Re: [galaxy-user] Adding a custom genome for using MACS in Galaxy

10 Sep 2012

      Hello Mathew,

If you already have mapped your data, then you can just upload the 
BAM/SAM dataset(s), sort if necessary, leave the database unassigned, 
and run MACS. This workflow has an example of how to sort a BAM file and 
send to MACS - you don't have to use this exactly, in fact the settings 
(especially for MACS) are likely not appropriate. Just examine the 
general sort rules and use the parts of it that make sense for your 
purposes, and run the tools independently or modify to create your own 
workflow:
http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/sort-bam-for-peak-calling-ma...

If you want to convert SAM-to-BAM (not really necessary) or when 
starting with raw sequence data that needs to be mapped (or find that 
you want to map it again), then the reference custom genome should be 
loaded along with the sequence data. Again, leave the database 
unassigned for all. The general protocol is covered in #3 from the Using 
Galaxy paper (make adjustments for tag size, effective genome size, etc. 
as needed, using the MACS documentation linked from the tool's page as a 
guide):
http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012

To prepare, load, troubleshoot, and use a custom reference genome with 
tools (such as mapping tools), please see this wiki and the links it 
points to.
http://wiki.g2.bx.psu.edu/Support#Custom_reference_genome
In short, tool forms that have a custom genome option will ask "Choose 
the source for the reference list:" or similar - you will select 
"History" and then select the dataset where your custom reference genome 
has been uploaded in fasta format and assigned the datatype "fasta". It 
is very important that the chromosome/scaffold identifiers in the 
reference genome and those in any other files that refer to it are 
identical (in for example, a SAM or GTF dataset). This is where doing 
all of the analysis within Galaxy can be sometimes easier, since our 
tools maintain this internal data consistency.

This should help to get you started, but please let us know if you need 
more help as the analysis proceeds,

Best,

Jen
Galaxy team

On 9/10/12 9:59 AM, Mathew Bunj wrote:
...
I have  a chipseq data which ha sbeen alined against bacterial genome. I
am trying to figure out how I can use peak calling MACS in Galaxy main
server. Do I need to use the bactaerial genome (in genome option of data
uplaod) in uplaoding the data. Could some one diect me if I can add my
own custom genome for MACS program with in Galaxy main.
Thanks
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
-- 
Jennifer Jackson
http://galaxyproject.org