Hi Peter, Florian's answers are very good - I am not sure this will add much, but perhaps a little, for the Galaxy output datasets parts of the questions ... The latest Using Galaxy paper, protocol 3, includes all of the "optional" output that MACS in Galaxy will produce (in addition to the linked files from the HTML report). Apart from the primary BED file and HTML output, there are 4 files paired by tags/control = 2 interval and 2 wig. The coordinate system used by each file specification can vary, as you observed and already noted. See the documentation links for exactly how these files are formatted. But regardless of the file coordinate system, a proper browser that interprets the datatype correctly will display the start/stop correctly, which is where the output datasets in Galaxy can be useful. Meaning, that whether the start in the file is 1-based or 0-based, the actual start base will visualize as the same start base. Load the output into the UCSC Browser or Trackster in Galaxy and scroll into one of the regions to view this, and compare with the files, both datasets in Galaxy and downloaded through links) to better understand. Full documentation for core MACS output is in the MACS documentation (link given by Florian, also linked from MACS tool page). Documentation/examples for the Galaxy output files is in our paper: http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012 (scroll to protocol 3) http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi1005s38/full#bi1005-... (see step #6) More help for datatypes: http://wiki.g2.bx.psu.edu/Learn/Datatypes (bed, interval, wig are all covered with links to more resources) Florian mostly covered these, but I'll also address to be clear: On 9/11/12 9:45 AM, peter scot wrote:
I ran MACS on my chipseq dataset and found various files:
1. under html report there ar etwo files one of negative peaks.xls and second is peaks.xls the file peaks.xls is same as peaks .intreval file in the right out put flow with one bp position added e..g if peak coordinate under html report are 99 to 120 than in the peaks .interval it is 100 to 121. Which one should be followed? Related to different coordinate system. See file specifications.
2. What is the meaning of negative peak. interval file?
Is a type of control data - basically the inputs are flipped to produce it. May not be needed/useful for further downstream analysis. The advice to read the MACS doc to fully understand is a good one.
3. I have used ctrl and treated sample to run MACS - there are two wig files one ctrl.wig and another treatment. Wig; Do these two files belong to ctrl and treated samples then where are corresponding bed files.
These show the data density (pileup) in a graphical format. No bed files, although you can visualize these against the other bed and/or interval peak data to see how density was interpreted when calling peaks. Hopefully this helps! Jen Galaxy team
If someone can direct me to the out put as we get in Galaxy while using MACS that will be helpful
Thanks
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org