Hi Yan, I suspected that this was what you were originally asking, but then reconsidered when I read the subject line again. This is because one of the functions of Cufflinks is to do what you are asking - it brings together mapped RNA-seq data to produce transcript/genes based on either read overlap alone or read overlap plus overlap with reference annotation (GTF reference annotation). As far as I know, modifying the RNA-seq input by assembling it first or by collapsing redundancy would change the nature of the experiment. Reviewing the Cufflinks documentation will help with understanding how this processing was designed to work with the expected inputs: http://cufflinks.cbcb.umd.edu/manual.html Assembly is not available on Galaxy Main, but there are other options. For general RNA-seq assembly purposes (not advised for Cufflinks, at least for the RNA-seq input), you could run a local or cloud instance (http://getgalaxy.org) and consider Trinity (alpha). This was announced in the May News Brief: http://wiki.g2.bx.psu.edu/DevNewsBriefs/2012_05_11#Tools There are also tools available from the Tool Shed to consider. Search for 'assembly' or 'trinity' - but be sure the tool is for RNA and not DNA. Tools here are supported by the tool wrapper/authors themselves - the contact information is with each repository. http://toolshed.g2.bx.psu.edu/ As an aside, using Bowtie is non-standard (TopHat is preferred, unless you are working with a genome that has an unspliced transcriptome). I was making the assumption that this was the case with your data, but I did want to mention it in case it wasn't clear. If the desire to assemble is related to the use of a circular genome, then you may want to contact the tool authors at their support email to see what protocol advice is available: tophat.cufflinks@gmail.com. Posting back any replies, creating a tutorial, or adding a page in the Galaxy wiki on the subject would be most welcome - other Galaxy users would likely be very interested. Hopefully this helps, Jen Galaxy team On 8/14/12 8:43 PM, Yan He wrote:
Hi Jen,
Thanks for your reply! I know this workflow. I am just wondering if there is a tool in Galaxy to combine the reads that mapped to the same gene with different positions before running cufflinks.
Thanks again,
Yan
-----邮件原件----- 发件人: Jennifer Jackson [mailto:jen@bx.psu.edu] 发送时间: Wednesday, August 15, 2012 11:01 AM 收件人: Yan He 抄送: galaxy-user@lists.bx.psu.edu 主题: Re: [galaxy-user] how to sort mapped data?
Hello Yan,
To sort a SAM file produced by Bowtie before using it with Cufflinks (a requirement), please see this FAQ and workflow:
http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq#faq2
Best,
Jen Galaxy team
On 8/14/12 7:33 PM, Yan He wrote:
Hi everyone,
I am working on RNA-seq data. First, I mapped the reads to the reference transcriptome using bowtie. I found some different reads mapped to the same gene with different positions. Before running Cufflinks, I would like to combine the reads that mapped to the same gene though with different positions. Is there a tool in Galaxy can fulfill this purpose? Any suggestion would be much appreciated. Thanks!
Yan
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
-- Jennifer Jackson http://galaxyproject.org