Should I "use raw junction" and "Only look for supplied junctions"
Dear All, I have two more questions about settings for Tophat. My aim is to look for the defferential splicing events between cell types. After I checked "Use Own Junctions", three more options came out: 1) "Use Gene Annotation Model" 2) "Use raw Junctions" 3) "Only look for supplied junctions" As instructed by Jen, I checked "Use Gene Annotation Model", and input iGenome mm9 genes.gtf as "Gene Model Annotations". However, I am not sure if I should choose to "Use raw junctions" and "only look for supplied junctions". Please help me set up these two options. Thanks. Jianguang
Hi Jianguang, Using known reference annotation ("Use Own Junctions", etc.) is not a part of the example RNA-seq tutorial our team has published for this type of analysis. That doesn't mean that it cannot be used, but that how it could or should be used probably needs to be tested to see if the results from the various options meet your needs. "Use raw Junctions" = will combine both the reference annotation and the novel junctions in the final junctions called. "Only look for supplied junctions" = will limit to only those junctions in the reference annotation (no novel junctions from the input). As I explained earlier, based on the TopHat documentation, using reference annotation at all causes those junctions to be given some favorable bias during the mapping. When thinking about the options, a lot probably depends on how well the genome is annotated vs how novel your data is. This may not be known upfront. Also, if you are interested in discovery, it would probably be important to consider whether you want to bias towards known annotation early in the analysis - we didn't in our tutorial. However it may be that you want to map to primarily or to only characterized splicing events that are known (or suspected) to be linked to disease or other expression profiles of interest, and that are already present in the reference annotation, and for this case using reference annotation could help to focus the results. Ultimately this is a decision you will need to make according to your end goals - and some testing would be recommended. Try a few runs with the different options and compare the TopHat mapped & Cufflinks assembled transcripts differences at the gene level and see which make the most sense - the Trackster tool ("Visualization") would be good for this. Apologies for not being more specific, but there is no single answer for this question. You might try asking at tophat.cufflinks@gmail.com for advice from that community or the tool authors or searching at seqanswers.com to see what others have been doing. This may give you a feel for the general usage trends (but it probably won't replace your own testing). Take care, Jen Galaxy team On 8/28/12 8:11 AM, Du, Jianguang wrote:
Dear All,
I have two more questions about settings for Tophat.
My aim is to look for the defferential splicing events between cell types.
After I checked "Use Own Junctions", three more options came out:
1) "Use Gene Annotation Model"
2) "Use raw Junctions"
3) "Only look for supplied junctions"
As instructed by Jen, I checked "Use Gene Annotation Model", and input iGenome mm9 genes.gtf as "Gene Model Annotations".
However, I am not sure if I should choose to "Use raw junctions" and "only look for supplied junctions". Please help me set up these two options.
Thanks.
Jianguang
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Du, Jianguang
-
Jennifer Jackson