Dear all, I am a Phd student working on chicken genomics, with limited experience in the bio-informatics field. I performed an RNA-Seq experiment with single end 50 bp reads to find differential gene expressions between different groups. I have mapped this data with Tophat and used flagstat and Picard to check the number of mapped reads. To check the coverage of my genome, I could use the number of mapped reads and multiply this by the read length and divide by the genome size, but of course since I used mRNA as input material, average coverage will be low (only exons presents). I would like to use the Samtools Depth (as I read on SeqAnswers) to get the average coverage for a coveraged base AND the total base coverage, but this does not seem to be included in Galaxy. Does anyone know a way around this? Other useful tips and tricks are also welcomed. Thank you very much. Have a nice day. Yours Sincerely, Els --- Ir. Els Willems KU Leuven Department of Biosystems Division Livestock - Nutrition - Quality Laboratory of Livestock Physiology Kasteelpark Arenberg 30 bus 2456 B - 3001 Heverlee T (+32) 016 32 17 29 F (+32) 016 32 19 94
Hello Els, Have you seen the tool "BEDTools -> Create a BedGraph of genome coverage"? This would give you the coverage numbers, then you could perform statistics on those numbers. You could also "Convert from BAM to BED" (there is an option to split for spliced alignments) and if you had a bed file of transcripts, use tools in this group or tools in "Operate on Genomic Intervals" to generate statistics. You could also create your own statistics using "Text Manipulation -> Compute" or "Join, Subtract and Group -> Group". Hopefully one of these options works out for you. Jen Galaxy team On 3/15/13 8:54 AM, Els Willems wrote:
Dear all,
I am a Phd student working on chicken genomics, with limited experience in the bio-informatics field. I performed an RNA-Seq experiment with single end 50 bp reads to find differential gene expressions between different groups. I have mapped this data with Tophat and used flagstat and Picard to check the number of mapped reads.
To check the coverage of my genome, I could use the number of mapped reads and multiply this by the read length and divide by the genome size, but of course since I used mRNA as input material, average coverage will be low (only exons presents). I would like to use the Samtools Depth (as I read on SeqAnswers) to get the average coverage for a coveraged base AND the total base coverage, but this does not seem to be included in Galaxy. Does anyone know a way around this? Other useful tips and tricks are also welcomed. Thank you very much.
Have a nice day.
Yours Sincerely, Els
---
Ir. Els Willems
KU Leuven
Department of Biosystems
Division Livestock - Nutrition - Quality
Laboratory of Livestock Physiology
Kasteelpark Arenberg 30 bus 2456
B - 3001 Heverlee
T (+32) 016 32 17 29
F (+32) 016 32 19 94
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
participants (2)
-
Els Willems
-
Jennifer Jackson