Trimming small RNA
Hello everybody, I have sequences of small RNA's from 18 to 35nt and accurate trim the sequences of the adapters before aligning the reads with the reference genome. Are there any tools available to it in the Galaxy? Thanks. -- Thiago Mafra Batista Biólogo Molecular Doutorando em Bioinformática - UFMG LGB - ICB/Bloco K4 sala 245 Tel Lab: (31) 3409-2628 CV: http://lattes.cnpq.br/9414909432933240
Hi, This must seem like a newbie question but I cant get a clear answer. My understanding from the galaxy wiki page http://wiki.g2.bx.psu.edu/Learn/FAQ#Learn.2BAC8-FAQ.Interval_and_BED_format is that all intervals in galaxy are 0 based, start inclusive end exclusive. but when i use generate pileup/filter pileup and convert to intervals, i get something like this: chr10 1056309 1056310 G C + When i look up the SNP (G-->C) it is pretty clearly 1056310. Which would make the "interval" end inclusive. this is key because when i annotate snp's against dbSNP, i need to have the right cooridnates. Can anyone provide some guidance? Thanks! rich
Hi, This must seem like a newbie question but I cant get a clear answer. My understanding from the galaxy wiki page http://wiki.g2.bx.psu.edu/Learn/FAQ#Learn.2BAC8-FAQ.Interval_and_BED_format is that all intervals in galaxy are 0 based, start inclusive end exclusive. but when i use generate pileup/filter pileup and convert to intervals, i get something like this: chr10 1056309 1056310 G C + When i look up the SNP (G-->C) it is pretty clearly 1056310. Which would make the "interval" end inclusive. this is key because when i annotate snp's against dbSNP, i need to have the right cooridnates. Can anyone provide some guidance? Thanks! rich
Hello Richard, The coordinates have a zero-based start. Add +1 to the start, do nothing to the end, and the bases included will match up with any visualization tool where the first base is labeled "1". The data: chr10 1056309 1056310 G C + start = 1056309 + 1 = 1056310 end = 1056310 SNP is a single base change at position = 1056310 There are other details, but this is the key fact that you will likely need to know for most applications, esp. those that are not stranded or converted to be on the (+) strand. For the full details, including how to transform (-) stranded coordinates using this system, the description from UCSC is very handy: http://genomewiki.cse.ucsc.edu/index.php/Coordinate_Transforms Hopefully this helps, Best, Jen Galaxy team On 9/11/11 8:04 PM, Richard Mark White wrote:
Hi, This must seem like a newbie question but I cant get a clear answer. My understanding from the galaxy wiki page http://wiki.g2.bx.psu.edu/Learn/FAQ#Learn.2BAC8-FAQ.Interval_and_BED_format is that all intervals in galaxy are 0 based, start inclusive end exclusive. but when i use generate pileup/filter pileup and convert to intervals, i get something like this: chr10 1056309 1056310 G C +
When i look up the SNP (G-->C) it is pretty clearly 1056310. Which would make the "interval" end inclusive. this is key because when i annotate snp's against dbSNP, i need to have the right cooridnates.
Can anyone provide some guidance? Thanks!
rich
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
Hi Rich, That is consistent with the BED format, that is in "BEDs" 0 based coordinate system the SNP is at 1056309. In the more common "1" based system this translates to 1056310. If the end were inclusive the SNP would be at 1056309-1056310 in "BED" world, that is it would take 2 positions. The first base of a genome in BED coordinates is represented as 0-1. My quick rule of thumb for converting between coordinate systems is to add (or subtract) 1 from the start base, leave the end base alone. Jim On Sun, Sep 11, 2011 at 10:58 PM, Richard Mark White <whiter3@yahoo.com>wrote:
Hi, This must seem like a newbie question but I cant get a clear answer. My understanding from the galaxy wiki page http://wiki.g2.bx.psu.edu/Learn/FAQ#Learn.2BAC8-FAQ.Interval_and_BED_formati... that all intervals in galaxy are 0 based, start inclusive end exclusive. but when i use generate pileup/filter pileup and convert to intervals, i get something like this:
chr10 1056309 1056310 G C +
When i look up the SNP (G-->C) it is pretty clearly 1056310. Which would make the "interval" end inclusive. this is key because when i annotate snp's against dbSNP, i need to have the right cooridnates.
Can anyone provide some guidance? Thanks!
rich
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (4)
-
James Robinson
-
Jennifer Jackson
-
Richard Mark White
-
Thiago Mafra