Dear Galaxy expert(s),
I have .BED file of regions from mouse. I guess many of them can span whole genes i.e. many exons; might even span over the gene flanks.
I need to get the REPEATMASKED sequences of only the annotated exons of these regions.
I see that If I use the tool "Fetch Sequences->Extract Genomic DNA" on these regions, it returns sequences with mixed small and capital letters.
Question I: what are the small letters and what are the capitals here?
Are these already masked, exons/introns or what?
(I downloaded some of these sequences and repeatmasked myself. My pasked sequences overlap with some of "yours" written in small letters.)
Question II: Is the strand "honored" by these tool?
I guess I remember from my old experience that there was an issue although I can not recall what exactly.
Thank you in advance,
David