Hi all,
Are there any built in Galaxy tools that I have missed to do with GC
percentage (or indeed, AT percentage)?
I'm thinking of a tool to calculate the GC percentage (and perhaps
related statistics like counts/percentages of A, C, G, T), and perhaps
a related tool to filter on GC. Possible use cases include filtering
NGS reads to remove high/low GC reads from a contaminate.
Slightly more complicated, right now I want to calculate the GC (or in
fact AT) percentage from the first and last ~20 (configurable) bases.
In this case I am looking for (and filtering on) AT rich ends of
contigs which may be indicative of viral sequences. A very similar
task would be looking for (and filtering on) poly A tails of mRNA, or
if sequenced from the reverse strand, a poly T start.
Peter