I'm a new user, learning how to use Galaxy while I wait for my 454 results.
So I'm not actually playing with any data yet but I'm trying to set up a
draft workflow as practice. Two issues:
I am having trouble with the 'manipulate fastq' command. Without this, my
workflow saves quickly and seems fine, but when I include even a (seemingly
simple) 'manipulate fastq' step, it tries to save for many minutes,
unsuccessfully, until I get sick of it and close the window.
Well this isn't really an issue, just a request for advice! My dataset will
be a barcoded amplicon library, containing 8 different gene regions (which I
can recognise from the amplicon-specific primer sequences) amplified in 64
different individuals (which I can recognise by an individual-specific
barcode sequence). I thought I'd set up a workflow with the following steps:
1) convert to FASTQ format. 2) grooming, filtering to remove short reads
etc. 3) 'manipulate FASTQ' to match all sequences containing one of the
eight reverse primer sequences, and reverse-complement them. 4)
FASTQ--tabular format conversion. 5) eight separate 'select' steps to select
sequences with a match to either the forward primer or the
reverse-complemented reverse primer of the desired gene region.
My question is: does this seem sensible? Is there a more efficient way to do
this that I haven't discovered yet? I was thinking I'd then set up another
workflow to label barcoded individuals, for I could use each of the eight
gene 'output files' in turn as input.
Thanks so much for this service! The screencasts are especially great.
University of Melbourne, Australia