Issue with saving 'manipulate fastq' in workflow; and request for advice dealing with barcoded 454 data
Hi, I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actually playing with any data yet but I'm trying to set up a draft workflow as practice. Two issues: Issue 1. I am having trouble with the 'manipulate fastq' command. Without this, my workflow saves quickly and seems fine, but when I include even a (seemingly simple) 'manipulate fastq' step, it tries to save for many minutes, unsuccessfully, until I get sick of it and close the window. Issue 2. Well this isn't really an issue, just a request for advice! My dataset will be a barcoded amplicon library, containing 8 different gene regions (which I can recognise from the amplicon-specific primer sequences) amplified in 64 different individuals (which I can recognise by an individual-specific barcode sequence). I thought I'd set up a workflow with the following steps: 1) convert to FASTQ format. 2) grooming, filtering to remove short reads etc. 3) 'manipulate FASTQ' to match all sequences containing one of the eight reverse primer sequences, and reverse-complement them. 4) FASTQ--tabular format conversion. 5) eight separate 'select' steps to select sequences with a match to either the forward primer or the reverse-complemented reverse primer of the desired gene region. My question is: does this seem sensible? Is there a more efficient way to do this that I haven't discovered yet? I was thinking I'd then set up another workflow to label barcoded individuals, for I could use each of the eight gene 'output files' in turn as input. Thanks so much for this service! The screencasts are especially great. Pip Griffin University of Melbourne, Australia
participants (1)
-
Pip Griffin