Megablast question
Hi all When using Galaxy megablast, is there a simple way to reduce my FASTA files from 23 million reads to 1/2 that size and submit to megablast separately? Thanks -- Scott Tighe Advanced Genome Technology Lab Vermont Cancer Center at the University of Vermont 149 Beaumont Avenue Health Science Research Bd RM 305 Burlington Vermont USA 05405 lab 802-656-AGTC (2482) cell 802-999-6666
Noa has the right idea, but if you're asking for how to split a dataset into two non-overlapping halves you'll want to use "Select First" and "Select Last", instead of random lines. Get an accurate line count from your file using the "Line/Word/Character count" tool and then split it right in the middle using select first/last. -Dannon On Feb 16, 2012, at 2:35 PM, Noa Sher wrote:
Hi Scott I never used megablast so what i am writing is true of just any fasta file (so if there is anything quirky in megablast that i dont know about, apologies!): • Take your fasta file and convert to tabular (under "fasta manipulation" - this will make it go to one line per record). • Then randomly choose whatever number of reads you want using "select random lines from a file" under the text maniupulation tab. • Then convert the tabular file back to fasta. (under the fasta manipulation tab) noa On 16/02/2012 19:31, Scott Tighe wrote:
Hi all
When using Galaxy megablast, is there a simple way to reduce my FASTA files from 23 million reads to 1/2 that size and submit to megablast separately?
Thanks -- Scott Tighe Advanced Genome Technology Lab Vermont Cancer Center at the University of Vermont 149 Beaumont Avenue Health Science Research Bd RM 305 Burlington Vermont USA 05405 lab 802-656-AGTC (2482) cell 802-999-6666
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (3)
-
Dannon Baker
-
Noa Sher
-
Scott Tighe