![](https://secure.gravatar.com/avatar/d316a268c91d56e536a029a4229afc4d.jpg?s=120&d=mm&r=g)
Hello all, Continuing the search for slowness in my local Galaxy server (see http://lists.bx.psu.edu/pipermail/galaxy-dev/2009-December/001549.html ), The datatypes/sequence.py file is also scanning and parsing entire files when creating a new FASTA/FASTQ file. It's nice and fun and informative for small files, but with a 2.7GB FASTA file - the python process stays at 100% CPU for a long long time, causing everything else to be very slow. The offending code is at sequence.py, method "set_meta", lines 30-39. I think Illumina expects 25x coverage of the human genome in a single run by the end of the year - this will roughly translates to 8 FASTQ files of more than 8GB each => FASTA files of 4GB each... Galaxy will not be able to just casually scan these files. -gordon