Continuing the search for slowness in my local Galaxy server (see
The datatypes/sequence.py file is also scanning and parsing entire
files when creating a new FASTA/FASTQ file.
It's nice and fun and informative for small files, but with a 2.7GB
FASTA file - the python process stays at 100% CPU for a long long
time, causing everything else to be very slow.
The offending code is at sequence.py, method "set_meta", lines 30-39.
I think Illumina expects 25x coverage of the human genome in a single
run by the end of the year - this will roughly translates to 8 FASTQ
files of more than 8GB each => FASTA files of 4GB each... Galaxy will
not be able to just casually scan these files.
galaxy-dev mailing list