Re: [galaxy-dev] Fastq_filter fails on large files

1 Oct 2010

      Hi Isabelle,

There are no known limits, but perhaps you have found something new. We 
can explore two areas:

Your Galaxy instance config:
Is metadata being set externally? Specifically, we are wondering whether 
you have have optional metadata configured to not count fastq blocks if 
the file is larger than a specified size or similar.

Example data & filter options:
  1 - If you could load some sample input files into a history on Galaxy 
main and share the link, that would be helpful. Just a sample of 
sequences that are representative of the entire dataset.
  2 - Note the specific filter options used in the fastq_filter tool.
We can scale the data up, run with your filters, and try to see what is 
causing the problem.

We look forward to your reply,

Jen
Galaxy team

On 9/13/10 3:31 PM, Isabelle Phan wrote:
...
Hello,
The tool fastq_filter worked on 1M reads, but fails (hangs) on 15M reads. I had to kill the job after the user let it run for a whole day. The debug.txt file containing a python function "fastq_read_pass_filter" is created in the files/000/dataset_xxx_files directory. I am getting no error from the galaxy server.
I wonder what could cause fastq_filter to fail? The fastx equivalent tool works, but it misses all the options of fastq_filter.
I'd be grateful for any hints to help me get fastq_filter to work on large fastq files.
Thanks
Isabelle
_______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev
-- 
Jennifer Jackson
http://usegalaxy.org