There are no known limits, but perhaps you have found something new. We
can explore two areas:
Your Galaxy instance config:
Is metadata being set externally? Specifically, we are wondering whether
you have have optional metadata configured to not count fastq blocks if
the file is larger than a specified size or similar.
Example data & filter options:
1 - If you could load some sample input files into a history on Galaxy
main and share the link, that would be helpful. Just a sample of
sequences that are representative of the entire dataset.
2 - Note the specific filter options used in the fastq_filter tool.
We can scale the data up, run with your filters, and try to see what is
causing the problem.
We look forward to your reply,
On 9/13/10 3:31 PM, Isabelle Phan wrote:
The tool fastq_filter worked on 1M reads, but fails (hangs) on 15M reads. I had to kill
the job after the user let it run for a whole day. The debug.txt file containing a python
function "fastq_read_pass_filter" is created in the files/000/dataset_xxx_files
directory. I am getting no error from the galaxy server.
I wonder what could cause fastq_filter to fail? The fastx equivalent tool works, but it
misses all the options of fastq_filter.
I'd be grateful for any hints to help me get fastq_filter to work on large fastq
galaxy-dev mailing list