Metadata indexing slow on large files

14 May 2010

      We are working with some very large sequence libraries (paired end 70M
reads+ each end - 15Gbx2). We already know what the file types are and that
they are appropriate for the  pipeline. There seems to be a large amount of
the processing effort expended after the completion after each step in the
workflow  analysing the files and determining their attributes - this is
related to the size of the files (which are large at every step) and is of
no practical use to us, except maybe on the final step.

Is there any way to suppress post processing steps and simply accept the
file as specified in the tool output tags? How can we reduce or eliminate
verification/indexing on metadata tags - what implications should we be
aware of.

Thanks

dennis

Dennis Gascoigne

Nate Coraor

tags

participants (2)