On Thu, Jul 4, 2013 at 9:49 PM, Robert Baertsch
Do these readers support gzip files?
reader = fastqVerboseErrorReader
reader = fastqReader
Presumably you are writing a Python script using this library?
The answer is a qualified yes. Instead of passing them a normal
file handle using open("example.fastq") you instead use
gzip.open("example.fastq") via import gzip.
Do I have to define a special type in galaxy for gzipped files or
will the fastq type be ok?
This needs a special file format - but you are not the first person to
look at this, some groups have defined custom gzipped variants of
the FASTQ formats within their own Galaxy instances. I've not
done this but there should be some useful emails in the archive.
Note you'd also need to modify any tool definitions to that they
can accept a gzipped FASTQ file.
Ideally, I would like to keep my files zipped and not have galaxy
unzip them, since they triple in size when unzipped.
I'm happy to do a push request if you don't support this but I want to make sure
I'm in line with your roadmap.
Personally I would like a more general system in Galaxy for
potentially any file type to be held compressed in a range of
formats (e.g. using gzip, bgzf, xy, bz2, etc), with exclusions
for things like BAM which are already compressed. This way
naive tools would get the gzipped file file uncompressed to a
temporary folder before use (i.e. no change for the tool wrapper),
but if a tool accepts a gzipped file it will get that (less disk IO
and CPU usage, but requires updating tool wrappers).
That idea is quite ambitious through ;)
I have written a simple tool to convert Illumina fastq to mapsplice
fastq. Does that already exist already somewhere?
I don't know.