To (I think) fix this, I changed line 50 in rgFastQC.py from
infname = self.opts.inputfilename
infname = self.opts.input
This will force FastQC to look at the "real" file and not the renamed
On Mon, Jan 12, 2015 at 12:20 PM, Ryan G <ngsbioinformatics(a)gmail.com>
Yes, I'm doing a link to file on file system when doing a library
Does this mean I should link to the the uncompressed file?
On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock <p.j.a.cock(a)googlemail.com>
> Ah. Then this is more subtle... are you using the
> library import option where Galaxy just symlinks
> to existing files? I thought that was not possible
> with gzipped files (for the reasons given below).
> Perhaps this is not being blocked, leading to the
> confused state you're seeing?
> On Mon, Jan 12, 2015 at 4:52 PM, Ryan G <ngsbioinformatics(a)gmail.com>
> > Galaxy is not decompressing the file. The file is linked to on the
> > filesystem.
> > On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock <p.j.a.cock(a)googlemail.com
> > wrote:
> >> Hi Ryan,
> >> The problem isn't Galaxy stripping the extension, rather
> >> Galaxy is actually decompressing the file as part of the
> >> upload process.
> >> Unfortunately (and there is an open Trello enhancement
> >> request on this), Galaxy does not support sorting any of
> >> the defined datatypes in compressed form UNLESS they
> >> are defined that way (like BAM files).
> >> This has lead some Galaxy Admins to define a new datatype
> >> lgzippedfastq (or similar - I'd have to check my old emails
> >> for the exact name used as a gripped alternative to the
> >> Galaxy sangerfastq datatype) and then modified many/all
> >> their tools to handle this. That is a lot of work, but does
> >> offer big disk savings for this key datatype.
> >> The Galaxy team instead use a compressed file system,
> >> so for usegalaxy.org
ALL their data files are compressed
> >> but Galaxy can ignore this complexity.
> >> Peter
> >> On Mon, Jan 12, 2015 at 3:15 PM, Ryan G <ngsbioinformatics(a)gmail.com>
> >> wrote:
> >> > Hi all - I've got a bunch of fatsq files uploaded into a data
> library in
> >> > Galaxy. The underlying files is gzipped however Galaxy strips the
> >> > from
> >> > the filename and displays it as .fastq. When the python wrapper
> >> > rgFastQC.py
> >> > gets called, it correctly sees the fastq.gz file. The wrapper
> creates a
> >> > symbolic link to the .gz file in a tmp directory. The link is
> >> > When
> >> > FastQC tries to read this file, it fails because its compressed. So
> >> > of
> >> > two things is going wrong here:
> >> >
> >> > 1) It looks like the wrapper is incorrectly renaming the file, but
> >> > using the name given to it in Galaxy.
> >> >
> >> > 2) When the file is uploaded into the data library, Galaxy is
> >> > off
> >> > the .gz extension.
> >> >
> >> > I think #2 is the more correct problem. How can I keep Galaxy from
> >> > stripping the .gz extension?
> >> >
> >> > ___________________________________________________________
> >> > Please keep all replies on the list by using "reply all"
> >> > in your mail client. To manage your subscriptions to this
> >> > and other Galaxy lists, please use the interface at:
> >> > https://lists.galaxyproject.org/
> >> >
> >> > To search Galaxy mailing lists use the unified search at:
> >> > http://galaxyproject.org/search/mailinglists/