To (I think) fix this, I changed line 50 in rgFastQC.py from
infname = self.opts.inputfilename

to
infname = self.opts.input

This will force FastQC to look at the "real" file and not the renamed dataset.


On Mon, Jan 12, 2015 at 12:20 PM, Ryan G <ngsbioinformatics@gmail.com> wrote:
Yes, I'm doing a link to file on file system when doing a library import.  Does this mean I should link to the the uncompressed file? 

On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Ah. Then this is more subtle... are you using the
library import option where Galaxy just symlinks
to existing files? I thought that was not possible
with gzipped files (for the reasons given below).
Perhaps this is not being blocked, leading to the
confused state you're seeing?

Peter

On Mon, Jan 12, 2015 at 4:52 PM, Ryan G <ngsbioinformatics@gmail.com> wrote:
> Galaxy is not decompressing the file.  The file is linked to on the
> filesystem.
>
> On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock <p.j.a.cock@googlemail.com>
> wrote:
>>
>> Hi Ryan,
>>
>> The problem isn't Galaxy stripping the extension, rather
>> Galaxy is actually decompressing the file as part of the
>> upload process.
>>
>> Unfortunately (and there is an open Trello enhancement
>> request on this), Galaxy does not support sorting any of
>> the defined datatypes in compressed form UNLESS they
>> are defined that way (like BAM files).
>>
>> This has lead some Galaxy Admins to define a new datatype
>> lgzippedfastq (or similar - I'd have to check my old emails
>> for the exact name used as a gripped alternative to the
>> Galaxy sangerfastq datatype) and then modified many/all
>> their tools to handle this. That is a lot of work, but does
>> offer big disk savings for this key datatype.
>>
>> The Galaxy team instead use a compressed file system,
>> so for usegalaxy.org ALL their data files are compressed
>> but Galaxy can ignore this complexity.
>>
>> Peter
>>
>> On Mon, Jan 12, 2015 at 3:15 PM, Ryan G <ngsbioinformatics@gmail.com>
>> wrote:
>> > Hi all - I've got a bunch of fatsq files uploaded into a data library in
>> > Galaxy.  The underlying files is gzipped however Galaxy strips the .gz
>> > from
>> > the filename and displays it as .fastq.  When the python wrapper
>> > rgFastQC.py
>> > gets called, it correctly sees the fastq.gz file.  The wrapper creates a
>> > symbolic link to the .gz file in a tmp directory.  The link is .fastq.
>> > When
>> > FastQC tries to read this file, it fails because its compressed.  So one
>> > of
>> > two things is going wrong here:
>> >
>> > 1)  It looks like the wrapper is incorrectly renaming the file, but its
>> > using the name given to it in Galaxy.
>> >
>> > 2)  When the file is uploaded into the data library, Galaxy is stripping
>> > off
>> > the .gz extension.
>> >
>> > I think #2 is the more correct problem.  How can I keep Galaxy from
>> > stripping the .gz extension?
>> >
>> > ___________________________________________________________
>> > Please keep all replies on the list by using "reply all"
>> > in your mail client.  To manage your subscriptions to this
>> > and other Galaxy lists, please use the interface at:
>> >   https://lists.galaxyproject.org/
>> >
>> > To search Galaxy mailing lists use the unified search at:
>> >   http://galaxyproject.org/search/mailinglists/
>
>