uploading gzip compressed datasets
Hi all - I'm uploading datasets into my local instance by importing from the filesystem w/o copying into Galaxy. Importing of uncompressed files works properly. The files I'm importing are owned by user1 and a read-able by everyone, including the galaxy user. The galaxy user does not have write permissions in the directory where the files are stored. I have fastq files that are compressed with gzip and owned by user1. I don't want Galaxy to uncompress them, but it appears it is trying to during import, and subsequently fails because of permissions. Is there a way I can force galaxy to import the compressed files w/o uncompressing them? I think most of the mapping programs work on compressed fastq files so I'd rather not uncompressed them to save space. -- CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information.
Ryan Golhar wrote:
Hi all - I'm uploading datasets into my local instance by importing from the filesystem w/o copying into Galaxy.
Importing of uncompressed files works properly. The files I'm importing are owned by user1 and a read-able by everyone, including the galaxy user. The galaxy user does not have write permissions in the directory where the files are stored.
I have fastq files that are compressed with gzip and owned by user1. I don't want Galaxy to uncompress them, but it appears it is trying to during import, and subsequently fails because of permissions.
Is there a way I can force galaxy to import the compressed files w/o uncompressing them? I think most of the mapping programs work on compressed fastq files so I'd rather not uncompressed them to save space.
Galaxy doesn't currently have a way to know whether a tool supports compressed data or not - most do not, but as you mention, some do. Because of this, all files are decompressed on upload. To prevent this, you'd need to comment out the section of tools/data_source/upload.py that does the decompression, which is the code under: elif is_gzipped and is_valid: # We need to uncompress the temp_name file, but BAM files must remain compressed in the BGZF format But this will cause problems with filetype autodetection and setting metadata which may take a fair amount of time to work around. --nate
-- CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information.
begin:vcard fn:Ryan Golhar, Ph.D. n:Golhar;Ryan org:The Cancer Institute of NJ;Cancer Informatics Core/Bioinformatics adr:5th floor;;120 Albany St;New Brunswick;NJ;08901;USA email;internet:golharam@umdnj.edu title:NGS Bioinformatics Specialist tel;work:(732) 235-6613 tel;fax:(732) 235-6267 tel;cell:(732) 236-1176 x-mozilla-html:FALSE url:http://www.cinj.org version:2.1 end:vcard
_______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Nate Coraor
-
Ryan Golhar