Uploading gzipped fastq files through the API
Hello, I have been using the galaxy API to upload files into a library (using a local folder for library import) and running a workflow into a history. I have been following the example scripts that come with galaxy. When I upload a gzipped fastq file, in autodetection mode (linking the file and not copying, to be faster), the file is not detected as fastq. Even if I explicitly say that it is a fastq, the file is uploaded as a fastq but with its contents gzipped (so downstream analysis fail). Nonetheless, if I do this manually through the interface, the file is unzipped correctly. Any idea how can I upload the gzipped fastq through the API so that it can be used properly? Thank you. ===================================== Daniel Sobral Next Generation Sequencing Data Analyst IGC - Instituto Gulbenkian de Ciencias e-mail dsobral@igc.gulbenkian.pt =====================================
Just an update: when I set it to copy, it works fine as expected. So I guess the only alternatives alternatives I have are: - link the file, but as unzipped fastq - copy the file (which copies the unzipped data to galaxy) Would be nice to have a way of working with gzipped fastq files, since most tools now work with them routinely. I'm probably missing something here? Thanks, Daniel On 12/30/2013 10:18 PM, dsobral@igc.gulbenkian.pt wrote:
Hello,
I have been using the galaxy API to upload files into a library (using a local folder for library import) and running a workflow into a history. I have been following the example scripts that come with galaxy.
When I upload a gzipped fastq file, in autodetection mode (linking the file and not copying, to be faster), the file is not detected as fastq. Even if I explicitly say that it is a fastq, the file is uploaded as a fastq but with its contents gzipped (so downstream analysis fail).
Nonetheless, if I do this manually through the interface, the file is unzipped correctly.
Any idea how can I upload the gzipped fastq through the API so that it can be used properly?
Thank you.
===================================== Daniel Sobral Next Generation Sequencing Data Analyst IGC - Instituto Gulbenkian de Ciencias e-mail dsobral@igc.gulbenkian.pt =====================================
-- Daniel Sobral, Bioinformatics Unit Instituto Gulbenkian de Ciência
Can you set a more specific datatype? This is a hack from a heavy Galaxy user and developer: "What I and others are doing is hacking the library file path uploads functionality. If you upload by file path, select your compressed fastq files, and set data type to fastqsanger (*this is the key step*), you can use these files in compressed format assuming a tool is smart enough to decompress based on file extension because the path is passed directly to the tool." Galaxy needs to support compressed files directly but does not :(. There is an open Trello card created by Peter Cock on this here - https://trello.com/c/3RkTDnIn as well as a more detailed follow up e-mail from here http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-November/017528.html. My own preference for how to implement this would be to bring in the concept of implicit datatypes from galaxy-extras (https://bitbucket.org/msiappdev/galaxy-extras) and implement plugins to create implicit datatypes for each kind of compression (fastq -> fastq.gz). I am not sure any of this is an immediate priority for the Galaxy team though - we compress things at the filesystem level so this is less important for galaxy main. But if people are passionate about this I would encourage them to vote on the Trello card. -John On Thu, Jan 2, 2014 at 8:06 AM, Daniel Sobral <dsobral@igc.gulbenkian.pt> wrote:
Just an update: when I set it to copy, it works fine as expected.
So I guess the only alternatives alternatives I have are: - link the file, but as unzipped fastq - copy the file (which copies the unzipped data to galaxy)
Would be nice to have a way of working with gzipped fastq files, since most tools now work with them routinely. I'm probably missing something here?
Thanks, Daniel
On 12/30/2013 10:18 PM, dsobral@igc.gulbenkian.pt wrote:
Hello,
I have been using the galaxy API to upload files into a library (using a local folder for library import) and running a workflow into a history. I have been following the example scripts that come with galaxy.
When I upload a gzipped fastq file, in autodetection mode (linking the file and not copying, to be faster), the file is not detected as fastq. Even if I explicitly say that it is a fastq, the file is uploaded as a fastq but with its contents gzipped (so downstream analysis fail).
Nonetheless, if I do this manually through the interface, the file is unzipped correctly.
Any idea how can I upload the gzipped fastq through the API so that it can be used properly?
Thank you.
===================================== Daniel Sobral Next Generation Sequencing Data Analyst IGC - Instituto Gulbenkian de Ciencias e-mail dsobral@igc.gulbenkian.pt =====================================
-- Daniel Sobral, Bioinformatics Unit Instituto Gulbenkian de Ciência
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (3)
-
Daniel Sobral
-
dsobral@igc.gulbenkian.pt
-
John Chilton