Dataset's extra files
Hi all, We have a local tool which role is to transfer (ie copy) a dataset file to a directory on our NFS. This is extremely convenient as it can be included within workflows and therefore save the time of clicking download button (we also have configurable renaming/compression as part of it). It is heavily used by our users. The problem is with datasets that have associated files like FASTQC as these extra files are simple not ignored... We'd like to improve our 'NFS_transfer' tool so it can deal with this in a similar fashion as the download button. Foreseen solution : * Check if a directory named 'dataset_<id>_files' exists within the dataset store * if so, 'cp -r' it into a tmp dir, cp the dataset itself into same tmp dir (with renaming on the fly) * zip/tar.gz the tmp dir * copy it to final NFS location Question is : is this the right way to do it ? As a non python specialist, it is a little tricky to find the right way to it (I can t locate the piece of code that does this in galaxy ie behind the download button). In particular, can I get the list of extra files using the '$galaxyFile' object given in the tool by : <param type="data" name="galaxyFile" label="File to transfer"/> i.e. in the same way we get the dataset name or file extension ($galaxyFile.dataset.name and $galaxyFile.ext) ? Any advise on how best to implement this, in a portable way, very appreciated. Thanks for your time, Charles ===================================== Charles Girardot Head of Genome Biology Computational Support (GBCS) European Molecular Biology Laboratory Tel: +49 6221 387 -8585 Fax: +49-(0)6221-387-8166 Email: charles.girardot@embl.de Room V205 Meyerhofstraße 1, 69117 Heidelberg, Germany =====================================
The directory can be obtained using $galaxyData.extra_files_path (be sure to check it exists before zipping it up). I would discourage re-using Galaxy components directly from inside of a tool or tool wrapper - but if you want to reference that code it is actually inside of the datatypes module - https://bitbucket.org/galaxy/galaxy-central/src/3fb927653301a0c06a0bf94f2b6b... Hope this helps. -John On Tue, Mar 11, 2014 at 5:35 AM, Charles Girardot <charles.girardot@embl.de> wrote:
Hi all,
We have a local tool which role is to transfer (ie copy) a dataset file to a directory on our NFS. This is extremely convenient as it can be included within workflows and therefore save the time of clicking download button (we also have configurable renaming/compression as part of it). It is heavily used by our users.
The problem is with datasets that have associated files like FASTQC as these extra files are simple not ignored... We'd like to improve our 'NFS_transfer' tool so it can deal with this in a similar fashion as the download button.
Foreseen solution : * Check if a directory named 'dataset_<id>_files' exists within the dataset store * if so, 'cp -r' it into a tmp dir, cp the dataset itself into same tmp dir (with renaming on the fly) * zip/tar.gz the tmp dir * copy it to final NFS location
Question is : is this the right way to do it ? As a non python specialist, it is a little tricky to find the right way to it (I can t locate the piece of code that does this in galaxy ie behind the download button). In particular, can I get the list of extra files using the '$galaxyFile' object given in the tool by :
<param type="data" name="galaxyFile" label="File to transfer"/>
i.e. in the same way we get the dataset name or file extension ($galaxyFile.dataset.name and $galaxyFile.ext) ?
Any advise on how best to implement this, in a portable way, very appreciated.
Thanks for your time,
Charles
===================================== Charles Girardot Head of Genome Biology Computational Support (GBCS) European Molecular Biology Laboratory Tel: +49 6221 387 -8585 Fax: +49-(0)6221-387-8166 Email: charles.girardot@embl.de Room V205 Meyerhofstraße 1, 69117 Heidelberg, Germany =====================================
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
Charles Girardot
-
John Chilton