Hi Lance, I looked at this a bit ago and had similar concerns, particularly with the outputs and inputs not being well-defined. In addition to the output tar ball —> local, extract —> upload not being great, as you mention, the input datatypes, etc, could use some work — in the very least, we should definitely create a nice biom datatype and have some converters available (import and export). Definitely worth spending some extra time to make sure that we have the data flowing well between each of the different parts/tools, and even better to make sure that its done in a way that allows mixing and matching with other non-qiime tools. One thing that we want to avoid is large amounts of manual massaging of the automatically generated xml; fixing things up once might not be too bad, but having to do it with each new tool version can be “frustrating". Although perhaps having a good starting point and only needing to manually modify for any updates could be good enough (I’m not very familiar with the extent of typical changes between qiime versions to make a call on how much changes). Dan (resending since I received a message bounce from list) On Oct 6, 2015, at 9:59 AM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Lance,
I looked at this a bit ago and had similar concerns, particularly with the outputs and inputs not being well-defined. In addition to the output tar ball —> local, extract —> upload not being great, as you mention, the input datatypes, etc, could use some work — in the very least, we should definitely create a nice biom datatype and have some converters available (import and export).
Definitely worth spending some extra time to make sure that we have the data flowing well between each of the different parts/tools, and even better to make sure that its done in a way that allows mixing and matching with other non-qiime tools.
One thing that we want to avoid is large amounts of manual massaging of the automatically generated xml; fixing things up once might not be too bad, but having to do it with each new tool version can be “frustrating". Although perhaps having a good starting point and only needing to manually modify for any updates could be good enough (I’m not very familiar with the extent of typical changes between qiime versions to make a call on how much changes).
Dan
On Oct 5, 2015, at 5:26 PM, Lance Parsons <lparsons@princeton.edu> wrote:
I was recently asked if I could provide a QIIME analysis pipeline for 16S data in Galaxy using tools in the QIIME pipeline (http://qiime.org/).
I did a bit of looking around for existing Galaxy wrappers and found an application that generates the wrappers for QIIME scripts for Galaxy (https://github.com/qiime/qiime-galaxy). This is a very well written application that does a great job of wrapping the QIIME scripts for Galaxy. However, there are a few things about it that don't quite fit my needs.
1. The tools output tgz files of all of the output files. This means that to execute a pipeline, the user would have to download the tgz files, untar, and then upload whichever file(s) are needed for the next step. 2. There is no toolshed repository to install the dependencies for these tools, making it a tricky for administrators to automate and also maintain various versions of QIIME going forward. 3. There are no toolshed versions of the tools themselves, which also makes installation and integration a bit tricky and makes it hard to me to create and manage updates, fixes, tweaks, etc. There are also no tests, etc.
For these reasons I decided to investigate the feasibility of using the generated wrappers as a basis for a "toolshed" version of QIIME. If anyone is interested in helping, or has suggestions, or is working on something related, I'd be very happy to collaborate.
The repository for the WIP is at https://github.com/lparsons/galaxy_tools/tree/qiime/tools/qiime1.9.0. There is also a package on the testtoolshed as well as a first pass at package_qiime_1_9_1 (https://github.com/lparsons/galaxy_tools/tree/qiime/packages/package_qiime_1...).
-- Lance Parsons - Scientific Programmer 134 Carl C. Icahn Laboratory Lewis-Sigler Institute for Integrative Genomics Princeton University
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/