intended behaviour for multiple data sets
Dear list, In addition to single datasets tool file inputs allow to choose between - multiple dataset - dataset collection - some tools use repeat tags for multiple inputs I would like to know what the intended behavior of galaxy is for the three options considering multiple datasets? Is this described somewhere? Is it that the tool is applied to each of the multiple inputs separately or at once. For some tools one case makes more sense than the other. Best, Matthias -- ------------------------------------------- Matthias Bernt Bioinformatics Service Molekulare Systembiologie (MOLSYB) Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/ Helmholtz Centre for Environmental Research GmbH - UFZ Permoserstraße 15, 04318 Leipzig, Germany Phone +49 341 235 482296, m.bernt@ufz.de, www.ufz.de Sitz der Gesellschaft/Registered Office: Leipzig Registergericht/Registration Office: Amtsgericht Leipzig Handelsregister Nr./Trade Register Nr.: B 4703 Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig Wilfried Kraus Wissenschaftlicher Geschäftsführer/Scientific Managing Director: Prof. Dr. Dr. h.c. Georg Teutsch Administrative Geschäftsführerin/ Administrative Managing Director: Prof. Dr. Heike Graßmann -------------------------------------------
Hello, How the data is consumed is explained on the tool form in the data entry area. Please review how to interpret this below and let us know if you are not sure about a particular tool. We might be able to explain how to access the help, or the tool might need an update to make the usage clearer. Example1: *Samtools Sort* > click on the Multiple Dataset or Dataset Collection icons and this new help text will be presented: This is a batch mode input field. Separate jobs will be triggered for each dataset selection. Example2: *Samtools Mpileup* > multiple dataset selection is the default (one or more can be chosen), or click to Collections where one or more can be selected. No new help text is presented to warn about the batch job mode. This means that inputs are processed together in the same job. More complex entry can be found on tools like *Compare two Datasets*, where there are two input sections and each can have a single or multiple (batch) entry (individually selected or in a collection). The batch mode help text comes up when multiple/collections are selected. This is expanded behavior similar to *Sort* above. And even more complex entry can be found on tools like *MultiQC*, where one or more input sets can be selected, and additional input sets (Reports) sections can be optionally added in. There is no batch mode entry text reported, meaning that all data is run with the same job. This is expanded behavior similar to *Mpileup* above, where each subsection is combined to produce a summary sub-report, then the final results from each sub-report is combined into the final report. *Galaxy tutorials: *https://galaxyproject.org/learn/ Hope that helps! Jen -- Jennifer Hillman-Jackson Galaxy Application Support http://usegalaxy.org http://galaxyproject.org http://biostar.usegalaxy.org On Thu, Dec 14, 2017 at 3:34 AM, Matthias Bernt <m.bernt@ufz.de> wrote:
Dear list,
In addition to single datasets tool file inputs allow to choose between - multiple dataset - dataset collection - some tools use repeat tags for multiple inputs
I would like to know what the intended behavior of galaxy is for the three options considering multiple datasets? Is this described somewhere?
Is it that the tool is applied to each of the multiple inputs separately or at once. For some tools one case makes more sense than the other.
Best, Matthias
--
------------------------------------------- Matthias Bernt Bioinformatics Service Molekulare Systembiologie (MOLSYB) Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/ Helmholtz Centre for Environmental Research GmbH - UFZ Permoserstraße 15, 04318 Leipzig, Germany <https://maps.google.com/?q=Permoserstra%C3%9Fe+15,+04318+Leipzig,+Germany&entry=gmail&source=g> Phone +49 341 235 482296, m.bernt@ufz.de, www.ufz.de
Sitz der Gesellschaft/Registered Office: Leipzig Registergericht/Registration Office: Amtsgericht Leipzig Handelsregister Nr./Trade Register Nr.: B 4703 Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig Wilfried Kraus Wissenschaftlicher Geschäftsführer/Scientific Managing Director: Prof. Dr. Dr. h.c. Georg Teutsch Administrative Geschäftsführerin/ Administrative Managing Director: Prof. Dr. Heike Graßmann ------------------------------------------- ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/
Dear Jen, thanks for the detailed info. I was asking more from a tool developers perspective. But I found the answer: the multiple parameter of the input tag -- I forgot about it. Best, Matthias On 14.12.2017 21:40, Jennifer Hillman-Jackson wrote:
Hello,
How the data is consumed is explained on the tool form in the data entry area. Please review how to interpret this below and let us know if you are not sure about a particular tool. We might be able to explain how to access the help, or the tool might need an update to make the usage clearer.
Example1: *Samtools Sort* > click on the Multiple Dataset or Dataset Collection icons and this new help text will be presented: This is a batch mode input field. Separate jobs will be triggered for each dataset selection.
Example2: *Samtools Mpileup* > multiple dataset selection is the default (one or more can be chosen), or click to Collections where one or more can be selected. No new help text is presented to warn about the batch job mode. This means that inputs are processed together in the same job.
More complex entry can be found on tools like *Compare two Datasets*, where there are two input sections and each can have a single or multiple (batch) entry (individually selected or in a collection). The batch mode help text comes up when multiple/collections are selected. This is expanded behavior similar to *Sort* above.
And even more complex entry can be found on tools like *MultiQC*, where one or more input sets can be selected, and additional input sets (Reports) sections can be optionally added in. There is no batch mode entry text reported, meaning that all data is run with the same job. This is expanded behavior similar to *Mpileup* above, where each subsection is combined to produce a summary sub-report, then the final results from each sub-report is combined into the final report.
*Galaxy tutorials: *https://galaxyproject.org/learn/
Hope that helps!
Jen
-- Jennifer Hillman-Jackson Galaxy Application Support http://usegalaxy.org http://galaxyproject.org http://biostar.usegalaxy.org
On Thu, Dec 14, 2017 at 3:34 AM, Matthias Bernt <m.bernt@ufz.de <mailto:m.bernt@ufz.de>> wrote:
Dear list,
In addition to single datasets tool file inputs allow to choose between - multiple dataset - dataset collection - some tools use repeat tags for multiple inputs
I would like to know what the intended behavior of galaxy is for the three options considering multiple datasets? Is this described somewhere?
Is it that the tool is applied to each of the multiple inputs separately or at once. For some tools one case makes more sense than the other.
Best, Matthias
--
------------------------------------------- Matthias Bernt Bioinformatics Service Molekulare Systembiologie (MOLSYB) Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/ Helmholtz Centre for Environmental Research GmbH - UFZ Permoserstraße 15, 04318 Leipzig, Germany <https://maps.google.com/?q=Permoserstra%C3%9Fe+15,+04318+Leipzig,+Germany&entry=gmail&source=g> Phone +49 341 235 482296 <tel:%2B49%20341%20235%20482296>, m.bernt@ufz.de <mailto:m.bernt@ufz.de>, www.ufz.de <http://www.ufz.de>
Sitz der Gesellschaft/Registered Office: Leipzig Registergericht/Registration Office: Amtsgericht Leipzig Handelsregister Nr./Trade Register Nr.: B 4703 Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig Wilfried Kraus Wissenschaftlicher Geschäftsführer/Scientific Managing Director: Prof. Dr. Dr. h.c. Georg Teutsch Administrative Geschäftsführerin/ Administrative Managing Director: Prof. Dr. Heike Graßmann ------------------------------------------- ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ <http://galaxyproject.org/search/>
-- ------------------------------------------- Matthias Bernt Bioinformatics Service Molekulare Systembiologie (MOLSYB) Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/ Helmholtz Centre for Environmental Research GmbH - UFZ Permoserstraße 15, 04318 Leipzig, Germany Phone +49 341 235 482296, m.bernt@ufz.de, www.ufz.de Sitz der Gesellschaft/Registered Office: Leipzig Registergericht/Registration Office: Amtsgericht Leipzig Handelsregister Nr./Trade Register Nr.: B 4703 Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig Wilfried Kraus Wissenschaftlicher Geschäftsführer/Scientific Managing Director: Prof. Dr. Dr. h.c. Georg Teutsch Administrative Geschäftsführerin/ Administrative Managing Director: Prof. Dr. Heike Graßmann -------------------------------------------
participants (2)
-
Jennifer Hillman-Jackson
-
Matthias Bernt