Launching multiple jobs using one tool form with multiple selected datasets
Hi everyone, I was wondering what would be the way in Galaxy to program the following: - User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected. I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go. regards, Leandro
I've essentially asked the same question of the list in the past and gotten no real response. I have the same interest, but from a workflow perspective. * A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results. I am attempting to implement something like this using the API, though the API is still pretty green from my perspective. I think functionality like this built into the workflow editor would be a great addition. Dave On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
Hi, You're not alone with this request! Unfortunately I wasn't able to join the 2-day Galaxy Hackathon, but I heard from a colleague who just came back that this was one of the topics they worked on. At the end of the hackathon they had a working prototype: https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files... Since they had only 2 days, it will need some polish, but I heard this hack already made it into the development branch of Galaxy, so we may see something like this in the near future :) Cheers, Pi On Apr 15, 2011, at 4:28 PM, Dave Walton wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though the API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
------------------------------------------------------------- mobile: +31 6 143 66 783 e-mail: pieter.neerincx@gmail.com skype: pieter.online -------------------------------------------------------------
Is the server below down? I'm trying to get there this morning and having no luck... https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files _in_a_directory Dave On 4/15/11 12:16 PM, "Pieter Neerincx" <pieter.neerincx@gmail.com> wrote:
Hi,
You're not alone with this request! Unfortunately I wasn't able to join the 2-day Galaxy Hackathon, but I heard from a colleague who just came back that this was one of the topics they worked on. At the end of the hackathon they had a working prototype:
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files... n_a_directory
Since they had only 2 days, it will need some polish, but I heard this hack already made it into the development branch of Galaxy, so we may see something like this in the near future :)
Cheers,
Pi
On Apr 15, 2011, at 4:28 PM, Dave Walton wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though the API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
------------------------------------------------------------- mobile: +31 6 143 66 783 e-mail: pieter.neerincx@gmail.com skype: pieter.online -------------------------------------------------------------
Hi again, Sorry to maybe not understand how to get started, but I'm trying for something even without the added complexity of a workflow. If you could point me into the right direction: how would I have a single tool where on the tool form there is a multi-select of certain datasets in the user's history. The user multi selects X datasets in the form and hits submit and Galaxy will launch X tool jobs (on for each of the datasets) at the same time in the history. How do I go about doing this? Sorry for being daft, Leandro On Tue, Apr 19, 2011 at 4:34 PM, Dave Walton <Dave.Walton@jax.org> wrote:
Is the server below down? I'm trying to get there this morning and having no luck...
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files _in_a_directory
Dave
On 4/15/11 12:16 PM, "Pieter Neerincx" <pieter.neerincx@gmail.com> wrote:
Hi,
You're not alone with this request! Unfortunately I wasn't able to join the 2-day Galaxy Hackathon, but I heard from a colleague who just came back that this was one of the topics they worked on. At the end of the hackathon they had a working prototype:
n_a_directory
Since they had only 2 days, it will need some polish, but I heard this hack already made it into the development branch of Galaxy, so we may see something like this in the near future :)
Cheers,
Pi
On Apr 15, 2011, at 4:28 PM, Dave Walton wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files... the
API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
------------------------------------------------------------- mobile: +31 6 143 66 783 e-mail: pieter.neerincx@gmail.com skype: pieter.online -------------------------------------------------------------
Leandro, I see what you mean, I misunderstood your original goal. There currently isn't a way to execute single tools in this fashion. It isn't exactly straightforward, but you could construct a workflow that consisted of two steps-- an Input Dataset step, and whatever tool you wanted to use, and then you'd be able to use the functionality described in this thread as long as your instance is at 5386:67a19816034b or higher. -Dannon On Apr 21, 2011, at 7:49 AM, Leandro Hermida wrote:
Hi again,
Sorry to maybe not understand how to get started, but I'm trying for something even without the added complexity of a workflow.
If you could point me into the right direction: how would I have a single tool where on the tool form there is a multi-select of certain datasets in the user's history. The user multi selects X datasets in the form and hits submit and Galaxy will launch X tool jobs (on for each of the datasets) at the same time in the history.
How do I go about doing this?
Sorry for being daft, Leandro
On Tue, Apr 19, 2011 at 4:34 PM, Dave Walton <Dave.Walton@jax.org> wrote: Is the server below down? I'm trying to get there this morning and having no luck...
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files _in_a_directory
Dave
On 4/15/11 12:16 PM, "Pieter Neerincx" <pieter.neerincx@gmail.com> wrote:
Hi,
You're not alone with this request! Unfortunately I wasn't able to join the 2-day Galaxy Hackathon, but I heard from a colleague who just came back that this was one of the topics they worked on. At the end of the hackathon they had a working prototype:
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files... n_a_directory
Since they had only 2 days, it will need some polish, but I heard this hack already made it into the development branch of Galaxy, so we may see something like this in the near future :)
Cheers,
Pi
On Apr 15, 2011, at 4:28 PM, Dave Walton wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though the API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
------------------------------------------------------------- mobile: +31 6 143 66 783 e-mail: pieter.neerincx@gmail.com skype: pieter.online -------------------------------------------------------------
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Thu, Apr 21, 2011 at 2:40 PM, Dannon Baker <dannonbaker@me.com> wrote:
Leandro,
I see what you mean, I misunderstood your original goal. There currently isn't a way to execute single tools in this fashion.
It isn't exactly straightforward, but you could construct a workflow that consisted of two steps-- an Input Dataset step, and whatever tool you wanted to use, and then you'd be able to use the functionality described in this thread as long as your instance is at 5386:67a19816034b or higher.
-Dannon
Hi Dannon, Thanks for your replies and advice, this leads me to a quesion... is it possible to execute a tool job via the Galaxy API just like someone does via the UI? Could I then have a tool script that wraps this programmatic execution of X tool jobs via the API? regards, Leandro
On Apr 21, 2011, at 7:49 AM, Leandro Hermida wrote:
Hi again,
Sorry to maybe not understand how to get started, but I'm trying for something even without the added complexity of a workflow.
If you could point me into the right direction: how would I have a single tool where on the tool form there is a multi-select of certain datasets in the user's history. The user multi selects X datasets in the form and hits submit and Galaxy will launch X tool jobs (on for each of the datasets) at the same time in the history.
How do I go about doing this?
Sorry for being daft, Leandro
On Tue, Apr 19, 2011 at 4:34 PM, Dave Walton <Dave.Walton@jax.org> wrote: Is the server below down? I'm trying to get there this morning and having no luck...
_in_a_directory
Dave
On 4/15/11 12:16 PM, "Pieter Neerincx" <pieter.neerincx@gmail.com> wrote:
Hi,
You're not alone with this request! Unfortunately I wasn't able to join
2-day Galaxy Hackathon, but I heard from a colleague who just came back
this was one of the topics they worked on. At the end of the hackathon
had a working prototype:
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files...
n_a_directory
Since they had only 2 days, it will need some polish, but I heard this hack already made it into the development branch of Galaxy, so we may see something like this in the near future :)
Cheers,
Pi
On Apr 15, 2011, at 4:28 PM, Dave Walton wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though
https://wiki.nbic.nl/index.php/NBIC_Galaxy_Hackathon_project#Loop_over_files the that they the
API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
------------------------------------------------------------- mobile: +31 6 143 66 783 e-mail: pieter.neerincx@gmail.com skype: pieter.online -------------------------------------------------------------
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Not currently, though the API is being continually extended with new features as the need arises. -Dannon On Apr 27, 2011, at 9:40 AM, Leandro Hermida wrote:
Hi Dannon,
Thanks for your replies and advice, this leads me to a quesion... is it possible to execute a tool job via the Galaxy API just like someone does via the UI? Could I then have a tool script that wraps this programmatic execution of X tool jobs via the API?
regards, Leandro
Hi again, Can one via the API upload a file programmatically just like one does using the UI? best, Leandro On Wed, Apr 27, 2011 at 3:43 PM, Dannon Baker <dannonbaker@me.com> wrote:
Not currently, though the API is being continually extended with new features as the need arises.
-Dannon
Hi Dannon,
Thanks for your replies and advice, this leads me to a quesion... is it
On Apr 27, 2011, at 9:40 AM, Leandro Hermida wrote: possible to execute a tool job via the Galaxy API just like someone does via the UI? Could I then have a tool script that wraps this programmatic execution of X tool jobs via the API?
regards, Leandro
Hi Dave, Thanks for the feedback, this to me would be a very useful feature to be able to spawn multiple same jobs or workflows from one tool form using different input datasets selected with multi select and all other form parameters being the same. Where are the docs and information for the Galaxy API? Wondering if I could program this exact functionality in the script that the tool would execute. thanks, Leandro On Fri, Apr 15, 2011 at 4:28 PM, Dave Walton <Dave.Walton@jax.org> wrote:
I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though the API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
Leandro, As far as I can tell there are no real docs for the API. The best I’ve been able to find is what’s under the scripts/api directory. I believe the API is still considered a “beta” at best. Galaxy team, are there any updates on the API, documentation plans, etc? Thanks, Dave On 4/15/11 1:14 PM, "Leandro Hermida" <softdev@leandrohermida.com> wrote: Hi Dave, Thanks for the feedback, this to me would be a very useful feature to be able to spawn multiple same jobs or workflows from one tool form using different input datasets selected with multi select and all other form parameters being the same. Where are the docs and information for the Galaxy API? Wondering if I could program this exact functionality in the script that the tool would execute. thanks, Leandro On Fri, Apr 15, 2011 at 4:28 PM, Dave Walton <Dave.Walton@jax.org> wrote: I've essentially asked the same question of the list in the past and gotten no real response. I have the same interest, but from a workflow perspective. * A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results. I am attempting to implement something like this using the API, though the API is still pretty green from my perspective. I think functionality like this built into the workflow editor would be a great addition. Dave On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
You are correct in that, so far, there isn't any formal documentation of the API, and it should still be considered a beta feature. There is a workflow API, and the best way to see how it works right now is, like you mentioned, browsing the scripts/api directory for usage examples or reading the api source itself. The two relevant sample scripts for workflows are workflow_execute.py and example_watch_folder.py for a more involved demo that uploads files from a regular system folder to a data library within galaxy, and then executes a workflow on them. As Pieter mentioned earlier in the thread, we made significant progress towards addressing this within the standard galaxy UI at the recent hackathon, and it has been committed to galaxy-central. Depending on your needs, you might not need to use the API at all. If you'd like to give it a try, the relevant changeset in galaxy-central is 5385:b9cff3fbd170, which is also on test.g2.bx.psu.edu. Feedback is definitely welcome. Thanks, -Dannon On 04/18/2011 10:56 AM, Dave Walton wrote:
Leandro,
As far as I can tell there are no real docs for the API. The best I’ve been able to find is what’s under the scripts/api directory. I believe the API is still considered a “beta” at best.
Galaxy team, are there any updates on the API, documentation plans, etc?
Thanks,
Dave
On 4/15/11 1:14 PM, "Leandro Hermida"<softdev@leandrohermida.com> wrote:
Hi Dave,
Thanks for the feedback, this to me would be a very useful feature to be able to spawn multiple same jobs or workflows from one tool form using different input datasets selected with multi select and all other form parameters being the same.
Where are the docs and information for the Galaxy API? Wondering if I could program this exact functionality in the script that the tool would execute.
thanks, Leandro
On Fri, Apr 15, 2011 at 4:28 PM, Dave Walton<Dave.Walton@jax.org> wrote: I've essentially asked the same question of the list in the past and gotten no real response.
I have the same interest, but from a workflow perspective.
* A module that allows me to select multiple datafiles (say fastq files) * Then pass each data file to a separate instance of a workflow that runs tophat and cufflinks * then another module that takes the final outputs of each of the workflow runs and sends them to a final module that merges results.
I am attempting to implement something like this using the API, though the API is still pretty green from my perspective.
I think functionality like this built into the workflow editor would be a great addition.
Dave
On Apr 15, 2011, at 8:14 AM, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
<ATT00001..txt>
Hi Our production pipeline does this on LSF through job arrays. It would be good if Galaxy supported job arrays Marina On 15/04/2011 13:14, Leandro Hermida wrote:
Hi everyone,
I was wondering what would be the way in Galaxy to program the following:
- User clicks on a tool and form is displayed - They use a select multi menu in the form to pick lets say X multiple datasets from their history - When they click submit the tool launches X number of jobs in the history, on for each of the datasets selected.
I have a common use case where users have to manually run the same tool over and over again with the same parameters for each dataset of interest in their history. I would be great to be able programmatically or otherwise with Galaxy to be able to use one form and multi select the datasets and then launch the parallel jobs in one go.
regards, Leandro
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
participants (5)
-
Dannon Baker
-
Dave Walton
-
Leandro Hermida
-
Marina Gourtovaia
-
Pieter Neerincx