How to run a pipeline on many data sets ?
Dear Galaxy users, I would like to do a quite simple operation, in theory: I've configured a Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine cluster), and I would like to run it on several datasets (several thousands, in a directory) and get result files in another directory. With the web interface, using libraries or not, I didn't found any solution. Does a simple solution exist ? Or anybody have experienced the same problem ? Sincerely yours, -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
Hello Jean-Francois, The Galaxy wiki describing production set up should help you to develop a solution, but please let us know if you need more help. General: http://bitbucket.org/galaxy/galaxy-central/wiki/Home -> For tool developers and labs Specific: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer Best! Jen Galaxy team On 11/2/10 1:39 AM, Jean-François Dufayard wrote:
Dear Galaxy users,
I would like to do a quite simple operation, in theory: I've configured a Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine cluster), and I would like to run it on several datasets (several thousands, in a directory) and get result files in another directory.
With the web interface, using libraries or not, I didn't found any solution.
Does a simple solution exist ? Or anybody have experienced the same problem ?
Sincerely yours, -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org
Hi, I'm also very interested in how to loop over multiple datasets. Although the info below is important to make a Galaxy scale to serve many users simultaneously, I don't see how this will help to provide looping support. You'll still need to manually configure the same tool 1000 times and start 1000 jobs if you want to analyze 1000 files. With the current web interface this ain't much fun... Or am I missing something? Cheers, Pi On Nov 2, 2010, at 10:02 PM, Jennifer Jackson wrote:
Hello Jean-Francois,
The Galaxy wiki describing production set up should help you to develop a solution, but please let us know if you need more help.
General: http://bitbucket.org/galaxy/galaxy-central/wiki/Home -> For tool developers and labs
Specific: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer
Best!
Jen Galaxy team
On 11/2/10 1:39 AM, Jean-François Dufayard wrote:
Dear Galaxy users,
I would like to do a quite simple operation, in theory: I've configured a Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine cluster), and I would like to run it on several datasets (several thousands, in a directory) and get result files in another directory.
With the web interface, using libraries or not, I didn't found any solution.
Does a simple solution exist ? Or anybody have experienced the same problem ?
Sincerely yours, -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org _______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
------------------------------------------------------------------ Biomolecular Mass Spectrometry & Proteomics group Utrecht University phone: +31 6 143 66 783 email: pieter.neerincx@gmail.com skype: pieter.online visiting address: H.R. Kruyt building // room O607 Padualaan 8 // 3584 CH Utrecht // The Netherlands mail address: P.O. box 80.082 // 3508 TB Utrecht // The Netherlands ------------------------------------------------------------------
Follow-up for the original question that would apply to yours as well. -- You are correct in that the simplest approach for this would be to specify multiple inputs at runtime. This is a feature that does not currently exist, but I'll be working on it soon. You can follow the ticket here: http://bitbucket.org/galaxy/galaxy-central/issue/409/static-and-library-inpu... -Dannon -- Best, Jen Galaxy team On 11/3/10 3:25 AM, Pieter Neerincx wrote:
Hi,
I'm also very interested in how to loop over multiple datasets. Although the info below is important to make a Galaxy scale to serve many users simultaneously, I don't see how this will help to provide looping support. You'll still need to manually configure the same tool 1000 times and start 1000 jobs if you want to analyze 1000 files. With the current web interface this ain't much fun... Or am I missing something?
Cheers,
Pi
On Nov 2, 2010, at 10:02 PM, Jennifer Jackson wrote:
Hello Jean-Francois,
The Galaxy wiki describing production set up should help you to develop a solution, but please let us know if you need more help.
General: http://bitbucket.org/galaxy/galaxy-central/wiki/Home -> For tool developers and labs
Specific: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer
Best!
Jen Galaxy team
On 11/2/10 1:39 AM, Jean-François Dufayard wrote:
Dear Galaxy users,
I would like to do a quite simple operation, in theory: I've configured a Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine cluster), and I would like to run it on several datasets (several thousands, in a directory) and get result files in another directory.
With the web interface, using libraries or not, I didn't found any solution.
Does a simple solution exist ? Or anybody have experienced the same problem ?
Sincerely yours, -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org _______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
------------------------------------------------------------------ Biomolecular Mass Spectrometry& Proteomics group Utrecht University
phone: +31 6 143 66 783 email: pieter.neerincx@gmail.com skype: pieter.online
visiting address: H.R. Kruyt building // room O607 Padualaan 8 // 3584 CH Utrecht // The Netherlands
mail address: P.O. box 80.082 // 3508 TB Utrecht // The Netherlands ------------------------------------------------------------------
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org
participants (3)
-
Jean-François Dufayard
-
Jennifer Jackson
-
Pieter Neerincx