Is parallelism not working with galaxy-central?
Hi all, can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers. The only error message I get is: galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched It's working fine for me with galaxy-dist. Thanks! Bjoern
On Sun, Mar 17, 2013 at 3:13 PM, Björn Grüning wrote:
Hi all,
can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers.
The only error message I get is:
galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched
It's working fine for me with galaxy-dist. Thanks! Bjoern
Which hg revision are you on from galaxy-central? It was working for me last time I tested, but my setup isn't fully up-to-date with the development repository. (This makes a nice change from it being me reporting that something on the trunk broke the parallelism features - that's happened a couple of time now sadly.) Regards, Peter
Am Sonntag, den 17.03.2013, 15:26 +0000 schrieb Peter Cock:
On Sun, Mar 17, 2013 at 3:13 PM, Björn Grüning wrote:
Hi all,
can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers.
The only error message I get is:
galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched
It's working fine for me with galaxy-dist. Thanks! Bjoern
Which hg revision are you on from galaxy-central? It was working for me last time I tested, but my setup isn't fully up-to-date with the development repository.
I'm on 9088:1aa6224b32c2. Which hg revision are you using? Maybe I can track down the commit. handler.py changed a lot recently :(
(This makes a nice change from it being me reporting that something on the trunk broke the parallelism features - that's happened a couple of time now sadly.)
You're welcome! Bjoern
Regards,
Peter
On Sun, Mar 17, 2013 at 3:40 PM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Am Sonntag, den 17.03.2013, 15:26 +0000 schrieb Peter Cock:
On Sun, Mar 17, 2013 at 3:13 PM, Björn Grüning wrote:
Hi all,
can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers.
The only error message I get is:
galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched
It's working fine for me with galaxy-dist. Thanks! Bjoern
Which hg revision are you on from galaxy-central? It was working for me last time I tested, but my setup isn't fully up-to-date with the development repository.
I'm on 9088:1aa6224b32c2. Which hg revision are you using? Maybe I can track down the commit. handler.py changed a lot recently :(
So yours is very up to date, just 20 hours ago according to bitbucket: https://bitbucket.org/galaxy/galaxy-central/commits/1aa6224b32c2 For the default branch (which I merge into my working branch), I last updated 24 days ago - over three weeks: https://bitbucket.org/galaxy/galaxy-central/commits/45f1d93124ad $ hg id 45f1d93124ad $ hg log -l 1 -b default changeset: 9548:45f1d93124ad parent: 9546:1adf6fdd9c49 user: InitHello <inithello@gmail.com> date: Thu Feb 21 11:06:11 2013 -0500 summary: Changed to_safe_string to use markupsafe.escape for unsafe characters. Assuming I'm right and parallelism was working fine with that commit, we've only narrowed this down to the last 24 days. If I update mine and it breaks, I'll let you know. Peter
I believe the following changeset fixes this problem. https://bitbucket.org/jmchilton/galaxy-central-lwr/commits/3b6c209d2d078bad2... I have incorporated the fix into pull request 138 that has other job runner fixes and enhancements. -John P.S. Peter's point about the tasked job runner being frequently broken is a good one, it has been broken more often than not over the last 6 months. Digging through Trello it is clear the same applies to composite datatypes. If only there was a big important feature that depended on these that we could push through to ensure these enhancements that are important to the community got more attention :). On Sun, Mar 17, 2013 at 11:00 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Sun, Mar 17, 2013 at 3:40 PM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Am Sonntag, den 17.03.2013, 15:26 +0000 schrieb Peter Cock:
On Sun, Mar 17, 2013 at 3:13 PM, Björn Grüning wrote:
Hi all,
can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers.
The only error message I get is:
galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched
It's working fine for me with galaxy-dist. Thanks! Bjoern
Which hg revision are you on from galaxy-central? It was working for me last time I tested, but my setup isn't fully up-to-date with the development repository.
I'm on 9088:1aa6224b32c2. Which hg revision are you using? Maybe I can track down the commit. handler.py changed a lot recently :(
So yours is very up to date, just 20 hours ago according to bitbucket: https://bitbucket.org/galaxy/galaxy-central/commits/1aa6224b32c2
For the default branch (which I merge into my working branch), I last updated 24 days ago - over three weeks: https://bitbucket.org/galaxy/galaxy-central/commits/45f1d93124ad
$ hg id 45f1d93124ad
$ hg log -l 1 -b default changeset: 9548:45f1d93124ad parent: 9546:1adf6fdd9c49 user: InitHello <inithello@gmail.com> date: Thu Feb 21 11:06:11 2013 -0500 summary: Changed to_safe_string to use markupsafe.escape for unsafe characters.
Assuming I'm right and parallelism was working fine with that commit, we've only narrowed this down to the last 24 days. If I update mine and it breaks, I'll let you know.
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi John, works great, thanks for your effort!
I believe the following changeset fixes this problem.
https://bitbucket.org/jmchilton/galaxy-central-lwr/commits/3b6c209d2d078bad2...
I have incorporated the fix into pull request 138 that has other job runner fixes and enhancements.
-John
P.S.
Peter's point about the tasked job runner being frequently broken is a good one, it has been broken more often than not over the last 6 months. Digging through Trello it is clear the same applies to composite datatypes. If only there was a big important feature that depended on these that we could push through to ensure these enhancements that are important to the community got more attention :).
Maybe its possible to run a few selected toolshed tools in an automatic test suite before each release or after such large changes. We have really complex tools in the toolshed that uses mainly all features available. Have a nice weekend! Bjoern
On Sun, Mar 17, 2013 at 11:00 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Sun, Mar 17, 2013 at 3:40 PM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Am Sonntag, den 17.03.2013, 15:26 +0000 schrieb Peter Cock:
On Sun, Mar 17, 2013 at 3:13 PM, Björn Grüning wrote:
Hi all,
can anyone confirm that the parallelism features is not working in galaxy-central? Parallelism is among others used by the blast+ wrappers.
The only error message I get is:
galaxy.jobs DEBUG 2013-03-17 16:08:27,729 (64174) Working directory for job is: /media/data/web/galaxy-central/database/job_working_directory/064/64174 galaxy.jobs.handler DEBUG 2013-03-17 16:08:27,750 (64174) Dispatching to tasks runner galaxy.jobs.handler ERROR 2013-03-17 16:08:27,750 put(): (64174) Invalid job runner: tasks galaxy.datatypes.metadata DEBUG 2013-03-17 16:08:28,079 Cleaning up external metadata files galaxy.jobs.handler INFO 2013-03-17 16:08:28,120 (64174) Job dispatched
It's working fine for me with galaxy-dist. Thanks! Bjoern
Which hg revision are you on from galaxy-central? It was working for me last time I tested, but my setup isn't fully up-to-date with the development repository.
I'm on 9088:1aa6224b32c2. Which hg revision are you using? Maybe I can track down the commit. handler.py changed a lot recently :(
So yours is very up to date, just 20 hours ago according to bitbucket: https://bitbucket.org/galaxy/galaxy-central/commits/1aa6224b32c2
For the default branch (which I merge into my working branch), I last updated 24 days ago - over three weeks: https://bitbucket.org/galaxy/galaxy-central/commits/45f1d93124ad
$ hg id 45f1d93124ad
$ hg log -l 1 -b default changeset: 9548:45f1d93124ad parent: 9546:1adf6fdd9c49 user: InitHello <inithello@gmail.com> date: Thu Feb 21 11:06:11 2013 -0500 summary: Changed to_safe_string to use markupsafe.escape for unsafe characters.
Assuming I'm right and parallelism was working fine with that commit, we've only narrowed this down to the last 24 days. If I update mine and it breaks, I'll let you know.
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
A new next-stable branch on Galaxy central is currently scheduled for tomorrow ( 3/18/13 ) in preparation for the next Galaxy dist release currently scheduled for 4/1/13.. This means that the main tool shed will be updated tomorrow to this next-stable branch. This branch includes a new tool shed test framework component that will discover all tool shed repositories that contain tools. It will inspect these repositories for tests defined for each contained tool, and test data files for the tools that must be contained in a test-data directory within the repository. For those repositories that meet this criteria, the test framework will install the repository into a Galaxy instance and run the functional test defined for each tool. Information about the results of these tests are uploaded to the repository for users to see and review. Repositories that contain tools that do not have functional tests defined or that are missing test data files in the required test-data directory will be updated with information about the invalid functional tests. This new test framework will be executed against the main Galaxy tool shed on a regular schedule by cron. This release includes other features that will filter out invalid or unwanted repositories in the tool shed as well. These and other features will be discussed during the next IUC teleconference and will be documented in the tool shed wiki before the next Galaxy release currently scheduled for 4/1/13. Greg Von Kuster On Mar 17, 2013, at 2:25 PM, Björn Grüning wrote:
Maybe its possible to run a few selected toolshed tools in an automatic test suite before each release or after such large changes. We have really complex tools in the toolshed that uses mainly all features available.
Have a nice weekend! Bjoern
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Mar 17, 2013, at 2:25 PM, Björn Grüning wrote:
Maybe its possible to run a few selected toolshed tools in an automatic test suite before each release or after such large changes. We have really complex tools in the toolshed that uses mainly all features available.
Have a nice weekend! Bjoern
On Sun, Mar 17, 2013 at 10:35 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
A new next-stable branch on Galaxy central is currently scheduled for tomorrow ...
This branch includes a new tool shed test framework component that will discover all tool shed repositories that contain tools. ..., the test framework will install the repository into a Galaxy instance and run the functional test defined for each tool. ...
These Tool Shed developments are definitely going to be a big step forward for automated testing - but that doesn't address the problem at hand: The <parallelism> code using the TaskedJobWrapper would only get tested if this is enabled in the Galaxy system being tested. Inevitably as more features are added to Galaxy itself, included more complicated job runners for different computer/grid back ends, covering them all will require a suite of Galaxy test servers. That's hard. I can see why the <parallelism> code isn't enabled by default on the main public Galaxy - the job splitting and merging adds additional IO and compute load, so does not make sense on a large and heavily used cluster (overall it will only slow jobs down). I suspect it is more suited to local institute clusters which are not typically under full load - where it helps by making each individual job complete more quickly. Perhaps there is a case for one of the Galaxy core team's development instances to run with <parallelism> enabled? Regards, Peter
Björn, Peter, and John, Thanks for finding and fixing this issue -- the fix has been merged into galaxy-central and should be available with the next distribution release currently scheduled for April 1. We're also looking into adding parallelism tests to our normal testing procedure to prevent this happening in the future. -Dannon On Sun, Mar 17, 2013 at 8:18 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Mar 17, 2013, at 2:25 PM, Björn Grüning wrote:
Maybe its possible to run a few selected toolshed tools in an automatic test suite before each release or after such large changes. We have really complex tools in the toolshed that uses mainly all features available.
Have a nice weekend! Bjoern
On Sun, Mar 17, 2013 at 10:35 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
A new next-stable branch on Galaxy central is currently scheduled for tomorrow ...
This branch includes a new tool shed test framework component that will discover all tool shed repositories that contain tools. ..., the test framework will install the repository into a Galaxy instance and run the functional test defined for each tool. ...
These Tool Shed developments are definitely going to be a big step forward for automated testing - but that doesn't address the problem at hand: The <parallelism> code using the TaskedJobWrapper would only get tested if this is enabled in the Galaxy system being tested. Inevitably as more features are added to Galaxy itself, included more complicated job runners for different computer/grid back ends, covering them all will require a suite of Galaxy test servers. That's hard.
I can see why the <parallelism> code isn't enabled by default on the main public Galaxy - the job splitting and merging adds additional IO and compute load, so does not make sense on a large and heavily used cluster (overall it will only slow jobs down). I suspect it is more suited to local institute clusters which are not typically under full load - where it helps by making each individual job complete more quickly.
Perhaps there is a case for one of the Galaxy core team's development instances to run with <parallelism> enabled?
Regards,
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Mon, Mar 18, 2013 at 4:52 PM, Dannon Baker <dannon.baker@gmail.com> wrote:
Björn, Peter, and John,
Thanks for finding and fixing this issue -- the fix has been merged into galaxy-central and should be available with the next distribution release currently scheduled for April 1. We're also looking into adding parallelism tests to our normal testing procedure to prevent this happening in the future.
-Dannon
Excellent news - thank you all, Peter
participants (5)
-
Björn Grüning
-
Dannon Baker
-
Greg Von Kuster
-
John Chilton
-
Peter Cock