Workflow Fails with persistence issue?

16 Sep 2010

      Hi!  I have a pretty simple NGS workflow I've set up.  I'm running Galaxy's
MACS wrapper pushing the output to MEME, a motif finder, in hopes of finding
overrepresented TF motifs.

Running MACS and then MEME when the job is complete works just fine, but
when I extract the workflow or create my own workflow and link the pieces
together, I get a strange SQLAlchemy persistence error when MACS is run.
MACS successfully completes and I can see the output just fine, but the jobs
that follow in the workflow stay grayed out forever.  I can run other jobs
just fine, so I don't think the dispatcher died.  Here is the error message:

69.234.135.61 - - [16/Sep/2010:22:08:44 -0700] "GET /workflow/list_for_run
HTTP/1.1" 200 - "http://iron.ics.uci.edu
:8080/root/tool_menu" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4;
en-US; rv:1.9.0.4) Gecko/2008102920 Firefox/3
.0.4"
169.234.135.61 - - [16/Sep/2010:22:08:46 -0700] "GET
/workflow/run?id=1cd8e2f6b131e891 HTTP/1.1" 200 - "http://iron.
ics.uci.edu:8080/workflow/list_for_run" "Mozilla/5.0 (Macintosh; U; Intel
Mac OS X 10.4; en-US; rv:1.9.0.4) Gecko/20
08102920 Firefox/3.0.4"
galaxy.jobs DEBUG 2010-09-16 22:09:09,483 dispatching job 33 to local runner
galaxy.jobs INFO 2010-09-16 22:09:09,725 job 33 dispatched
galaxy.jobs.runners.local DEBUG 2010-09-16 22:09:10,159 executing: python
/home/galaxy/galaxy_dist/tools/peak_callin
g/macs_wrapper.py
/home/galaxy/galaxy_dist/database/job_working_directory/33/tmpSvTS8S
/home/galaxy/galaxy_dist/data
base/files/000/dataset_84.dat
/home/galaxy/galaxy_dist/database/files/000/dataset_87.dat
/home/galaxy/galaxy_dist/da
tabase/job_working_directory/33/dataset_87_files
galaxy.jobs ERROR 2010-09-16 22:09:11,379 failure running job 34
Traceback (most recent call last):
  File "/home/galaxy/galaxy_dist/lib/galaxy/jobs/__init__.py", line 186, in
__monitor_step
    self.sa_session.refresh( job )
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/scoping.py",
line 127, in
do
    return getattr(self.registry(), name)(*args, **kwargs)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 926, in
refresh
    self._validate_persistent(state)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 1236, in
 _validate_persistent
    mapperutil.state_str(state))
InvalidRequestError: Instance '<Job at 0x5e04f90>' is not persistent within
this Session
galaxy.jobs ERROR 2010-09-16 22:09:11,925 failure running job 35
Traceback (most recent call last):
  File "/home/galaxy/galaxy_dist/lib/galaxy/jobs/__init__.py", line 186, in
__monitor_step
    self.sa_session.refresh( job )
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/scoping.py",
line 127, in
do
    return getattr(self.registry(), name)(*args, **kwargs)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 926, in
refresh
    self._validate_persistent(state)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 1236, in
 _validate_persistent
    mapperutil.state_str(state))
InvalidRequestError: Instance '<Job at 0x5e57d10>' is not persistent within
this Session
galaxy.jobs ERROR 2010-09-16 22:09:12,666 failure running job 36
Traceback (most recent call last):
  File "/home/galaxy/galaxy_dist/lib/galaxy/jobs/__init__.py", line 186, in
__monitor_step
    self.sa_session.refresh( job )
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/scoping.py",
line 127, in
do
    return getattr(self.registry(), name)(*args, **kwargs)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 926, in
refresh
    self._validate_persistent(state)
  File
"/home/galaxy/galaxy_dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/session.py",
line 1236, in
 _validate_persistent
    mapperutil.state_str(state))
InvalidRequestError: Instance '<Job at 0x5e4b350>' is not persistent within
this Session
169.234.135.61 - - [16/Sep/2010:22:09:08 -0700] "POST
/workflow/run?id=1cd8e2f6b131e891 HTTP/1.1" 200 - "http://iron
.ics.uci.edu:8080/workflow/run?id=1cd8e2f6b131e891" "Mozilla/5.0 (Macintosh;
U; Intel Mac OS X 10.4; en-US; rv:1.9.0
.4) Gecko/2008102920 Firefox/3.0.4"
...  a few job failures later
galaxy.jobs.runners.local DEBUG 2010-09-16 22:09:15,495 execution finished:
python /home/galaxy/galaxy_dist/tools/pe
ak_calling/macs_wrapper.py
/home/galaxy/galaxy_dist/database/job_working_directory/33/tmpSvTS8S
/home/galaxy/galaxy_
dist/database/files/000/dataset_84.dat
/home/galaxy/galaxy_dist/database/files/000/dataset_87.dat
/home/galaxy/galax
y_dist/database/job_working_directory/33/dataset_87_files
galaxy.jobs DEBUG 2010-09-16 22:09:16,192 job 33 ended

It seems that the subsequent jobs (34, 35, etc) aren't waiting for MACS to
complete.

FYI:  Running fairly recent checkout of galaxy_dist (not more than a month
old).  Using Postgres 8.4.4 currently, but was on the default sqlite3 table
schema, where I saw the same issue. Ubuntu 10.04, with BASH replacing the
default DASH.

Any thoughts or places I could look to figure out what's going on?

--
Jake Biesinger
Graduate Student
Xie Lab, UC Irvine
(949) 231-7587

Jacob Biesinger

Jacob Biesinger

tags

participants (1)