I also should have added -- as an even *quicker* 'fix' you can probably kick that single handler process a few times and get lucky and have it start correctly, since this is a race condition.

On Thu, Apr 9, 2015 at 8:14 AM, Dannon Baker <dannon.baker@gmail.com> wrote:
Hi Cristian,

Unfortunately, I *have* seen this error before (though never on my own hardware) and we thought we'd fixed it a little while back.  It's a weird race condition in kombu[1] mapper initialization under sqlalchemy.  We use this library for message passing between processes.  Usually kombu is bound to a message queue like RabbitMQ, but it can also use a traditional database if one isn't available.

I'm going to spend some more time on this today and upgrade our kombu dependency to the latest version and can get back to you with hopefully better news, but if you need to fix this *right now*, one sure way to do it is to set up one of the other transports[2] and specify the 'amqp_internal_connection' entry in your galaxy.ini.  We've had success with rabbitmq, but mongo, redis, or really any of the other non-sqlalchemy transports should work if you already have infrastructure for it.  

One of the primary reasons we chose kombu for handling Galaxy's messaging was so that things *would* continue to just work with existing infrastructure without requiring a message queue, so this is definitely something I'm going to figure out.

Thanks!

Dannon

[1] https://github.com/celery/kombu/
[2] https://kombu.readthedocs.org/en/latest/userguide/connections.html#amqp-transports


On Thu, Apr 9, 2015 at 5:36 AM, C. Ch. <tuto345@hotmail.com> wrote:
Hi all,

We had a problem on the NAS where we couldn't write anymore. This problem was solved and I decided that I would profit to upgrade galaxy.

Now we have a problem where jobs do not get launched and the handler log shows a lot of error messages. The first error message is copied below. Afterwards it alternates between

 galaxy.jobs.handler ERROR 2015-04-07 15:58:24,959 Exception in monitor_step 
 galaxy.jobs.runners ERROR 2015-04-07 15:58:23,542 Unhandled exception checking active jobs

with, mainly, the same error stack (second error message shown below).

Any idea what is going wrong?

My first impression is that it is trying to recuperate some job that galaxy believes is still running but the machine has been stopped and rebooted in the mean time.

Any help is welcome!

Cristian


galaxy.jobs.runners ERROR 2015-04-07 15:44:04,809 (27764) Unhandled exception calling queue_job
Traceback (most recent call last):
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 97, in run_next
    method(arg)
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 115, in queue_job
    if not self.prepare_job( job_wrapper, include_metadata=True ):
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 140, in prepare_job
    job_id = job_wrapper.get_id_tag()
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 794, in get_id_tag
    return self.get_job().get_id_tag()
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 790, in get_job
    return self.sa_session.query( model.Job ).get( self.job_id )
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 840, in get
    return loading.load_on_ident(self, key)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/loading.py", line 231, in load_on_ident
    return q.one()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2395, in one
    ret = list(self)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2434, in __iter__
    context = self._compile_context()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2810, in _compile_context
    entity.setup_context(self, context)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 3194, in setup_context
    column_collection=context.primary_columns
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/interfaces.py", line 466, in setup
    strat.setup_query(context, entity, path, loader, adapter, **kwargs)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py", line 1111, in setup_query
    column_collection, parentmapper, chained_from_outerjoin
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py", line 1234, in _generate_row_adapter
    use_mapper_path=True)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py", line 364, in __init__
    else mapper.with_polymorphic_mappers,
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py", line 725, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 1877, in _with_polymorphic_mappers
    configure_mappers()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 2589, in configure_mappers
    mapper._post_configure_properties()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 1694, in _post_configure_properties
    prop.init()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/interfaces.py", line 144, in init
    self.do_init()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/relationships.py", line 1549, in do_init
    self._process_dependent_arguments()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/relationships.py", line 1605, in _process_dependent_arguments
    self.target = self.mapper.mapped_table
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py", line 725, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/relationships.py", line 1522, in mapper
    argument = self.argument()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/ext/declarative/clsregistry.py", line 283, in __call__
    (self.prop.parent, self.arg, n.args[0], self.cls)
InvalidRequestError: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name ("name 'Message' is not defined"). If this is a class name, consider adding this relationship() to the <class 'kombu.transport.sqlalchemy.Queue'> class after both dependent classes have been defined.

Repeated error messages:
galaxy.jobs.runners ERROR 2015-04-07 15:58:23,542 Unhandled exception checking active jobs
Traceback (most recent call last):
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 501, in monitor
    self.check_watched_items()
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 238, in check_watched_items
    galaxy_id_tag = ajs.job_wrapper.get_id_tag()
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 794, in get_id_tag
    return self.get_job().get_id_tag()
  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 790, in get_job
    return self.sa_session.query( model.Job ).get( self.job_id )
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py", line 150
, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py", line 116
5, in query
    return self._query_cls(entities, self, **kwargs)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 108,
in __init__
    self._set_entities(entities)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 118,
in _set_entities
    self._set_entity_selectables(self._entities)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 151,
in _set_entity_selectables
    ent.setup_entity(*d[entity])
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 3036,
 in setup_entity
    self._with_polymorphic = ext_info.with_polymorphic_mappers
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py", lin
e 725, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 1877
, in _with_polymorphic_mappers
    configure_mappers()
  File "/home/galaxy/galaxy-dist/eggs/SQLAlchemy-0.9.8-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 2586
, in configure_mappers
    raise e
InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers.  Or
iginal exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (
"name 'Message' is not defined"). If this is a class name, consider adding this relationship() to the <class 'kombu.tra
nsport.sqlalchemy.Queue'> class after both dependent classes have been defined.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/