Hi again Ulf,

Thanks for the info. A few questions to help me track this down:

Does the postgres database reside on a remote box from galaxy?  And is it very large?

Running the latest galaxy may not change anything related to this particular issue, but you could always try it.

Sqlalchemy is fixed at the latest version we can currently support without reworking how migration scripts function (which we will do, moving to Alembic, in the future), and I do suspect that this is actually a bug in sqlalchemy mapper initialization, but we should be able to come up with an interim work around. 

Finally, if this is a blocker for you while it's not trivial(and I still am going to fox this bug), setting up an amqp (rabbitmq) server and configuring your galaxy instances to communicate using that is a workaround.

On Oct 8, 2014 10:45 AM, "Ulf Schaefer" <Ulf.Schaefer@phe.gov.uk> wrote:
Hi all again

Seems I am not so fortunate that this would just go away.

It appear to be happening sometimes at start-up time for one of the
handler processes. The first thing that appears to go wrong is this just
after starting the job handler queue:

---

galaxy.jobs.handler INFO 2014-10-06 14:37:51,220 job handler queue started
galaxy.sample_tracking.external_service_types DEBUG 2014-10-06
14:37:51,246 Loaded external_service_type: Simple unknown sequencer 1.0.0
galaxy.sample_tracking.external_service_types DEBUG 2014-10-06
14:37:51,253 Loaded external_service_type: Applied Biosystems SOLiD 1.0.0
galaxy.queue_worker INFO 2014-10-06 14:37:51,254 Initalizing Galaxy
Queue Worker on
sqlalchemy+postgres://galaxy:xxx@158.119.147.86:5432/galaxyprod
galaxy.jobs DEBUG 2014-10-06 14:37:51,416 (78355) Working directory for
job is:
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/database/job_working_directory/078/78355
galaxy.web.framework.base DEBUG 2014-10-06 14:37:51,454 Enabling
'data_admin' controller, class: DataAdmin
galaxy.jobs.handler ERROR 2014-10-06 14:37:51,464 failure running job 78355
Traceback (most recent call last):
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
line 243, in __monitor_step
     job_state = self.__check_if_ready_to_run( job )
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
line 333, in __check_if_ready_to_run
     state = self.__check_user_jobs( job, self.job_wrappers[job.id] )
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
line 417, in __check_user_jobs
     if job.user:
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py",
line 168, in __get__
     return self.impl.get(instance_state(instance),dict_)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py",
line 453, in get
     value = self.callable_(state, passive)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py",
line 508, in _load_for_state
     return self._emit_lazyload(session, state, ident_key)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py",
line 552, in _emit_lazyload
     return q._load_on_ident(ident_key)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 2512, in _load_on_ident
     return q.one()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 2184, in one
     ret = list(self)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 2227, in __iter__
     return self._execute_and_instances(context)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 2240, in _execute_and_instances
     close_with_result=True)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 2231, in _connection_from_session
     **kw)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
line 774, in connection
     bind = self.get_bind(mapper, clause=clause, **kw)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
line 1052, in get_bind
     c_mapper = mapper is not None and _class_to_mapper(mapper) or None
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py",
line 680, in _class_to_mapper
     mapperlib.configure_mappers()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py",
line 2263, in configure_mappers
     mapper._post_configure_properties()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py",
line 1172, in _post_configure_properties
     prop.init()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/interfaces.py",
line 128, in init
     self.do_init()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py",
line 910, in do_init
     self._process_dependent_arguments()
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py",
line 998, in _process_dependent_arguments
     self.target = self.mapper.mapped_table
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py",
line 494, in __get__
     obj.__dict__[self.__name__] = result = self.fget(obj)
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py",
line 891, in mapper
     mapper_ = mapper.class_mapper(self.argument(),
   File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/ext/declarative.py",
line 1428, in return_cls
     (prop.parent, arg, n.args[0], cls)
InvalidRequestError: When initializing mapper Mapper|Queue|kombu_queue,
expression 'Message' failed to locate a name ("name 'Message' is not
defined"). If this is a class name, consider adding this relationship()
to the <class 'kombu.transport.sqlalchemy.Queue'> class after both
dependent classes have been defined.

---

After that it starts throwing the exception in monitor_step that I
previously posted. Has anyone seen a potentially related issue? Would an
update to the latest galaxy code help? I see there are newer versions of
SQLAlchemy available. Are they part of a newer code base?

Thanks a lot for your help
Ulf

On 07/10/14 12:20, Ulf Schaefer wrote:
> Update:
>
> The usual switching it off and on again (server reboot) has resolved the
> problem (for now), albeit in a rather unsatisfactory manner.
>
> If there are any insights what caused this behaviour and how it can be
> avoided in the future I'd be more than happy to hear them.
>
> Cheers
> Ulf
>
> On 07/10/14 11:04, Dannon Baker wrote:
>> One per second?  Can you tell me more about your configuration?   This is
>> an odd bug with multiple mapper initialization that I haven't been able to
>> reproduce yet, so any information will help.  Database configuration,
>> number of processes, etc.
>> On Oct 7, 2014 11:46 AM, "Ulf Schaefer" <Ulf.Schaefer@phe.gov.uk> wrote:
>>
>>> Dear all
>>>
>>> Maybe one of you can shed some light on this error message that I see in
>>> the log file for one of my handler processes. I get about one of them
>>> per second. The effect is that most of the jobs remain in the "waiting
>>> to run" stage.
>>>
>>> The postgres database is running on a separate server and appear to be
>>> doing just fine.
>>>
>>> Any help is greatly appreciated.
>>>
>>> Thanks
>>> Ulf
>>>
>>> ---
>>>
>>> galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step
>>> Traceback (most recent call last):
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
>>> line 161, in __monitor
>>>        self.__monitor_step()
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
>>> line 184, in __monitor_step
>>>        hda_not_ready =
>>> self.sa_session.query(model.Job.id).enable_eagerloads(False) \
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py",
>>> line 114, in do
>>>        return getattr(self.registry(), name)(*args, **kwargs)
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
>>> line 1088, in query
>>>        return self._query_cls(entities, self, **kwargs)
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
>>> line 108, in __init__
>>>        self._set_entities(entities)
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
>>> line 117, in _set_entities
>>>        self._setup_aliasizers(self._entities)
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
>>> line 132, in _setup_aliasizers
>>>        _entity_info(entity)
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py",
>>> line 578, in _entity_info
>>>        mapperlib.configure_mappers()
>>>      File
>>>
>>> "/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py",
>>> line 2260, in configure_mappers
>>>        raise e
>>> InvalidRequestError: One or more mappers failed to initialize - can't
>>> proceed with initialization of other mappers.  Original exception was:
>>> When initializing mapper Mapper|Queue|kombu_queue, expression 'Message'
>>> failed to locate a name ("name 'Message' is not defined"). If this is a
>>> class name, consider adding this relationship() to the <class
>>> 'kombu.transport.sqlalchemy.Queue'> class after both dependent classes
>>> have been defined.
>>>
>>> ---
>>>
>>> **************************************************************************
>>> The information contained in the EMail and any attachments is confidential
>>> and intended solely and for the attention and use of the named
>>> addressee(s). It may not be disclosed to any other person without the
>>> express authority of Public Health England, or the intended recipient, or
>>> both. If you are not the intended recipient, you must not disclose, copy,
>>> distribute or retain this message or any part of it. This footnote also
>>> confirms that this EMail has been swept for computer viruses by
>>> Symantec.Cloud, but please re-sweep any attachments before opening or
>>> saving. http://www.gov.uk/PHE
>>> **************************************************************************
>>>
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>     http://lists.bx.psu.edu/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>     http://galaxyproject.org/search/mailinglists/
>>>
>>
>
> **************************************************************************
> The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
> **************************************************************************
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>    http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>    http://galaxyproject.org/search/mailinglists/
>

**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************