Hi Dannon Yes. The database is running on a different server from the server running Galaxy. They are both VMs running Centos (6.5 on the Galaxy server, 6.2 on the database server). The postgres version is 8.4.9 and the database size is 712,161,040. I suspect that is not very large compared to some others. There are a number of other databases running on the same server, the one most frequently used is for our test Galaxy server which runs on yet a different VM. This one is much smaller (25,319,184). Both servers are on the same subnet. The problem is with our production Galaxy (of course). Are there any instructions around, how to implement a rabbitmq for my Galaxy? Thanks for looking into this. Ulf On 08/10/14 11:26, Dannon Baker wrote:
Hi again Ulf,
Thanks for the info. A few questions to help me track this down:
Does the postgres database reside on a remote box from galaxy? And is it very large?
Running the latest galaxy may not change anything related to this particular issue, but you could always try it.
Sqlalchemy is fixed at the latest version we can currently support without reworking how migration scripts function (which we will do, moving to Alembic, in the future), and I do suspect that this is actually a bug in sqlalchemy mapper initialization, but we should be able to come up with an interim work around.
Finally, if this is a blocker for you while it's not trivial(and I still am going to fox this bug), setting up an amqp (rabbitmq) server and configuring your galaxy instances to communicate using that is a workaround. On Oct 8, 2014 10:45 AM, "Ulf Schaefer" <Ulf.Schaefer@phe.gov.uk> wrote:
Hi all again
Seems I am not so fortunate that this would just go away.
It appear to be happening sometimes at start-up time for one of the handler processes. The first thing that appears to go wrong is this just after starting the job handler queue:
---
galaxy.jobs.handler INFO 2014-10-06 14:37:51,220 job handler queue started galaxy.sample_tracking.external_service_types DEBUG 2014-10-06 14:37:51,246 Loaded external_service_type: Simple unknown sequencer 1.0.0 galaxy.sample_tracking.external_service_types DEBUG 2014-10-06 14:37:51,253 Loaded external_service_type: Applied Biosystems SOLiD 1.0.0 galaxy.queue_worker INFO 2014-10-06 14:37:51,254 Initalizing Galaxy Queue Worker on sqlalchemy+postgres://galaxy:xxx@158.119.147.86:5432/galaxyprod galaxy.jobs DEBUG 2014-10-06 14:37:51,416 (78355) Working directory for job is:
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/database/job_working_directory/078/78355 galaxy.web.framework.base DEBUG 2014-10-06 14:37:51,454 Enabling 'data_admin' controller, class: DataAdmin galaxy.jobs.handler ERROR 2014-10-06 14:37:51,464 failure running job 78355 Traceback (most recent call last): File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py", line 243, in __monitor_step job_state = self.__check_if_ready_to_run( job ) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py", line 333, in __check_if_ready_to_run state = self.__check_user_jobs( job, self.job_wrappers[job.id] ) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py", line 417, in __check_user_jobs if job.user: File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py", line 168, in __get__ return self.impl.get(instance_state(instance),dict_) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py", line 453, in get value = self.callable_(state, passive) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py", line 508, in _load_for_state return self._emit_lazyload(session, state, ident_key) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py", line 552, in _emit_lazyload return q._load_on_ident(ident_key) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2512, in _load_on_ident return q.one() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2184, in one ret = list(self) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2227, in __iter__ return self._execute_and_instances(context) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2240, in _execute_and_instances close_with_result=True) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py", line 2231, in _connection_from_session **kw) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py", line 774, in connection bind = self.get_bind(mapper, clause=clause, **kw) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py", line 1052, in get_bind c_mapper = mapper is not None and _class_to_mapper(mapper) or None File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py", line 680, in _class_to_mapper mapperlib.configure_mappers() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 2263, in configure_mappers mapper._post_configure_properties() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py", line 1172, in _post_configure_properties prop.init() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/interfaces.py", line 128, in init self.do_init() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py", line 910, in do_init self._process_dependent_arguments() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py", line 998, in _process_dependent_arguments self.target = self.mapper.mapped_table File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py", line 494, in __get__ obj.__dict__[self.__name__] = result = self.fget(obj) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py", line 891, in mapper mapper_ = mapper.class_mapper(self.argument(), File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/ext/declarative.py", line 1428, in return_cls (prop.parent, arg, n.args[0], cls) InvalidRequestError: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name ("name 'Message' is not defined"). If this is a class name, consider adding this relationship() to the <class 'kombu.transport.sqlalchemy.Queue'> class after both dependent classes have been defined.
---
After that it starts throwing the exception in monitor_step that I previously posted. Has anyone seen a potentially related issue? Would an update to the latest galaxy code help? I see there are newer versions of SQLAlchemy available. Are they part of a newer code base?
Thanks a lot for your help Ulf
On 07/10/14 12:20, Ulf Schaefer wrote:
Update:
The usual switching it off and on again (server reboot) has resolved the problem (for now), albeit in a rather unsatisfactory manner.
If there are any insights what caused this behaviour and how it can be avoided in the future I'd be more than happy to hear them.
Cheers Ulf
On 07/10/14 11:04, Dannon Baker wrote:
One per second? Can you tell me more about your configuration? This is an odd bug with multiple mapper initialization that I haven't been able to reproduce yet, so any information will help. Database configuration, number of processes, etc. On Oct 7, 2014 11:46 AM, "Ulf Schaefer" <Ulf.Schaefer@phe.gov.uk> wrote:
Dear all
Maybe one of you can shed some light on this error message that I see in the log file for one of my handler processes. I get about one of them per second. The effect is that most of the jobs remain in the "waiting to run" stage.
The postgres database is running on a separate server and appear to be doing just fine.
Any help is greatly appreciated.
Thanks Ulf
---
galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step Traceback (most recent call last): File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
line 161, in __monitor self.__monitor_step() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py",
line 184, in __monitor_step hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \ File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py",
line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
line 1088, in query return self._query_cls(entities, self, **kwargs) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 108, in __init__ self._set_entities(entities) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 117, in _set_entities self._setup_aliasizers(self._entities) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py",
line 132, in _setup_aliasizers _entity_info(entity) File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py",
line 578, in _entity_info mapperlib.configure_mappers() File
"/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py",
line 2260, in configure_mappers raise e InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name ("name 'Message' is not defined"). If this is a class name, consider adding this relationship() to the <class 'kombu.transport.sqlalchemy.Queue'> class after both dependent classes have been defined.
---
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE **************************************************************************
************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE **************************************************************************