This is just a guess, which may help you troubleshoot.
It could be a that python is reaching a stack limit: run ulimit -s and set it to a higher value if required
I’m completely guessing here but is it possible that the DRMAA is missing a linked library on the redhat system – check with ldd?
Regards,
Iyad Kandalaft
Iyad Kandalaft
Microbial Biodiversity Bioinformatics
Agriculture and Agri-Food Canada | Agriculture et Agroalimentaire Canada
960 Carling Ave.| 960 Ave. Carling
Ottawa, ON| Ottawa (ON) K1A 0C6
E-mail Address / Adresse courriel Iyad.Kandalaft@agr.gc.ca
Telephone | Téléphone 613-759-1228
Facsimile | Télécopieur 613-759-1701
Teletypewriter | Téléimprimeur 613-773-2600
Government of Canada | Gouvernement du Canada
From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu]
On Behalf Of I Kozin
Sent: Tuesday, June 10, 2014 12:42 PM
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] troubleshooting Galaxy with LSF
Hello,
This is largely a repost from the biostar forum following the suggestion there to post here.
I'm doing my first steps in setting up a Galaxy server with an LSF job scheduler. Recently LSF started supporting DRMAA again so I decided to give it a go.
I have two setups. The one that works is a stand along server (OpenSuse 12.1, python 2.7.2, LSF 9.1.2). By "works" I mean that when I login into Galaxy using a browser and upload a file, a job gets submitted and run and everything seems
fine.
The second setup does not work (RH 6.4, python 2.6.6, LSF 9.1.2). It's a server running Galaxy which is meant to submit jobs to an LSF cluster. When I similarly pick and download a file I get
Job <72266> is submitted to queue <short>.
./run.sh: line 79: 99087 Segmentation fault python ./scripts/paster.py serve universe_wsgi.ini $@
For the moment, I'm not bothered with the full server setup, I'm just testing whether Galaxy works with LSF and therefore run ./run.sh as a user.
The job configuration job_conf.xml is identical in both cases:
<?xml version="1.0"?>
<job_conf>
<plugins>
<plugin id="lsf" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner">
<param id="drmaa_library_path">/opt/gridware/lsf/9.1/linux2.6-glibc2.3-x86_64/lib/libdrmaa.so</param>
</plugin>
</plugins>
<handlers>
<handler id="main"/>
</handlers>
<destinations default="lsf_default">
<destination id="lsf_default" runner="lsf">
<param id="nativeSpecification">-W 24:00</param>
</destination>
</destinations>
</job_conf>
run.sh is only changed to allow remote access.
Most recently I tried replacing python with 2.7.5 to no avail. Still the same kind of error. I also updated Galaxy.
Any hints would be much appreciated. Thank you