Problems with Galaxy cluster using grid engine
Hi All, Trying to setup a Galaxy cluster using Rocks Gridengine OS is Centos 6.5. psql (9.1.18) shell bash Getting error messages in paster.log below. I can submit jobs to Gridengine using qsub so this is not an issue. But when trying to "Upload File from you computer”, history indicates jobs does not complete. Any help would be appreciated. galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,272 Changing ownership of /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,297 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,297 Changing ownership of /share/apps/galaxy/database/tmp/tmplIgC3n with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/tmplIgC3n rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,323 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/tmplIgC3n failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common INFO 2015-10-02 09:35:02,357 tool upload1 created job id 101 galaxy.tools.execute DEBUG 2015-10-02 09:35:02,423 Tool [upload1] created job [101] (332.351 ms) 206.124.61.6 - - [02/Oct/2015:09:34:59 -0500] "POST /api/tools HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:02 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs DEBUG 2015-10-02 09:35:02,676 (101) Working directory for job is: /share/apps/galaxy/database/job_working_directory/000/101 galaxy.jobs.handler DEBUG 2015-10-02 09:35:02,682 (101) Dispatching to drmaa runner galaxy.jobs DEBUG 2015-10-02 09:35:02,894 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners DEBUG 2015-10-02 09:35:02,903 Job [101] queued (220.456 ms) galaxy.jobs.handler INFO 2015-10-02 09:35:02,958 (101) Job dispatched galaxy.jobs.command_factory INFO 2015-10-02 09:35:03,821 Built script [/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] for tool command[/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] galaxy.jobs.runners DEBUG 2015-10-02 09:35:04,010 (101) command is: /share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh; return_code=$?; python "/share/apps/galaxy/database/job_working_directory/000/101/set_metadata_QaaegG.py" "/share/apps/galaxy/database/tmp/tmpglze54" "/share/apps/galaxy/database/job_working_directory/000/101/galaxy.json" "/share/apps/galaxy/database/job_working_directory/000/101/metadata_in_HistoryDatasetAssociation_71_7DAllZ,/share/apps/galaxy/database/job_working_directory/000/101/metadata_kwds_HistoryDatasetAssociation_71_YiPkTL,/share/apps/galaxy/database/job_working_directory/000/101/metadata_out_HistoryDatasetAssociation_71_JbkolS,/share/apps/galaxy/database/job_working_directory/000/101/metadata_results_HistoryDatasetAssociation_71_d93tKG,/share/apps/galaxy/database/job_working_directory/000/101/galaxy_dataset_71.dat,/share/apps/galaxy/database/job_working_directory/000/101/metadata_override_HistoryDatasetAssociation_71_ih81Fj" 5242880; sh -c "exit $return_code" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,074 (101) submitting file /share/apps/galaxy/database/job_working_directory/000/101/galaxy_101.sh galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,075 (101) native specification is: -q galaxy.q -V galaxy.jobs DEBUG 2015-10-02 09:35:04,075 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 rpolich 507 galaxy.jobs ERROR 2015-10-02 09:35:04,102 (101) Failed to change ownership of /share/apps/galaxy/database/job_working_directory/000/101, making world-writable instead Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1649, in change_ownership_for_run self._change_ownership( self.user_system_pwent[0], str( self.user_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,102 (101) submitting with credentials: rpolich [uid: 1006] galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,104 (101) Job script for external submission is: /share/apps/galaxy/database/gridengine/101.jt_json galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,104 Running command ['/usr/bin/sudo', '-E', '/share/apps/galaxy/scripts/drmaa_external_runner.py', '1006', '/share/apps/galaxy/database/gridengine/101.jt_json'] galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,308 (101) queued as 239 galaxy.jobs DEBUG 2015-10-02 09:35:04,375 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:05,462 (101/239) state change: job is queued and active 206.124.61.6 - - [02/Oct/2015:09:35:06 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:10 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:14 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:17,626 (101/239) state change: job is running 206.124.61.6 - - [02/Oct/2015:09:35:18 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:22,084 (101/239) job left DRM queue with following message: code 18: The job specified by the 'jobid' does not exist. galaxy.jobs DEBUG 2015-10-02 09:35:22,212 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 galaxy 507 galaxy.jobs.runners ERROR 2015-10-02 09:35:22,240 (unknown) Unhandled exception calling finish_job Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 100, in run_next method(arg) File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 554, in finish_job job_state.job_wrapper.reclaim_ownership() File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1657, in reclaim_ownership self._change_ownership( self.galaxy_system_pwent[0], str( self.galaxy_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError 206.124.61.6 - - [02/Oct/2015:09:35:22 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" My job_conf.xml below…. <?xml version="1.0"?> <!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). --> <job_conf> <plugins> <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/> </plugins> <handlers> <handler id="main"/> </handlers> <destinations default="sge_default"> <!--destination id="big_jobs" runner="drmaa"> <param id="nativeSpecification">-P bignodes -R y -pe threads 8</param> </destination--> <destination id="sge_default" runner="drmaa"> <param id="nativeSpecification">-q galaxy.q -V</param> </destination> <destination id="local" runner="local"/> </destinations> </job_conf> Output from qacct -j qname galaxy.q hostname compute-1-1703.local group galaxy owner rpolich project NONE department defaultdepartment jobname g101_upload1_rpolich_txbiomed_org jobnumber 239 taskid undefined account sge priority 0 qsub_time Fri Oct 2 09:35:04 2015 start_time Fri Oct 2 09:35:17 2015 end_time Fri Oct 2 09:35:21 2015 granted_pe NONE slots 1 failed 0 exit_status 0 ru_wallclock 4 ru_utime 1.975 ru_stime 0.494 ru_maxrss 37792 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 63746 ru_majflt 7 ru_nswap 0 ru_inblock 26720 ru_oublock 152 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 6188 ru_nivcsw 522 cpu 2.469 mem 0.372 io 0.225 iow 0.000 maxvmem 463.258M arid undefined Thank you, Richard Polich Systems Administrator Department of Genetics Texas Biomedical Research Institute 7620 NW Loop 410, San Antonio, TX 78227-5301 Phone:(210)258-9727 Email: rpolich@txbiomed.org<mailto:rpolich@txbiomed.org>
Do you have drmaa_external_** options set in config/galaxy.ini? It seems like maybe you do. I would try to get Galaxy working without those first, just submitting everything as the Galaxy user. -John On Fri, Oct 2, 2015 at 4:05 PM, Richard Polich <rpolich@txbiomed.org> wrote:
Hi All,
Trying to setup a Galaxy cluster using Rocks Gridengine
OS is Centos 6.5. psql (9.1.18) shell bash
Getting error messages in paster.log below. I can submit jobs to Gridengine using qsub so this is not an issue. But when trying to "Upload File from you computer”, history indicates jobs does not complete.
Any help would be appreciated.
galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,272 Changing ownership of /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,297 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs failed: sudo: no tty present and no askpass program specified
galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,297 Changing ownership of /share/apps/galaxy/database/tmp/tmplIgC3n with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/tmplIgC3n rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,323 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/tmplIgC3n failed: sudo: no tty present and no askpass program specified
galaxy.tools.actions.upload_common INFO 2015-10-02 09:35:02,357 tool upload1 created job id 101 galaxy.tools.execute DEBUG 2015-10-02 09:35:02,423 Tool [upload1] created job [101] (332.351 ms) 206.124.61.6 - - [02/Oct/2015:09:34:59 -0500] "POST /api/tools HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:02 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs DEBUG 2015-10-02 09:35:02,676 (101) Working directory for job is: /share/apps/galaxy/database/job_working_directory/000/101 galaxy.jobs.handler DEBUG 2015-10-02 09:35:02,682 (101) Dispatching to drmaa runner galaxy.jobs DEBUG 2015-10-02 09:35:02,894 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners DEBUG 2015-10-02 09:35:02,903 Job [101] queued (220.456 ms) galaxy.jobs.handler INFO 2015-10-02 09:35:02,958 (101) Job dispatched galaxy.jobs.command_factory INFO 2015-10-02 09:35:03,821 Built script [/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] for tool command[/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] galaxy.jobs.runners DEBUG 2015-10-02 09:35:04,010 (101) command is: /share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh; return_code=$?; python "/share/apps/galaxy/database/job_working_directory/000/101/set_metadata_QaaegG.py" "/share/apps/galaxy/database/tmp/tmpglze54" "/share/apps/galaxy/database/job_working_directory/000/101/galaxy.json" "/share/apps/galaxy/database/job_working_directory/000/101/metadata_in_HistoryDatasetAssociation_71_7DAllZ,/share/apps/galaxy/database/job_working_directory/000/101/metadata_kwds_HistoryDatasetAssociation_71_YiPkTL,/share/apps/galaxy/database/job_working_directory/000/101/metadata_out_HistoryDatasetAssociation_71_JbkolS,/share/apps/galaxy/database/job_working_directory/000/101/metadata_results_HistoryDatasetAssociation_71_d93tKG,/share/apps/galaxy/database/job_working_directory/000/101/galaxy_dataset_71.dat,/share/apps/galaxy/database/job_working_directory/000/101/metadata_override_HistoryDatasetAssociation_71_ih81Fj" 5242880; sh -c "exit $return_code" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,074 (101) submitting file /share/apps/galaxy/database/job_working_directory/000/101/galaxy_101.sh galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,075 (101) native specification is: -q galaxy.q -V galaxy.jobs DEBUG 2015-10-02 09:35:04,075 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 rpolich 507 galaxy.jobs ERROR 2015-10-02 09:35:04,102 (101) Failed to change ownership of /share/apps/galaxy/database/job_working_directory/000/101, making world-writable instead Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1649, in change_ownership_for_run self._change_ownership( self.user_system_pwent[0], str( self.user_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,102 (101) submitting with credentials: rpolich [uid: 1006] galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,104 (101) Job script for external submission is: /share/apps/galaxy/database/gridengine/101.jt_json galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,104 Running command ['/usr/bin/sudo', '-E', '/share/apps/galaxy/scripts/drmaa_external_runner.py', '1006', '/share/apps/galaxy/database/gridengine/101.jt_json'] galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,308 (101) queued as 239 galaxy.jobs DEBUG 2015-10-02 09:35:04,375 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:05,462 (101/239) state change: job is queued and active 206.124.61.6 - - [02/Oct/2015:09:35:06 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:10 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:14 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:17,626 (101/239) state change: job is running 206.124.61.6 - - [02/Oct/2015:09:35:18 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:22,084 (101/239) job left DRM queue with following message: code 18: The job specified by the 'jobid' does not exist. galaxy.jobs DEBUG 2015-10-02 09:35:22,212 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 galaxy 507 galaxy.jobs.runners ERROR 2015-10-02 09:35:22,240 (unknown) Unhandled exception calling finish_job Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 100, in run_next method(arg) File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 554, in finish_job job_state.job_wrapper.reclaim_ownership() File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1657, in reclaim_ownership self._change_ownership( self.galaxy_system_pwent[0], str( self.galaxy_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError 206.124.61.6 - - [02/Oct/2015:09:35:22 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0"
My job_conf.xml below….
<?xml version="1.0"?> <!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). --> <job_conf> <plugins> <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/> </plugins> <handlers> <handler id="main"/> </handlers> <destinations default="sge_default"> <!--destination id="big_jobs" runner="drmaa"> <param id="nativeSpecification">-P bignodes -R y -pe threads 8</param> </destination--> <destination id="sge_default" runner="drmaa"> <param id="nativeSpecification">-q galaxy.q -V</param> </destination> <destination id="local" runner="local"/> </destinations> </job_conf>
Output from qacct -j
qname galaxy.q hostname compute-1-1703.local group galaxy owner rpolich project NONE department defaultdepartment jobname g101_upload1_rpolich_txbiomed_org jobnumber 239 taskid undefined account sge priority 0 qsub_time Fri Oct 2 09:35:04 2015 start_time Fri Oct 2 09:35:17 2015 end_time Fri Oct 2 09:35:21 2015 granted_pe NONE slots 1 failed 0 exit_status 0 ru_wallclock 4 ru_utime 1.975 ru_stime 0.494 ru_maxrss 37792 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 63746 ru_majflt 7 ru_nswap 0 ru_inblock 26720 ru_oublock 152 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 6188 ru_nivcsw 522 cpu 2.469 mem 0.372 io 0.225 iow 0.000 maxvmem 463.258M arid undefined
Thank you,
Richard Polich Systems Administrator Department of Genetics Texas Biomedical Research Institute 7620 NW Loop 410, San Antonio, TX 78227-5301 Phone:(210)258-9727 Email: rpolich@txbiomed.org
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi John, Yes, I came across a listing to enable this in /config/galaxy.ini for Gridengine. Peter van Heusden brought this error to my attention. I have now commented this out and can now upload files. #drmaa_external_runjob_script = /share/apps/galaxy/scripts/drmaa_external_runner.py #drmaa_external_killjob_script = /share/apps/galaxy/scripts/drmaa_external_killer.py #external_chown_script = /share/apps/galaxy/scripts/external_chown_script.py Thank you very much, Richard On Oct 7, 2015, at 8:40 AM, John Chilton <jmchilton@gmail.com<mailto:jmchilton@gmail.com>> wrote: Do you have drmaa_external_** options set in config/galaxy.ini? It seems like maybe you do. I would try to get Galaxy working without those first, just submitting everything as the Galaxy user. -John On Fri, Oct 2, 2015 at 4:05 PM, Richard Polich <rpolich@txbiomed.org<mailto:rpolich@txbiomed.org>> wrote: Hi All, Trying to setup a Galaxy cluster using Rocks Gridengine OS is Centos 6.5. psql (9.1.18) shell bash Getting error messages in paster.log below. I can submit jobs to Gridengine using qsub so this is not an issue. But when trying to "Upload File from you computer”, history indicates jobs does not complete. Any help would be appreciated. galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,272 Changing ownership of /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,297 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/upload_file_data_xvpaYs failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common DEBUG 2015-10-02 09:35:02,297 Changing ownership of /share/apps/galaxy/database/tmp/tmplIgC3n with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/tmp/tmplIgC3n rpolich 507 galaxy.tools.actions.upload_common WARNING 2015-10-02 09:35:02,323 Changing ownership of uploaded file /share/apps/galaxy/database/tmp/tmplIgC3n failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common INFO 2015-10-02 09:35:02,357 tool upload1 created job id 101 galaxy.tools.execute DEBUG 2015-10-02 09:35:02,423 Tool [upload1] created job [101] (332.351 ms) 206.124.61.6 - - [02/Oct/2015:09:34:59 -0500] "POST /api/tools HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:02 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs DEBUG 2015-10-02 09:35:02,676 (101) Working directory for job is: /share/apps/galaxy/database/job_working_directory/000/101 galaxy.jobs.handler DEBUG 2015-10-02 09:35:02,682 (101) Dispatching to drmaa runner galaxy.jobs DEBUG 2015-10-02 09:35:02,894 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners DEBUG 2015-10-02 09:35:02,903 Job [101] queued (220.456 ms) galaxy.jobs.handler INFO 2015-10-02 09:35:02,958 (101) Job dispatched galaxy.jobs.command_factory INFO 2015-10-02 09:35:03,821 Built script [/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] for tool command[/share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh] galaxy.jobs.runners DEBUG 2015-10-02 09:35:04,010 (101) command is: /share/apps/galaxy/database/job_working_directory/000/101/tool_script.sh; return_code=$?; python "/share/apps/galaxy/database/job_working_directory/000/101/set_metadata_QaaegG.py" "/share/apps/galaxy/database/tmp/tmpglze54" "/share/apps/galaxy/database/job_working_directory/000/101/galaxy.json" "/share/apps/galaxy/database/job_working_directory/000/101/metadata_in_HistoryDatasetAssociation_71_7DAllZ,/share/apps/galaxy/database/job_working_directory/000/101/metadata_kwds_HistoryDatasetAssociation_71_YiPkTL,/share/apps/galaxy/database/job_working_directory/000/101/metadata_out_HistoryDatasetAssociation_71_JbkolS,/share/apps/galaxy/database/job_working_directory/000/101/metadata_results_HistoryDatasetAssociation_71_d93tKG,/share/apps/galaxy/database/job_working_directory/000/101/galaxy_dataset_71.dat,/share/apps/galaxy/database/job_working_directory/000/101/metadata_override_HistoryDatasetAssociation_71_ih81Fj" 5242880; sh -c "exit $return_code" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,074 (101) submitting file /share/apps/galaxy/database/job_working_directory/000/101/galaxy_101.sh galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,075 (101) native specification is: -q galaxy.q -V galaxy.jobs DEBUG 2015-10-02 09:35:04,075 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 rpolich 507 galaxy.jobs ERROR 2015-10-02 09:35:04,102 (101) Failed to change ownership of /share/apps/galaxy/database/job_working_directory/000/101, making world-writable instead Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1649, in change_ownership_for_run self._change_ownership( self.user_system_pwent[0], str( self.user_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,102 (101) submitting with credentials: rpolich [uid: 1006] galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:04,104 (101) Job script for external submission is: /share/apps/galaxy/database/gridengine/101.jt_json galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,104 Running command ['/usr/bin/sudo', '-E', '/share/apps/galaxy/scripts/drmaa_external_runner.py', '1006', '/share/apps/galaxy/database/gridengine/101.jt_json'] galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:04,308 (101) queued as 239 galaxy.jobs DEBUG 2015-10-02 09:35:04,375 (101) Persisting job destination (destination id: sge_default) galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:05,462 (101/239) state change: job is queued and active 206.124.61.6 - - [02/Oct/2015:09:35:06 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:10 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" 206.124.61.6 - - [02/Oct/2015:09:35:14 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa DEBUG 2015-10-02 09:35:17,626 (101/239) state change: job is running 206.124.61.6 - - [02/Oct/2015:09:35:18 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" galaxy.jobs.runners.drmaa INFO 2015-10-02 09:35:22,084 (101/239) job left DRM queue with following message: code 18: The job specified by the 'jobid' does not exist. galaxy.jobs DEBUG 2015-10-02 09:35:22,212 (101) Changing ownership of working directory with: /usr/bin/sudo -E /share/apps/galaxy/scripts/external_chown_script.py /share/apps/galaxy/database/job_working_directory/000/101 galaxy 507 galaxy.jobs.runners ERROR 2015-10-02 09:35:22,240 (unknown) Unhandled exception calling finish_job Traceback (most recent call last): File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 100, in run_next method(arg) File "/share/apps/galaxy/lib/galaxy/jobs/runners/__init__.py", line 554, in finish_job job_state.job_wrapper.reclaim_ownership() File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1657, in reclaim_ownership self._change_ownership( self.galaxy_system_pwent[0], str( self.galaxy_system_pwent[3] ) ) File "/share/apps/galaxy/lib/galaxy/jobs/__init__.py", line 1643, in _change_ownership assert p.returncode == 0 AssertionError 206.124.61.6 - - [02/Oct/2015:09:35:22 -0500] "GET /api/histories/1fad1eaf5f4f1766/contents HTTP/1.1" 200 - "http://galaxy.txbiomedgenetics.org:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:40.0) Gecko/20100101 Firefox/40.0" My job_conf.xml below…. <?xml version="1.0"?> <!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). --> <job_conf> <plugins> <plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/> </plugins> <handlers> <handler id="main"/> </handlers> <destinations default="sge_default"> <!--destination id="big_jobs" runner="drmaa"> <param id="nativeSpecification">-P bignodes -R y -pe threads 8</param> </destination--> <destination id="sge_default" runner="drmaa"> <param id="nativeSpecification">-q galaxy.q -V</param> </destination> <destination id="local" runner="local"/> </destinations> </job_conf> Output from qacct -j qname galaxy.q hostname compute-1-1703.local group galaxy owner rpolich project NONE department defaultdepartment jobname g101_upload1_rpolich_txbiomed_org jobnumber 239 taskid undefined account sge priority 0 qsub_time Fri Oct 2 09:35:04 2015 start_time Fri Oct 2 09:35:17 2015 end_time Fri Oct 2 09:35:21 2015 granted_pe NONE slots 1 failed 0 exit_status 0 ru_wallclock 4 ru_utime 1.975 ru_stime 0.494 ru_maxrss 37792 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 63746 ru_majflt 7 ru_nswap 0 ru_inblock 26720 ru_oublock 152 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 6188 ru_nivcsw 522 cpu 2.469 mem 0.372 io 0.225 iow 0.000 maxvmem 463.258M arid undefined Thank you, Richard Polich Systems Administrator Department of Genetics Texas Biomedical Research Institute 7620 NW Loop 410, San Antonio, TX 78227-5301 Phone:(210)258-9727 Email: rpolich@txbiomed.org<mailto:rpolich@txbiomed.org> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ Richard Polich Systems Administrator Department of Genetics Texas Biomedical Research Institute 7620 NW Loop 410, San Antonio, TX 78227-5301 Phone:(210)258-9727 Email: rpolich@txbiomed.org<mailto:rpolich@txbiomed.org>
participants (2)
-
John Chilton
-
Richard Polich