lastz on local galaxy failing
Hi, when I start a lastz job it submits a job to GridEngine. But this job seems to hang. The process running is python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source= ... This wrapper seems to not start a lastz but lastz is in the path. After 2 days then Galaxy shows "Job did not produce output" Here is the paster.log from the start of the job. galaxy.jobs DEBUG 2012-11-15 09:03:07,539 (254) Working directory for job is: /data/galaxy/galaxy-dist/database/job_working_directory/000/254 galaxy.jobs.handler DEBUG 2012-11-15 09:03:07,540 dispatching job 254 to drmaa runner galaxy.jobs.handler INFO 2012-11-15 09:03:07,688 (254) Job dispatched galaxy.tools DEBUG 2012-11-15 09:03:07,980 Building dependency shell command for dependency 'lastz' galaxy.tools WARNING 2012-11-15 09:03:07,981 Failed to resolve dependency on 'lastz', ignoring galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,669 (254) submitting file /data/galaxy/galaxy-dist/database/pbs/galaxy_254.sh galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,670 (254) command is: python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source=cached --source_select=pre_set --out_format=sam --input2=/data/galaxy/galaxy-dist/database/files/027/dataset_27795.dat --input1="/data/galaxy/galaxy-dist/tool-data/shared/ucsc/hg18/seq/hg18.2bit" --ref_sequences="None" --pre_set_options=yasra98 --identity_min=0 --identity_max=100 --coverage=0 --output=/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat --unmask=yes --lastzSeqsFileDir=/data/galaxy/galaxy-dist/tool-data; cd /data/galaxy/galaxy-dist; /data/galaxy/galaxy-dist/set_metadata.sh ./database/files /data/galaxy/galaxy-dist/database/job_working_directory/000/254 . /data/galaxy/galaxy-dist/universe_wsgi.ini /data/galaxy/galaxy-dist/database/tmp/tmpL4MMJV /data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy.json /data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_in_HistoryDatasetAssociation_288_9ynzEN,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_kwds_HistoryDatasetAssociation_288_AEIseO,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_out_HistoryDatasetAssociation_288_RhsufT,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_results_HistoryDatasetAssociation_288_86DYsR,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_override_HistoryDatasetAssociation_288_HMRNEG galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,676 run as user ['kuntzagk', '600'] galaxy.jobs DEBUG 2012-11-15 09:03:08,676 (254) Changing ownership of working directory with: /usr/bin/sudo -E scripts/external_chown_script.py /data/galaxy/galaxy-dist/database/job_working_directory/000/254 kuntzagk 600 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:09,148 (254) Job script for external submission is: /data/galaxy/galaxy-dist/database/pbs/254.jt_json 141.80.188.178 - - [15/Nov/2012:09:03:10 +0200] "POST /galaxy/root/history_item_updates HTTP/1.1" 200 - "http://bbc.mdc-berlin.de/galaxy/history" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0" galaxy.jobs.runners.drmaa INFO 2012-11-15 09:03:11,553 (254) queued as 1282605 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:11,767 (254/1282605) state change: job is running -- Andreas Kuntzagk SystemAdministrator Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
Hi, I just noticed that this was another case of a tool that needs more then the 1GB memory that is default on our cluster. After adjusting the job_runner settings everything seems fine. regards, Andreas On 15.11.2012 09:08, Andreas Kuntzagk wrote:
Hi,
when I start a lastz job it submits a job to GridEngine. But this job seems to hang. The process running is
python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source= ...
This wrapper seems to not start a lastz but lastz is in the path. After 2 days then Galaxy shows "Job did not produce output"
Here is the paster.log from the start of the job.
galaxy.jobs DEBUG 2012-11-15 09:03:07,539 (254) Working directory for job is: /data/galaxy/galaxy-dist/database/job_working_directory/000/254 galaxy.jobs.handler DEBUG 2012-11-15 09:03:07,540 dispatching job 254 to drmaa runner galaxy.jobs.handler INFO 2012-11-15 09:03:07,688 (254) Job dispatched galaxy.tools DEBUG 2012-11-15 09:03:07,980 Building dependency shell command for dependency 'lastz' galaxy.tools WARNING 2012-11-15 09:03:07,981 Failed to resolve dependency on 'lastz', ignoring galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,669 (254) submitting file /data/galaxy/galaxy-dist/database/pbs/galaxy_254.sh galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,670 (254) command is: python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source=cached --source_select=pre_set --out_format=sam --input2=/data/galaxy/galaxy-dist/database/files/027/dataset_27795.dat --input1="/data/galaxy/galaxy-dist/tool-data/shared/ucsc/hg18/seq/hg18.2bit" --ref_sequences="None" --pre_set_options=yasra98 --identity_min=0 --identity_max=100 --coverage=0 --output=/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat --unmask=yes --lastzSeqsFileDir=/data/galaxy/galaxy-dist/tool-data; cd /data/galaxy/galaxy-dist; /data/galaxy/galaxy-dist/set_metadata.sh ./database/files /data/galaxy/galaxy-dist/database/job_working_directory/000/254 . /data/galaxy/galaxy-dist/universe_wsgi.ini /data/galaxy/galaxy-dist/database/tmp/tmpL4MMJV /data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy.json /data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_in_HistoryDatasetAssociation_288_9ynzEN,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_kwds_HistoryDatasetAssociation_288_AEIseO,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_out_HistoryDatasetAssociation_288_RhsufT,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_results_HistoryDatasetAssociation_288_86DYsR,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_override_HistoryDatasetAssociation_288_HMRNEG
galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,676 run as user ['kuntzagk', '600'] galaxy.jobs DEBUG 2012-11-15 09:03:08,676 (254) Changing ownership of working directory with: /usr/bin/sudo -E scripts/external_chown_script.py /data/galaxy/galaxy-dist/database/job_working_directory/000/254 kuntzagk 600 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:09,148 (254) Job script for external submission is: /data/galaxy/galaxy-dist/database/pbs/254.jt_json 141.80.188.178 - - [15/Nov/2012:09:03:10 +0200] "POST /galaxy/root/history_item_updates HTTP/1.1" 200 - "http://bbc.mdc-berlin.de/galaxy/history" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0" galaxy.jobs.runners.drmaa INFO 2012-11-15 09:03:11,553 (254) queued as 1282605 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:11,767 (254/1282605) state change: job is running
-- Andreas Kuntzagk SystemAdministrator Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
Howdy, Andreas, Lastz memory requirements are dependent on the size of the input sequences (mainly on the size of reference sequence) and, to a lesser extent, the genome's repeat content. I'm a little confused/concerned by how this failure was indicated. When run from a console, if lastz has a memory allocation failure, a message is written to stderr (e.g. "call to realloc failed to allocate 1,000,000,000 bytes") and the program exits to the shell with the status EXIT_FAILURE (an ISO C definition which I presume corresponds to a standard shell error code). Usually, if lastz isn't going to have enough memory, an allocation failure will occur early in the run as all the long term data structures are being built. This would normally be within the first five minutes. A later failure would (probably) mean that the long term structures were close to the memory limit and then alignments for one of the query sequences required enough additional memory to push us over the limit. I assume galaxy's "Job did not produce output" message must be based on a lack of any output to stdout. What is strange is that fact that it took 2 days to get this message. Lack of output to stdout suggests that the failure occurred before any queries were processed (strictly speaking, before any alignments were output). This should have occurred in the first few minutes of the run. Would it be possible for you to point me at the input sequences for this run, so that I can try running this via the console, and see if there's something happening in lastz that I don't understand? Bob H On Nov 26, 2012, at 3:35 AM, Andreas Kuntzagk wrote:
Hi,
I just noticed that this was another case of a tool that needs more then the 1GB memory that is default on our cluster. After adjusting the job_runner settings everything seems fine.
regards, Andreas
On 15.11.2012 09:08, Andreas Kuntzagk wrote:
Hi,
when I start a lastz job it submits a job to GridEngine. But this job seems to hang. The process running is
python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py -- ref_source= ...
This wrapper seems to not start a lastz but lastz is in the path. After 2 days then Galaxy shows "Job did not produce output"
Here is the paster.log from the start of the job.
galaxy.jobs DEBUG 2012-11-15 09:03:07,539 (254) Working directory for job is: /data/galaxy/galaxy-dist/database/job_working_directory/000/254 galaxy.jobs.handler DEBUG 2012-11-15 09:03:07,540 dispatching job 254 to drmaa runner galaxy.jobs.handler INFO 2012-11-15 09:03:07,688 (254) Job dispatched galaxy.tools DEBUG 2012-11-15 09:03:07,980 Building dependency shell command for dependency 'lastz' galaxy.tools WARNING 2012-11-15 09:03:07,981 Failed to resolve dependency on 'lastz', ignoring galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,669 (254) submitting file /data/galaxy/galaxy-dist/database/pbs/galaxy_254.sh galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,670 (254) command is: python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py -- ref_source=cached --source_select=pre_set --out_format=sam --input2=/data/galaxy/galaxy-dist/database/files/027/ dataset_27795.dat --input1="/data/galaxy/galaxy-dist/tool-data/shared/ucsc/hg18/seq/ hg18.2bit" --ref_sequences="None" --pre_set_options=yasra98 -- identity_min=0 --identity_max=100 --coverage=0 --output=/data/galaxy/galaxy-dist/database/job_working_directory/ 000/254/galaxy_dataset_27800.dat --unmask=yes --lastzSeqsFileDir=/data/galaxy/galaxy-dist/ tool-data; cd /data/galaxy/galaxy-dist; /data/galaxy/galaxy-dist/ set_metadata.sh ./database/files /data/galaxy/galaxy-dist/database/job_working_directory/000/254 . /data/galaxy/galaxy-dist/universe_wsgi.ini /data/galaxy/galaxy-dist/ database/tmp/tmpL4MMJV /data/galaxy/galaxy-dist/database/job_working_directory/000/254/ galaxy.json /data/galaxy/galaxy-dist/database/job_working_directory/000/254/ metadata_in_HistoryDatasetAssociation_288_9ynzEN,/data/galaxy/ galaxy-dist/database/job_working_directory/000/254/ metadata_kwds_HistoryDatasetAssociation_288_AEIseO,/data/galaxy/ galaxy-dist/database/job_working_directory/000/254/ metadata_out_HistoryDatasetAssociation_288_RhsufT,/data/galaxy/ galaxy-dist/database/job_working_directory/000/254/ metadata_results_HistoryDatasetAssociation_288_86DYsR,/data/galaxy/ galaxy-dist/database/job_working_directory/000/254/ galaxy_dataset_27800.dat,/data/galaxy/galaxy-dist/database/ job_working_directory/000/254/ metadata_override_HistoryDatasetAssociation_288_HMRNEG
galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,676 run as user ['kuntzagk', '600'] galaxy.jobs DEBUG 2012-11-15 09:03:08,676 (254) Changing ownership of working directory with: /usr/bin/sudo -E scripts/external_chown_script.py /data/galaxy/galaxy-dist/database/job_working_directory/000/254 kuntzagk 600 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:09,148 (254) Job script for external submission is: /data/galaxy/galaxy-dist/database/pbs/254.jt_json 141.80.188.178 - - [15/Nov/2012:09:03:10 +0200] "POST /galaxy/root/ history_item_updates HTTP/1.1" 200 - "http://bbc.mdc-berlin.de/galaxy/history" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0" galaxy.jobs.runners.drmaa INFO 2012-11-15 09:03:11,553 (254) queued as 1282605 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:11,767 (254/1282605) state change: job is running
-- Andreas Kuntzagk
SystemAdministrator
Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany
http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, What I remember is that while the lastz had died the python wrapper was still hanging around for these 2 days. Maybe it's related to this builtin scheduler of the lastz wrapper. Unfortunately I don't have time now to reproduce the error conditions. regards, Andreas On 26.11.2012 15:47, Bob Harris wrote:
Howdy, Andreas,
Lastz memory requirements are dependent on the size of the input sequences (mainly on the size of reference sequence) and, to a lesser extent, the genome's repeat content.
I'm a little confused/concerned by how this failure was indicated. When run from a console, if lastz has a memory allocation failure, a message is written to stderr (e.g. "call to realloc failed to allocate 1,000,000,000 bytes") and the program exits to the shell with the status EXIT_FAILURE (an ISO C definition which I presume corresponds to a standard shell error code).
Usually, if lastz isn't going to have enough memory, an allocation failure will occur early in the run as all the long term data structures are being built. This would normally be within the first five minutes. A later failure would (probably) mean that the long term structures were close to the memory limit and then alignments for one of the query sequences required enough additional memory to push us over the limit.
I assume galaxy's "Job did not produce output" message must be based on a lack of any output to stdout. What is strange is that fact that it took 2 days to get this message. Lack of output to stdout suggests that the failure occurred before any queries were processed (strictly speaking, before any alignments were output). This should have occurred in the first few minutes of the run.
Would it be possible for you to point me at the input sequences for this run, so that I can try running this via the console, and see if there's something happening in lastz that I don't understand?
Bob H
On Nov 26, 2012, at 3:35 AM, Andreas Kuntzagk wrote:
Hi,
I just noticed that this was another case of a tool that needs more then the 1GB memory that is default on our cluster. After adjusting the job_runner settings everything seems fine.
regards, Andreas
On 15.11.2012 09:08, Andreas Kuntzagk wrote:
Hi,
when I start a lastz job it submits a job to GridEngine. But this job seems to hang. The process running is
python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source= ...
This wrapper seems to not start a lastz but lastz is in the path. After 2 days then Galaxy shows "Job did not produce output"
Here is the paster.log from the start of the job.
galaxy.jobs DEBUG 2012-11-15 09:03:07,539 (254) Working directory for job is: /data/galaxy/galaxy-dist/database/job_working_directory/000/254 galaxy.jobs.handler DEBUG 2012-11-15 09:03:07,540 dispatching job 254 to drmaa runner galaxy.jobs.handler INFO 2012-11-15 09:03:07,688 (254) Job dispatched galaxy.tools DEBUG 2012-11-15 09:03:07,980 Building dependency shell command for dependency 'lastz' galaxy.tools WARNING 2012-11-15 09:03:07,981 Failed to resolve dependency on 'lastz', ignoring galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,669 (254) submitting file /data/galaxy/galaxy-dist/database/pbs/galaxy_254.sh galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,670 (254) command is: python /data/galaxy/galaxy-dist/tools/sr_mapping/lastz_wrapper.py --ref_source=cached --source_select=pre_set --out_format=sam --input2=/data/galaxy/galaxy-dist/database/files/027/dataset_27795.dat --input1="/data/galaxy/galaxy-dist/tool-data/shared/ucsc/hg18/seq/hg18.2bit" --ref_sequences="None" --pre_set_options=yasra98 --identity_min=0 --identity_max=100 --coverage=0 --output=/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat --unmask=yes --lastzSeqsFileDir=/data/galaxy/galaxy-dist/tool-data; cd /data/galaxy/galaxy-dist; /data/galaxy/galaxy-dist/set_metadata.sh ./database/files /data/galaxy/galaxy-dist/database/job_working_directory/000/254 . /data/galaxy/galaxy-dist/universe_wsgi.ini /data/galaxy/galaxy-dist/database/tmp/tmpL4MMJV /data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy.json /data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_in_HistoryDatasetAssociation_288_9ynzEN,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_kwds_HistoryDatasetAssociation_288_AEIseO,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_out_HistoryDatasetAssociation_288_RhsufT,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_results_HistoryDatasetAssociation_288_86DYsR,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/galaxy_dataset_27800.dat,/data/galaxy/galaxy-dist/database/job_working_directory/000/254/metadata_override_HistoryDatasetAssociation_288_HMRNEG
galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:08,676 run as user ['kuntzagk', '600'] galaxy.jobs DEBUG 2012-11-15 09:03:08,676 (254) Changing ownership of working directory with: /usr/bin/sudo -E scripts/external_chown_script.py /data/galaxy/galaxy-dist/database/job_working_directory/000/254 kuntzagk 600 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:09,148 (254) Job script for external submission is: /data/galaxy/galaxy-dist/database/pbs/254.jt_json 141.80.188.178 - - [15/Nov/2012:09:03:10 +0200] "POST /galaxy/root/history_item_updates HTTP/1.1" 200 - "http://bbc.mdc-berlin.de/galaxy/history" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0" galaxy.jobs.runners.drmaa INFO 2012-11-15 09:03:11,553 (254) queued as 1282605 galaxy.jobs.runners.drmaa DEBUG 2012-11-15 09:03:11,767 (254/1282605) state change: job is running
-- Andreas Kuntzagk
SystemAdministrator
Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany
http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Andreas Kuntzagk SystemAdministrator Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
participants (2)
-
Andreas Kuntzagk
-
Bob Harris