Status on importing BAM file into Library does not update
I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com>wrote:
I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems: 1) samtools is not using the cluster. I assume this is a configuration setting somewhere. 2) Why is galaxy trying to index the bam files if the bai files exists in the same directory as the bam file. The BAM files are sorted and have 'SO:coordinate'. I also have samtools-0.1.18 installed.
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com>wrote:
I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
2) Why is galaxy trying to index the bam files if the bai files exists in the same directory as the bam file. The BAM files are sorted and have 'SO:coordinate'. I also have samtools-0.1.18 installed.
It also appears:
3) Galaxy is unable to import .bai files. It says there was an error importing these files "The uploaded binary file contains inappropriate content" 4) Galaxy is trying to change the permissions on the files I'm importing (as links). Thankfully the data tree is read-only. If I'm linking Galaxy to my date, why does Galaxy want to change the permissions? This seems like something it shouldn't be doing i.e. Galaxy should leave external data alone.
On Jan 4, 2012, at 6:44 PM, Ryan Golhar wrote:
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com> wrote: I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
See set_metadata_externally in universe_wsgi.ini. This should be set to True to run on the cluster. If you haven't seen the rest of the production server documentation, see http://usegalaxy.org/production
2) Why is galaxy trying to index the bam files if the bai files exists in the same directory as the bam file. The BAM files are sorted and have 'SO:coordinate'. I also have samtools-0.1.18 installed.
Galaxy does not yet have a method to upload BAM files with a precreated .bai.
It also appears:
3) Galaxy is unable to import .bai files. It says there was an error importing these files "The uploaded binary file contains inappropriate content"
See #2. Galaxy will always create its own .bai.
4) Galaxy is trying to change the permissions on the files I'm importing (as links). Thankfully the data tree is read-only. If I'm linking Galaxy to my date, why does Galaxy want to change the permissions? This seems like something it shouldn't be doing i.e. Galaxy should leave external data alone.
Hrm, this is not good. I'll have a look at this. --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Jan 4, 2012, at 6:44 PM, Ryan Golhar wrote:
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com> wrote: I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
See set_metadata_externally in universe_wsgi.ini. This should be set to True to run on the cluster.
If you haven't seen the rest of the production server documentation, see http://usegalaxy.org/production
This is already set. I set this in universe_wsgi.ini (and universe_wsgi.webapp.ini and universe_wsgi.running.ini since I'm using a proxy server and load balancer on Apache). This was one of the first things I set up.
On Jan 5, 2012, at 11:29 AM, Ryan Golhar wrote:
On Jan 4, 2012, at 6:44 PM, Ryan Golhar wrote:
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com> wrote: I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
See set_metadata_externally in universe_wsgi.ini. This should be set to True to run on the cluster.
If you haven't seen the rest of the production server documentation, see http://usegalaxy.org/production
This is already set. I set this in universe_wsgi.ini (and universe_wsgi.webapp.ini and universe_wsgi.running.ini since I'm using a proxy server and load balancer on Apache). This was one of the first things I set up.
Does the upload tool run on the cluster? See upload1 under [galaxy:tool_runners] in universe_wsgi.runner.ini. --nate
I set it to run on the cluster: [galaxy@bic galaxy-dist]$ grep upload1 universe_wsgi.runner.ini #upload1 = local:/// On Thu, Jan 5, 2012 at 11:33 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Jan 5, 2012, at 11:29 AM, Ryan Golhar wrote:
On Jan 4, 2012, at 6:44 PM, Ryan Golhar wrote:
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar < ngsbioinformatics@gmail.com> wrote: I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
See set_metadata_externally in universe_wsgi.ini. This should be set to True to run on the cluster.
If you haven't seen the rest of the production server documentation, see http://usegalaxy.org/production
This is already set. I set this in universe_wsgi.ini (and universe_wsgi.webapp.ini and universe_wsgi.running.ini since I'm using a proxy server and load balancer on Apache). This was one of the first things I set up.
Does the upload tool run on the cluster? See upload1 under [galaxy:tool_runners] in universe_wsgi.runner.ini.
--nate
On Jan 5, 2012, at 11:41 AM, Ryan Golhar wrote:
I set it to run on the cluster:
[galaxy@bic galaxy-dist]$ grep upload1 universe_wsgi.runner.ini #upload1 = local:///
Could you set use_heartbeat = True in the runner's config file and then check the resulting heartbeat log files created in the root directory to get a stack trace to the call of samtools? Thanks, --nate
On Thu, Jan 5, 2012 at 11:33 AM, Nate Coraor <nate@bx.psu.edu> wrote: On Jan 5, 2012, at 11:29 AM, Ryan Golhar wrote:
On Jan 4, 2012, at 6:44 PM, Ryan Golhar wrote:
On Wed, Jan 4, 2012 at 5:17 PM, Ryan Golhar <ngsbioinformatics@gmail.com> wrote: I'm adding Data Libraries to my local galaxy instance. I'm doing this by importing directories that contain bam and bai files. I see the bam/bai files get added on the admin page and the Message is "This job is running". qstat shows the job run and complete. I checked my runner0.log and it registers the PBS job completed successfully. But the web page never updates. I tried to refresh the page by navigating away from it then back to it, but it still reads "This job is running". How do I fix this?
Some more information...I check my head node and I see samtools is running there. Its running 'samtools index'. So two problems:
1) samtools is not using the cluster. I assume this is a configuration setting somewhere.
See set_metadata_externally in universe_wsgi.ini. This should be set to True to run on the cluster.
If you haven't seen the rest of the production server documentation, see http://usegalaxy.org/production
This is already set. I set this in universe_wsgi.ini (and universe_wsgi.webapp.ini and universe_wsgi.running.ini since I'm using a proxy server and load balancer on Apache). This was one of the first things I set up.
Does the upload tool run on the cluster? See upload1 under [galaxy:tool_runners] in universe_wsgi.runner.ini.
--nate
On Thu, Jan 5, 2012 at 11:59 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Jan 5, 2012, at 11:41 AM, Ryan Golhar wrote:
I set it to run on the cluster:
[galaxy@bic galaxy-dist]$ grep upload1 universe_wsgi.runner.ini #upload1 = local:///
Could you set use_heartbeat = True in the runner's config file and then check the resulting heartbeat log files created in the root directory to get a stack trace to the call of samtools?
Thanks, --nate
Hi Nate, I just tried importing another BAM file. I see the upload working on a compute node, but the indexing happens on the head node. 'samtools index' is never submitted to the cluster. Attached is a copy of the heartbeat log. Its 990K, hopefully it will go through. Ryan
On Jan 5, 2012, at 2:48 PM, Ryan Golhar wrote:
On Thu, Jan 5, 2012 at 11:59 AM, Nate Coraor <nate@bx.psu.edu> wrote: On Jan 5, 2012, at 11:41 AM, Ryan Golhar wrote:
I set it to run on the cluster:
[galaxy@bic galaxy-dist]$ grep upload1 universe_wsgi.runner.ini #upload1 = local:///
Could you set use_heartbeat = True in the runner's config file and then check the resulting heartbeat log files created in the root directory to get a stack trace to the call of samtools?
Thanks, --nate
Hi Nate,
I just tried importing another BAM file. I see the upload working on a compute node, but the indexing happens on the head node. 'samtools index' is never submitted to the cluster. Attached is a copy of the heartbeat log. Its 990K, hopefully it will go through.
Thread 1180711232, <Thread(Thread-8, started 1180711232)>: File "/share/apps/Python-2.6.7/lib/python2.6/threading.py", line 504, in __bootstrap self.__bootstrap_inner() File "/share/apps/Python-2.6.7/lib/python2.6/threading.py", line 532, in __bootstrap_inner self.run() File "/share/apps/Python-2.6.7/lib/python2.6/threading.py", line 484, in run self.__target(*self.__args, **self.__kwargs) File "/home/galaxy/galaxy-dist-9/lib/galaxy/jobs/runners/pbs.py", line 190, in run_next self.finish_job( obj ) File "/home/galaxy/galaxy-dist-9/lib/galaxy/jobs/runners/pbs.py", line 514, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/galaxy/galaxy-dist-9/lib/galaxy/jobs/__init__.py", line 611, in finish dataset.set_meta( overwrite = False ) File "/home/galaxy/galaxy-dist-9/lib/galaxy/model/__init__.py", line 886, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/galaxy/galaxy-dist-9/lib/galaxy/datatypes/binary.py", line 173, in set_meta exit_code = proc.wait() File "/share/apps/Python-2.6.7/lib/python2.6/subprocess.py", line 1182, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/share/apps/Python-2.6.7/lib/python2.6/subprocess.py", line 455, in _eintr_retry_call return func(*args) This indicates that set_meta is running locally, in the runner process. Can you make sure there's not a typo in your config? The other possibility is that external metadata setting failed and it's being retried internally (if that was true, you'd see messages indicated such in the server log). --nate
Ryan <heartbeat.log>___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
This indicates that set_meta is running locally, in the runner process. Can you make sure there's not a typo in your config? The other possibility is that external metadata setting failed and it's being retried internally (if that was true, you'd see messages indicated such in the server log).
I'm pretty sure there isn't a typo. Here is anything meta related (with
comment lines removed) in my universe_wsgi.*.ini files: [galaxy@bic galaxy-dist]$ grep set_meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True [galaxy@bic galaxy-dist]$ grep meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True I just tried it again on some BAM files, and nothing comes up in /var/log/messages or /var/log/httpd/error_log. Runner0.log also doesn't show anything except for the upload job being completed.
On Fri, Jan 6, 2012 at 12:55 PM, Ryan Golhar <ngsbioinformatics@gmail.com>wrote:
This indicates that set_meta is running locally, in the runner process. Can you make sure there's not a typo in your config? The other possibility is that external metadata setting failed and it's being retried internally (if that was true, you'd see messages indicated such in the server log).
I'm pretty sure there isn't a typo. Here is anything meta related (with comment lines removed) in my universe_wsgi.*.ini files:
[galaxy@bic galaxy-dist]$ grep set_meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True
[galaxy@bic galaxy-dist]$ grep meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True
I just tried it again on some BAM files, and nothing comes up in /var/log/messages or /var/log/httpd/error_log. Runner0.log also doesn't show anything except for the upload job being completed.
I'm still trying to track this one down. Can I add a debug output string to show what the value of set_metadata_externally is when its read in? If so, where would I do this?
On Jan 9, 2012, at 2:38 PM, Ryan Golhar wrote:
On Fri, Jan 6, 2012 at 12:55 PM, Ryan Golhar <ngsbioinformatics@gmail.com> wrote:
This indicates that set_meta is running locally, in the runner process. Can you make sure there's not a typo in your config? The other possibility is that external metadata setting failed and it's being retried internally (if that was true, you'd see messages indicated such in the server log).
I'm pretty sure there isn't a typo. Here is anything meta related (with comment lines removed) in my universe_wsgi.*.ini files:
[galaxy@bic galaxy-dist]$ grep set_meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True
[galaxy@bic galaxy-dist]$ grep meta *.ini universe_wsgi.ini:set_metadata_externally = True universe_wsgi.runner.ini:set_metadata_externally = True universe_wsgi.webapp.ini:set_metadata_externally = True
I just tried it again on some BAM files, and nothing comes up in /var/log/messages or /var/log/httpd/error_log. Runner0.log also doesn't show anything except for the upload job being completed.
I'm still trying to track this one down. Can I add a debug output string to show what the value of set_metadata_externally is when its read in? If so, where would I do this?
Hi Ryan, You could check it in lib/galaxy/config.py, after it's read. By any chance, are you using galaxy-central vs. galaxy-dist? It's possible that due to a bug I recently fixed and a certain combination of options, metadata for BAMs would always fail externally and be retried internally, although you should still see log messages indicating that this has happened. --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Nate Coraor
-
Ryan Golhar