Unsorted BAM file --> Galaxy crash?
Dear all, We are experiencing a crash of our Galaxy instance (latest git log is early January 2016) and it seems unable to restart. The last entries in paster.log says the following: galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,593 Enabling 'histories' API controller, class: HistoriesController galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,622 Enabling 'history_contents' API controller, class: HistoryContentsController [bam_index_core] the alignment is not sorted (XXX): 21-th chr > 11-th chr [bam_index_build2] fail to index the BAM file. galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,632 Enabling 'history_content_tags' API controller, class: HistoryContentTagsController [bam_index_core] the alignment is not sorted (XXX): 24-th chr > 9-th chr [bam_index_build2] fail to index the BAM file. galaxy.datatypes.metadata DEBUG 2016-05-24 17:46:42,638 setting metadata externally failed for HistoryDatasetAssociation 442: External set_meta() not called galaxy.jobs.runners.slurm WARNING 2016-05-24 17:46:42,642 (313/189770) Job not found, assuming job check exceeded MinJobAge and completing as successful So, the system cannot start and if we visit the site, it says "503 Service Temporarily Unavailable". The log file seems to indicate some problem with the sorting of the BAM file. The problem looks a bit like something from 9 months ago (https://www.biostars.org/p/159336/), but the messages I've seen do not indicate that the problem prevents Galaxy from starting up. I would have thought if there is a problem with the input file, the job would have failed, but not affect Galaxy from starting up again. So, I'm wondering if that is the problem or maybe it's the line after about, "setting metadata externally failed" Has anyone seen this problem before or might have an idea about what to do about it? Thank you in advance! Ray
Hi Raymond, In case you didn't know, Galaxy likes to keep all BAM files in coordinate sorted order with a BAI index (using samtools). Somehow this is failing on your system (perhaps a suitable version of samtools is not on the $PATH). I'm sure a Galaxy expert can and will comment, but it looks like while trying to sort the BAM file in order to index it, something fails on the cluster side. This means the BAI index does not exist, so Galaxy cannot access the BAM file metadata. I'm still surprised if this is why Galaxy is not starting up - but perhaps you've found a new bug? If no one else has any suggestions and this is urgent, I would try manually sorting and indexing the problem BAM file(s) outside of Galaxy using samtools, and see if after that Galaxy can restart? Peter On Tue, May 24, 2016 at 11:07 AM, Raymond Wan <rwan.work@gmail.com> wrote:
Dear all,
We are experiencing a crash of our Galaxy instance (latest git log is early January 2016) and it seems unable to restart. The last entries in paster.log says the following:
galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,593 Enabling 'histories' API controller, class: HistoriesController galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,622 Enabling 'history_contents' API controller, class: HistoryContentsController [bam_index_core] the alignment is not sorted (XXX): 21-th chr > 11-th chr [bam_index_build2] fail to index the BAM file. galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,632 Enabling 'history_content_tags' API controller, class: HistoryContentTagsController [bam_index_core] the alignment is not sorted (XXX): 24-th chr > 9-th chr [bam_index_build2] fail to index the BAM file. galaxy.datatypes.metadata DEBUG 2016-05-24 17:46:42,638 setting metadata externally failed for HistoryDatasetAssociation 442: External set_meta() not called galaxy.jobs.runners.slurm WARNING 2016-05-24 17:46:42,642 (313/189770) Job not found, assuming job check exceeded MinJobAge and completing as successful
So, the system cannot start and if we visit the site, it says "503 Service Temporarily Unavailable".
The log file seems to indicate some problem with the sorting of the BAM file. The problem looks a bit like something from 9 months ago (https://www.biostars.org/p/159336/), but the messages I've seen do not indicate that the problem prevents Galaxy from starting up.
I would have thought if there is a problem with the input file, the job would have failed, but not affect Galaxy from starting up again. So, I'm wondering if that is the problem or maybe it's the line after about,
"setting metadata externally failed"
Has anyone seen this problem before or might have an idea about what to do about it?
Thank you in advance!
Ray ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Peter, On Tue, May 24, 2016 at 6:29 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
In case you didn't know, Galaxy likes to keep all BAM files in coordinate sorted order with a BAI index (using samtools). Somehow this is failing on your system (perhaps a suitable version of samtools is not on the $PATH).
No, I wasn't aware of this. Actually, I was looking just now and galaxy/database/sw_dir/samtools/ looks like this: $ ls 0.1.16 0.1.18 0.1.19 1.1 1.2 So, it seems like we have several versions installed. (I don't know why -- I presume various tools pulled in different dependencies? I didn't know it looked like this until I went searching.) However, we have command-line access and /usr/bin/samtools *is* 0.1.18 . I can get back to the user and ask him which tool he was using at the time of the failure. Is it difficult to "force" the version of samtools that the tool uses? I don't know how tools are installed, but if it's a matter of editing XML files, then maybe we can do that.
I'm sure a Galaxy expert can and will comment, but it looks like while trying to sort the BAM file in order to index it, something fails on the cluster side. This means the BAI index does not exist, so Galaxy cannot access the BAM file metadata. I'm still surprised if this is why Galaxy is not starting up - but perhaps you've found a new bug?
So the instance we are using is from January. As the system is "in production", we are reluctant to update it from Github. Of course, this is preventing the system from starting so if a recent update fixes this problem, we'll be more included to do a pull.
If no one else has any suggestions and this is urgent, I would try manually sorting and indexing the problem BAM file(s) outside of Galaxy using samtools, and see if after that Galaxy can restart?
Ok -- that is worth considering. Can I find the offending file, sort it (via the command-line) and replace it? I'm hesitant to do this since I'm worried it will negatively affect the database (i.e., corrupting it, if it stores some kind of md5 signature of files). But if there is no harm, then we will give that a try. Thanks for the suggestion! Ray
On Tue, May 24, 2016 at 11:42 AM, Raymond Wan <rwan.work@gmail.com> wrote:
Hi Peter,
On Tue, May 24, 2016 at 6:29 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
In case you didn't know, Galaxy likes to keep all BAM files in coordinate sorted order with a BAI index (using samtools). Somehow this is failing on your system (perhaps a suitable version of samtools is not on the $PATH).
No, I wasn't aware of this. Actually, I was looking just now and galaxy/database/sw_dir/samtools/ looks like this:
$ ls 0.1.16 0.1.18 0.1.19 1.1 1.2
So, it seems like we have several versions installed. (I don't know why -- I presume various tools pulled in different dependencies? I didn't know it looked like this until I went searching.)
Galaxy has a complex tool dependency mechanism which allows for user-facing tools to call a specific version of an installed command line tool. However, for the metadata processing, I believe Galaxy itself just uses whichever samtools is on the $PATH (and ignores all the clever Galaxy specific tool dependency stuff). Peter
Hello Ray One of my colleagues encountered a similar-sounding error on our local test instance, where an unsorted BAM file seemed to crash the Galaxy handler processes and prevented them from restarting. In our case the default samtools version in the Galaxy environment was 0.1.18; we found that updating this to samtools 1.2 fixed the crash problem. Hope this helps, Best wishes Peter On 24/05/16 11:07, Raymond Wan wrote:
Dear all,
We are experiencing a crash of our Galaxy instance (latest git log is early January 2016) and it seems unable to restart. The last entries in paster.log says the following:
galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,593 Enabling 'histories' API controller, class: HistoriesController galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,622 Enabling 'history_contents' API controller, class: HistoryContentsController [bam_index_core] the alignment is not sorted (XXX): 21-th chr > 11-th chr [bam_index_build2] fail to index the BAM file. galaxy.web.framework.base DEBUG 2016-05-24 17:46:42,632 Enabling 'history_content_tags' API controller, class: HistoryContentTagsController [bam_index_core] the alignment is not sorted (XXX): 24-th chr > 9-th chr [bam_index_build2] fail to index the BAM file. galaxy.datatypes.metadata DEBUG 2016-05-24 17:46:42,638 setting metadata externally failed for HistoryDatasetAssociation 442: External set_meta() not called galaxy.jobs.runners.slurm WARNING 2016-05-24 17:46:42,642 (313/189770) Job not found, assuming job check exceeded MinJobAge and completing as successful
So, the system cannot start and if we visit the site, it says "503 Service Temporarily Unavailable".
The log file seems to indicate some problem with the sorting of the BAM file. The problem looks a bit like something from 9 months ago (https://www.biostars.org/p/159336/), but the messages I've seen do not indicate that the problem prevents Galaxy from starting up.
I would have thought if there is a problem with the input file, the job would have failed, but not affect Galaxy from starting up again. So, I'm wondering if that is the problem or maybe it's the line after about,
"setting metadata externally failed"
Has anyone seen this problem before or might have an idea about what to do about it?
Thank you in advance!
Ray ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Peter Briggs peter.briggs@manchester.ac.uk Bioinformatics Core Facility University of Manchester B.1083 Michael Smith Bldg Tel: (0161) 2751482
Hi Peter, On Tue, May 24, 2016 at 6:29 PM, Peter Briggs <peter.briggs@manchester.ac.uk> wrote:
One of my colleagues encountered a similar-sounding error on our local test instance, where an unsorted BAM file seemed to crash the Galaxy handler processes and prevented them from restarting.
In our case the default samtools version in the Galaxy environment was 0.1.18; we found that updating this to samtools 1.2 fixed the crash problem.
Ah! That does sound familiar. As I just replied in another message, we seem to have at least 5 versions of samtools in the system. I'm not sure if that is normal or not for a Galaxy instance... Let me see if I can figure out (through logs, but most likely by asking the user) what tool he was using and figure out which samtools executable was being run. If it is 0.1.18, I'll look into update it...thank you for sharing your colleague's experience! Ray
Hello Ray Just to clarify, for us it wasn't dependent on the tool - it was actually the version of samtools installed in the 'galaxy' user's environment i.e. on the 'galaxy' user's PATH. I set it explicitly by creating a file called "local_env.sh" in the 'config' directory of the Galaxy installation with the following content: $ cat local_env.sh # Prepend samtools 1.2 export PATH=$HOME/apps/samtools/1.2/bin:$PATH ## # which is automatically picked up by 'run.sh' when Galaxy is started (I don't know how that plays if you're using a uwsgi/supervisor setup). Best wishes Peter On 24/05/16 11:46, Raymond Wan wrote:
Hi Peter,
On Tue, May 24, 2016 at 6:29 PM, Peter Briggs <peter.briggs@manchester.ac.uk> wrote:
One of my colleagues encountered a similar-sounding error on our local test instance, where an unsorted BAM file seemed to crash the Galaxy handler processes and prevented them from restarting.
In our case the default samtools version in the Galaxy environment was 0.1.18; we found that updating this to samtools 1.2 fixed the crash problem.
Ah! That does sound familiar. As I just replied in another message, we seem to have at least 5 versions of samtools in the system. I'm not sure if that is normal or not for a Galaxy instance...
Let me see if I can figure out (through logs, but most likely by asking the user) what tool he was using and figure out which samtools executable was being run. If it is 0.1.18, I'll look into update it...thank you for sharing your colleague's experience!
Ray
-- Peter Briggs peter.briggs@manchester.ac.uk Bioinformatics Core Facility University of Manchester B.1083 Michael Smith Bldg Tel: (0161) 2751482
To follow up on what Peter says It is the code in https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary... from line 182 class Bam( Binary ): This code (set_meta) is called after the tool has finished running while it is copying the resulting data into the galaxy database. If the bam file is unsorted or incorrectly sorted any tool fails. But if the _get_samtools_version methods found the early version of samtools it crashed our galaxy. 2 issues here. 1. Galaxy will only handle sorted bam files (regrettable but that is political) 2. While galaxy has wheels (previously eggs) for all other dependencies it relies on locally installed samtools. Can this really not be fixed??????? Christian University of Manchester ________________________________________ From: galaxy-dev [galaxy-dev-bounces@lists.galaxyproject.org] on behalf of Peter Briggs [peter.briggs@manchester.ac.uk] Sent: Tuesday, May 24, 2016 12:04 PM To: Raymond Wan Cc: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Unsorted BAM file --> Galaxy crash? Hello Ray Just to clarify, for us it wasn't dependent on the tool - it was actually the version of samtools installed in the 'galaxy' user's environment i.e. on the 'galaxy' user's PATH. I set it explicitly by creating a file called "local_env.sh" in the 'config' directory of the Galaxy installation with the following content: $ cat local_env.sh # Prepend samtools 1.2 export PATH=$HOME/apps/samtools/1.2/bin:$PATH ## # which is automatically picked up by 'run.sh' when Galaxy is started (I don't know how that plays if you're using a uwsgi/supervisor setup). Best wishes Peter On 24/05/16 11:46, Raymond Wan wrote:
Hi Peter,
On Tue, May 24, 2016 at 6:29 PM, Peter Briggs <peter.briggs@manchester.ac.uk> wrote:
One of my colleagues encountered a similar-sounding error on our local test instance, where an unsorted BAM file seemed to crash the Galaxy handler processes and prevented them from restarting.
In our case the default samtools version in the Galaxy environment was 0.1.18; we found that updating this to samtools 1.2 fixed the crash problem.
Ah! That does sound familiar. As I just replied in another message, we seem to have at least 5 versions of samtools in the system. I'm not sure if that is normal or not for a Galaxy instance...
Let me see if I can figure out (through logs, but most likely by asking the user) what tool he was using and figure out which samtools executable was being run. If it is 0.1.18, I'll look into update it...thank you for sharing your colleague's experience!
Ray
-- Peter Briggs peter.briggs@manchester.ac.uk Bioinformatics Core Facility University of Manchester B.1083 Michael Smith Bldg Tel: (0161) 2751482 ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hello Peter and everyone, Just wanted to update all of you that replacing samtools with 1.2 solved the problem. As you suggested, we didn't replace the system-installed one at /usr/bin (though I suppose we could have) but modified the environment variable. I don't know the details of the recovery since I look after Galaxy and not the overall system (i.e., I'm not the root user) but it sounds like the administrators didn't need to perform any job deletions and/or Galaxy database editing. Once samtools was replaced, the system managed to restart correctly. Honestly, we never would have guessed that using an old version of samtools would cause Galaxy to crash. Even with the log files alluding to it, we were initially in disbelief. And thanks to Christian for pointing out the main cause of the problem in the source code. We had thought (hoped?) that Galaxy was a self-contained system. So, this experience helped us better understand how Galaxy works. Thank you all for your prompt help! Ray On Tue, May 24, 2016 at 7:04 PM, Peter Briggs <peter.briggs@manchester.ac.uk> wrote:
Just to clarify, for us it wasn't dependent on the tool - it was actually the version of samtools installed in the 'galaxy' user's environment i.e. on the 'galaxy' user's PATH.
I set it explicitly by creating a file called "local_env.sh" in the 'config' directory of the Galaxy installation with the following content:
$ cat local_env.sh # Prepend samtools 1.2 export PATH=$HOME/apps/samtools/1.2/bin:$PATH ## #
which is automatically picked up by 'run.sh' when Galaxy is started (I don't know how that plays if you're using a uwsgi/supervisor setup).
Best wishes
Peter
On 24/05/16 11:46, Raymond Wan wrote:
Hi Peter,
On Tue, May 24, 2016 at 6:29 PM, Peter Briggs <peter.briggs@manchester.ac.uk> wrote:
One of my colleagues encountered a similar-sounding error on our local test instance, where an unsorted BAM file seemed to crash the Galaxy handler processes and prevented them from restarting.
In our case the default samtools version in the Galaxy environment was 0.1.18; we found that updating this to samtools 1.2 fixed the crash problem.
Ah! That does sound familiar. As I just replied in another message, we seem to have at least 5 versions of samtools in the system. I'm not sure if that is normal or not for a Galaxy instance...
Let me see if I can figure out (through logs, but most likely by asking the user) what tool he was using and figure out which samtools executable was being run. If it is 0.1.18, I'll look into update it...thank you for sharing your colleague's experience!
Ray
-- Peter Briggs peter.briggs@manchester.ac.uk Bioinformatics Core Facility University of Manchester B.1083 Michael Smith Bldg Tel: (0161) 2751482
participants (4)
-
Christian Brenninkmeijer
-
Peter Briggs
-
Peter Cock
-
Raymond Wan