Greetings,
I'm unfortunately having problems with segfaults on my server. The server runs CentOS, with python version 2.4.3. After installing & setting up Galaxy, there have been an increasing number of segfaults thrown by multiple processes. Originally there were 2 distinct tools (which were run by Galaxy) that had thrown segfaults, *and* Galaxy's own python process was throwing segfaults as well. This is happening with increasing frequency. Most recently python is throwing the segfault (and galaxy dies) before we can even run any tools. Syslog eg: May 18 13:29:12 florence kernel: python[10657]: segfault at 00002aab33ff1000 rip 00000033a427bf0b rsp 000000004942ac38 error 6
Notes: 1) We ran a full spectrum of hardware diagnostics which found no problems. 2) This morning while in the process of troubleshooting, I rebooted the server. Galaxy started as a service, and *threw another segfault shortly after instantiating* - long before anyone could have interacted with the server. This leads me to believe the problem is python related.
I plan to upgrade python, possibly go through some core dumps, etc. However before going any further, I wanted to see if any had experience with a similar problem. Any information is appreciated.
Thanks very much, Chris Zaleski CSHL
Chris Zaleski wrote:
Greetings,
I'm unfortunately having problems with segfaults on my server. The server runs CentOS, with python version 2.4.3. After installing & setting up Galaxy, there have been an increasing number of segfaults thrown by multiple processes. Originally there were 2 distinct tools (which were run by Galaxy) that had thrown segfaults, *and* Galaxy's own python process was throwing segfaults as well. This is happening with increasing frequency. Most recently python is throwing the segfault (and galaxy dies) before we can even run any tools. Syslog eg: May 18 13:29:12 florence kernel: python[10657]: segfault at 00002aab33ff1000 rip 00000033a427bf0b rsp 000000004942ac38 error 6
Notes:
- We ran a full spectrum of hardware diagnostics which found no problems.
- This morning while in the process of troubleshooting, I rebooted the
server. Galaxy started as a service, and /threw another segfault shortly after instantiating/ - long before anyone could have interacted with the server. This leads me to believe the problem is python related.
I plan to upgrade python, possibly go through some core dumps, etc. However before going any further, I wanted to see if any had experience with a similar problem. Any information is appreciated.
Hi Chris,
Although Python has segfaulted on me before, it's never done it with this sort of frequency. I'd be interested to see what happens after upgrading Python (have you checked closed bug reports for this release on CentOS to see if there are any hints there?).
--nate
Thanks very much, Chris Zaleski CSHL
galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Hello Nate,
Yeah, the day I wrote this, it was dying horribly all over the place. However the first thing I tried was to switch Galaxy to use Python 2.6.4. Indeed it has been stable ever since! Unfortunately I wasn't able to dig any further into (what was) the root cause. Since the change, we've just been rolling forward. But at least it's good to know the Python upgrade did the trick.
Thanks Chris
On Thu, May 20, 2010 at 3:06 PM, Nate Coraor nate@bx.psu.edu wrote:
Chris Zaleski wrote:
Greetings,
I'm unfortunately having problems with segfaults on my server. The server runs CentOS, with python version 2.4.3. After installing & setting up Galaxy, there have been an increasing number of segfaults thrown by multiple processes. Originally there were 2 distinct tools (which were run by Galaxy) that had thrown segfaults, *and* Galaxy's own python process was throwing segfaults as well. This is happening with increasing frequency. Most recently python is throwing the segfault (and galaxy dies) before we can even run any tools. Syslog eg: May 18 13:29:12 florence kernel: python[10657]: segfault at 00002aab33ff1000 rip 00000033a427bf0b rsp 000000004942ac38 error 6
Notes:
- We ran a full spectrum of hardware diagnostics which found no problems.
- This morning while in the process of troubleshooting, I rebooted the
server. Galaxy started as a service, and /threw another segfault shortly after instantiating/ - long before anyone could have interacted with the server. This leads me to believe the problem is python related.
I plan to upgrade python, possibly go through some core dumps, etc. However before going any further, I wanted to see if any had experience with a similar problem. Any information is appreciated.
Hi Chris,
Although Python has segfaulted on me before, it's never done it with this sort of frequency. I'd be interested to see what happens after upgrading Python (have you checked closed bug reports for this release on CentOS to see if there are any hints there?).
--nate
Thanks very much, Chris Zaleski CSHL
galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Chris Zaleski wrote:
Hello Nate,
Yeah, the day I wrote this, it was dying horribly all over the place. However the first thing I tried was to switch Galaxy to use Python 2.6.4. Indeed it has been stable ever since! Unfortunately I wasn't able to dig any further into (what was) the root cause. Since the change, we've just been rolling forward. But at least it's good to know the Python upgrade did the trick.
Great to hear, Chris.
For the record, Python 2.4 is supported by Galaxy and has been used with success. I suspect this was OS related, but if anyone experiences similar issues with Python 2.4, they should be reported.
--nate
Thanks Chris
On Thu, May 20, 2010 at 3:06 PM, Nate Coraor <nate@bx.psu.edu mailto:nate@bx.psu.edu> wrote:
Chris Zaleski wrote: Greetings, I'm unfortunately having problems with segfaults on my server. The server runs CentOS, with python version 2.4.3. After installing & setting up Galaxy, there have been an increasing number of segfaults thrown by multiple processes. Originally there were 2 distinct tools (which were run by Galaxy) that had thrown segfaults, *and* Galaxy's own python process was throwing segfaults as well. This is happening with increasing frequency. Most recently python is throwing the segfault (and galaxy dies) before we can even run any tools. Syslog eg: May 18 13:29:12 florence kernel: python[10657]: segfault at 00002aab33ff1000 rip 00000033a427bf0b rsp 000000004942ac38 error 6 Notes: 1) We ran a full spectrum of hardware diagnostics which found no problems. 2) This morning while in the process of troubleshooting, I rebooted the server. Galaxy started as a service, and /threw another segfault shortly after instantiating/ - long before anyone could have interacted with the server. This leads me to believe the problem is python related. I plan to upgrade python, possibly go through some core dumps, etc. However before going any further, I wanted to see if any had experience with a similar problem. Any information is appreciated. Hi Chris, Although Python has segfaulted on me before, it's never done it with this sort of frequency. I'd be interested to see what happens after upgrading Python (have you checked closed bug reports for this release on CentOS to see if there are any hints there?). --nate Thanks very much, Chris Zaleski CSHL ------------------------------------------------------------------------ _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu> http://lists.bx.psu.edu/listinfo/galaxy-dev
galaxy-dev@lists.galaxyproject.org