Re: [galaxy-dev] Server stops itself

9 Jan 2013

      Hello All,

after more researchs, I found that the crash of the galaxy server was 
caused by stopping jobs. We are working with our own SGE cluster.

It's weird because we can kill jobs via history or administration panel 
without problem.

In the paster.log, we just got this message before the crash of the server :
...
galaxy.jobs.handler DEBUG 2013-01-08 16:52:39,877 Stopping job 3065:
galaxy.jobs.handler DEBUG 2013-01-08 16:52:39,877 stopping job 3065 in 
drmaa runner
I think this problem comes when there is many jobs in "running", "new" 
and "queued" states.

Cheers,
Cyril

On 01/08/2013 04:11 PM, MONJEAUD wrote:
...
Hello All,
I'm trying to deploy my instance of Galaxy in production. Some tests 
we've done show that when the number of person connected is high (>20 
together), the server stops itself.
Sometimes, I have this error in the paster.log:
...
Exception happened during processing of request from ('127.0.0.1', 
60575)
Traceback (most recent call last):
  File 
"/opt/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpserver.py", line 
1053, in process_request_in_thread
    self.finish_request(request, client_address)
  File "/local/python/2.7-bis/lib/python2.7/SocketServer.py", line 
323, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/local/python/2.7-bis/lib/python2.7/SocketServer.py", line 
641, in __init__
    self.finish()
  File "/local/python/2.7-bis/lib/python2.7/SocketServer.py", line 
694, in finish
    self.wfile.flush()
  File "/local/python/2.7-bis/lib/python2.7/socket.py", line 301, in 
flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe
Do you have any ideas about this and how resolve it?
Cheers!!
Cyril
-- 

Cyril Monjeaud
Equipe Symbiose / Plate-forme GenOuest
Bureau D156
IRISA-INRIA, Campus de Beaulieu
35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 74 17