Hi All, I'm still having this issue despite several attempts to try to resolve it. I've booted it on a 80GB VM, there are no users on it and only 1 or 2 tools installed from the tool shed. I have loaded around 150 fasta.gz files into a couple of data libraries which are on a nfs share. When galaxy starts it has a 57GB RAM foot print. If I leave it and do nothing, around 5 mins after I start galaxy something kicks in and starts consuming all the ram and then it segfaults. root@galaxy:~# top top - 10:01:34 up 20 min, 2 users, load average: 0.84, 0.49, 0.43 Tasks: 180 total, 1 running, 179 sleeping, 0 stopped, 0 zombie %Cpu(s): 12.4 us, 0.2 sy, 0.0 ni, 87.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st KiB Mem: 81295232 total, 58937820 used, 22357408 free, 13508 buffers KiB Swap: 8640508 total, 69940 used, 8570568 free. 86132 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2867 galaxy 20 0 57.856g 0.054t 11460 S 101.6 71.4 1:38.15 python This is what I get in syslog when it crashes like this. Jul 20 09:50:25 galaxy kernel: [ 569.351158] show_signal_msg: 18 callbacks suppressed Jul 20 09:50:25 galaxy kernel: [ 569.351168] python[1883]: segfault at 24 ip 0000000000558077 sp 00007fc5cb9e6400 error 6 in python2.7[400000+2bc000] Jul 20 09:50:25 galaxy kernel: [ 569.444890] Core dump to |/usr/share/apport/apport 1409 11 0 1409 pipe failed If there isn't sufficient memory in the first place (i.e. less than 57GB), I get something more like this; Jul 16 20:36:41 galaxy kernel: [ 117.123921] Out of memory: Kill process 1390 (python) score 986 or sacrifice child Jul 16 20:36:41 galaxy kernel: [ 117.124087] Killed process 1390 (python) total-vm:43496348kB, anon-rss:32611892kB, file-rss:1800kB (END) I can't see anything in the paster.log. I'm at a bit of a loss where to look for what is causing it. Any help would be greatly appreciated. Many thanks, Martin On 07/16/2015 08:48 PM, Martin Vickers [mjv08] wrote:
Hi Nate,
Thanks for the reply. In syslog I'm getting;
Jul 16 20:36:41 galaxy kernel: [ 117.123921] Out of memory: Kill
process 1390 (python) score 986 or sacrifice child
Jul 16 20:36:41 galaxy kernel: [ 117.124087] Killed process 1390 (python) total-vm:43496348kB, anon-rss:32611892kB, file-rss:1800kB (END)
It's a 32GB VM. I could increase it but I wouldn't expect 32GB to be too little. I've attached the full syslog.
Dr. Martin Vickers
Data Manager/HPC Systems Administrator Institute of Biological, Environmental and Rural Sciences IBERS New Building Aberystwyth University
w: http://www.martin-vickers.co.uk/ e: mjv08@aber.ac.uk t: 01970 62 2807
------------------------- *From:* Nate Coraor <nate@bx.psu.edu> *Sent:* 16 July 2015 04:36 PM *To:* Martin Vickers [mjv08] *Cc:* galaxy-dev@lists.galaxyproject.org *Subject:* Re: [galaxy-dev] ./run.sh segfault
Hi Martin,
Is there anything in the syslog?
--nate
On Thu, Jul 16, 2015 at 11:26 AM, Martin Vickers <mjv08@aber.ac.uk <mailto:mjv08@aber.ac.uk>> wrote:
Hi All,
I have a weird issue that's just cropped up. After a new install of galaxy (checked out on Monday from github) on a ubuntu vm, using postgres rather than sqlite as well as a few other production recommendations, I started playing around with the Data Libraries functionality. I linked a bunch of fastq.gz files into galaxy (around 150 in total) and everything was working fine. I went home and the next day, it was down.
I tried to start it up as usual (using an init.d script), it worked for less than a minute and then disappeared again. So I tried running it as the galaxy user using ./run.sh and I get a seg fault;
Starting server in PID 23173. serving on http://144.124.110.39:8080 Segmentation fault
Tried again with strace
Starting server in PID 23552. serving on http://144.124.110.39:8080 [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], 0, NULL) = 23552 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=23552, si_status=SIGKILL, si_utime=1590, si_stime=1930} --- rt_sigreturn() = 23552 write(2, "Killed\n", 7Killed ) = 7 read(10, "", 8192) = 0 exit_group(137) = ? +++ exited with 137 +++
I can't see anything odd in the log file and I've turned debugging on in galaxy.ini. I'm at a bit of a loss. Does anyone know what might be causing it?
Cheers,
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- -- Dr. Martin Vickers Data Manager/HPC Systems Administrator Institute of Biological, Environmental and Rural Sciences IBERS New Building Aberystwyth University w: http://www.martin-vickers.co.uk/ e: mjv08@aber.ac.uk t: 01970 62 2807