It seems that even under very light i.e. just a couple of long-running jobs present load the job runner slowly, but steadily (1-3Mb per sec) grows its memory consumption until it's killed by linux OOM killer. My job runner config: http://pastebin.com/vMWDHAQm I'm currently restarting the runner out of crontab when it is killed by OOM, but it's not a sensible solution by any means. I wonder if anyone encountered this and how it was solved. Thanks, Alex
On Dec 14, 2011, at 1:51 PM, Oleksandr Moskalenko wrote:
It seems that even under very light i.e. just a couple of long-running jobs present load the job runner slowly, but steadily (1-3Mb per sec) grows its memory consumption until it's killed by linux OOM killer.
My job runner config: http://pastebin.com/vMWDHAQm
I'm currently restarting the runner out of crontab when it is killed by OOM, but it's not a sensible solution by any means.
I wonder if anyone encountered this and how it was solved.
Hi Alex, I believe this is a leak either in pbs_python or libtorque.so. I haven't yet been able to track down the culprit, so in the meantime, we simply restart the job runner process once it reaches a specified amount of memory usage. --nate
Thanks,
Alex
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Nate Coraor
-
Oleksandr Moskalenko