Fw: "No peek" issue and datasets wrongly reported as "Empty"
Dear Galaxy developers, I know I am not the only one with this issue, as over time I've stumbled on a few mailing-list threads with other users having the same problem. And I know the recommended solution is to use the -noac mount option. ( http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#Unified_Metho... ) However, it is said that using this -noac mount option comes with a performance trade-off, so when we first ran into this issue (datasets showing "Empty" and "No peek", even though the file on the hard drive is full of content), we implemented the hack found in this thread: http://dev.list.galaxyproject.org/What-s-causing-this-error-td4141958.html#a... In this thread, John suggested to add a "sleep()" in the "finish_job" method of the "galaxy_dist/lib/galaxy/jobs/runnersdrmaa.py" file. It worked very well for us. Adding a sleep(30) made all the jobs waiting 30 seconds before finishing, but the "No peek" issue had at least disappear). However, since the latest Galaxy updates, this file (drmaa.py) has been dramastically changed and the "finish_job" method doesn't exist anymore. Hence, I had to remove this hack, hoping that this issue would have disappeared as well. Unfortunaley, this "No peek" issue is still there and causing many headaches to some of our workflows users. My question is then: Can I put this "sleep(30)" in some other place (method and/or file) in order to achieve the same result? I would really like to solve this "No peek" issue without resorting to the "-noac" mount option. Actually, I am not even sure our system administrator would allow it. Thanks again for your help! Jean-François
On Nov 7, 2013, at 2:45 PM, Jean-Francois Payotte wrote:
Dear Galaxy developers,
I know I am not the only one with this issue, as over time I've stumbled on a few mailing-list threads with other users having the same problem. And I know the recommended solution is to use the -noac mount option. (http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#Unified_Metho...)
However, it is said that using this -noac mount option comes with a performance trade-off, so when we first ran into this issue (datasets showing "Empty" and "No peek", even though the file on the hard drive is full of content), we implemented the hack found in this thread: http://dev.list.galaxyproject.org/What-s-causing-this-error-td4141958.html#a...
In this thread, John suggested to add a "sleep()" in the "finish_job" method of the "galaxy_dist/lib/galaxy/jobs/runnersdrmaa.py" file. It worked very well for us. Adding a sleep(30) made all the jobs waiting 30 seconds before finishing, but the "No peek" issue had at least disappear).
However, since the latest Galaxy updates, this file (drmaa.py) has been dramastically changed and the "finish_job" method doesn't exist anymore. Hence, I had to remove this hack, hoping that this issue would have disappeared as well. Unfortunaley, this "No peek" issue is still there and causing many headaches to some of our workflows users.
My question is then: Can I put this "sleep(30)" in some other place (method and/or file) in order to achieve the same result? I would really like to solve this "No peek" issue without resorting to the "-noac" mount option. Actually, I am not even sure our system administrator would allow it.
Hi Jean-François, The job runners have been largely refactored into lib/galaxy/jobs/runners/__init__.py, which is where you'll find finish_job(). However, we also recently added some tricks to work around this issue that has solved the problem (for usegalaxy.org, at least) without needing -noac. This is available in Monday's distribution release. Here's the commit: https://bitbucket.org/galaxy/galaxy-central/commits/384240b8cd29963f302a0349... To use, set retry_job_output_collection > 0 in the Galaxy config. --nate
Thanks again for your help! Jean-François ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Dear Galaxy developers,
I know I am not the only one with this issue, as over time I've stumbled on a few mailing-list threads with other users having the same problem. And I know the recommended solution is to use the -noac mount option. ( http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#Unified_Metho... )
However, it is said that using this -noac mount option comes with a
Thanks Nate, I will try manually editing the lib/galaxy/jobs/runners/__init__.py file for now, as we are not ready to update to the latest distribution yet. I should try the new distribution's fix in a couple weeks. Many thanks for solving this issue! I will let you know if it doesn't solve our issue when updating to the November distribution. :) Thanks, Jean-François From: Nate Coraor <nate@bx.psu.edu> To: Jean-Francois Payotte <jean-francois.payotte@dnalandmarks.ca> Cc: galaxy-dev@lists.bx.psu.edu Date: 08/11/2013 09:40 AM Subject: Re: [galaxy-dev] Fw: "No peek" issue and datasets wrongly reported as "Empty" On Nov 7, 2013, at 2:45 PM, Jean-Francois Payotte wrote: performance trade-off, so when we first ran into this issue (datasets showing "Empty" and "No peek", even though the file on the hard drive is full of content), we implemented the hack found in this thread: http://dev.list.galaxyproject.org/What-s-causing-this-error-td4141958.html#a...
In this thread, John suggested to add a "sleep()" in the "finish_job"
It worked very well for us. Adding a sleep(30) made all the jobs waiting 30 seconds before finishing, but the "No peek" issue had at least disappear).
However, since the latest Galaxy updates, this file (drmaa.py) has been dramastically changed and the "finish_job" method doesn't exist anymore. Hence, I had to remove this hack, hoping that this issue would have disappeared as well. Unfortunaley, this "No peek" issue is still there and causing many headaches to some of our workflows users.
My question is then: Can I put this "sleep(30)" in some other place (method and/or file) in order to achieve the same result? I would really like to solve this "No peek" issue without resorting to
method of the "galaxy_dist/lib/galaxy/jobs/runnersdrmaa.py" file. the "-noac" mount option. Actually, I am not even sure our system administrator would allow it. Hi Jean-François, The job runners have been largely refactored into lib/galaxy/jobs/runners/__init__.py, which is where you'll find finish_job(). However, we also recently added some tricks to work around this issue that has solved the problem (for usegalaxy.org, at least) without needing -noac. This is available in Monday's distribution release. Here's the commit: https://bitbucket.org/galaxy/galaxy-central/commits/384240b8cd29963f302a0349... To use, set retry_job_output_collection > 0 in the Galaxy config. --nate
Thanks again for your help! Jean-François ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
Jean-Francois Payotte
-
Nate Coraor