Hi all, I have exactly the same problem as described here: http://gmod.827538.n3.nabble.com/Job-output-not-returned-from-cluster-td8795... All the SGE stuff is administrated in our company and unfortunately they know nothing about Galaxy and so it?s my task to get it running. Everything works fine except the problem mentioned above. I?ve contacted Erick but haven?t received a response yet. If anybody can give me some hints where I can look at to solve the problem or at least what I can tell the admins, I would be really grateful. Thanks in advance. If you need further information let me know! Cheers, Sascha
Hi! Unfortunately I was not able to fix this problem yet. Is anybody out there who had a similar problem while using Galaxy with SGE or has the knowledge about the things I can look at? Thanks in advance! Cheers, Sascha
On Jul 9, 2012, at 11:25 AM, Sascha Kastens wrote:
Hi!
Unfortunately I was not able to fix this problem yet.
Is anybody out there who had a similar problem while using Galaxy with SGE or has the knowledge about the things I can look at?
Hi Sascha, The error message you are getting is because the SGE job's stdout and stderr files are not where Galaxy expects to find them. The working directory will be output to the debug log prior to the job's execution and should resemble: <job_working_directory>/<job id hash ...>/<job id> where: <job_working_directory> is the absolute path to the value of job_working_directory in universe_wsgi.ini <job id hash> is determined based on the job ID but is most likely '000' if you are just setting up a new server <job id> is the job's ID as shown in the debug log Upon job completion, this directory should contain files like <job id>.drmout and <job id>.drmerr. Is it possible that your SGE installation is overriding the stdout/stderr paths, or that job_working_directory is not a shared filesystem?
Thanks in advance!
Cheers, Sascha ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Nate, thanks a lot for your hints. I was finally able to fix the problem. Galaxy couldn?t find the .drmerr file because our SGE installation merged .drmerr into .drmout... now everything works fine! Cheers, Sascha Original Message processed by CONSOLIDATE Subject: Re: [galaxy-dev] Job output not returned from Cluster Sent: Freitag, 13. Juli 2012 19:35 From: Nate Coraor (nate@bx.psu.edu) On Jul 9, 2012, at 11:25 AM, Sascha Kastens wrote:
Hi!
Unfortunately I was not able to fix this problem yet.
Is anybody out there who had a similar problem while using Galaxy with SGE or has the knowledge about the things I can look at?
Hi Sascha, The error message you are getting is because the SGE job?s stdout and stderr files are not where Galaxy expects to find them. The working directory will be output to the debug log prior to the job?s execution and should resemble: // where: is the absolute path to the value of job_working_directory in universe_wsgi.ini is determined based on the job ID but is most likely ?000? if you are just setting up a new server is the job?s ID as shown in the debug log Upon job completion, this directory should contain files like .drmout and .drmerr. Is it possible that your SGE installation is overriding the stdout/stderr paths, or that job_working_directory is not a shared filesystem?
Thanks in advance!
Cheers, Sascha ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, We have set up a local instance of galaxy-dist, using pbs-python to communicate with our HPC. Everything is working great, except for the upload functionality. When I assign the upload1 to local:/// , uploading of small files through the web-site works, and upload of large files by ftp works as well. When I let upload1 to be handled by the default_cluster_job_runner ( = pbs:///), I get empty data sets. Galaxy does not report any errors however and the data state is ok. Has anybody seen this issue and solved it? Using the local:/// job handler causes a massive performance hit on the galaxy process. Best regards, Geert Vandeweyer
On Jul 16, 2012, at 6:14 AM, Geert Vandeweyer wrote:
Hi,
We have set up a local instance of galaxy-dist, using pbs-python to communicate with our HPC. Everything is working great, except for the upload functionality.
When I assign the upload1 to local:/// , uploading of small files through the web-site works, and upload of large files by ftp works as well. When I let upload1 to be handled by the default_cluster_job_runner ( = pbs:///), I get empty data sets. Galaxy does not report any errors however and the data state is ok.
Has anybody seen this issue and solved it? Using the local:/// job handler causes a massive performance hit on the galaxy process.
Best regards,
Geert Vandeweyer
Hi Geert, Sorry for the delayed response. If you're still having this issue, if you view the empty dataset, can you see the data it's supposed to contain? i.e. does the underlying output file for the upload tool actually contain the data, or is it really empty on disk? --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Nate, The datafile is really empty on the disk (size 0, checked at the "full path" etnry from the dataset info). Best regards, Geert Vandeweyer On 08/29/2012 06:05 PM, Nate Coraor wrote:
On Jul 16, 2012, at 6:14 AM, Geert Vandeweyer wrote:
Hi,
We have set up a local instance of galaxy-dist, using pbs-python to communicate with our HPC. Everything is working great, except for the upload functionality.
When I assign the upload1 to local:/// , uploading of small files through the web-site works, and upload of large files by ftp works as well. When I let upload1 to be handled by the default_cluster_job_runner ( = pbs:///), I get empty data sets. Galaxy does not report any errors however and the data state is ok.
Has anybody seen this issue and solved it? Using the local:/// job handler causes a massive performance hit on the galaxy process.
Best regards,
Geert Vandeweyer Hi Geert,
Sorry for the delayed response. If you're still having this issue, if you view the empty dataset, can you see the data it's supposed to contain? i.e. does the underlying output file for the upload tool actually contain the data, or is it really empty on disk?
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726
On Aug 31, 2012, at 8:33 AM, Geert Vandeweyer wrote:
Hi Nate,
The datafile is really empty on the disk (size 0, checked at the "full path" etnry from the dataset info).
Best regards,
Geert Vandeweyer
Although it should generate an error rather than an empty file, can you ensure that new_file_path in Galaxy's config is pointed at a filesystem that is shared between the cluster and the Galaxy server? If it is, you may need to add some debugging to the upload tool to figure out exactly where the problem is. --nate
On 08/29/2012 06:05 PM, Nate Coraor wrote:
On Jul 16, 2012, at 6:14 AM, Geert Vandeweyer wrote:
Hi,
We have set up a local instance of galaxy-dist, using pbs-python to communicate with our HPC. Everything is working great, except for the upload functionality.
When I assign the upload1 to local:/// , uploading of small files through the web-site works, and upload of large files by ftp works as well. When I let upload1 to be handled by the default_cluster_job_runner ( = pbs:///), I get empty data sets. Galaxy does not report any errors however and the data state is ok.
Has anybody seen this issue and solved it? Using the local:/// job handler causes a massive performance hit on the galaxy process.
Best regards,
Geert Vandeweyer Hi Geert,
Sorry for the delayed response. If you're still having this issue, if you view the empty dataset, can you see the data it's supposed to contain? i.e. does the underlying output file for the upload tool actually contain the data, or is it really empty on disk?
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
--
Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726
I saw this same problem with our upload1 set to run on the cluster (it had worked properly a few months prior). I found out that the cluster sysadmins had set the nodes for local access only (no web access), direct file uploads and FTP to local disk worked but URL-based uploads did not. The job indicated success, however the file was empty. Interestingly the 'info' box had the error: Unable to fetch ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Agrobacterium_tumefaciens_C58_uid57865/NC_003063.fna [Errno ftp error] [Errno 113] No route to host Switching it back to the local job runner fixed it. chris On Aug 31, 2012, at 9:27 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Aug 31, 2012, at 8:33 AM, Geert Vandeweyer wrote:
Hi Nate,
The datafile is really empty on the disk (size 0, checked at the "full path" etnry from the dataset info).
Best regards,
Geert Vandeweyer
Although it should generate an error rather than an empty file, can you ensure that new_file_path in Galaxy's config is pointed at a filesystem that is shared between the cluster and the Galaxy server? If it is, you may need to add some debugging to the upload tool to figure out exactly where the problem is.
--nate
On 08/29/2012 06:05 PM, Nate Coraor wrote:
On Jul 16, 2012, at 6:14 AM, Geert Vandeweyer wrote:
Hi,
We have set up a local instance of galaxy-dist, using pbs-python to communicate with our HPC. Everything is working great, except for the upload functionality.
When I assign the upload1 to local:/// , uploading of small files through the web-site works, and upload of large files by ftp works as well. When I let upload1 to be handled by the default_cluster_job_runner ( = pbs:///), I get empty data sets. Galaxy does not report any errors however and the data state is ok.
Has anybody seen this issue and solved it? Using the local:/// job handler causes a massive performance hit on the galaxy process.
Best regards,
Geert Vandeweyer Hi Geert,
Sorry for the delayed response. If you're still having this issue, if you view the empty dataset, can you see the data it's supposed to contain? i.e. does the underlying output file for the upload tool actually contain the data, or is it really empty on disk?
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
--
Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (4)
-
Fields, Christopher J
-
Geert Vandeweyer
-
Nate Coraor
-
Sascha Kastens