
On Nov 29, 2011, at 3:13 AM, Peter Cock wrote:
On Monday, November 28, 2011, Joseph Hargitai <joseph.hargitai@einstein.yu.edu> wrote:
Ed,
we had the classic goof on our cluster with this. 4 nodes could not see the /home/galaxy folder due to a missing entry in /etc/fstab. When the jobs hit those nodes (which explains the randomness) we got the error message.
Bothersome was the lack of good logs to go on. The error message was too generic - however I discovered that Galaxy was depositing the error and our messages in the /pbs folder and you could briefly read them before they got deleted. There the message was the classic SGE input/output message - /home/galaxy.... file not found.
Hence my follow up question - how can I have galaxy NOT to delete these SGE error and out files?
best, joe
Better yet, Galaxy should read the SGE o and e files and record their contents as it would for a directly executed tools stdout and stderr.
Peter
...or at least have the option to do so, maybe a level of verbosity. I have been bitten by lack of stderr output myself, where having it might have saved some manual debugging. chris