
Ed,
we had the classic goof on our cluster with this. 4 nodes could not see
On Monday, November 28, 2011, Joseph Hargitai < joseph.hargitai@einstein.yu.edu> wrote: the /home/galaxy folder due to a missing entry in /etc/fstab. When the jobs hit those nodes (which explains the randomness) we got the error message.
Bothersome was the lack of good logs to go on. The error message was too
generic - however I discovered that Galaxy was depositing the error and our messages in the /pbs folder and you could briefly read them before they got deleted. There the message was the classic SGE input/output message - /home/galaxy.... file not found.
Hence my follow up question - how can I have galaxy NOT to delete these
SGE error and out files?
best, joe
Better yet, Galaxy should read the SGE o and e files and record their contents as it would for a directly executed tools stdout and stderr. Peter