I have my Galaxy instance configured using Torque PBS with a MySQL backend. Everything works pretty well, although I've recently been getting errors when I attempt to submit more than 2 workflows. These workflows are very simple. They go from a fastq file, to a converter, then get mapped and converted to bam format. Occasionally I'll have a job fail, and never at the same point in the process, that says"
"An error occurred running this job: unable to run this job due to a cluster error"
I'm not sure where to even begin looking for cluster problems, it seems most output gets wiped out pretty quickly within galaxy. Re-running the job typically works fine, but since this is occurring in the middle of a workflow its rather disruptive. Any suggestions on where to look for problems would be much appreciated.
Also, I assume changing the log_level value in universe_wsgi.ini to something other than DEBUG might help? what are the level values galaxy uses? I can't seem to find any notes on this anywhere. Thanks!
Juan Perin