If I had to guess, I would guess this is caused by a mis-configured proxy (nginx or Apache) that is resubmitting a POST request that is taking Galaxy to long to respond to. Order of events being something like: - User clicks to upload library items. - Proxy gets requests and passes to Galaxy - Galaxy takes a long time to process request and doesn't respond within a timeout. - Proxy resends POST request to Galaxy. - Galaxy takes a long time to process request and doesn't respond within a timeout. ... Proxies should never resend POST requests to Galaxy as far a I can imagine, but we have seen this for instance when submitting workflows. Some people have had their proxy retry that request repeatedly. I don't really know if this is a problem with the default proxy configurations we list on the wiki or if it comes down to customizations or special loaded extensions at various sites that have encountered this. Is this enough to help debug the problem? I'm not really an expert on specific proxies, etc... and you have it there and seem to be able to reproduce the problem. If you do want further help I would post the proxy you are using, the extensions, the configuration, and the Galaxy logs corresponding to this incident to see if we can see the repeated posts and the route that is being posted to. If you are not using a proxy, then I am stumped :(. -John On Fri, Sep 4, 2015 at 12:04 PM, Martin Vickers <mjv08@aber.ac.uk> wrote:
Hi All,
I've noticed an issue a couple of times now where I've added a directory of fastq's from an NFS mounted filesystem (reference only rather than copying into galaxy) and then galaxy times out. Load average begins to get really high and then consumes all the RAM and sometimes crashes. These are the same symptom as I had before with this issue that was never resolved;
http://dev.list.galaxyproject.org/run-sh-segfault-td4667549.html#a4667553
What I've noticed is that in the dataset I'm uploading to galaxy, there are suddenly many duplicates. In this example that's just happened, there are 288 fastq.gz files in the physical folder, but galaxy has created 6 references to each file resulting in 1728 datasets in the folder (see attached images).
When this happened before and crashed the galaxy application, whenever it restarted it'd try to resume what it was doing which created an endless loop of retrying and crashing until the job was removed.
Does anyone know what may be causing this?
Cheers,
Martin
--
-- Dr. Martin Vickers
Data Manager/HPC Systems Administrator Institute of Biological, Environmental and Rural Sciences IBERS New Building Aberystwyth University
w: http://www.martin-vickers.co.uk/ e: mjv08@aber.ac.uk t: 01970 62 2807
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/