Hi guys,
Today my institute ran a galaxy training with about 20 people. We didn't implement multiple instances for galaxy python. When everyone started submitting jobs, the thread pool filled up very quickly, soon after the galaxy threw worker queue full error then became unresponsive. It had to be restarted.
So we decide to implement the multiple instances overnight in prepare for 2nd day of the training. Our setup is a 8-core host runs the galaxy server itself, and a 500-core cluster handles all the jobs.
So I am wondering how I should distribute 8 cores for different roles (web, manager, job handler). In the wiki, the author said he runs six web server processes on 8 CPUs server, I assume he only runs one manager and one job handler. Is one job handler is more than enough? Even for the public galaxy?
Second question is, it makes sense to have all web roles listen on a public IP (with different ports). For manager, job handler, can I just set them to listen 127.0.0.1? Or they have to be listening the same IP as the web roles?
Regards,
Derrick