Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager?
Hi Carlos, sorry for the slow reply. With a small 10 user setup (I assume just one group) your installation is probably going to be a lot less complex than ours. We are running services for users from multiple labs and departments, so a chief concern was making sure private datasets always stay private - which is relatively involved when using Galaxy to run jobs on a general-purpose shared cluster. I think if I had to recommend picking a job scheduler I would suggest SLURM since it's probably the most 'fashionable' choice at present. You also have the advantage that SLURM is used with Galaxy in the Galaxy docker image etc. which ensures people notice if the Galaxy->DRMAA->SLURM setup isn't working. I've also used Galaxy with GridEngine in the past, and that was fine - but is becoming a less common choice as a scheduler. Having said that, I don't think that the job scheduler needs to be your biggest concern. I would focus most on the file system and user account setup you are going to need. * How are you going to migrate from standalone Galaxy to a situation where your new cluster can see the Galaxy data files, tools etc. Are you purchasing storage with the cluster? If so do you move Galaxy onto that storage, or can you mount existing Galaxy data onto the cluster nodes? If you can do that, is your networking such that performance is sufficient for the type of analysis you are going to run? * Do you need to, or will you need to, keep track of per-user usage of the cluster for things that Galaxy will be running? If not then you can just have a galaxy user on your cluster and things are pretty easy for file permissions etc. If you need to track jobs per-user then it becomes more complex, and the solution depends on how much privacy you need for datasets, how your cluster will authenticate users etc. The filesystem and user accounts issues are, in my mind, the ones to focus on. You can always modify Galaxy's config to switch to a different job scheduler fairly easily. You cannot as easily move around large amounts of data, and reconcile local vs cluster user accounts, should that be necessary. Cheers, Dave Trudgian -----Original Message----- From: Carlos Lijeron [mailto:clijeron@hunter.cuny.edu] Sent: Wednesday, April 22, 2015 10:54 AM To: David Trudgian; John Chilton Cc: RODRIGO GONZALEZ SERRANO Subject: Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager? Hello David, Thank you for the great feedback. We are at Hunter College in NYC, part of the City University of New York. We recently ordered the cluster which comes with Bright Cluster Management, and our PI wants to implement Galaxy for all the users (about 10) on the cluster and manage all job submissions through a job scheduler. So, to answer your question, we are not really using any scheduler at this point, but only a stand alone server with a local installation of Galaxy. Our Cluster should be assembled and installed by the end of May, so I¹m trying to gather as much information as possible in preparation for the deployment. Based on your experience, what do you think I should focus on to ensure we maximize outcome and reduce the possibility of mistakes? In other words, any lessons learned that you would like to share will be greatly appreciated. Thanks again, Carlos Lijeron. On 4/22/15, 10:34 AM, "David Trudgian" <David.Trudgian@UTSouthwestern.edu> wrote:
Carlo,
We have Bright Cluster Manager in use on our cluster for node provisioning etc. but the actual job scheduler in use in our case is SLURM, which we use directly.
Are you using one of the integrated workload managers such as SLURM / SGE / TORQUE directly, or indirectly via cmsub?
I guess the easiest way to come up with some kind of advice is if you can provide an example of generic job script you are using on your system. If you're using cmsub is it specifying a --wlmanager etc.
DT
-----Original Message----- From: galaxy-dev [mailto:galaxy-dev-bounces@lists.galaxyproject.org] On Behalf Of John Chilton Sent: Wednesday, April 22, 2015 8:26 AM To: Carlos Lijeron Cc: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager?
Hello Carlos,
I have never heard of anyone running Galaxy with Bright Cluster Manager (though hopefully someone will chime in if they have). If you are interested in adding support it should be possible. One complication is that Bright Cluster Manager doesn't appear to have a DRMAA interface (http://www.drmaa.org/) which is the most direct way to utilize new DRMs. Without that my approach would be to build a new CLI runner:
There are a few examples here that one can use as template:
https://github.com/galaxyproject/galaxy/tree/dev/lib/galaxy/jobs/runner s/u til/cli/job
I guess you would have to write a new one targeting cmsub I guess - you also need to be able to parse a job status somehow - I haven't figured out how to do that from the documentation - but I assume there is a way.
I looks like Bright supports running SGE, SLURM, and Torque on the cluster - doing this and interfacing with one of those more common options directly might be a better approach for Galaxy (and other users if your cluster has them).
-John
On Wed, Apr 22, 2015 at 8:56 AM, Carlos Lijeron <clijeron@hunter.cuny.edu> wrote:
Good day everyone,
Has anyone of you been able to implement Galaxy on a HPC using Bright Cluster Manager as the main DRM? I noticed that only a few have been known to work with Galaxy, but the list does not include Bright. Any advice/ideas will be greatly appreciated.
TORQUE Resource Manager
PBS Professional
Open Grid Engine
Univa Grid Engine (previously known as Sun Grid Engine and Oracle Grid Engine)
Platform LSF
HTCondor
Slurm
Galaxy Pulsar (formerly LWR)
Thanks.
Carlo
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
________________________________
UT Southwestern
Medical Center
The future of medicine, today.
participants (1)
-
David Trudgian