Round-Robin Scheduling

22 Apr 2009

      Hello,

I have another question regarding the local job scheduler:

Is it possible to limit the number of jobs *per user* ?

That is - any given user can have up to X number of jobs running 
concurrently, regardless of the value of local_job_queue_workers ?

Imagine the following situation:

    local_job_queue_worker = 5
    job_scheduler_policy = 
galaxy.jobs.schedulingpolicy.roundrobin:UserRoundRobin

Which means that at any given moment, galaxy can run only five jobs.

Now, Galaxy is completely Idle, no jobs are running.
One users starts 7 very long running jobs (each jobs will take about two 
hours).
If I understand correctly - since no jobs are running, 5 of the user's 
job will be started immediately, even with the round-robin policy, right ?

And this means that for the next two hours, every other user which 
starts a job - his/her job will be either new or limbo-running, but none 
will actually be started, right ?

I think I'm experiencing this situation on my galaxy server.
Users are complaining their jobs have been running 'forever' or not even 
starting for a long long time.

Close examination shows that there are running 5 jobs (all from the same 
user) which have been running for three hours, and they are kind of 
hogging all the worker threads.

To make a long story short -
I would like to make sure a single user can't hog Galaxy.
Is it possible with the local job runner, and if not - is it possible 
with the SGE job runner ?

Thanks for reading so far,
    Gordon.

Assaf Gordon

Greg Von Kuster

tags

participants (2)