Detailed SGE timing information about galaxy jobs
Hi all, I got around to writing the report I always needed: how much time each job actually runs and how much time it spends waiting in the SGE queue. The attached shell script produces the report, combining information from SGE's QACCT with the galaxy job/dataset information. The output contains: job, user, tool name, dbkey, total input size in bytes, waiting time in SGE queue, actual SGE executiong running time, and some other tidbits. This allows finding how much running time each user had on the cluster, how much time each tool/user spend idly waiting, and some possible correlations between tools, dbkeys, input size and running time. The script is tightly coupled with SGE and PostgreSQL, but can probably be adapted to PBS/MySQL. Hope this helps someone, -gordon
Hi Gordon, Very nice! Do you think it would be helpful to add this into the Tool Shed for sharing? I didn't see it there, so apologies if you already added! Best, Jen Galaxy team On 7/19/11 11:45 AM, Assaf Gordon wrote:
Hi all,
I got around to writing the report I always needed: how much time each job actually runs and how much time it spends waiting in the SGE queue.
The attached shell script produces the report, combining information from SGE's QACCT with the galaxy job/dataset information.
The output contains: job, user, tool name, dbkey, total input size in bytes, waiting time in SGE queue, actual SGE executiong running time, and some other tidbits.
This allows finding how much running time each user had on the cluster, how much time each tool/user spend idly waiting, and some possible correlations between tools, dbkeys, input size and running time.
The script is tightly coupled with SGE and PostgreSQL, but can probably be adapted to PBS/MySQL.
Hope this helps someone, -gordon
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/
Jennifer Jackson wrote, On 07/19/2011 05:04 PM:
Do you think it would be helpful to add this into the Tool Shed for sharing? I didn't see it there, so apologies if you already added!
It's not a galaxy tool per-se, are you sure you want to start a "galaxy admin" section in the toolshed ?
Best,
Jen Galaxy team
On 7/19/11 11:45 AM, Assaf Gordon wrote:
Hi all,
I got around to writing the report I always needed: how much time each job actually runs and how much time it spends waiting in the SGE queue.
The attached shell script produces the report, combining information from SGE's QACCT with the galaxy job/dataset information.
The output contains: job, user, tool name, dbkey, total input size in bytes, waiting time in SGE queue, actual SGE executiong running time, and some other tidbits.
This allows finding how much running time each user had on the cluster, how much time each tool/user spend idly waiting, and some possible correlations between tools, dbkeys, input size and running time.
The script is tightly coupled with SGE and PostgreSQL, but can probably be adapted to PBS/MySQL.
Hope this helps someone, -gordon
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Assaf, I've added this to the ~/contrib directory in change set 5820:c609c2574c66. Thanks very much for contributing this. Greg Von Kuster On Jul 19, 2011, at 2:45 PM, Assaf Gordon wrote:
Hi all,
I got around to writing the report I always needed: how much time each job actually runs and how much time it spends waiting in the SGE queue.
The attached shell script produces the report, combining information from SGE's QACCT with the galaxy job/dataset information.
The output contains: job, user, tool name, dbkey, total input size in bytes, waiting time in SGE queue, actual SGE executiong running time, and some other tidbits.
This allows finding how much running time each user had on the cluster, how much time each tool/user spend idly waiting, and some possible correlations between tools, dbkeys, input size and running time.
The script is tightly coupled with SGE and PostgreSQL, but can probably be adapted to PBS/MySQL.
Hope this helps someone, -gordon
<collect_sge_job_timings.sh>___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu
participants (3)
-
Assaf Gordon
-
Greg Von Kuster
-
Jennifer Jackson