I checked the database and have the look at the job entries that handler0 tried to stop then shutdown:
| 3088 | 2013-01-03 14:25:38 | 2013-01-03 14:27:05 | 531 |
toolshed.g2.bx.psu.edu/repos/kevyin/homer/homer_findPeaks/0.1.2 | 0.1.2 | deleted_new | Job output deleted by user before job completed. | NULL | NULL | NULL | NULL | NULL | NULL | 1659 | drmaa://-V -j n -R y -q intel.q/ | NULL | NULL | 76 | 0 | NULL | NULL | handler0 | NULL |
| 3091 | 2013-01-04 10:52:19 | 2013-01-07 09:14:34 | 531 |
toolshed.g2.bx.psu.edu/repos/kevyin/homer/homer_findPeaks/0.1.2 | 0.1.2 | deleted_new | Job output deleted by user before job completed. | NULL | NULL | NULL | NULL | NULL | NULL | 1659 | drmaa://-V -j n -R y -q intel.q/ | NULL | NULL | 76 | 0 | NULL | NULL | handler0 | NULL |
| 3093 | 2013-01-07 22:02:21 | 2013-01-07 22:16:27 | 531 |
toolshed.g2.bx.psu.edu/repos/kevyin/homer/homer_pos2bed/1.0.0 | 1.0.0 | deleted_new | Job output deleted by user before job completed. | NULL | NULL | NULL | NULL | NULL | NULL | 1749 | drmaa://-V -j n -R y -q intel.q/ | NULL | NULL | 76 | 0 | NULL | NULL | handler0 | NULL |
So basically the job table has several of these entries what assigned to handler0 and marked as "deleted_new". When the handler0 is up, it starts stopping these jobs, after the first job has been "stopped", handler0 went crash and died. But that job was then marked as "deleted".
I think if I manually change the job state from "deleted_new" to "deleted" in the db, the handler0 will become fine. I am just concerned about how these jobs were created (like assigned to a handler but marked as "deleted_new").