On Fri, Jun 27, 2014 at 3:13 PM, John Chilton <jmchilton@gmail.com> wrote:
On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
John - that Trello issue you logged, https://trello.com/c/0XQXVhRz Generic infrastructure to let deployers specify limits for tools based on input metadata (number of sequences, file size, etc...)
Would it be fair to say this is not likely to be implemented in the near future? i.e. Should we consider implementing the BLAST query limit approach as a short term hack?
It would be good functionality - but I don't foresee myself or anyone on the core team getting to it in the next six months say.
...
I am now angry with myself though because I realized that dynamic job destinations are a better way to implement this in the meantime (that environment stuff was very fresh when I responded so I think I just jumped there). You can build a flexible infrastructure locally that is largely decoupled from the tools and that may (?) work around the task splitting problem Peter brought up.
Outline of the idea: <snip>
Hi John, So the idea is to define a dynamic job mapper which checks the query input size, and if too big raises an error, and otherwise passes the job to the configured job handler (e.g. SGE cluster). See https://wiki.galaxyproject.org/Admin/Config/Jobs It sounds like this ought to be possible right now, but you are suggesting since this seems quite a general use case, the code to help build a dynamic mapper using things like file size (in bytes or number of sequences) could be added to Galaxy? This approach would need the Galaxy Admin to setup a custom job mapper for BLAST (which knows to look at the query file), but it taps into an existing Galaxy framework. By providing a reference implementation this ought to be fairly easy to setup, and can be extended to be more clever about the limits. e.g. For BLAST, we should consider both the number (and length) of the queries, plus the size of the database. Regards, Peter