I am not using job splitting, because I am implementing this for a client with a small (one machine) galaxy setup.

Implementing a query limit feature in galaxy core would probably be the best idea, but that would also probably require an admin screen to edit those limits, and I don't think I can sell the required time to my boss under the contract we have with the client.

I gave a quick try before on making the blast2html tool run in both python 2.6 and 3, but I gave up due to too many encoding issues. The client's machine has python 2.6. Maybe I should have another look.

Jan


On 17 June 2014 21:55, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis <jan.code@jankanis.nl> wrote:
> Too bad there aren't any really good options. I will use the environment
> variable approach for the query size limit.

Are you using the optional job splitting (parallelism) feature in Galaxy?
That seems to be me to be a good place to insert a Galaxy level
job size limit. e.g. BLAST+ jobs are split into 1000 query chunks,
so you might wish to impose a 25 chunk limit?

Long term being able to set limits on the input file parameters
of each tool would be nicer - e.g. Limit BLASTN to at most
20,000 queries, limit MIRA to at most 50GB FASTQ files, etc.

> For the gene bank links I guess modifying the .loc file is the least
> bad way. Maybe it can be merged into galaxy_blast, that would at
> least solve the interoperability problems.

It would have to be sufficiently general, and backward compatible.

FYI other people have also looked at extending the blast *.loc
files (e.g. adding a category column for helping filter down a
very large BLAST database list).

> @Peter: One potential problem in merging my blast2html tool
> could be that I have written it in python3, and the current tool
> wrapper therefore installs python3 and a host of its dependencies,
> making for a quite large download.

Without seeing your code, it is hard to say, but actually writing
Python code which works unmodified under Python 2.7 and
Python 3 is quite doable (and under Python 2.6 with a few
more provisos). Both NumPy and Biopython do this if you
wanted some reassurance.

On the other hand, Galaxy itself will need to more to Python 3
at some point, and certainly individual tools will too. This will
probably mean (as with Linux Python packages) having double
entries on the ToolSehd (one for Python 2, one for Python 3),

e.g ToolShed package for NumPy under Python 2 (done)
and under Python 3 (needed).

Peter