From: Kelly Vincent <kpvincent(a)bx.psu.edu>
Date: August 18, 2010 11:46:32 AM EDT
Subject: Re: [galaxy-user] [galaxy-dev] Tool Integration:
Are you currently calling a script file (Python Perl, etc.) in your
command tag, or calling the tool directly? If you are running it
only with an XML file, you cannot do what you want here. The command
that will be run is generated by converting the variable values you
put in the <command> tag to the names of the actual input and output
files that Galaxy creates when the tool is run.
However, this is not particularly difficult if you are calling a
script in the <command> tag, though it's a little convoluted. There
are several tools that deal with this situation (the NGS tools
again, in particular). There are a couple of basic situations. If
you just need to have a single file with a certain extension, simply
create a temp file with the extension specified (in Python
"tempfile.mkstemp( suffix='.ext' )" will do it). It's slightly more
complicated if you need associated files in the same directory (for
instance, Samtools expects a fasta file like "hg19.fa" plus its
index "hg19.fa.fai" to be in the same place. When you call the
indexing command with the name of the fasta file, it automatically
creates the index file with that name plus ".fai"). In this case,
you should symlink the original file to a temp file (with or without
a specified extension, as appropriate) and then get the name of that
temp file. The name of the other file is that name PLUS the
appropriate extension. For example, sam_to_bam.py does this in lines
73 through 76.
Note that the reason for symlinking to a temp file is that otherwise
the tool will create a new file in a database/files subdirectory,
which is not the right way to do things. So in the Samtools case, if
the uploaded fasta file is "database/files/000/dataset_23.dat", a
new file "database/files/000/dataset_23.dat.fai" will be created,
but the database won't know about it and it will not be deleted
after the tool finishes (but it still won't be available to future
The paster.log contains the command as generated by the <command>
tag, but it's not possible to print to that log from a tool script.
You can simply print the relevant command to sdtout or stderr from
the script and when you run it in the browser, you will see it in
the info section.
Hopefully this makes sense--let us know if you need further
On Aug 17, 2010, at 1:21 PM, Branden Timm wrote:
> Just a follow up here ... if the <command> tag needs to list all
> output files, but the tool itself does not specify output files on
> the command line, does that mean I need to write a wrapper for my
> tool that accepts each output file as a command-line parameter?
> Currently I've tried specifying all of the output files on the
> <command> line, where each variable corresponds to a <data> line
> under outputs, but then SOAPaligner/soap2 tries to read those non-
> existent files as FASTA files.
> As a more general question, is there a way to see the exact
> command(s) that Galaxy is dispatching on my behalf when I execute a
> tool? I checked paster.log but it wasn't there. It seems like
> that would be a great debugging feature.
> On Aug 17, 2010, at 10:06 AM, Hans-Rudolf Hotz wrote:
>> Hi Branden
>>> I'm very new to Galaxy, and trying to use SOAPaligner/soap2 as a
>>> integration case.
>>> soap2 includes two executables, 2bwt-builder and soap. 2bwt-builder
>>> takes a FASTA files and generates a set of 13 different index
>>> which soap needs in order to do it's alignment.
>>> I have started by just creating the tool XML configuration for
>>> 2bwt-builder. The configuration follows:
>>> <tool id="2bwt-builder" name="2bwt-Builder">
>>> <description>build index files for the SOAPaligner/soap2</
>>> <command>2bwt-builder $input</command>
>> the "command line" needs all output files listed, see:
>> However, in your case: Do you really want to make an extra tool
>> for the indexing step? Wouldn't it make more sense to have the
>> indices pre-built for some genomes?
>> Your soap galaxy tool can then re-use the indices again and again.
>> This is also much more space efficient, as all the user share the
>> same index files.
>> Regards, Hans
>>> <param type="data" format="fasta"
>>> <data format="tabular" name=".amb Index File"/>
>>> <data format="tabular" name=".ann Index File"/>
>>> <data format="tabular" name=".bwt Index File"/>
>>> <data format="tabular" name=".fmv Index File"/>
>>> <data format="tabular" name=".hot Index File"/>
>>> <data format="tabular" name=".lkt Index File"/>
>>> <data format="tabular" name=".pac Index File"/>
>>> <data format="tabular" name=".rev.bwt Index
>>> <data format="tabular" name=".rev.fmv Index
>>> <data format="tabular" name=".rev.lkt Index
>>> <data format="tabular" name=".rev.pac Index
>>> <data format="tabular" name=".sa Index File"/>
>>> <data format="tabular" name=".sai Index File"/>
>>> I've used the tabular data type for the output files, which I'm
>>> not sure
>>> is correct. When the script runs, it generates 13 output files in
>>> history, but they are all empty according to galaxy. When I look at
>>> galaxy_dist/database/files/.../, the output files have been
>>> correctly and are non-empty.
>>> Where am I going wrong? Thank you in advance for any advice.
>>> Branden Timm
>>> System Administrator
>>> Great Lakes Bioenergy Research Center
>>> University of Wisconsin
>>> galaxy-dev mailing list
> galaxy-dev mailing list
galaxy-user mailing list