If you look at some of the NGS tools (see SAM-to-BAM for instance,
because it's simple) you can see the way we have handled this, which
is to offer several pre-built indexes but also give the user the
option to use a fasta file for which there is no pre-built index,
through the use of conditionals. So there is definitely no need to
make an entirely separate tool for the indexing. Also, note that
handling files used for input/output/indexes for external tools can
sometimes be tricky, if those tools expect the files to have
particular extensions. SAM-to-BAM also handles this situation with
some temp file renaming trickery.
For the pre-built indexes, once you have them built, have a look at
the DataIntegration wiki page, but
will also be helpful to you (start with the "Setting Up the
Reference Genomes for NGS Tools" section).
Let us know if you run into any more issues.
On Aug 17, 2010, at 11:35 AM, Branden Timm wrote:
Thanks for the advice. I agree it does make more sense to generate
the indexes once. Would I create those indexes in the same
directory as the FASTA and follow the DataIntegration document here?
On 8/17/2010 10:06 AM, Hans-Rudolf Hotz wrote:
> Hi Branden
>> I'm very new to Galaxy, and trying to use SOAPaligner/soap2 as a
>> integration case.
>> soap2 includes two executables, 2bwt-builder and soap. 2bwt-builder
>> takes a FASTA files and generates a set of 13 different index files,
>> which soap needs in order to do it's alignment.
>> I have started by just creating the tool XML configuration for
>> 2bwt-builder. The configuration follows:
>> <tool id="2bwt-builder" name="2bwt-Builder">
>> <description>build index files for the SOAPaligner/soap2</
>> <command>2bwt-builder $input</command>
> the "command line" needs all output files listed, see:
> However, in your case: Do you really want to make an extra tool for
> the indexing step? Wouldn't it make more sense to have the indices
> pre-built for some genomes?
> Your soap galaxy tool can then re-use the indices again and again.
> This is also much more space efficient, as all the user share the
> same index files.
> Regards, Hans
>> <param type="data" format="fasta" name="input"
>> <data format="tabular" name=".amb Index File"/>
>> <data format="tabular" name=".ann Index File"/>
>> <data format="tabular" name=".bwt Index File"/>
>> <data format="tabular" name=".fmv Index File"/>
>> <data format="tabular" name=".hot Index File"/>
>> <data format="tabular" name=".lkt Index File"/>
>> <data format="tabular" name=".pac Index File"/>
>> <data format="tabular" name=".rev.bwt Index File"/>
>> <data format="tabular" name=".rev.fmv Index File"/>
>> <data format="tabular" name=".rev.lkt Index File"/>
>> <data format="tabular" name=".rev.pac Index File"/>
>> <data format="tabular" name=".sa Index File"/>
>> <data format="tabular" name=".sai Index File"/>
>> I've used the tabular data type for the output files, which I'm
>> not sure
>> is correct. When the script runs, it generates 13 output files in my
>> history, but they are all empty according to galaxy. When I look at
>> galaxy_dist/database/files/.../, the output files have been
>> correctly and are non-empty.
>> Where am I going wrong? Thank you in advance for any advice.
>> Branden Timm
>> System Administrator
>> Great Lakes Bioenergy Research Center
>> University of Wisconsin
>> galaxy-dev mailing list
galaxy-dev mailing list