From p.j.a.cock@googlemail.com Wed Apr 9 07:04:22 2014
From: Peter Cock
To: galaxy-dev@lists.galaxyproject.org
Subject: [galaxy-dev] Data Tables and *.loc files: Using named columns versus
from_data_table
Date: Wed, 09 Apr 2014 12:04:11 +0100
Message-ID:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============2600076960375802030=="
--===============2600076960375802030==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Hi all,
In discussion about adding an NCBI BLAST data manager
https://github.com/peterjc/galaxy_blast/issues/22 based on
Dan's example, Michael Li has suggested using the new(ish)
Data Table functionality of Galaxy for using *.loc files:
https://wiki.galaxyproject.org/Admin/Tools/Data%20Tables
Currently the BLAST+ wrappers access the blastdb.loc file
for picking a system installed nucleotide BLAST database
like this:
See https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus=
/ncbi_macros.xml
With the from_data_table feature this which would be much shorter:
For this to work, the column information must instead be
defined centrally in ``tool_data_table_conf.xml`` (via a
``tool_data_table_conf.xml.sample`` file), e.g.
For simple tools this seems quite neat, but within a single tool
suite using XML macros seems equally effective for centrally
defining the columns in the *.loc files (we do this currently).
However, what worries me is the data table XML configuration
file adds a new complexity for dependency management between
different ToolShed repositories using a *.loc file (like the *.loc
files for BLAST databases).
For the BLAST database *.loc files, the simplest solution seems
to be not to use the Data Tables feature (as we do now).
The next best solution seems to be to put the sample *.loc files
and associated data table definition XML files into a shared
ToolShed repository (called called blast_data_tables, or
blast_databases?) which would be declared as a dependency
of anything using the BLAST database *.loc files (e.g. the
BLAST+ wrappers and any data managers).
[This would be like the existing blast_datatypes ToolShed
repository which is a declared dependency of many tools
using BLAST]
Is that a good plan? What benefits does it have over simply
not using the Data Table functionality?
Thanks,
Peter
--===============2600076960375802030==--