From p.j.a.cock@googlemail.com Wed Apr 9 07:04:22 2014 From: Peter Cock To: galaxy-dev@lists.galaxyproject.org Subject: [galaxy-dev] Data Tables and *.loc files: Using named columns versus from_data_table Date: Wed, 09 Apr 2014 12:04:11 +0100 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2600076960375802030==" --===============2600076960375802030== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi all, In discussion about adding an NCBI BLAST data manager https://github.com/peterjc/galaxy_blast/issues/22 based on Dan's example, Michael Li has suggested using the new(ish) Data Table functionality of Galaxy for using *.loc files: https://wiki.galaxyproject.org/Admin/Tools/Data%20Tables Currently the BLAST+ wrappers access the blastdb.loc file for picking a system installed nucleotide BLAST database like this: See https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus= /ncbi_macros.xml With the from_data_table feature this which would be much shorter: For this to work, the column information must instead be defined centrally in ``tool_data_table_conf.xml`` (via a ``tool_data_table_conf.xml.sample`` file), e.g. value, name, path
For simple tools this seems quite neat, but within a single tool suite using XML macros seems equally effective for centrally defining the columns in the *.loc files (we do this currently). However, what worries me is the data table XML configuration file adds a new complexity for dependency management between different ToolShed repositories using a *.loc file (like the *.loc files for BLAST databases). For the BLAST database *.loc files, the simplest solution seems to be not to use the Data Tables feature (as we do now). The next best solution seems to be to put the sample *.loc files and associated data table definition XML files into a shared ToolShed repository (called called blast_data_tables, or blast_databases?) which would be declared as a dependency of anything using the BLAST database *.loc files (e.g. the BLAST+ wrappers and any data managers). [This would be like the existing blast_datatypes ToolShed repository which is a declared dependency of many tools using BLAST] Is that a good plan? What benefits does it have over simply not using the Data Table functionality? Thanks, Peter --===============2600076960375802030==--