From p.j.a.cock@googlemail.com Wed Apr 9 07:04:22 2014
From: Peter Cock
To: galaxy-dev@lists.galaxyproject.org
Subject: [galaxy-dev] Data Tables and *.loc files: Using named columns versus
from_data_table
Date: Wed, 09 Apr 2014 12:04:11 +0100
Message-ID:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============9191318633861772883=="
--===============9191318633861772883==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Hi all,
In discussion about adding an NCBI BLAST data manager
https://github.com/peterjc/galaxy_blast/issues/22 based on
Dan's example, Michael Li has suggested using the new(ish)
Data Table functionality of Galaxy for using *.loc files:
https://wiki.galaxyproject.org/Admin/Tools/Data%20Tables
Currently the BLAST+ wrappers access the blastdb.loc file
for picking a system installed nucleotide BLAST database
like this:
See https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus=
/ncbi_macros.xml
With the from_data_table feature this which would be much shorter:
For this to work, the column information must instead be
defined centrally in ``tool_data_table_conf.xml`` (via a
``tool_data_table_conf.xml.sample`` file), e.g.
For simple tools this seems quite neat, but within a single tool
suite using XML macros seems equally effective for centrally
defining the columns in the *.loc files (we do this currently).
However, what worries me is the data table XML configuration
file adds a new complexity for dependency management between
different ToolShed repositories using a *.loc file (like the *.loc
files for BLAST databases).
For the BLAST database *.loc files, the simplest solution seems
to be not to use the Data Tables feature (as we do now).
The next best solution seems to be to put the sample *.loc files
and associated data table definition XML files into a shared
ToolShed repository (called called blast_data_tables, or
blast_databases?) which would be declared as a dependency
of anything using the BLAST database *.loc files (e.g. the
BLAST+ wrappers and any data managers).
[This would be like the existing blast_datatypes ToolShed
repository which is a declared dependency of many tools
using BLAST]
Is that a good plan? What benefits does it have over simply
not using the Data Table functionality?
Thanks,
Peter
--===============9191318633861772883==--
From dan@bx.psu.edu Wed Apr 9 11:14:20 2014
From: Daniel Blankenberg
To: galaxy-dev@lists.galaxyproject.org
Subject: Re: [galaxy-dev] Data Tables and *.loc files: Using named columns
versus from_data_table
Date: Wed, 09 Apr 2014 11:14:18 -0400
Message-ID: <4A1994A3-412D-478E-B9D2-36B71F1EE998@bx.psu.edu>
In-Reply-To:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============7088091603408269563=="
--===============7088091603408269563==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Hi Peter,
Having a standalone repository that just contained the tool data table and .l=
oc file that could be a dependency of other repositories would be a good way =
to go here. Unfortunately, this isn=E2=80=99t supported right now. I=E2=80=99=
ve opened a trello card for this: https://trello.com/c/VZxV08Qt
However, even though you currently need to include the tool data table defini=
tion and .loc sample in each repository in order for the tool to be valid, it=
is still a best practice to use tool data tables.
Thanks,
Dan
On Apr 9, 2014, at 7:04 AM, Peter Cock wrote:
> Hi all,
>=20
> In discussion about adding an NCBI BLAST data manager
> https://github.com/peterjc/galaxy_blast/issues/22 based on
> Dan's example, Michael Li has suggested using the new(ish)
> Data Table functionality of Galaxy for using *.loc files:
> https://wiki.galaxyproject.org/Admin/Tools/Data%20Tables
>=20
> Currently the BLAST+ wrappers access the blastdb.loc file
> for picking a system installed nucleotide BLAST database
> like this:
>=20
>
>
>
>
>
>
>
>=20
> See https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_pl=
us/ncbi_macros.xml
>=20
> With the from_data_table feature this which would be much shorter:
>=20
>
>
>
>=20
> For this to work, the column information must instead be
> defined centrally in ``tool_data_table_conf.xml`` (via a
> ``tool_data_table_conf.xml.sample`` file), e.g.
>=20
>
>=20
> For simple tools this seems quite neat, but within a single tool
> suite using XML macros seems equally effective for centrally
> defining the columns in the *.loc files (we do this currently).
>=20
> However, what worries me is the data table XML configuration
> file adds a new complexity for dependency management between
> different ToolShed repositories using a *.loc file (like the *.loc
> files for BLAST databases).
>=20
> For the BLAST database *.loc files, the simplest solution seems
> to be not to use the Data Tables feature (as we do now).
>=20
> The next best solution seems to be to put the sample *.loc files
> and associated data table definition XML files into a shared
> ToolShed repository (called called blast_data_tables, or
> blast_databases?) which would be declared as a dependency
> of anything using the BLAST database *.loc files (e.g. the
> BLAST+ wrappers and any data managers).
>=20
> [This would be like the existing blast_datatypes ToolShed
> repository which is a declared dependency of many tools
> using BLAST]
>=20
> Is that a good plan? What benefits does it have over simply
> not using the Data Table functionality?
>=20
> Thanks,
>=20
> Peter
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> http://lists.bx.psu.edu/
>=20
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/mailinglists/
--===============7088091603408269563==--
From p.j.a.cock@googlemail.com Wed Apr 9 11:28:36 2014
From: Peter Cock
To: galaxy-dev@lists.galaxyproject.org
Subject: Re: [galaxy-dev] Data Tables and *.loc files: Using named columns
versus from_data_table
Date: Wed, 09 Apr 2014 16:28:25 +0100
Message-ID:
In-Reply-To: <4A1994A3-412D-478E-B9D2-36B71F1EE998@bx.psu.edu>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============6658388432725242824=="
--===============6658388432725242824==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
On Wed, Apr 9, 2014 at 4:14 PM, Daniel Blankenberg wrote:
> Hi Peter,
>
> Having a standalone repository that just contained the tool data table
> and .loc file that could be a dependency of other repositories would
> be a good way to go here. Unfortunately, this isn’t supported right
> now. I’ve opened a trello card for this: https://trello.com/c/VZxV08Qt
>
> However, even though you currently need to include the tool data table
> definition and .loc sample in each repository in order for the tool to be
> valid, it is still a best practice to use tool data tables.
OK, thanks Dan.
Peter
--===============6658388432725242824==--