Bumping one of my old queries again, with some more use-cases at the end,
On Wed, Aug 10, 2011 at 11:28 AM, Peter Cock <p.j.a.cock(a)googlemail.com> wrote:
Bumping this old query:
On Mon, Nov 22, 2010 at 10:11 AM, Peter <peter(a)maubp.freeserve.co.uk> wrote:
> On Fri, Nov 19, 2010 at 12:20 PM, Peter <peter(a)maubp.freeserve.co.uk> wrote:
>> Hi all,
>> I'd like to know more about Galaxy's column metadata for tabular files.
>> In the workflow editor under "Edit Step Actions" you can pick
>> Columns", and then give column numbers for five predefined cases:
>> Chrom, Start, End, Strand, Name.
>> Do these "named columns" get shown anywhere in the Galaxy UI?
>> For example, in a column select parameter widget?
> I've spotted some of these in the "peep" view (the right hand side
> history column) for interval data. Are they specific to interval data
> only, with no general mechanism available for other tabular data?
>> Is it possible to assign these columns in a tool's wrapper XML file?
>> From http://bitbucket.org/galaxy/galaxy-central/wiki/ToolConfigSyntax
>> I'm aware of the metadata_source attribute to *copy* the meta data
>> from the input file, but that isn't always relevant. Can I somehow
>> specify that my tool has tabular output where column 1 is "Name"?
>> Is it possible to introduce additional column types? e.g. "evalue" or
Is there any mechanism for tools to set tabular files' column metadata?
I was prompted to return to this issue after going through a fairly
simple BLAST data analysis flow with a biology colleague - and
being reminded just how non-obvious some of the task steps
were [*]. Galaxy could still be much easier to use.
Most of my protein analysis tool wrappers output tabular files,
where column 1 is the query name, and the rest of the columns
will be some sort of predictive model outcome or score. I do of
course document the column meanings in the tool's help (and
include a #header line in the output where possible), but this
could be much more user friendly.
A specific example is BLAST+ tabular output - where I have taken
pains to document the columns in the tools' help text, but it would
be much nicer to be able to annotate the columns within Galaxy's
metadata as well. If this isn't possible in the base 'tabular' datatype,
can it be done as a custom 'blast-tabular' datatype instead?
This relates to an open issues on improving the parameter widget
for selecting a column (or columns):
Bitbucket issue 554: Show column names, headers or first entry in
column select parameters
For instance, when sorting a BLAST tabular file, it should be
trivially easy to sort by bitscore without first having to go away
an lookup the meaning of each column in order to know this is
[*] This is one reason why I've just switched the default BLAST+
output from the standard 12 column output to the extended 24
column output in v0.0.17 of the wrappers: