Hi Galaxy team,
The following is just a bunch of thoughts after using Galaxy for a
while and which might be interesting for future developments...
1. Interface consistency: "Save"
* There are three nice icons at the top of all my dataset items in the
history panel on the right for view, edit and delete. So why is there
no save icon at the same location instead of a link further down?
* When I edit a workflow there is a save button above the canvas and
there is another on in the panel on the right when I edit the
properties of a specific workflow item. As far as I can tell these
buttons are not completely redundant, but why do I need two save
buttons?
2. Provenance data
* Reproducibility is important and it is nice that Galaxy
automatically captures your analysis in histories, but if I want to
have a second look at my data after let's say a few months to figure
out what I did exactly and how a certain combination of data and tools
produced a certain result. Hence if I for example executed a workflow
once every two weeks on updated data for many months I might want to
retrieve the history for a certain version of a database. So I might
want to say give me the histories containing datasets tagged as
Ensembl version 48, or UniProt 3 or some version of a reference
assembly, etc. Or I might want to see how the results changed for a
certain gene over time as result of updated databases and /or tools.
So I might want to say to Galaxy show me all histories containing
ENSGALG000012589 or NM_45689725. Hence, I'd love to be able to search
histories. In addition to make it a bit easier to trace thing in
browse mode it would be nice if the date a history was last modified
would be visible. Currently I only have the age of the history in
minutes, hours or days. That is convenient for recent items, but for
things that are longer ago a date makes more sense to me....
* There is a fixed "Database/Build" popup that I can use to tag my
data sets, but this feels artificially limited. Is there any reason
why the species and database version cannot be separate items? If
there would be a popup first to select a species followed by a second
popup to select the genome assembly version, the lists could be a lot
smaller and hence easier to navigate.
In addition there are cases where I do have a species, but don't have
an assembly or where there are additional version numbers to keep
track of.
For example I have lots of Ensembl data. Ensembl does not have a
single version number, but 3 version numbers. There is one for the
database schema, one for the assembly and one for the annotation/
genebuild. The curent version for mouse is for example: 55 37 h, where
55 is the release and schema version number, 37 the assembly and "h"
the version of the gene build. In addition I recently moved to a
proteomics group and might want to capture DB version numbers for
species without a reference assembly. for example I might know the
species name and the fact I'm using UniProt 15.5... but currently I
cannot easily capture that in a consistent way. (I know I might add
this to the "info" for a dataset, but it's free text, with all kinds
of possible spelling variants as a result...)
Cheers,
Pi
-------------------------------------------------------------
Biomolecular Mass Spectrometry and Proteomics
Utrecht University
Visiting address:
H.R. Kruyt building room O607
Padualaan 8
3584 CH Utrecht
The Netherlands
Mail address:
P.O. box 80.082
3508 TB Utrecht
The Netherlands
phone: +31 (0)6-143 66 783
email: pieter.neerincx(a)gmail.com
skype: pieter.online
------------------------------------------------------------