Re: [galaxy-dev] Feedback

15 Jul 2009

      On Jul 15, 2009, at 8:43 AM, Pieter Neerincx wrote:
...
The following is just a bunch of thoughts after using Galaxy for a
while and which might be interesting for future developments...
Thanks for the suggestions, lots of good ideas here.
...
* There are three nice icons at the top of all my dataset items in the
history panel on the right for view, edit and delete. So why is there
no save icon at the same location instead of a link further down?
This is simply because there is limited space. We've made the three  
most frequently needed options available as icons, but there is not  
space to include all options (save is just one, for certain datatypes  
we also have browse links and other options). I've played with a few  
ideas on how to make these options easier to get to. The most likely  
candidate is adding a popup menu, but this needs further study since  
any change to the history has potential to confuse current users.
...
* When I edit a workflow there is a save button above the canvas and
there is another on in the panel on the right when I edit the
properties of a specific workflow item. As far as I can tell these
buttons are not completely redundant, but why do I need two save
buttons?
The reason is that one save button just saves and validates the  
parameters for a single step, the other saves and validates parameters  
for the entire workflow. We want to make sure that each step is  
validated at the time it is being edited. Otherwise we would need to  
guide the user through a complex validation project at save time.

I'm not happy with the way his works right now either, but again have  
not found a better solution. We could autosave the forms whenever a  
user changes a field, but this has some tricky UI implications.
...
Ensembl version 48, or UniProt 3 or some version of a reference
assembly, etc. Or I might want to see how the results changed for a
certain gene over time as result of updated databases and /or tools.
So I might want to say to Galaxy show me all histories containing
ENSGALG000012589 or NM_45689725. Hence, I'd love to be able to search
histories.
This is a great idea, but computationally complex. If we just want to  
search the metadata in the histories (title, info, etc) that is pretty  
straightforward. Searching the actual data would be very costly, since  
it would require some sort of full text indexing of every dataset in  
Galaxy. Of course, this could be done in the background rather than in  
realtime... worth considering.

What would make this work better is if we could have additional  
metadata automatically added to datasets we get from external  
datasources that would include the sort of information people would be  
interested in searching for.
...
In addition to make it a bit easier to trace thing in
browse mode it would be nice if the date a history was last modified
would be visible. Currently I only have the age of the history in
minutes, hours or days. That is convenient for recent items, but for
things that are longer ago a date makes more sense to me....
This is no problem, we can display a date if it is more than a few  
days old.
...
* There is a fixed "Database/Build" popup that I can use to tag my
data sets, but this feels artificially limited. Is there any reason
why the species and database version cannot be separate items? If
there would be a popup first to select a species followed by a second
popup to select the genome assembly version, the lists could be a lot
smaller and hence easier to navigate.
Absolutely. We're playing with a variety of ways to make database/ 
build easier to work with. The main problem is that since we draw  
builds from multiple sources, getting the appropriate data to group  
them in a reliable way is challenging.
...
In addition there are cases where I do have a species, but don't have
an assembly or where there are additional version numbers to keep
track of.
For example I have lots of Ensembl data. Ensembl does not have a
single version number, but 3 version numbers. There is one for the
database schema, one for the assembly and one for the annotation/
genebuild. The curent version for mouse is for example: 55 37 h, where
55 is the release and schema version number, 37 the assembly and "h"
the version of the gene build. In addition I recently moved to a
proteomics group and might want to capture DB version numbers for
species without a reference assembly. for example I might know the
species name and the fact I'm using UniProt 15.5... but currently I
cannot easily capture that in a consistent way. (I know I might add
this to the "info" for a dataset, but it's free text, with all kinds
of possible spelling variants as a result...)
Builds and versioning are one of the trickiest problems we have to  
deal with. I'd love to hear your suggestions on how you'd like the  
tracking of this information to work.

Thanks,
James

Re: [galaxy-dev] Feedback

James Taylor