Re: [galaxy-user] Workflow improvement requests (long)

15 Nov 2008

      Dear Assaf and everybody else,

I can only reinforce what you said: Great work! ... and that I had  
similar problems. In particular, when working with workflows that have  
say 50 different steps, things can become very confusing. It would  
help, if one can define outputs of the workflow and hide all the steps  
in the history that are inside the workflow and not related to inputs  
and outputs.

Another feature that I would find be very helpful in designing larger  
workflows would be if one could use workflows within a larger workflow.
In my case I have set of tasks that have to be repeated using several  
different settings within a larger workflow.
I realize that workflows are still in beta and that it might be too  
early to ask for such features... but it would be great to see them in  
beta soon.

Thanks a lot for your efforts!

Gunnar

On 14.11.2008, at 22:15, Assaf Gordon wrote:
...
Dear all,
Recently, users (of our local galaxy server) started using  
workflows, and are very pleased. However, as workflows get more  
complicated, it gets harder to track the input and output of the  
workflows.
I'd like to share an example, to illustrate the problems that we  
encounter.
The workflow (pictured in the attached 'workflow.jpg') takes 4 input  
datasets, and produces 4 output datasets.
The first problem is that there's no way to differentiate between  
the input datasets (They appear simply as "Step 1: Input dataset",  
"Step 2: Input Dataset", etc). Since each dataset has a specific  
role, I've had to print the workflow and give the users instructions  
as to which dataset (in their history) goes into what dataset. (see  
attached 'crosstab_workflow_input_datasets.jpg').
The second problem is that whenever I change something in the  
workflow and save it - the order of the dataset change!
So what was once dataset 1, can now be dataset 2,3 or 4.
Users have no way of knowing this... (keen users might notice the  
the description of the first tool changed from "Output dataset  
'output' from step 2" to "Output dataset' output' from step 4" - but  
this is very obscure...).
The third problem is that once the workflow completes, the resulting  
dataset have cryptic names such as "Join two queries on Data 10 and  
Data 2". Since "Data 10" is "Awk on Data 8" and data-8 is "Generic  
Annotations on Data 7 and Data 1" and data-7 is "Intersect data 1  
and data 6" - it gets a bit hard to know what's going on. (see  
attached 'crosstab_history.png').
For the meantime, I've simply gave written instructions on what each  
dataset means (see attached  
'crosstab_workflow_dataset_explnanations.jpg).
If I may suggest a feature - it would be great if I could name a  
dataset inside the workflow. Instead of naming it "Input dataset" I  
could give it a descriptive name, so even if the order of the input  
datasets changes, users will know which dataset goes into which input.
Regarding the output dataset names, the 'label' option in the tools'  
XML is a good start, but still creates very long, hard-to-understand  
names.
Another great feature would be the possibility to add an 'output  
label'
for each step in the workflow.
Regardless of the above, I'd like to say (once again) that Galaxy is  
a great tool, and workflows are really cool - we have several long  
workflows which do wonderful things.
Thanks for reading so far,
  Gordon.
< 
workflow 
.png
...
< 
crosstab_workflow_input_datasets 
.jpg
...
< 
crosstab_history 
.png
...
< 
crosstab_workflow_dataset_explnanations 
.jpg>_______________________________________________
galaxy-user mailing list
galaxy-user@bx.psu.edu
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
+-------------------------------------------------------------------+
Gunnar Rätsch                         http://www.fml.mpg.de/raetsch
Friedrich Miescher Laboratory       Gunnar.Raetsch@tuebingen.mpg.de
Max Planck Society                          Tel: (+49) 7071 601 820
Spemannstraße 39, 72076 Tübingen, Germany   Fax: (+49) 7071 601 801

Re: [galaxy-user] Workflow improvement requests (long)

Gunnar Raetsch