Hi Jeremy and Ross

I agree that the current chaining mechanism would get very long after even a few steps.

Ususally the most important information is the first and last step
E.g.
The TopHat run should be called 
TopHat on SOLiD 24A

The alignment stats should be 
SAM/BAM Summary Metrics of Solid 24A
With the rest of the tools in the chain identified in the "more information" box.

This would also give graph generating tools a fighting chance to present something useful in any graphs generated.
E.g.
GC Bias Plot of Solid 24A could have a title of Solid 24A instead of dataset_234.dat

What do you think of this first-last model?

Brad

--
Brad Langhorst
New England Biolabs
langhorst@neb.com



From: Jeremy Goecks <jeremy.goecks@emory.edu>
Date: Tue, 31 Jan 2012 09:00:38 -0500
To: Brad Langhorst <langhorst@neb.com>
Cc: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>
Subject: Re: [galaxy-dev] naming of history steps

Brad,

But I think it might be a lot easier to manage if step names were based on the titles of the history items instead of "data 2" or whatever. 
 
Has this been tried and rejected for some reason? 

It's been tried and rejected because dataset names get very long and unwieldy. E.g. "Sam/Bam Alignment Summary Metrics on Tophat on Filter FASTQ on my_rna_seq_reads"

Would a pull request implementing this change be welcomed?

What we imagine would help is a way to easily show/find a dataset's analysis path -- its parents and its decendants -- so that it's possible to trace the datasets/tools used to create a dataset and the tools/datasets subsequently used. 

This is something we'd like to do but haven't put much effort into yet. Community contributions in this space would be great.

Best,
J.