
On Mon, May 16, 2011 at 4:53 PM, Duddy, John <jduddy@illumina.com> wrote:
Doesn’t this violate one of the basic tenets of Galaxy – reproducibility? Without the ability to provide full traceability to the inputs, one can make no guarantees about the outputs.
Yes, but not everyone puts that as a number one priority for their use of Galaxy. Also given software inevitably changes, the version of the underlying tools on a Galaxy server is potentially in flux. As another example, we have local copy of the NCBI NR BLAST database setup as nr, which is kept in sync with the latest releases. This implicitly means any workflow using this database (e.g. identify novel proteins, those with no matches against NR) can give different results as the database changes. IIRC the Galaxy documentation suggests using multiple copies of a database, each date stamped, to try and ensure reproducibility. Anyway, tools with no inputs was brought up at the end of last year: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-November/003812.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-December/003955.html ... http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-December/003974.html Peter