Hi Peter, On May 8, 2013, at 6:45 AM, Peter Cock wrote:
On Tue, May 7, 2013 at 7:02 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hi Peter,
Missing test components implies a tool config that does not define a test (i.e, a missing test definition) or a tool config that defines a test, but the test's input or output files are missing from the repository.
This seems to be our point of confusion: I don't understand combining these two categories - it seems unhelpful to me.
I feel this is just a difference of opinion. Combining missing tests and missing test data into a single category is certainly justifiable. Any repository that falls into this category clearly states to the owner what is missing, and the owner can easily know that work is needed to prepare the repository contents for testing, whether that work falls into the category of adding a missing test or adding missing test data.
Tools missing a test definition clearly can't be tested - but since we'd like every tool to have tests having this as an easily view listing is useful both for authors and reviewers.
But is is an easily viewed listing. It is currently very easy to determine if a tool is missing a defined test, is missing test data, or both.
It highlights tools which need some work - or in some cases work on the Galaxy test framework itself. They are neither passing nor failing tests - and it makes sense to list them separately.
Tools with a test definition should be tested
This is where I disagree. It currently takes a few seconds for our check repositories for test components to crawl the entire main tool shed and set flags for those repositories missing test components. However, the separate script that crawls the main tool shed and installs and tests repositories that are not missing test components currently takes hours to run even though less than 10% of the repositories are currently tests (due to missing test components on most of them). Installing a testing repositories that have tools with defined tests but missing test data is potentially costly from a time perspective. Let's take a simple example: Repo A has 1 tool that includes a defined test, but is missing required test data from the repository. The tool in repo A defines 2 3rd party tool dependencies that must be installed and compiled. In addition, repo A defines a repository dependency whose ultimate chain of repository installations results in 4 additional repositories with 16 additional 3rd party tool dependencies, with a total installation time of 2 hours. All of this time is taken in order to test the tool in repo A when we already know that it will not succeed because it is missing test data. This is certainly a realistic scenario.
- if they are missing an input or output file this is just a special case of a test failure (and can be spotted without actually attempting to run the tool).
Yes, but this is what we are doing now. We are spotting this scenario without installing the repository or running any defined tests by running the tool.
This is clearly a broken test and the tool author should be able to fix this easily (by uploading the missing test data file)
Yes, but this is already possible for them to clearly see without having to install the repository or run any tests.
I don't see the benefit of the above where you place tools missing tests into a different category than tools with defined tests, but missing test data. If any of the test components (test definition or required input or output files) are missing, then the test cannot be executed, so defining it as a failing test in either case is a bit misleading. It is actually a tool that is missing test components that are required for execution which will result in a pass / fail status.
It is still a failing test (just for the trivial reason of missing a test data file).
It would be much simpler to change the filter for failing tests to include those that are missing test components so that the list of missing test components is a subset of the list of failing tests.
What I would like is three lists:
Latest revision: missing tool tests - repositories where at least 1 tool has no test defined
[The medium term TO-DO list for the Tool Author]
Latest revision: failing tool tests - repositories where at least 1 tool has a failing test (where I include tests missing their input or output test data files)
[The priority TO-DO list for the Tool Author]
Latest revision: all tool tests pass - repositories where every tool has tests and they all pass
[The good list, Tool Authors should aim to have everything here]
Right now http://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus would appear under both "missing tool tests" and "failing tool tests", but I hope to fix this and have this under "missing tool tests" only (until my current roadblocks with the Galaxy Test Framework are resolved).
I hope I've managed a clearer explanation this time,
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/