I'm a semantic web folk (I'm on the PROV Data Model), and I'm willing to advise as needed. PROV is very generalized, but workflow provenance was one of the primary use cases for its design (it descends from OPM, which was almost entirely workflow provenance-oriented). Workflow tools probably should be prov:Plan, a subclass of prov:Entity. PROV can cover attribution of things, agents (people and orgs) acting on behalf of other agents, and virtual and physical locations (prov:atLocation). Tool provenance is a little smaller-scoped than workflow invocation provenance, but obviously we want the two to mesh neatly. Below I'm going to use Turtle to write out an example, but the tool XML can obviously embed RDF-XML instead. We would need to supply a little bit of our own vocabulary: galaxy:Tool rdfs:subClassOf prov:Plan. Citing a book in a tool might look like this: :mytool a galaxy:Tool; prov:wasDerivedFrom <http://dx.doi.org/10.1119/1.15378> ; # For DOI-able references. [1] dc:creator <http://tw.rpi.edu/instances/JamesMcCusker>. [1] dx.doi.org will return RDF for the document if you ask for it. We can use their vocabulary within our own RDF if there is no DOI for it. @prefix doi: <http://dx.doi.org/>. @prefix prism: <http://prismstandard.org/namespaces/basic/2.1/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. @prefix dc: <http://purl.org/dc/terms/>. @prefix bibo: <http://purl.org/ontology/bibo/>. @prefix owl: <http://www.w3.org/2002/07/owl#>. <http://id.crossref.org/contributor/milton-abramowitz-1hbgxt653tpie> a foaf:Person> ; foaf:familyName "Abramowitz" ; foaf:givenName "Milton" ; foaf:name "Milton Abramowitz" . doi:10.1119/1.15378 prism:doi "10.1119/1.15378" ; prism:startingPage "958" ; prism:volume "56" ; dc:creator < http://id.crossref.org/contributor/milton-abramowitz-1hbgxt653tpie> ; dc:date "1988"^^<gYear> ; dc:identifier "10.1119/1.15378" ; dc:isPartOf <http://id.crossref.org/issn/0002-9505> ; dc:publisher "American Association of Physics Teachers (AAPT)" ; dc:title "Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables" ; bibo:doi "10.1119/1.15378" ; bibo:pageStart "958" ; bibo:volume "56" ; owl:sameAs <doi:10.1119/1.15378> , <info:doi/10.1119/1.15378> . <http://id.crossref.org/issn/0002-9505> a bibo:Journal ; prism:issn "0002-9505" ; dc:title "American Journal of Physics" ; bibo:issn "0002-9505" ; owl:sameAs <urn:issn:0002-9505> . Jim On Tue, May 27, 2014 at 2:17 PM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
On Tue, May 27, 2014 at 6:21 PM, Jim McCusker <jmccusker@5amsolutions.com> wrote:
I would suggest using as much as possible from PROV, especially since other workflow engines (Taverna and Pegasus come to mind) already support it. Rather than looking for bibtex mappings in XML, we should be looking for vocabularies that represent the elements we need to represent, and the relevant bibtex should be generated from that. PROV and Dublin Core Terms can get us most of the way there, I think.
Jim
This PROV ontology? http://www.w3.org/TR/prov-o/ http://www.taverna.org.uk/documentation/taverna-2-x/provenance/ https://github.com/wf4ever/taverna-prov
It looks potentially relevant, but tackling a wider issue. Are you aware of specific examples covering tool citation within the context of workflow provenance? I think many people (myself included) would find that useful - a basic example for what might go into a Galaxy Tool's XML file?
Maybe we need to CC some semantic web folk to advise... or schedule a get together as a BoF at GCC2014? It seems once a format is agreed, there is willingness on the Galaxy side to start coding :)
Peter