Script to help maintain toolshed repos across toolsheds
Hi all, I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release) The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this <repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/> needs to be updated to a different toolshed url and (by extension) a different changeset revision. The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed. I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share https://bitbucket.org/iracooke/galaxy_repo_bundler/ Cheers Ira
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira, I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later. Peter
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK. e.g. For v0.0.5 of my seq_filter_by_id tool, <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b or: <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38 Something like this maybe?: <any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of> Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification. Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term? Peter
Helo Peter and Ira, These are great ideas and contributions, and I'll make sure to incorporate some version of them into the Tool Shed framework as soon as possible. I've created the following Trello card for this. Thanks! https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s... On May 6, 2013, at 11:05 AM, Peter Cock wrote:
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK.
e.g. For v0.0.5 of my seq_filter_by_id tool,
<repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b
or:
<repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38
Something like this maybe?:
<any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of>
Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification.
Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi, nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm. To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload. For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ... Ciao, Bjoern
Helo Peter and Ira,
These are great ideas and contributions, and I'll make sure to incorporate some version of them into the Tool Shed framework as soon as possible. I've created the following Trello card for this.
Thanks!
https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s...
On May 6, 2013, at 11:05 AM, Peter Cock wrote:
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK.
e.g. For v0.0.5 of my seq_filter_by_id tool,
<repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b
or:
<repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38
Something like this maybe?:
<any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of>
Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification.
Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hello Björn, I've added your comments to the following Trello card. I'm doing some thinking on this issue to make sure I come up with the optimal solution. Thanks for the valuable input! https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s... Greg Von Kuster On May 7, 2013, at 5:49 AM, Björn Grüning wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Ciao, Bjoern
Helo Peter and Ira,
These are great ideas and contributions, and I'll make sure to incorporate some version of them into the Tool Shed framework as soon as possible. I've created the following Trello card for this.
Thanks!
https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s...
On May 6, 2013, at 11:05 AM, Peter Cock wrote:
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK.
e.g. For v0.0.5 of my seq_filter_by_id tool,
<repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b
or:
<repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38
Something like this maybe?:
<any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of>
Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification.
Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Bjoern, It's good to know I'm not the only one to encounter this. I think it tends to bite when you have alot of dependencies. I use the same method as you (uploading one repo .. then another and so on) ... using the script each time to automate the upload. BTW I made a little addition to the script today so it automatically commits and checks in the changes via mercurial ... that makes the workflow really quite smooth. Your suggestion of possibly allowing the revision tag to be blank makes alot of sense ... it could simply imply (always use the latest available). That would certainly make things much much easier for those who aren't using a script. Cheers Ira On 07/05/2013, at 7:49 PM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Ciao, Bjoern
Helo Peter and Ira,
These are great ideas and contributions, and I'll make sure to incorporate some version of them into the Tool Shed framework as soon as possible. I've created the following Trello card for this.
Thanks!
https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s...
On May 6, 2013, at 11:05 AM, Peter Cock wrote:
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK.
e.g. For v0.0.5 of my seq_filter_by_id tool,
<repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b
or:
<repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38
Something like this maybe?:
<any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of>
Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification.
Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Ira, that is a great new features, thanks! Will try it soon. Ciao, Bjoern
Hi Bjoern,
It's good to know I'm not the only one to encounter this. I think it tends to bite when you have alot of dependencies.
I use the same method as you (uploading one repo .. then another and so on) ... using the script each time to automate the upload.
BTW I made a little addition to the script today so it automatically commits and checks in the changes via mercurial ... that makes the workflow really quite smooth.
Your suggestion of possibly allowing the revision tag to be blank makes alot of sense ... it could simply imply (always use the latest available). That would certainly make things much much easier for those who aren't using a script.
Cheers Ira
On 07/05/2013, at 7:49 PM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Ciao, Bjoern
Helo Peter and Ira,
These are great ideas and contributions, and I'll make sure to incorporate some version of them into the Tool Shed framework as soon as possible. I've created the following Trello card for this.
Thanks!
https://trello.com/card/toolshed-enable-dependency-definitions-across-tool-s...
On May 6, 2013, at 11:05 AM, Peter Cock wrote:
On Thu, May 2, 2013 at 10:24 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, May 2, 2013 at 7:16 AM, Ira Cooke <iracooke@gmail.com> wrote:
Hi all,
I've written a script to help deal with the problem of maintaining toolshed tools across multiple toolsheds (eg test and release)
The problem I encountered was that switching between test and production versions of a suite of tools can be quite painful because every repository definition like this
<repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="463328a6967f"/>
needs to be updated to a different toolshed url and (by extension) a different changeset revision.
The idea with this script is that you should be able to point it at a directory containing a toolshed repository and it will create a copy of that repository in which the toolshed urls (and changeset revisions) have been updated to correct values for a different toolshed.
I'm not sure how others are dealing with this issue (perhaps there is another easier way) .. but I've found this helped me alot so I thought I'd share
https://bitbucket.org/iracooke/galaxy_repo_bundler/
Cheers Ira
Thanks Ira,
I've not made as heavy use of inter-repository dependencies as you, but thus far I have ignored the problem (only a couple of my repositories are affected), in the hope this limitation will be fixed sooner rather than later.
Peter
Thinking out loud, another way to solve this would be to allow multiple equivalent <repository> entries as a group where any one would be OK.
e.g. For v0.0.5 of my seq_filter_by_id tool,
<repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> for http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/abdd608c869b
or:
<repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> for http://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id/66d1ca92fb38
Something like this maybe?:
<any_one_of> <repository toolshed="http://toolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="abdd608c869b"/> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" owner="peterjc" name="seq_filter_by_id" changeset_revision="66d1ca92fb38"/> </any_one_of>
Of course more generally we might also post this on a local public Tool Shed as well. The point is then the same repository_dependencies.xml could be used on either tool shed without modification.
Fixing declaring a dependency on an external Tool Shed would be better - but perhaps we'll need both or some sort of mirroring federated system in the long term?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Tue, May 7, 2013 at 10:49 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
I've not written any code so its just an idea for now, rather than a ready to try solution ;)
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Is the current revision setting treated as an exact match, or as a minimum version? There are advantages to both, but if it is an exact match then it would be sensible to have a way to say just give me the latest version of that repository. Peter
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation. Even so, it may still make sense to allow for a default of the latest installable revision if the definition lacks a revision value. Greg Von Kuster On May 7, 2013, at 8:57 AM, Peter Cock wrote:
On Tue, May 7, 2013 at 10:49 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
I've not written any code so its just an idea for now, rather than a ready to try solution ;)
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Is the current revision setting treated as an exact match, or as a minimum version? There are advantages to both, but if it is an exact match then it would be sensible to have a way to say just give me the latest version of that repository.
Peter
Hi Greg,
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation.
if I get that right, we should always have the latest revision installed, regardless of the specified requirement revision. Is that correct? I do not think that is working atm. It seems its using an exact match. Ciao, Björn
Even so, it may still make sense to allow for a default of the latest installable revision if the definition lacks a revision value.
Greg Von Kuster
On May 7, 2013, at 8:57 AM, Peter Cock wrote:
On Tue, May 7, 2013 at 10:49 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi,
nice work Ira! That problem also bothers me. I have written a similar script in python, but its not documented ;) I really think a more advanced solution like Peter's are needed midterm.
I've not written any code so its just an idea for now, rather than a ready to try solution ;)
To fix the revision problems: What about leaving the revision tag blank means that the toolshed should insert the latest revision of a tool dependency during upload.
For me the vast majority of repo uploads referring to the latest dependencies of other repo's. So I ending up with uploading one repo, running my script to insert the new revision tag, uploading the second, rerun the script and so on ...
Is the current revision setting treated as an exact match, or as a minimum version? There are advantages to both, but if it is an exact match then it would be sensible to have a way to say just give me the latest version of that re
Hello Björn, On May 12, 2013, at 12:36 PM, Björn Grüning wrote:
Hi Greg,
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation.
if I get that right, we should always have the latest revision installed, regardless of the specified requirement revision. Is that correct?
No, this is not correct. By "minimum", we mean that the revision setting is treated as a minimum within the set of changeset revisions up to, but not including, the next installable changeset revision in the change log. For example, assume a change log like this: changset revision Installable revision 0: sjekvub yes 1: jjtofvp 2: htocegy 3: jswofpt yes 4: jaqvkrc In the above example, sjekvub is considered the "minimum revision for revs 0, 1, 2, and jswofpt is considered the minimum revision for revs 3, 4. If a dependency definition defined revision sjekvub, then what will actually be installed is "2: htocegy". This approach guarantees reproducibility. For example, assume revision 0: sjekvub contains version 1 of tool A and revision 3: jswofpt contains version 2 of tool A. If you install 0: sjekvub, then you cannot upgrade beyond 2: htocegy for that specific repository installation. To get version 2 of tool A you have to install revision 3: jswofpt of the repository as a separate installation. This ensures that both versions of the same tool are always available to you for reproducibility. This information is document in the following section of the tool shed wiki: http://wiki.galaxyproject.org/RepositoryRevisions
I do not think that is working atm. It seems its using an exact match.
If you are seeing behavior that contradicts the above information or the information in the tool shed wiki, please let me know some specifics. Thanks!
Ciao, Björn
Even so, it may still make sense to allow for a default of the latest installable revision if the definition lacks a revision value.
On Mon, May 13, 2013 at 4:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Björn,
On May 12, 2013, at 12:36 PM, Björn Grüning wrote:
Hi Greg,
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation.
if I get that right, we should always have the latest revision installed, regardless of the specified requirement revision. Is that correct?
No, this is not correct. By "minimum", we mean that the revision setting is treated as a minimum within the set of changeset revisions up to, but not including, the next installable changeset revision in the change log.
For example, assume a change log like this:
changset revision Installable revision
0: sjekvub yes 1: jjtofvp 2: htocegy 3: jswofpt yes 4: jaqvkrc
In the above example, sjekvub is considered the "minimum revision for revs 0, 1, 2, and jswofpt is considered the minimum revision for revs 3, 4. If a dependency definition defined revision sjekvub, then what will actually be installed is "2: htocegy".
This approach guarantees reproducibility. For example, assume revision 0: sjekvub contains version 1 of tool A and revision 3: jswofpt contains version 2 of tool A. If you install 0: sjekvub, then you cannot upgrade beyond 2: htocegy for that specific repository installation. To get version 2 of tool A you have to install revision 3: jswofpt of the repository as a separate installation.
This ensures that both versions of the same tool are always available to you for reproducibility.
This information is document in the following section of the tool shed wiki:
http://wiki.galaxyproject.org/RepositoryRevisions
I do not think that is working atm. It seems its using an exact match.
If you are seeing behavior that contradicts the above information or the information in the tool shed wiki, please let me know some specifics.
Thanks!
Woosh - that largely went over my head. It sounds rather complicated. Did I get the gist right: To paraphrase, if I declare a dependency on revision X, then what will be installed could be X or an EARLIER revision - stepping back until an installable revision is found. (Here I'd have called X a "maximum" revision, but this is seemingly open to two opposite view points - thus our confusion). That is very different to what I (and Bjorn?) had understood you to mean by "minimum", namely if we declare a dependency on revision X, then we'll get either X or a LATER revision. I guess this means an option to explicitly ask for the latest revision would be regarded as counter to the Galaxy reproducibility goals? Thanks, Peter
Am Montag, den 13.05.2013, 17:13 +0100 schrieb Peter Cock:
On Mon, May 13, 2013 at 4:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Björn,
On May 12, 2013, at 12:36 PM, Björn Grüning wrote:
Hi Greg,
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation.
if I get that right, we should always have the latest revision installed, regardless of the specified requirement revision. Is that correct?
No, this is not correct. By "minimum", we mean that the revision setting is treated as a minimum within the set of changeset revisions up to, but not including, the next installable changeset revision in the change log.
For example, assume a change log like this:
changset revision Installable revision
0: sjekvub yes 1: jjtofvp 2: htocegy 3: jswofpt yes 4: jaqvkrc
In the above example, sjekvub is considered the "minimum revision for revs 0, 1, 2, and jswofpt is considered the minimum revision for revs 3, 4. If a dependency definition defined revision sjekvub, then what will actually be installed is "2: htocegy".
This approach guarantees reproducibility. For example, assume revision 0: sjekvub contains version 1 of tool A and revision 3: jswofpt contains version 2 of tool A. If you install 0: sjekvub, then you cannot upgrade beyond 2: htocegy for that specific repository installation. To get version 2 of tool A you have to install revision 3: jswofpt of the repository as a separate installation.
This ensures that both versions of the same tool are always available to you for reproducibility.
This information is document in the following section of the tool shed wiki:
http://wiki.galaxyproject.org/RepositoryRevisions
I do not think that is working atm. It seems its using an exact match.
If you are seeing behavior that contradicts the above information or the information in the tool shed wiki, please let me know some specifics.
Thanks!
Woosh - that largely went over my head. It sounds rather complicated. Did I get the gist right:
To paraphrase, if I declare a dependency on revision X, then what will be installed could be X or an EARLIER revision - stepping back until an installable revision is found.
(Here I'd have called X a "maximum" revision, but this is seemingly open to two opposite view points - thus our confusion).
As far as I understood, every revision that did not change any metadata will be marked as 'preferred installable', the revision before looses these tag. Any change to metadata will create an additional tag. So we end up with two 'preferred installable' tags. If I now specify a version that has no tag, toolshed will climb the revision history until it will find a 'preferred installation' tag. That has the same metadata as my specified version. In the end I get either my specified version or a later version, but not the necessarily the latest.
That is very different to what I (and Bjorn?) had understood you to mean by "minimum", namely if we declare a dependency on revision X, then we'll get either X or a LATER revision.
Yes, that also confused me ...
I guess this means an option to explicitly ask for the latest revision would be regarded as counter to the Galaxy reproducibility goals?
The point of that short cut is that my git changelog will get spammed with revision changes and during development I will only need the latest version of a dependency, or? So leaving the revision field empty means the toolshed should include the latest current revision of my dependency automatically. For the reproducibility nothing will change I think. In the toolshed its filled with a specific revision. Only on my development box and git account its empty.
Thanks,
Peter
Hi Peter, On May 13, 2013, at 12:13 PM, Peter Cock wrote:
On Mon, May 13, 2013 at 4:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Björn,
On May 12, 2013, at 12:36 PM, Björn Grüning wrote:
Hi Greg,
The current revision setting is treated as a minimum, so any available updates are retrieved automatically at the time of installation.
if I get that right, we should always have the latest revision installed, regardless of the specified requirement revision. Is that correct?
No, this is not correct. By "minimum", we mean that the revision setting is treated as a minimum within the set of changeset revisions up to, but not including, the next installable changeset revision in the change log.
For example, assume a change log like this:
changset revision Installable revision
0: sjekvub yes 1: jjtofvp 2: htocegy 3: jswofpt yes 4: jaqvkrc
In the above example, sjekvub is considered the "minimum revision for revs 0, 1, 2, and jswofpt is considered the minimum revision for revs 3, 4. If a dependency definition defined revision sjekvub, then what will actually be installed is "2: htocegy".
This approach guarantees reproducibility. For example, assume revision 0: sjekvub contains version 1 of tool A and revision 3: jswofpt contains version 2 of tool A. If you install 0: sjekvub, then you cannot upgrade beyond 2: htocegy for that specific repository installation. To get version 2 of tool A you have to install revision 3: jswofpt of the repository as a separate installation.
This ensures that both versions of the same tool are always available to you for reproducibility.
This information is document in the following section of the tool shed wiki:
http://wiki.galaxyproject.org/RepositoryRevisions
I do not think that is working atm. It seems its using an exact match.
If you are seeing behavior that contradicts the above information or the information in the tool shed wiki, please let me know some specifics.
Thanks!
Woosh - that largely went over my head. It sounds rather complicated. Did I get the gist right:
To paraphrase, if I declare a dependency on revision X, then what will be installed could be X or an EARLIER revision - stepping back until an installable revision is found.
No, nothing EARLIER is installed, only revisions that came LATER (than the declared dependency revision) up to, but not including the next revision in the change log (AFTER the declared dependency revision) that has metadata associated with it. If you look at the change log page for a repository in the tool shed, you'll see all of the revisions that have metadata associated with them.
(Here I'd have called X a "maximum" revision, but this is seemingly open to two opposite view points - thus our confusion).
That is very different to what I (and Bjorn?) had understood you to mean by "minimum", namely if we declare a dependency on revision X, then we'll get either X or a LATER revision.
This is actually the case, so I'm not sure what I stated in the example above that caused the confusion.
I guess this means an option to explicitly ask for the latest revision would be regarded as counter to the Galaxy reproducibility goals?
Not necessarily, but there is a Trello card here: https://trello.com/card/toolshed-add-the-ability-to-deprecate-a-repository-r... based on a request at the last IUC teleconference that could easily impact reproducibility. This request will take careful consideration in how it is implemented so that we can ensure reproducibility.
Thanks,
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Mon, May 13, 2013 at 6:35 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Woosh - that largely went over my head. It sounds rather complicated. Did I get the gist right:
To paraphrase, if I declare a dependency on revision X, then what will be installed could be X or an EARLIER revision - stepping back until an installable revision is found.
No, nothing EARLIER is installed, only revisions that came LATER (than the declared dependency revision) up to, but not including the next revision in the change log (AFTER the declared dependency revision) that has metadata associated with it. If you look at the change log page for a repository in the tool shed, you'll see all of the revisions that have metadata associated with them.
Oh. OK, so you'd get revision X or anything later with the same metadata (but not any more recent changes where the tool metadata was altered).
This is actually the case, so I'm not sure what I stated in the example above that caused the confusion.
Well one thing which threw me was this table:
For example, assume a change log like this:
changset revision Installable revision
0: sjekvub yes 1: jjtofvp 2: htocegy 3: jswofpt yes 4: jaqvkrc
In the above example, sjekvub is considered the "minimum revision for revs 0, 1, 2, and jswofpt is considered the minimum revision for revs 3, 4.
Assuming I followed correctly in your example revisions 0,1 and 2 all had the same metadata (Tool A, version 1) while later revisions 3 and 4 represented a jump (they have Tool A, version 2). Of these sets, only revision 0 and 3 are marked as installable - which I took to mean you only install the minimum revision (oldest revision) in each batch of revisions with the same metadata. This was contracted however by the next bit - saying if I asked for 0:sjekvub then I'd get 2:htocegy (which is the maximum revision for the set 0, 1, 2). Perhaps this is clearer?: changset revision Installable revision ----------------------------------------------------------------------------- 0: sjekvub not any more, superseded by r1 1: jjtofvp not any more, superseded by r2 2: htocegy yes, final revision with this metadata ----------------------------------------------------------------------------- 3: jswofpt not any more, superseded by r4 4: jaqvkrc yes, final revision with this metadata Regards, Peter
Peter, yes, your enhanced information is correct, and to add a bit more information: Assume a user had at some point installed 0:sjekvub. Then as soon as 1:jjtofvp was made available in the tool shed, the associated installed repository in their Galaxy instance would be displayed in yellow in the Manage installed tool shed repositories page alerting the user that there are updates available for that installed repository. Updates to installed tool shed repositories are discovered when the Galaxy server is started or after the defined time delay using the following config settings: # Enable automatic polling of relative tool sheds to see if any updates # are available for installed repositories. Ideally only one Galaxy # server process should be able to check for repository updates. The # setting for hours_between_check should be an integer between 1 and 24. enable_tool_shed_check = True hours_between_check = 12 On May 13, 2013, at 3:07 PM, Peter Cock wrote:
Perhaps this is clearer?:
changset revision Installable revision ----------------------------------------------------------------------------- 0: sjekvub not any more, superseded by r1 1: jjtofvp not any more, superseded by r2 2: htocegy yes, final revision with this metadata ----------------------------------------------------------------------------- 3: jswofpt not any more, superseded by r4 4: jaqvkrc yes, final revision with this metadata
Regards,
Peter
participants (4)
-
Björn Grüning
-
Greg Von Kuster
-
Ira Cooke
-
Peter Cock