Hi Martin,

After we upgraded to the March release I was still seeing behavior similar to that I originally described back in February, even after uninstalling and reinstalling the affected repositories. On a separate instance using the 15.05 release and the new `install_database_connection` support, I've been working to duplicate all of our installed repositories from production using scripted API calls for reproducibility, and have again run into similar problems.

What I've learned, though, is that the problem does not appear to be caused by installing a *single* repository, but rather some kind of interaction between multiple repositories. For example, against a brand new 15.05 instance, I ran:

```
$ python ./scripts/api/install_tool_shed_repositories.py --local $GALAXY_INSTALL_URL --api $GALAXY_INSTALL_KEY --tool-deps --repository-deps --url https://toolshed.g2.bx.psu.edu --owner iuc --name samtools_sort --revision 38ea74bd4054
```

followed by: 

```
$ python ./scripts/api/install_tool_shed_repositories.py --local $GALAXY_INSTALL_URL --api $GALAXY_INSTALL_KEY --tool-deps --repository-deps --url https://toolshed.g2.bx.psu.edu --owner lparsons --name htseq_count --revision 6f920f33c5eb
```

The first repository, samtools_sort, installed fine:

```
Response
--------
/api/tool_shed_repositories/f2db41e1fa331b3e
  name: samtools_sort
  status: Installed
  tool_shed_status: {u'latest_installable_revision': u'True', u'revision_update': u'False', u'revision_upgrade': u'False', u'repository_deprecated': u'False'}
  deleted: False
  ctx_rev: 2
  error_message:
  dist_to_shed: False
  tool_shed: toolshed.g2.bx.psu.edu
  installed_changeset_revision: 38ea74bd4054
  uninstalled: False
  owner: iuc
  changeset_revision: 38ea74bd4054
  id: f2db41e1fa331b3e
  includes_datatypes: False
/api/tool_shed_repositories/f597429621d6eb2b
  name: package_samtools_0_1_19
  status: Installed
  tool_shed_status: {u'latest_installable_revision': u'True', u'revision_update': u'False', u'revision_upgrade': u'False', u'repository_deprecated': u'False'}
  deleted: False
  ctx_rev: 1
  error_message:
  dist_to_shed: False
  tool_shed: toolshed.g2.bx.psu.edu
  installed_changeset_revision: 95d2c4aefb5f
  uninstalled: False
  owner: devteam
  changeset_revision: 95d2c4aefb5f
  id: f597429621d6eb2b
  includes_datatypes: False
```

However, installation of htseq_count next failed and logged the following:

```
tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2015-06-20 02:31:33,466 Creating repository dependency objects...
tool_shed.util.shed_util_common DEBUG 2015-06-20 02:31:34,552 Adding new row for repository 'htseq_count' in the tool_shed_repository table, status set to 'New'.
tool_shed.util.shed_util_common DEBUG 2015-06-20 02:31:34,880 Adding new row for repository 'package_numpy_1_7' in the tool_shed_repository table, status set to 'New'.
tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2015-06-20 02:31:34,892 Skipping installation of revision 95d2c4aefb5f of repository 'package_samtools_0_1_19' because it was installed with the (possibly updated) revision 95d2c4aefb5f and its current installation status is 'Installed'.
tool_shed.util.shed_util_common DEBUG 2015-06-20 02:31:35,242 Adding new row for repository 'package_pysam_0_7_7' in the tool_shed_repository table, status set to 'New'.
tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2015-06-20 02:31:35,255 Building repository dependency relationships...
tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2015-06-20 02:31:35,266 Creating new repository_dependency record for installed revision 0c288abd2a1e of repository: package_numpy_1_7 owned by devteam.
tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2015-06-20 02:31:35,332 Creating new repository_dependency record for installed revision b62538c8c664 of repository: package_pysam_0_7_7 owned by iuc.
galaxy.web.framework.decorators ERROR 2015-06-20 02:31:35,387 Uncaught exception in exposed API method:
Traceback (most recent call last):
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/galaxy/web/framework/decorators.py", line 251, in decorator
    rval = func( self, trans, *args, **kwargs)
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/galaxy/webapps/galaxy/api/tool_shed_repositories.py", line 246, in install_repository_revision
    payload )
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/tool_shed/galaxy_install/install_manager.py", line 709, in install
    install_options
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/tool_shed/galaxy_install/install_manager.py", line 801, in __initiate_and_install_repositories
    return self.install_repositories(tsr_ids, decoded_kwd, reinstalling=False)
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/tool_shed/galaxy_install/install_manager.py", line 844, in install_repositories
    reinstalling=reinstalling )
  File "/mnt/scdata/scdata_03/galaxy/containers/galaxy-builder/stable/lib/tool_shed/galaxy_install/install_manager.py", line 864, in install_tool_shed_repository
    repo_info_tuple = repo_info_dict[ tool_shed_repository.name ]
KeyError: u'package_pysam_0_7_7'
```

This left some of the dependency repositories in strange states, even after a restart:

Inline image 1

Attempting to `purge` the package_numpy_1_7 repository still listed as 'New' resulted in an error traceback:

Inline image 2

However, without installing samtools_sort first, htseq_count and its dependencies install fine.

This is the only failing case I've reduced to the bare minimum from the ~110 tools I was trying to install. Before trying to reduce to a test case I provoked similar errors from at least three other repositories, so the problem doesn't seem to be limited to this combination.


Cheers,

Brian


On Wed, Apr 15, 2015 at 2:48 PM, Brian Claywell <bclaywel@fredhutch.org> wrote:
Hi Martin,

We haven't upgraded to the March release yet, and I haven't really touched the tools I referred to in my original email since I got them working (despite the duplicate installation records in the admin interface, etc). I've started testing the new release, though, and will take another look after we complete the upgrade. Thanks for following up!


Cheers,

Brian


On Mon, Apr 13, 2015 at 7:27 AM, Martin Čech <marten@bx.psu.edu> wrote:
Hello Brian,

we have made some changes in the past month that might address the issue you are having. Are you still encountering problems? If so could you please try re-installing the tools again?

thank you

Martin

On Tue, Feb 24, 2015 at 5:15 PM Brian Claywell <bclaywel@fredhutch.org> wrote:
Hi Martin,

Thanks for the prompt reply! My responses are inline below.

On Tue, Feb 24, 2015 at 8:26 AM, Martin Čech <marten@bx.psu.edu> wrote:
Are you using Admin UI for installation or API scripts?

The admin UI exclusively.
 
Does the 'installed tool shed repos' page show multiple installations of the same package (with the same changesets)? If so what are the states of these?

Yes -- the iuc package_numpy_1_7 revision a4bdc17eed4a is listed twice, with both showing the green "Installed" status. Interestingly the installed repo page shows that they're installed in different locations:


At one point during the troubleshooting steps I described in my original email, package_pybamparser_0_0_1 and package_pybamtools_0_0_1 showed up in the same way, with two records each having the same revision and a green "Installed" status.

If you check on the filesystem in the {{configured_dependency_dir}}/path/to/duplicated/package do you see any strange stuff/duplication?

I didn't see anything that struck me as out of the ordinary. Here's a gist of the directory structures for the affected packages under both shed_tools and tool_deps:


How big is your Galaxy in terms of installed tools, complexity of setup and traffic?

I don't think it's terribly big -- we have a couple of large tool suites like BEDTools, the gops interval operations suites, GATK 1 and 2, etc, and then maybe 15-20 other tools and their dependencies?

Our setup isn't very complex, either; Galaxy is running with one web process and one handler process behind an nginx reverse proxy, all running on a single machine. Jobs are submitted to our slurm cluster via the SSH CLI runner.

Traffic is pretty low -- we're just now rolling out of beta.


Cheers,

Brian

--
Brian Claywell | programmer/analyst
Matsen Group   | http://matsen.fredhutch.org
Fred Hutchinson Cancer Research Center



--
Brian Claywell | programmer/analyst
Matsen Group   | http://matsen.fredhutch.org
Fred Hutchinson Cancer Research Center



--
Brian Claywell | programmer/analyst
Matsen Group   | http://matsen.fredhutch.org
Fred Hutchinson Cancer Research Center