I tried to run an alignment using Bowtie2 and got this message-
format: bam, database: mm9
Could not locate a Bowtie index corresponding to basename "/galaxy/data/mm9/mm9canon/bowtie2_index/mm9canon" Error: Encountered internal Bowtie 2 exception (#1) Command: /galaxy/software/linux2.6-x86_64/pkg/bowtie2-2.1.0/bin/bowtie2-align --wrapper basic
I imported Illumina fastq data, groomed them, and then did the analysis using the built-in index mouse, both full and male, and had the same error message.
I'm relatively new to this, and don't see what I missed.
Mary E. Davis, Ph.D.
Department of Physiology & Pharmacology
West Virginia University Health Sciences Center
PO Box 9229
Morgantown, WV 26506-9229
Sending to galaxy-dev instead.
From: Srinivas Maddhi <iihg-galaxy-admin(a)uiowa.edu<mailto:firstname.lastname@example.org>>
Date: Friday, November 1, 2013 11:56 AM
To: "galaxy-user(a)lists.bx.psu.edu<mailto:email@example.com>" <galaxy-user(a)lists.bx.psu.edu<mailto:firstname.lastname@example.org>>
Subject: Empty bowtie2 output
In follow-up to http://user.list.galaxyproject.org/Empty-bowtie2-output-tp4656137.html, is there:
- an ETA on when the issue with Bowtie2, in August 2013 distribution, generating empty output will be fixed (if not already fixed) ?
- a suggested workaround (revert to an older version of that particular tool etc.) in the meantime ?
Unrelated: wasn't able to determine how to update that thread to request status, hence creating a new one.
Dear Galaxy developers,
I know I am not the only one with this issue, as over time I've stumbled
on a few mailing-list threads with other users having the same problem.
And I know the recommended solution is to use the -noac mount option. (
However, it is said that using this -noac mount option comes with a
performance trade-off, so when we first ran into this issue (datasets
showing "Empty" and "No peek", even though the file on the hard drive is
full of content), we implemented the hack found in this thread:
In this thread, John suggested to add a "sleep()" in the "finish_job"
method of the "galaxy_dist/lib/galaxy/jobs/runnersdrmaa.py" file.
It worked very well for us. Adding a sleep(30) made all the jobs waiting
30 seconds before finishing, but the "No peek" issue had at least
However, since the latest Galaxy updates, this file (drmaa.py) has been
dramastically changed and the "finish_job" method doesn't exist anymore.
Hence, I had to remove this hack, hoping that this issue would have
disappeared as well. Unfortunaley, this "No peek" issue is still there
and causing many headaches to some of our workflows users.
My question is then: Can I put this "sleep(30)" in some other place
(method and/or file) in order to achieve the same result?
I would really like to solve this "No peek" issue without resorting to the
"-noac" mount option. Actually, I am not even sure our system
administrator would allow it.
Thanks again for your help!
What's the status of bowtie2/mm9 index on PSU main?
When I select tophat2, it offers me mm9 as a choice for built-in indexes.
However, when the job runs, I get the following error, indicating the bowtie2/mm9 indexes are missing (below).
Any insight into whether this is expected, or what the ETA is until the index would be installed, would be great.
I'm trying to reproduce work on PSU I ran on my local galaxy, so that we can link to it for supplemental materials for a paper.
PS - I clicked the submit bug button a few days ago, but haven't received a response yet.
Fatal error: Tool execution failed
[2013-10-29 10:13:27] Beginning TopHat run (v2.0.9)
[2013-10-29 10:13:27] Checking for Bowtie
Bowtie version: 22.214.171.124
[2013-10-29 10:13:27] Checking for Samtools
Samtools version: 0.1.18.0
[2013-10-29 10:13:27] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (/galaxy/data/mm9/mm9full/bowtie2_index/mm9full.*.bt2)
From: Jennifer Jackson [mailto:email@example.com]
Sent: Friday, September 20, 2013 4:00 PM
To: Curtis Hendrickson (Campus)
Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ?
I am actually working to try to get mm9 out there right now. No promises, but is just one (well, three, including variants)! If technical is a go, then will do it. Ideally others soonish. We'll see.
The last news brief has help for the Data manager, it may be that you need to do some config changes to get it going. I am certainly no expert - this is Dan's and under active development - but is where I would start.
On 9/20/13 1:25 PM, Curtis Hendrickson (Campus) wrote:
Thanks for the rapid reply! I have some questions and comments, but need to read up on Data Managers (that admin page seems non-functional in our local galaxy, despite being on latest code) first.
From: Jennifer Jackson [mailto:firstname.lastname@example.org]
Sent: Friday, September 20, 2013 2:34 PM
To: Curtis Hendrickson (Campus)
Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ?
The datacache was originally pointed to the data staging area and is now pointed to the data published area. The difference is that the published area contains data and location (.loc) files that are in synch and have completed final testing. It is your choice about whether to use the staged-only data - it depends how risk tolerant your project is and if you plan on testing. But, that said, I think it is almost certainly fine or our team wouldn't have staged it yet. A vanishingly small number of datasets are pulled back once they make it to staging, and this is why we were comfortable pointing datacache there in the first place (were unable to point to the published area at first, but wanted to make the data available ASAP).
Going forward - I can let you know that these indexes are very easy to create: one command-line execution, then add one line to the associated .loc file. Instructions are here, see "Bowtie and Tophat":
For one or few genomes, not a problem. For hundreds of genomes with variants, can become tedious even with helper tools and in our case, the processing interacted with disk that was undergoing changes (as we have been working on system configuration most of the summer). Also, with the Data Manager is now available, creating batch indexes for use via rsync become lower priority. Even so, I would expect more indexes to be fully published once the final configuration is in place, as many are already staged or close being staged (watch the yellow banner on Main).
Hopefully this helps to explain the data, guides you to making an informed decision, and aids with creating your own indexes as needed,
On 9/18/13 1:04 PM, Curtis Hendrickson (Campus) wrote:
First, I wanted to thank you for making the datacache available (http://wiki.galaxyproject.org/Admin/Data%20Integration; rsync://datacache.g2.bx.psu.edu). It's a great resource.
However, what is the best way to stay abreast of changes to what's in datacache, and understand how these indexes are computed?
We are currently upgrading to bowtie2, but I notice that the bowtie2 indices for mm9, which used to be in
have been removed, and only the hg19 genome has bowtie2 indices. Why only that one, and not the others?
Where are the scripts you use to make these indices, in case I want to create bowtie2 indices for other
So, how do I find out *why* they were removed? (Can I safely use the copy I have, or was there a problem with them?)
More generally, how do I understand the policies and logic behind the datacache indices, and be notified of changes, short of running my own periodic rsync/diff?
Finally, since I'm doing "reproducible research" is anything planned for systematically versioning genome indices, so I can easily tell what version of a system (ie, what BWA version) was used to create the index, and be sure that an index will not suddenly disappear.
Research Associate/CTSA-Informatics Team
University of Alabama at Birmingham
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
This is in regard to this:
Overall, this is very useful, just what I need, thanks. I'd *really*
like to see this feature in the mainline Galaxy. Is there some voting
necessary on Trello to achieve this, or is it enough to be
I tested the ModuleDependencyResolver, and fixed three problems:
1. Fixed up module loading to work properly. The problem is that
'module' is not a first class command, it's a shell function. And
it only works from interactive shells. The solution is to use the
underlying modulecmd command. This requires deeper knowledge in
the modules resolver of how environment modules work, which
obviates the DEFAULT_MODULE_COMMAND and the flexibility to
2. Made versionless fallback work, i.e. use the matching version if
it exists, and only fallback to a generic match if it doesn't.
3. Enhanced the DirectoryModuleChecker to look along the modulepath,
not just in a single directory. The default path is initialised
appropriately from environment variables MODULEPATH, MODULESHOME,
as per module(1). This can be overridden with the attribute
modulepath rather than directory in the config file.
Fix attached - I presume a Mercurial export is all you need?
It may be better to default prefetch to false (but I didn't change
that). Otherwise the Galaxy server needs restarting after new system
packages become available.
Now, there's one more thing required, which I'm not sure how to
achieve. I intend to run with this config:
<modules prefetch="false" versionless="true"/>
So in particular I'm not interested in tool_shed_packages. However,
when I install from the toolshed, say, the emboss tool, it still
downloads the source tarball and tries to compile it locally (which
fails, as I don't have make installed on my production Galaxy, nor do
I want it). The emboss tool status in the "Manage installed tool shed
repositories" list is "Installed, missing tool dependencies", but
actually my installed modules mean the tool dependencies are
The behaviour I'm after is not even to try to do the actions in a
tool_dependency.xml package spec in the toolshed, if I have dependency
resolvers configured without tool_shed_packages.
What are your thoughts on that?
I would like to set up a local Galaxy instance behind an Apache server with our local CAS for authentication.
It would be great if you could give me a hint for the httpd.conf. I have the problem that after authenticating against CAS in the browser, I get following error message and REMOTE_USER doesn't seem to be in the HTTP header for Galaxy (I can see the REMOTE_USER in the access_log of Apache but not any more in paster.log of Galaxy).
"Access to Galaxy is denied
Galaxy is configured to authenticate users via an external method (such as HTTP authentication in Apache), but a username was not provided by the upstream (proxy) server. This is generally due to a misconfiguration in the upstream server."
I know that the same question was already asked in the following post but I haven't seen an option to extend the post and I haven't found an answer.
Any help is much appreciated.
We are working on a galaxy tool suite for data analysis.
We use a sqlite db to keep result data centralised between the different tools.
At one point the tool configuration options of a tool should be dependent on the rows within a table of the sqlite db that is the output of the previous step. In other words, we would like to be able to set selectable parameters based on an underlying sql statement. If sql is not possible, an alternative would be to output the table content into a txt file and subsequently parse the txt file instead of the sqlite_db within the xml configuration file.
When looking through the galaxy wiki and mailing lists I came across the <code> tag which would be ideal, we could run a python script in the background to fetch date from the sqlite table, however that function is deprecated.
Does anybody know of other ways to achieve this?
Ir. Jeroen Crappé, PhD Student
Lab of Bioinformatics and Computational Genomics (Biobix)
FBW - Ghent University
There are two copies of the wiggle_to_simple tool in the main repository,
and this duplication appears to have happened back in 2009.
$ grep wiggle_to_simple tool_conf.xml.sample
<tool file="filters/wiggle_to_simple.xml" />
<tool file="stats/wiggle_to_simple.xml" />
$ diff tools/filters/wiggle_to_simple.py tools/stats/wiggle_to_simple.py
$ diff -w tools/filters/wiggle_to_simple.xml tools/stats/wiggle_to_simple.xml
< <param name="input" value="3.wig" />
< <output name="out_file1" file="3_wig.bed"/>
The tools/filters/wiggle_to_simple.xml version has Windows newlines,
and 2 tests.
The tools/stats/wiggle_to_simple.xml version has Unix newlines, but only 1 test.
I would therefore suggest merging the two (Unix newlines, both tests).
A security vulnerability was recently discovered by John Chilton with Galaxy's "Filter data on any column using simple expressions" and "Filter on ambiguities in polymorphism datasets" tools that can allow for arbitrary execution of code on the command line.
The fix for these tools has been committed to the Galaxy source. The timing of this commit coincides with the next Galaxy stable release (which has also been pushed out today).
To apply the fix and simultaneously update to the new Galaxy stable release, ensure you are on the stable branch and upgrade to the latest changeset:
% hg branch
% hg pull -u
For Galaxy installations that administrators are not yet ready to upgrade to the latest release, there are three workarounds.
First, for Galaxy installations running on a relatively new version of the stable release (e.g. release_2013.08.12), Galaxy can be updated to the specific changeset that that contains the fix. This will include all of the stable (non-feature) commits that have been accumulated since the 8/12 release plus any new features included with (and prior to) the 8/12 release, but without all of the new features included in the 11/4 release. Ensure you are on the stable branch and then upgrade to the specific changeset:
% hg pull -u -r e094c73fed4d
Second, the patch can be downloaded and applied manually:
% wget -o security.patch https://bitbucket.org/galaxy/galaxy-central/commits/e094c73fed4dc66b58993...
% hg patch security.patch
% patch -p1 < security.patch
Third, the tools can be completely disabled by removing them from the tool configuration file (by default, tool_conf.xml) and restarting all Galaxy server processes. The relevant lines in tool_conf.xml are:
<tool file="stats/dna_filtering.xml" />
<tool file="stats/filtering.xml" />
The full 11/4 Galaxy Distribution News Brief will be available later today and will contain details of changes since the last release.