Committed your changeset On Tue, Nov 16, 2010 at 11:43 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Hi Kanwei & Kelly,
I've just updated my test installation of Galaxy and realised that there is a problem with the loc file handling for BLAST+ due to this commit from Kelly Vincent:
"Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters."
http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
When I wrote the BLAST+ wrappers, blastdb.loc (for nucleotides) used two columns only (caption and path). Likewise for the introduced blastdb_p.loc file.
The legacy megablast_wrapper.xml treated the first word of the caption as an ID and passed it to megablast_wrapper.py which used the loc file to look up the real path to use to call blastall. This seems convoluted to me.
For my BLAST+ wrappers I just need the caption (to show to the user) and the path (to use at the command line), which were column indices 0 and 1 (python counting), thus:
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
Then came this patch, from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
After this patch, the blastdb.loc and blastdb_p.loc files have three columns (id, caption, path), with the recommendation that if you were using the old megablast_wrapper.xml then pick the first word of the caption as the id (for backwards compatibility).
The XML for the BLAST+ wrappers now (wrongly) uses this,
<options from_file="blastdb.loc"> <column name="name" index="2"/> <column name="value" index="0"/> </options>
That means the name shown to the users is column 2 (in python speak, i.e. the third column) which is the path (!) and the value used to call the executable is column 0 (in python speak, i.e. the first column) which is the new identifier column.
Is it possible that this would run, but only if the identifier was actually the name of a valid blast database (e.g. nr) which was on the blast database path. Maybe that is the case on Kelly's machine?
What it should be using is column indexes 1 and 2 (for the caption and path, ignoring the new id column):
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
This is done in the following changeset: http://bitbucket.org/peterjc/galaxy-central/changeset/6b499b39b804
Could one of you apply that please?
I'd also like to know why the extra ID column was added - I don't understand what it is for. Can we remove it again?
Regards,
Peter