On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Hi Kanwei & Kelly, I've just updated my test installation of Galaxy and realised that there is a problem with the loc file handling for BLAST+ due to this commit from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc When I wrote the BLAST+ wrappers, blastdb.loc (for nucleotides) used two columns only (caption and path). Likewise for the introduced blastdb_p.loc file. The legacy megablast_wrapper.xml treated the first word of the caption as an ID and passed it to megablast_wrapper.py which used the loc file to look up the real path to use to call blastall. This seems convoluted to me. For my BLAST+ wrappers I just need the caption (to show to the user) and the path (to use at the command line), which were column indices 0 and 1 (python counting), thus: <options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options> Then came this patch, from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc After this patch, the blastdb.loc and blastdb_p.loc files have three columns (id, caption, path), with the recommendation that if you were using the old megablast_wrapper.xml then pick the first word of the caption as the id (for backwards compatibility). The XML for the BLAST+ wrappers now (wrongly) uses this, <options from_file="blastdb.loc"> <column name="name" index="2"/> <column name="value" index="0"/> </options> That means the name shown to the users is column 2 (in python speak, i.e. the third column) which is the path (!) and the value used to call the executable is column 0 (in python speak, i.e. the first column) which is the new identifier column. Is it possible that this would run, but only if the identifier was actually the name of a valid blast database (e.g. nr) which was on the blast database path. Maybe that is the case on Kelly's machine? What it should be using is column indexes 1 and 2 (for the caption and path, ignoring the new id column): <options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options> This is done in the following changeset: http://bitbucket.org/peterjc/galaxy-central/changeset/6b499b39b804 Could one of you apply that please? I'd also like to know why the extra ID column was added - I don't understand what it is for. Can we remove it again? Regards, Peter