Re: [galaxy-dev] NCBI BLAST+ wrappers in Galaxy?
On Wed, Oct 13, 2010 at 10:22 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Wed, Oct 13, 2010 at 12:58 AM, Kanwei Li <kanwei@gmail.com> wrote:
Hi Peter,
The BLAST+ branch has been transplanted into galaxy-central. Thank you!
Kanwei
Wow. Thanks :)
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2 These are: Include BLAST-XML to tabular in tool_conf.xml.sample http://bitbucket.org/peterjc/galaxy-central/changeset/629448f1a17b Workaround Issue 397 using 'if' in Cheetah template to avoid hidden parameters http://bitbucket.org/peterjc/galaxy-central/changeset/02ce0b68936a Ignore Unix editor backup files ending with tilde (not BLAST+ specific): http://bitbucket.org/peterjc/galaxy-central/changeset/708816ffd451 Ignore # lines in BLAST tabular output in my filter script http://bitbucket.org/peterjc/galaxy-central/changeset/b2dcd6b8d434 And a capitalisation fix in my XML caption: http://bitbucket.org/peterjc/galaxy-central/changeset/6efadbea0cfc
I mentioned it needs tests:
I've worked out how to add unit tests to the XML wrapper (that is well documented) and run them via run_functional_tests.sh - so adding some BLAST+ tests should be possible now. Peter
Hi again, On Mon, Oct 18, 2010 at 10:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2
I've tidied that up a bit, please see: http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge I've been working on adding more arguments but this is contingent on support for optional interger/float parameters (Issue 403): http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters... Thanks, Peter
Hi all, Did this email get through? There was a problem on the the mailing list server, "Insufficient disk space; try again later". As an aside, how does the Galaxy team deal with giving external contributors commit access? Would you consider that here - say provisionally just for updating the BLAST+ wrappers? Thanks, Peter ---------- Forwarded message ---------- From: Peter <peter@maubp.freeserve.co.uk> Date: Mon, Oct 25, 2010 at 11:17 AM Subject: Re: [galaxy-dev] NCBI BLAST+ wrappers in Galaxy? To: galaxy-dev@lists.bx.psu.edu Hi again, On Mon, Oct 18, 2010 at 10:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2
I've tidied that up a bit, please see: http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge I've been working on adding more arguments but this is contingent on support for optional interger/float parameters (Issue 403): http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters... Thanks, Peter
On Mon, Oct 25, 2010 at 11:17 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
Hi again,
On Mon, Oct 18, 2010 at 10:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2
I've tidied that up a bit, please see: http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge
Hopefully this will still merge nicely, note that the following recent commit makes a few minor changes to the BLAST+ wrappers: http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
I've been working on adding more arguments but this is contingent on support for optional interger/float parameters (Issue 403): http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters...
Peter
Hi Peter, I will be responsible for merging in your changes so please send emails directly to me (you can still CC the list) when you have changes ready. I will try to merge this branch this week. Thanks! Kanwei On Mon, Nov 8, 2010 at 4:46 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Mon, Oct 25, 2010 at 11:17 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
Hi again,
On Mon, Oct 18, 2010 at 10:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2
I've tidied that up a bit, please see: http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge
Hopefully this will still merge nicely, note that the following recent commit makes a few minor changes to the BLAST+ wrappers: http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
I've been working on adding more arguments but this is contingent on support for optional interger/float parameters (Issue 403): http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters...
Peter _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
All changesets in the please_merge branch have been merged. Thanks for the contribution! -Kanwei On Mon, Nov 8, 2010 at 4:40 PM, Kanwei Li <kanwei@gmail.com> wrote:
Hi Peter,
I will be responsible for merging in your changes so please send emails directly to me (you can still CC the list) when you have changes ready. I will try to merge this branch this week.
Thanks!
Kanwei
On Mon, Nov 8, 2010 at 4:46 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Mon, Oct 25, 2010 at 11:17 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
Hi again,
On Mon, Oct 18, 2010 at 10:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
There are another few change sets on this follow up branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus2
I've tidied that up a bit, please see: http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge
Hopefully this will still merge nicely, note that the following recent commit makes a few minor changes to the BLAST+ wrappers: http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
I've been working on adding more arguments but this is contingent on support for optional interger/float parameters (Issue 403): http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters...
Peter _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
On Mon, Nov 8, 2010 at 4:40 PM, Kanwei Li <kanwei@gmail.com> wrote:
Hi Peter,
I will be responsible for merging in your changes so please send emails directly to me (you can still CC the list) when you have changes ready. I will try to merge this branch this week.
Thanks!
Kanwei
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Thank you Kanwei :) Would you be able to look at my patch for Issue 403? With that solved it would be much simpler to support a number of optional arguments in the BLAST+ wrappers: Issue 403: Enhance tool parameters to be optional http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters... Regards, Peter
On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Hi Kanwei & Kelly, I've just updated my test installation of Galaxy and realised that there is a problem with the loc file handling for BLAST+ due to this commit from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc When I wrote the BLAST+ wrappers, blastdb.loc (for nucleotides) used two columns only (caption and path). Likewise for the introduced blastdb_p.loc file. The legacy megablast_wrapper.xml treated the first word of the caption as an ID and passed it to megablast_wrapper.py which used the loc file to look up the real path to use to call blastall. This seems convoluted to me. For my BLAST+ wrappers I just need the caption (to show to the user) and the path (to use at the command line), which were column indices 0 and 1 (python counting), thus: <options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options> Then came this patch, from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc After this patch, the blastdb.loc and blastdb_p.loc files have three columns (id, caption, path), with the recommendation that if you were using the old megablast_wrapper.xml then pick the first word of the caption as the id (for backwards compatibility). The XML for the BLAST+ wrappers now (wrongly) uses this, <options from_file="blastdb.loc"> <column name="name" index="2"/> <column name="value" index="0"/> </options> That means the name shown to the users is column 2 (in python speak, i.e. the third column) which is the path (!) and the value used to call the executable is column 0 (in python speak, i.e. the first column) which is the new identifier column. Is it possible that this would run, but only if the identifier was actually the name of a valid blast database (e.g. nr) which was on the blast database path. Maybe that is the case on Kelly's machine? What it should be using is column indexes 1 and 2 (for the caption and path, ignoring the new id column): <options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options> This is done in the following changeset: http://bitbucket.org/peterjc/galaxy-central/changeset/6b499b39b804 Could one of you apply that please? I'd also like to know why the extra ID column was added - I don't understand what it is for. Can we remove it again? Regards, Peter
Committed your changeset On Tue, Nov 16, 2010 at 11:43 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Hi Kanwei & Kelly,
I've just updated my test installation of Galaxy and realised that there is a problem with the loc file handling for BLAST+ due to this commit from Kelly Vincent:
"Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters."
http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
When I wrote the BLAST+ wrappers, blastdb.loc (for nucleotides) used two columns only (caption and path). Likewise for the introduced blastdb_p.loc file.
The legacy megablast_wrapper.xml treated the first word of the caption as an ID and passed it to megablast_wrapper.py which used the loc file to look up the real path to use to call blastall. This seems convoluted to me.
For my BLAST+ wrappers I just need the caption (to show to the user) and the path (to use at the command line), which were column indices 0 and 1 (python counting), thus:
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
Then came this patch, from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
After this patch, the blastdb.loc and blastdb_p.loc files have three columns (id, caption, path), with the recommendation that if you were using the old megablast_wrapper.xml then pick the first word of the caption as the id (for backwards compatibility).
The XML for the BLAST+ wrappers now (wrongly) uses this,
<options from_file="blastdb.loc"> <column name="name" index="2"/> <column name="value" index="0"/> </options>
That means the name shown to the users is column 2 (in python speak, i.e. the third column) which is the path (!) and the value used to call the executable is column 0 (in python speak, i.e. the first column) which is the new identifier column.
Is it possible that this would run, but only if the identifier was actually the name of a valid blast database (e.g. nr) which was on the blast database path. Maybe that is the case on Kelly's machine?
What it should be using is column indexes 1 and 2 (for the caption and path, ignoring the new id column):
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
This is done in the following changeset: http://bitbucket.org/peterjc/galaxy-central/changeset/6b499b39b804
Could one of you apply that please?
I'd also like to know why the extra ID column was added - I don't understand what it is for. Can we remove it again?
Regards,
Peter
Hi again Kanwei, I've got a few more BLAST+ wrapper changes, most of this is adding unit tests which just assume BLAST 2.2.24+ is installed (they do not use any databases, just FASTA file vs FASTA file). I also fixed the binary dependency declarations (all five wrappers said they depended on the blastn binary, which was sloppy of me). Note that currently Galaxy doesn't use this information, see Issue 82: http://bitbucket.org/galaxy/galaxy-central/issue/82/fix-the-tag-set-in-the-t... All the changes are on this new branch, http://bitbucket.org/peterjc/galaxy-central/src/blastplus_nov18 Change sets are: http://bitbucket.org/peterjc/galaxy-central/changeset/4f37444d4038 http://bitbucket.org/peterjc/galaxy-central/changeset/cf6c71e06097 http://bitbucket.org/peterjc/galaxy-central/changeset/3b11c4d0a22c http://bitbucket.org/peterjc/galaxy-central/changeset/54be89e916b9 http://bitbucket.org/peterjc/galaxy-central/changeset/94b013214fd3 To run the new tests, use: ./run_functional_tests.sh -id ncbi_blastp_wrapper ./run_functional_tests.sh -id ncbi_tblastn_wrapper Please let me know if there are any problems with the tests on your Galaxy -- for example discrepancies in the output due to the BLAST+ version installed. Regards, Peter
Peter, As Nate mentions in that issue, there are a bunch of problems checking the binary requirement tag at load time, most obvious that the place where the web server is running might have different tools than where jobs are actually run. However, the package tag is very useful and will be needed for these to work well on cloud instances for example. I'd suggest adding <requirement type='package'>ncbi-blast+</requirement> to all of these. The binary requirement can still be there as well. On Nov 18, 2010, at 10:33 AM, Peter wrote:
I also fixed the binary dependency declarations (all five wrappers said they depended on the blastn binary, which was sloppy of me).
-- jt James Taylor, Assistant Professor, Biology / Computer Science, Emory University
On Thu, Nov 18, 2010 at 4:17 PM, James Taylor <james@jamestaylor.org> wrote:
Peter,
As Nate mentions in that issue, there are a bunch of problems checking the binary requirement tag at load time, most obvious that the place where the web server is running might have different tools than where jobs are actually run.
Right. On Issue 82 Nate suggested skipping the load time check if there is a cluster configured. That sounds like a good compromise to me (I'd have updated the patch if I knew how to check that), and we could also check for required python modules (again, quite a few tools declare this and it isn't checked yet, is it?).
However, the package tag is very useful and will be needed for these to work well on cloud instances for example. I'd suggest adding <requirement type='package'>ncbi-blast+</requirement> to all of these. The binary requirement can still be there as well.
I'd have to also define the ncbi-blast+ package too, right? Is any of this new stuff documented yet (beyond the notes on Issue 82)? Thanks, Peter
Committed those, let me know when the next ones are ready
This changeset undoes part of the change to data tables in my commit, which was not an accident. The blastdb.loc.sample and blastdb_p.loc.sample now do not match the columns expected in the ncbi_blast_plus tools. Megablast uses the blastdb.loc file and it expects it to match the spec in tool_data_table_conf.xml. Data tables is far more flexible than the raw loc approach, which is why we changed it. The unique ID is necessary for the data tables approach and allows for the structure of the loc file and/or data location to change without breaking things. The former approach was to store the path as the value for the parameter. This means that if it was set in a workflow and if that path changed (i.e. the data directory was restructured), the workflow would no longer work. However, if we use the unique ID, it's possible to maintain backwards compatibility. Instead of the path, it stores the unique ID, which can be used to obtain the path so that it can be passed to the Python file. And for items that were already in the loc file, you set the unique ID to be the same as the original path, so that the parameter values in existing workflows is still the same. But new items can have nicer-looking IDs. And if extra columns ever need to be added, it's easy. Ever since James' original data tables commit (in August), we have been wanting to change everything over to this style, so I am going to change these files back. If you don't want to reformat the loc files, just use the tool_data_table_conf.xml.oldlocstyle instead of tool_data_table_conf.xml.sample as the source for tool_data_table_conf.xml. This is where the columns are defined, and it's just a matter of defining name and value. (In case you're not seeing it, it's not showing up in one of my clones for some reason, but it is definitely in the repository.) I'm working on a wiki page that will explain data tables, since they're pretty much undocumented at this point. Kelly On Nov 16, 2010, at 11:43 AM, Peter wrote:
On Fri, Nov 12, 2010 at 2:05 AM, Kanwei Li <kanwei@gmail.com> wrote:
All changesets in the please_merge branch have been merged. Thanks for the contribution!
-Kanwei
Hi Kanwei & Kelly,
I've just updated my test installation of Galaxy and realised that there is a problem with the loc file handling for BLAST+ due to this commit from Kelly Vincent:
"Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters."
http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
When I wrote the BLAST+ wrappers, blastdb.loc (for nucleotides) used two columns only (caption and path). Likewise for the introduced blastdb_p.loc file.
The legacy megablast_wrapper.xml treated the first word of the caption as an ID and passed it to megablast_wrapper.py which used the loc file to look up the real path to use to call blastall. This seems convoluted to me.
For my BLAST+ wrappers I just need the caption (to show to the user) and the path (to use at the command line), which were column indices 0 and 1 (python counting), thus:
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
Then came this patch, from Kelly Vincent: "Converted several tools to data table style of loc file handling (Bowtie, BWA, Lastz, Megablast, PerM, SRMA). Cleaned up several tool XML files, removing unnecessary None parameters." http://bitbucket.org/galaxy/galaxy-central/changeset/535d276c92bc
After this patch, the blastdb.loc and blastdb_p.loc files have three columns (id, caption, path), with the recommendation that if you were using the old megablast_wrapper.xml then pick the first word of the caption as the id (for backwards compatibility).
The XML for the BLAST+ wrappers now (wrongly) uses this,
<options from_file="blastdb.loc"> <column name="name" index="2"/> <column name="value" index="0"/> </options>
That means the name shown to the users is column 2 (in python speak, i.e. the third column) which is the path (!) and the value used to call the executable is column 0 (in python speak, i.e. the first column) which is the new identifier column.
Is it possible that this would run, but only if the identifier was actually the name of a valid blast database (e.g. nr) which was on the blast database path. Maybe that is the case on Kelly's machine?
What it should be using is column indexes 1 and 2 (for the caption and path, ignoring the new id column):
<options from_file="blastdb.loc"> <column name="name" index="1"/> <column name="value" index="2"/> </options>
This is done in the following changeset: http://bitbucket.org/peterjc/galaxy-central/changeset/6b499b39b804
Could one of you apply that please?
I'd also like to know why the extra ID column was added - I don't understand what it is for. Can we remove it again?
Regards,
Peter
On Thu, Nov 18, 2010 at 5:57 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote:
This changeset undoes part of the change to data tables in my commit, which was not an accident. ...
I didn't think it was an accident, but it had broken the BLAST+ wrappers.
Data tables is far more flexible than the raw loc approach, which is why we changed it. The unique ID is necessary for the data tables approach and allows for the structure of the loc file and/or data location to change without breaking things. The former approach was to store the path as the value for the parameter. This means that if it was set in a workflow and if that path changed (i.e. the data directory was restructured), the workflow would no longer work. However, if we use the unique ID, it's possible to maintain backwards compatibility. Instead of the path, it stores the unique ID, which can be used to obtain the path so that it can be passed to the Python file. And for items that were already in the loc file, you set the unique ID to be the same as the original path, so that the parameter values in existing workflows is still the same. But new items can have nicer-looking IDs. And if extra columns ever need to be added, it's easy. Ever since James' original data tables commit (in August), we have been wanting to change everything over to this style, so I am going to change these files back.
...
I'm working on a wiki page that will explain data tables, since they're pretty much undocumented at this point.
I see - that was my guess for what the new ID was for, but the setup as you'd left it wasn't working for the BLAST+ wrappers. The new blastdb.loc.sample and blastdb_p.loc.sample have three columns: identifier (to be stored in the database to record what the user picked), name (to be shown to the user in the drop down select box), and path (to be given to the BLAST executable). Perhaps XML for the BLAST+ tools should be: <param name="database" type="select" label="Nucleotide BLAST database"> <!-- The BLAST loc file has three columns: column 0 is an identifier (what is saved in the database), column 1 is the caption (show this to the user), column 2 is the database path (given to BLAST+) --> <options from_file="blastdb.loc"> <column name="dbkey" index="0"/> <column name="name" index="1"/> <column name="value" index="2"/> </options> </param> where I'm assuming name="dbkey" is how you define the field which gets recorded in the database. Please go ahead and fix it - I'm about to go home for the day here. Peter
On Thu, Nov 18, 2010 at 6:09 PM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Thu, Nov 18, 2010 at 5:57 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote:
This changeset undoes part of the change to data tables in my commit, which was not an accident. ...
I didn't think it was an accident, but it had broken the BLAST+ wrappers.
Data tables is far more flexible than the raw loc approach, which is why we changed it. The unique ID is necessary for the data tables approach and allows for the structure of the loc file and/or data location to change without breaking things. The former approach was to store the path as the value for the parameter. This means that if it was set in a workflow and if that path changed (i.e. the data directory was restructured), the workflow would no longer work. However, if we use the unique ID, it's possible to maintain backwards compatibility. Instead of the path, it stores the unique ID, which can be used to obtain the path so that it can be passed to the Python file. And for items that were already in the loc file, you set the unique ID to be the same as the original path, so that the parameter values in existing workflows is still the same. But new items can have nicer-looking IDs. And if extra columns ever need to be added, it's easy. Ever since James' original data tables commit (in August), we have been wanting to change everything over to this style, so I am going to change these files back.
...
I'm working on a wiki page that will explain data tables, since they're pretty much undocumented at this point.
I see - that was my guess for what the new ID was for, but the setup as you'd left it wasn't working for the BLAST+ wrappers.
The new blastdb.loc.sample and blastdb_p.loc.sample have three columns: identifier (to be stored in the database to record what the user picked), name (to be shown to the user in the drop down select box), and path (to be given to the BLAST executable).
...
Please go ahead and fix it - I'm about to go home for the day here.
I see you put some documentation on the wiki yesterday, http://bitbucket.org/galaxy/galaxy-central/wiki/DataTables I was expecting we might have to use $param.path rather than $param in the cheetah template for the command line in the wrapper XML, or that it might happen automatically (i.e. some code takes care of things so that the database path would be provided to the cheetah template as $param, while the value column would be used to record the step in the history/workflow). I have to say the proposed way to get at the path from within the wrapper XML file looks very nasty. Is this a short term solution? Regards, Peter
Hi Peter,
I have to say the proposed way to get at the path from within the wrapper XML file looks very nasty. Is this a short term solution?
Yes, that is very nasty and a bad stop-gap solution that needs to not propogate. Data tables are pretty new and need lots of enhancement, something along the lines of what you expected (e.g. $param.path) is the intention. Any other data table suggestions? Thanks, Dan
On Thu, Nov 18, 2010 at 6:09 PM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Thu, Nov 18, 2010 at 5:57 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote:
This changeset undoes part of the change to data tables in my commit, which was not an accident. ...
I didn't think it was an accident, but it had broken the BLAST+ wrappers.
Data tables is far more flexible than the raw loc approach, which is why we changed it. The unique ID is necessary for the data tables approach and allows for the structure of the loc file and/or data location to change without breaking things. The former approach was to store the path as the value for the parameter. This means that if it was set in a workflow and if that path changed (i.e. the data directory was restructured), the workflow would no longer work. However, if we use the unique ID, it's possible to maintain backwards compatibility. Instead of the path, it stores the unique ID, which can be used to obtain the path so that it can be passed to the Python file. And for items that were already in the loc file, you set the unique ID to be the same as the original path, so that the parameter values in existing workflows is still the same. But new items can have nicer-looking IDs. And if extra columns ever need to be added, it's easy. Ever since James' original data tables commit (in August), we have been wanting to change everything over to this style, so I am going to change these files back.
...
I'm working on a wiki page that will explain data tables, since they're pretty much undocumented at this point.
I see - that was my guess for what the new ID was for, but the setup as you'd left it wasn't working for the BLAST+ wrappers.
The new blastdb.loc.sample and blastdb_p.loc.sample have three columns: identifier (to be stored in the database to record what the user picked), name (to be shown to the user in the drop down select box), and path (to be given to the BLAST executable).
...
Please go ahead and fix it - I'm about to go home for the day here.
I see you put some documentation on the wiki yesterday, http://bitbucket.org/galaxy/galaxy-central/wiki/DataTables
I was expecting we might have to use $param.path rather than $param in the cheetah template for the command line in the wrapper XML, or that it might happen automatically (i.e. some code takes care of things so that the database path would be provided to the cheetah template as $param, while the value column would be used to record the step in the history/workflow).
I have to say the proposed way to get at the path from within the wrapper XML file looks very nasty. Is this a short term solution?
Regards,
Peter _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Hi Dan, On Wed, Nov 24, 2010 at 2:49 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
I have to say the proposed way to get at the path from within the wrapper XML file looks very nasty. Is this a short term solution?
Yes, that is very nasty and a bad stop-gap solution that needs to not propogate.
Data tables are pretty new and need lots of enhancement, something along the lines of what you expected (e.g. $param.path) is the intention. Any other data table suggestions?
I see that has now landed on the trunk, https://bitbucket.org/galaxy/galaxy-central/changeset/7be369e7cc6f Is that stable and ready for public use now? I'll need to look at making updates to the BLAST+ wrappers to take advantage of this (probably after clearing though my back log of other BLAST+ stuff like the extended tabular output). Peter
Hi Peter (and -dev), Would something like ${param.fields.path} be preferred over ${param.get_fields( 'path' )}? Thanks, Dan On Jan 3, 2011, at 4:28 PM, Peter wrote:
Hi Dan,
On Wed, Nov 24, 2010 at 2:49 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
I have to say the proposed way to get at the path from within the wrapper XML file looks very nasty. Is this a short term solution?
Yes, that is very nasty and a bad stop-gap solution that needs to not propogate.
Data tables are pretty new and need lots of enhancement, something along the lines of what you expected (e.g. $param.path) is the intention. Any other data table suggestions?
I see that has now landed on the trunk, https://bitbucket.org/galaxy/galaxy-central/changeset/7be369e7cc6f
Is that stable and ready for public use now? I'll need to look at making updates to the BLAST+ wrappers to take advantage of this (probably after clearing though my back log of other BLAST+ stuff like the extended tabular output).
Peter
On Tue, Jan 4, 2011 at 2:26 AM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter (and -dev),
Would something like ${param.fields.path} be preferred over ${param.get_fields( 'path' )}?
Thanks,
Dan
That looks nicer to type in the XML wrapper, but I am not overly bothered. Could we use either syntax for things like filters and conditional checks (in other parameters)? Peter
I would think the get_fields way is more robust since you wouldn't run into issues with special characters On Tue, Jan 4, 2011 at 12:35 PM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Tue, Jan 4, 2011 at 2:26 AM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter (and -dev),
Would something like ${param.fields.path} be preferred over ${param.get_fields( 'path' )}?
Thanks,
Dan
That looks nicer to type in the XML wrapper, but I am not overly bothered. Could we use either syntax for things like filters and conditional checks (in other parameters)?
Peter _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Hi Peter, The method to access additional fields from .loc files has been changed in 4804:376cce23623d to be of the form: ${param.fields.path}. Thanks again for your input, Dan On Jan 4, 2011, at 12:35 PM, Peter wrote:
On Tue, Jan 4, 2011 at 2:26 AM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter (and -dev),
Would something like ${param.fields.path} be preferred over ${param.get_fields( 'path' )}?
Thanks,
Dan
That looks nicer to type in the XML wrapper, but I am not overly bothered. Could we use either syntax for things like filters and conditional checks (in other parameters)?
Peter
On Wed, Jan 5, 2011 at 8:02 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
The method to access additional fields from .loc files has been changed in 4804:376cce23623d to be of the form: ${param.fields.path}.
:) Did you see Kanwei's email pointing out that while this is nice, it won't work with special characters (e.g. hyphens), or another problem would be reserved Python names (e.g. in)? I don't think this will be a problem in practice, but should be touched on in the documentation for this (and setting up loc files). Peter
Hi Peter, We had some discussion today about Kanwei's email and opted for the nicer looking access method. I've added a note to the ToolConfigSyntax wiki page about special characters. Thanks, Dan On Jan 5, 2011, at 3:07 PM, Peter wrote:
On Wed, Jan 5, 2011 at 8:02 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
The method to access additional fields from .loc files has been changed in 4804:376cce23623d to be of the form: ${param.fields.path}.
:)
Did you see Kanwei's email pointing out that while this is nice, it won't work with special characters (e.g. hyphens), or another problem would be reserved Python names (e.g. in)? I don't think this will be a problem in practice, but should be touched on in the documentation for this (and setting up loc files).
Peter
On Wed, Jan 5, 2011 at 9:07 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
We had some discussion today about Kanwei's email and opted for the nicer looking access method. I've added a note to the ToolConfigSyntax wiki page about special characters.
Thanks,
Dan
Thanks Dan, Kanwei - could you merge/transplant this changeset to use the new parameter settings in the NCBI BLAST+ wrappers please: https://bitbucket.org/peterjc/galaxy-central/changeset/aa82d8273063 This is currently the only commit on this new branch, https://bitbucket.org/peterjc/galaxy-central/changeset/blastplus_jan31 This took a while because I wanted to get the other BLAST+ changes merged first - and that's done now. Thanks, Peter
Hi Kanwei, On Mon, Jan 31, 2011 at 5:16 PM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Wed, Jan 5, 2011 at 9:07 PM, Daniel Blankenberg <dan@bx.psu.edu> wrote:
Hi Peter,
We had some discussion today about Kanwei's email and opted for the nicer looking access method. I've added a note to the ToolConfigSyntax wiki page about special characters.
Thanks,
Dan
Thanks Dan,
Kanwei - could you merge/transplant this changeset to use the new parameter settings in the NCBI BLAST+ wrappers please:
https://bitbucket.org/peterjc/galaxy-central/changeset/aa82d8273063
This is currently the only commit on this new branch,
https://bitbucket.org/peterjc/galaxy-central/changeset/blastplus_jan31
This took a while because I wanted to get the other BLAST+ changes merged first - and that's done now.
Thanks,
Peter
Actually there are two commits on that branch now, I've added a workaround for Galaxy issue 325 which is causing us real problems with BLAST warnings being treated as errors: https://bitbucket.org/peterjc/galaxy-central/changeset/1bd227e7eb4c Once issue 325 is fixed, this wrapper script can be removed. Even my proposed first step of getting Galaxy to check the return code would allow a simpler workaround of redirecting stderr to stdout in the XML for the BLAST+ tools. Thanks, Peter
Hi Kanewi, Sorry - make that three changesets on this branch to consider mering/transplanting: https://bitbucket.org/peterjc/galaxy-central/changeset/blastplus_jan31 Parameter handling, https://bitbucket.org/peterjc/galaxy-central/changeset/aa82d8273063 Cope with errors on stderr, https://bitbucket.org/peterjc/galaxy-central/changeset/1bd227e7eb4c Ensure blast jobs get killed, https://bitbucket.org/peterjc/galaxy-central/changeset/607b7268693a Peter
Committed On Tue, Feb 1, 2011 at 10:27 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
Hi Kanewi,
Sorry - make that three changesets on this branch to consider mering/transplanting: https://bitbucket.org/peterjc/galaxy-central/changeset/blastplus_jan31
Parameter handling, https://bitbucket.org/peterjc/galaxy-central/changeset/aa82d8273063
Cope with errors on stderr, https://bitbucket.org/peterjc/galaxy-central/changeset/1bd227e7eb4c
Ensure blast jobs get killed, https://bitbucket.org/peterjc/galaxy-central/changeset/607b7268693a
Peter
participants (5)
-
Daniel Blankenberg
-
James Taylor
-
Kanwei Li
-
Kelly Vincent
-
Peter