#3351 should fix this - Juan, can you please try pulling the current code to
see if it works for you?
Also, some notes on how to maintain a local code repository have also been
added to section 6 of
http://bitbucket.org/galaxy/galaxy-central/wiki/GetGalaxy - no need to do a
fresh clone each time you want to update your local copy.
Fixes and comments on the updated notes welcomed...
On Mon, Feb 8, 2010 at 12:37 PM, <galaxy-dev-request(a)lists.bx.psu.edu>wrote:
>
>
> Message: 1
> Date: Fri, 5 Feb 2010 18:13:05 -0500
> From: Kanwei Li <kanwei(a)gmail.com>
> To: Juan Perin <juanperin(a)gmail.com>, James Taylor
> <james.taylor(a)emory.edu>
> Cc: galaxy-dev(a)bx.psu.edu
> Subject: Re: [galaxy-dev] Python problems
> Message-ID:
> <1469e4b41002051513o5a21793an348c37646724cb3e(a)mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hi Juan,
>
> After some searching it seems that it is an issue with python2.4 and
> hashlib. We recently introduced a change that might be responsible for
> this (commit 3311) and we'll look into it.
>
> Thanks,
>
> Kanwei
>
> On Fri, Feb 5, 2010 at 12:47 PM, Juan Perin <juanperin(a)gmail.com> wrote:
> > I've been working with a much older release of galaxy for a while, and it
> > has worked great for the last few months. ?I noticed some progress with
> the
> > NGS tools, so decided to attempt updating. ?My first mistake was in not
> > knowing how to do so, so I essentially decided to start from the hg clone
> > step and rebuild galaxy entirely in a new place on the same machine. ? ?I
> > stupidly ran the database update script without listening to the warning
> > about backing up the original galaxy db. ?So, my original instance won't
> > work now...
> > So, i'm trying to get the new instance working. ?Copied over my universe
> > file and the custom .loc files from tool-data/ . ?everything seems to be
> ok,
> > however I'm getting a python error that essentially repeats for anything
> I
> > try to do, I'll paste the full output below for reference. ?I'm using
> python
> > 2.4
> > Any ideas?
> > Thanks in advance.
> > URL: http://variome.chop.edu:8082/tool_runner?tool_id=bowtie_wrapperFile
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/WebError-0.8a-py2.4.egg/weberror/evalexception/middleware.py',
> > line 364 in respond app_iter = self.application(environ,
> > detect_start_response) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Paste-1.6-py2.4.egg/paste/debug/prints.py',
> > line 97 in __call__ status, headers, body = wsgilib.intercept_output(
> File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Paste-1.6-py2.4.egg/paste/wsgilib.py',
> > line 539 in intercept_output app_iter = application(environ,
> > replacement_start_response) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Paste-1.6-py2.4.egg/paste/recursive.py',
> > line 80 in __call__ return self.application(environ, start_response) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Paste-1.6-py2.4.egg/paste/httpexceptions.py',
> > line 632 in __call__ return self.application(environ, start_response)
> File
> > '/opt/galaxy-dist/lib/galaxy/web/framework/base.py', line 125 in __call__
> > body = method( trans, **kwargs ) File
> > '/opt/galaxy-dist/lib/galaxy/web/controllers/tool_runner.py', line 61 in
> > index return trans.fill_template( template, history=history,
> > toolbox=toolbox, tool=tool, util=util, add_frame=add_frame, **vars ) File
> > '/opt/galaxy-dist/lib/galaxy/web/framework/__init__.py', line 602 in
> > fill_template return self.fill_template_mako( filename, **kwargs ) File
> > '/opt/galaxy-dist/lib/galaxy/web/framework/__init__.py', line 613 in
> > fill_template_mako return template.render( **data ) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Mako-0.2.5-py2.4.egg/mako/template.py',
> > line 133 in render return runtime._render(self, self.callable_, args,
> data)
> > File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Mako-0.2.5-py2.4.egg/mako/runtime.py',
> > line 364 in _render _render_context(template, callable_, context, *args,
> > **_kwargs_for_callable(callable_, data)) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Mako-0.2.5-py2.4.egg/mako/runtime.py',
> > line 381 in _render_context _exec_template(inherit, lclcontext,
> args=args,
> > kwargs=kwargs) File
> >
> '/opt/galaxy-dist/eggs/py2.4-noplatform/Mako-0.2.5-py2.4.egg/mako/runtime.py',
> > line 414 in _exec_template callable_(context, *args, **kwargs) File
> > '/opt/galaxy-dist/database/compiled_templates/tool_form.mako.py', line
> 103
> > in render_body __M_writer(unicode(util.object_to_string(
> tool_state.encode(
> > tool, app ) ))) File '/opt/galaxy-dist/lib/galaxy/tools/__init__.py',
> line
> > 216 in encode a = hmac_new( app.config.tool_secret, value ) File
> > '/opt/galaxy-dist/lib/galaxy/util/hash_util.py', line 33 in hmac_new
> return
> > hmac.new( key, value, sha1 ).hexdigest() File
> > '/usr/lib64/python2.4/hmac.py', line 107 in new return HMAC(key, msg,
> > digestmod) File '/usr/lib64/python2.4/hmac.py', line 42 in __init__
> > self.outer = digestmod.new() AttributeError: 'builtin_function_or_method'
> > object has no attribute 'new'
>
details: http://www.bx.psu.edu/hg/galaxy/rev/b36c13131ac7
changeset: 3358:b36c13131ac7
user: Kelly Vincent <kpvincent(a)bx.psu.edu>
date: Mon Feb 08 23:52:56 2010 -0500
description:
Initial version of DNA code filter tool
diffstat:
test-data/dna_filter_in1.bed | 49 ++++++++++
test-data/dna_filter_out1.bed | 4 +
test-data/dna_filter_out2.bed | 39 ++++++++
test-data/dna_filter_out3.bed | 41 ++++++++
test-data/dna_filter_out4.bed | 24 +++++
tool_conf.xml.sample | 1 +
tools/stats/dna_filtering.py | 195 ++++++++++++++++++++++++++++++++++++++++++
tools/stats/dna_filtering.xml | 114 ++++++++++++++++++++++++
8 files changed, 467 insertions(+), 0 deletions(-)
diffs (506 lines):
diff -r dedb7be9aa44 -r b36c13131ac7 test-data/dna_filter_in1.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/dna_filter_in1.bed Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,49 @@
+chr1 256 257 A N M N - M N U N N A N D N G N N K N N N
+chr1 468 469 C C C N M N N K . N C U N H N G N N M N S
+chr1 582 583 G G G N G R N R N - N M N V K N N N G C R
+chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R
+chr1 4792 4793 A A M K N W S S N N Y N N N N N M R N R N
+chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N
+chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S
+chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N
+chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y
+chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N
+chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K
+chr1 45026 45027 C A C N N Y N S Y N N X N A D N N K N N A
+chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y
+chr2 45407 45408 C N C S B N N N N N C N Y N N T K G N C N
+chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C
+chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N
+chr2 47814 47815 A C A S N X D N N H W N G N Y C N N M R N
+chr2 48073 48074 A G A Y W . N K N N N G N N N G N N N Y N
+chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N
+chr2 51304 51305 A G N N C N W - N S Y N . N N G N N N W R
+chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N
+chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N
+chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N
+chr2 53505 53506 A A A M N N Y N N N N - K N W N N N S N R
+chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y
+chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N
+chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N
+chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N
+chr10 55870 55871 C G C N H G - N N N C N H K N M G N N N N
+chr10 56024 56025 A T A N D U N Y B N N X N N Y N T N - N N
+chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N
+chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N
+chr10 56137 56138 A A A N A Y H . Y N G N . D N N T N N N N
+chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N
+chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N
+chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T
+chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A
+chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R
+chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N
+chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N
+chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W
+chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N
+chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K
+chr18 91548 91549 A A A S N X H S R N A K N N N N U A R N N
+chr18 93895 93896 T T T H N N V W Y N N N - N N N N N N Y N
+chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y
+chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N
+chr18 140324 140325 A A A N M N N Y N S N V N N X N C N N . M
+chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N
\ No newline at end of file
diff -r dedb7be9aa44 -r b36c13131ac7 test-data/dna_filter_out1.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/dna_filter_out1.bed Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,4 @@
+chr1 582 583 G G G N G R N R N - N M N V K N N N G C R
+chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R
+chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N
+chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R
diff -r dedb7be9aa44 -r b36c13131ac7 test-data/dna_filter_out2.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/dna_filter_out2.bed Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,39 @@
+chr1 256 257 A N M N - M N U N N A N D N G N N K N N N
+chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R
+chr1 4792 4793 A A M K N W S S N N Y N N N N N M R N R N
+chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N
+chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S
+chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N
+chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y
+chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N
+chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K
+chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C
+chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N
+chr2 47814 47815 A C A S N X D N N H W N G N Y C N N M R N
+chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N
+chr2 51304 51305 A G N N C N W - N S Y N . N N G N N N W R
+chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N
+chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N
+chr2 53505 53506 A A A M N N Y N N N N - K N W N N N S N R
+chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y
+chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N
+chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N
+chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N
+chr10 56024 56025 A T A N D U N Y B N N X N N Y N T N - N N
+chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N
+chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N
+chr10 56137 56138 A A A N A Y H . Y N G N . D N N T N N N N
+chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N
+chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N
+chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T
+chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A
+chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R
+chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N
+chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N
+chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W
+chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N
+chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K
+chr18 91548 91549 A A A S N X H S R N A K N N N N U A R N N
+chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y
+chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N
+chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N
diff -r dedb7be9aa44 -r b36c13131ac7 test-data/dna_filter_out3.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/dna_filter_out3.bed Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,41 @@
+chr1 468 469 C C C N M N N K . N C U N H N G N N M N S
+chr1 582 583 G G G N G R N R N - N M N V K N N N G C R
+chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R
+chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N
+chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S
+chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N
+chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y
+chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N
+chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K
+chr1 45026 45027 C A C N N Y N S Y N N X N A D N N K N N A
+chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y
+chr2 45407 45408 C N C S B N N N N N C N Y N N T K G N C N
+chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C
+chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N
+chr2 48073 48074 A G A Y W . N K N N N G N N N G N N N Y N
+chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N
+chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N
+chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N
+chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N
+chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y
+chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N
+chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N
+chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N
+chr10 55870 55871 C G C N H G - N N N C N H K N M G N N N N
+chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N
+chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N
+chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N
+chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N
+chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T
+chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A
+chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R
+chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N
+chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N
+chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W
+chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N
+chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K
+chr18 93895 93896 T T T H N N V W Y N N N - N N N N N N Y N
+chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y
+chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N
+chr18 140324 140325 A A A N M N N Y N S N V N N X N C N N . M
+chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N
diff -r dedb7be9aa44 -r b36c13131ac7 test-data/dna_filter_out4.bed
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/dna_filter_out4.bed Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,24 @@
+chr1 582 583 G G G N G R N R N - N M N V K N N N G C R
+chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R
+chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N
+chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N
+chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N
+chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y
+chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C
+chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N
+chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N
+chr2 51304 51305 A G N N C N W - N S Y N . N N G N N N W R
+chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N
+chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N
+chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y
+chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N
+chr10 55870 55871 C G C N H G - N N N C N H K N M G N N N N
+chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N
+chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N
+chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A
+chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R
+chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W
+chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y
+chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N
+chr18 140324 140325 A A A N M N N Y N S N V N N X N C N N . M
+chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N
diff -r dedb7be9aa44 -r b36c13131ac7 tool_conf.xml.sample
--- a/tool_conf.xml.sample Mon Feb 08 21:33:12 2010 -0500
+++ b/tool_conf.xml.sample Mon Feb 08 23:52:56 2010 -0500
@@ -49,6 +49,7 @@
<tool file="filters/headWrapper.xml" />
<tool file="filters/tailWrapper.xml" />
<tool file="filters/trimmer.xml" />
+ <tool file="stats/dna_filtering.xml" />
</section>
<section name="Filter and Sort" id="filter">
<tool file="stats/filtering.xml" />
diff -r dedb7be9aa44 -r b36c13131ac7 tools/stats/dna_filtering.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/stats/dna_filtering.py Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,195 @@
+#!/usr/bin/env python
+
+"""
+This tool takes a tab-delimited text file as input and creates filters on columns based on certain properties. The tool will skip over invalid lines within the file, informing the user about the number of lines skipped.
+
+usage: %prog [options]
+ -i, --input=i: tabular input file
+ -o, --output=o: filtered output file
+ -c, --cond=c: conditions to filter on
+ -n, --n_handling=n: how to handle N and X
+ -l, --columns=l: columns
+ -t, --col_types=t: column types
+
+"""
+
+#from __future__ import division
+import os.path, re, string, sys
+from galaxy import eggs
+import pkg_resources; pkg_resources.require( "bx-python" )
+from bx.cookbook import doc_optparse
+
+# Older py compatibility
+try:
+ set()
+except:
+ from sets import Set as set
+
+#assert sys.version_info[:2] >= ( 2, 4 )
+
+def get_operands( filter_condition ):
+ # Note that the order of all_operators is important
+ items_to_strip = [ '==', '!=', ' and ', ' or ' ]
+ for item in items_to_strip:
+ if filter_condition.find( item ) >= 0:
+ filter_condition = filter_condition.replace( item, ' ' )
+ operands = set( filter_condition.split( ' ' ) )
+ return operands
+
+def stop_err( msg ):
+ sys.stderr.write( msg )
+ sys.exit()
+
+def __main__():
+ #Parse Command Line
+ options, args = doc_optparse.parse( __doc__ )
+ input = options.input
+ output = options.output
+ cond = options.cond
+ n_handling = options.n_handling
+ columns = options.columns
+ col_types = options.col_types
+
+ try:
+ in_columns = int( columns )
+ assert col_types #check to see that the column types variable isn't null
+ in_column_types = col_types.split( ',' )
+ except:
+ stop_err( "Data does not appear to be tabular. This tool can only be used with tab-delimited data." )
+
+ # Unescape if input has been escaped
+ cond_text = cond.replace( '__eq__', '==' ).replace( '__ne__', '!=' ).replace( '__sq__', "'" )
+ orig_cond_text = cond_text
+ # Expand to allow for DNA codes
+ dot_letters = [ letter for letter in string.uppercase if letter not in \
+ [ 'A', 'T', 'U', 'G', 'C', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ] ]
+ codes = {'A': [ 'A', 'M', 'R', 'W', 'V', 'H', 'D' ],
+ 'T': [ 'T', 'U', 'K', 'Y', 'W', 'B', 'H', 'D' ],
+ 'G': [ 'G', 'K', 'R', 'S', 'B', 'V', 'D' ],
+ 'C': [ 'C', 'M', 'Y', 'S', 'B', 'V', 'H' ],
+ 'U': [ 'T', 'U', 'K', 'Y', 'W', 'B', 'H', 'D' ],
+ 'K': [ 'K', 'G', 'T' ],
+ 'M': [ 'M', 'A', 'C' ],
+ 'R': [ 'R', 'A', 'G' ],
+ 'Y': [ 'Y', 'C', 'T' ],
+ 'S': [ 'S', 'C', 'G' ],
+ 'W': [ 'W', 'A', 'T' ],
+ 'B': [ 'B', 'C', 'G', 'T' ],
+ 'V': [ 'V', 'A', 'C', 'G' ],
+ 'H': [ 'H', 'A', 'C', 'T' ],
+ 'D': [ 'D', 'A', 'G', 'T' ],
+ '.': dot_letters,
+ '-': [ '-' ]}
+ # Add handling for N and X
+ if n_handling == "all":
+ codes[ 'N' ] = [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ]
+ codes[ 'X' ] = [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ]
+ for code in codes.keys():
+ if code != '.' and code != '-':
+ codes[code].append( 'N' )
+ codes[code].append( 'X' )
+ else:
+ codes[ 'N' ] = dot_letters
+ codes[ 'X' ] = dot_letters
+ # Expand conditions to allow for DNA codes
+ try:
+ match_replace = {}
+ pat = re.compile( "c\d+\s*[!=]=\s*[\w']+" )
+ matches = pat.findall( cond_text )
+ for match in matches:
+ if match.find( '==' ) > 0:
+ match_parts = match.split( '==' )
+ new_match = '(%s in codes[%s] and %s in codes[%s])' % ( match_parts[0], match_parts[1], match_parts[1], match_parts[0] )
+ elif match.find( '!=' ) > 0 :
+ match_parts = match.split( '!=' )
+ new_match = '(%s not in codes[%s] or %s not in codes[%s])' % ( match_parts[0], match_parts[1], match_parts[1], match_parts[0] )
+ else:
+ raise Exception
+ if match_parts[1].find( "'" ) >= 0:
+ assert match_parts[1].replace( "'", '' ) in [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X', '-', '.' ]
+ else:
+ assert match_parts[1].startswith( 'c' )
+ match_replace[match] = new_match
+ for match in match_replace.keys():
+ cond_text = cond_text.replace(match, match_replace[match])
+ if len( match_replace ) == 0:
+ raise Exception
+ except:
+ stop_err( "One of your conditions is invalid. Make sure to use only '!=' or '==', valid column numbers, and valid base values." )
+
+ # Attempt to determine if the condition includes executable stuff and, if so, exit
+ secured = dir()
+ operands = get_operands( cond_text )
+ for operand in operands:
+ try:
+ check = int( operand )
+ except:
+ if operand in secured:
+ stop_err( "Illegal value '%s' in condition '%s'" % ( operand, cond_text ) )
+
+ # Prepare the column variable names and wrappers for column data types
+ cols, type_casts = [], []
+ for col in range( 1, in_columns + 1 ):
+ col_name = "c%d" % col
+ cols.append( col_name )
+ col_type = in_column_types[ col - 1 ]
+ type_cast = "%s(%s)" % ( col_type, col_name )
+ type_casts.append( type_cast )
+
+ col_str = ', '.join( cols ) # 'c1, c2, c3, c4'
+ type_cast_str = ', '.join( type_casts ) # 'str(c1), int(c2), int(c3), str(c4)'
+ assign = "%s = line.split( '\\t' )" % col_str
+ wrap = "%s = %s" % ( col_str, type_cast_str )
+ skipped_lines = 0
+ first_invalid_line = 0
+ invalid_line = None
+ lines_kept = 0
+ total_lines = 0
+ out = open( output, 'wt' )
+ # Read and filter input file, skipping invalid lines
+ code = '''
+for i, line in enumerate( file( input ) ):
+ total_lines += 1
+ line = line.rstrip( '\\r\\n' )
+ if not line or line.startswith( '#' ):
+ skipped_lines += 1
+ if not invalid_line:
+ first_invalid_line = i + 1
+ invalid_line = line
+ continue
+ try:
+ %s = line.split( '\\t' )
+ %s = %s
+ if %s:
+ lines_kept += 1
+ print >> out, line
+ except Exception, e:
+ skipped_lines += 1
+ if not invalid_line:
+ first_invalid_line = i + 1
+ invalid_line = line
+''' % ( col_str, col_str, type_cast_str, cond_text )
+
+ valid_filter = True
+ try:
+ exec code
+ except Exception, e:
+ out.close()
+ if str( e ).startswith( 'invalid syntax' ):
+ valid_filter = False
+ stop_err( 'Filter condition "%s" likely invalid. See tool tips, syntax and examples.' % orig_cond_text )
+ else:
+ stop_err( str( e ) )
+
+ if valid_filter:
+ out.close()
+ valid_lines = total_lines - skipped_lines
+ print 'Filtering with %s, ' % orig_cond_text
+ if valid_lines > 0:
+ print 'kept %4.2f%% of %d lines.' % ( 100.0*lines_kept/valid_lines, total_lines )
+ else:
+ print 'Possible invalid filter condition "%s" or non-existent column referenced. See tool tips, syntax and examples.' % orig_cond_text
+ if skipped_lines > 0:
+ print 'Skipped %d invalid lines starting at line #%d: "%s"' % ( skipped_lines, first_invalid_line, invalid_line )
+
+if __name__ == "__main__" : __main__()
diff -r dedb7be9aa44 -r b36c13131ac7 tools/stats/dna_filtering.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/stats/dna_filtering.xml Mon Feb 08 23:52:56 2010 -0500
@@ -0,0 +1,114 @@
+<tool id="dna_filter" name="DNA Filter" version="1.0.0">
+ <description>filter column data on DNA ambiguity codes using simple expressions</description>
+ <command interpreter="python">
+ dna_filtering.py
+ --input=$input
+ --output=$out_file1
+ --cond="$cond"
+ --n_handling=$n_handling
+ --columns=${input.metadata.columns}
+ --col_types="${input.metadata.column_types}"
+ </command>
+ <inputs>
+ <param format="tabular" name="input" type="data" label="Filter" help="Query missing? See TIP below."/>
+ <param name="cond" size="40" type="text" value="c8=='G'" label="With following condition" help="Double equal signs, ==, must be used as shown above. To filter for an arbitrary string, use the Select tool.">
+ <validator type="empty_field" message="Enter a valid filtering condition, see syntax and examples below."/>
+ </param>
+ <param name="n_handling" type="select" label="Do you want N (and X) to match A or C or G or T OR nothing?">
+ <option value="all">N = A or C or G or T</option>
+ <option value="none">N = nothing</option>
+ </param>
+ </inputs>
+ <outputs>
+ <data format="input" name="out_file1" metadata_source="input"/>
+ </outputs>
+ <tests>
+ <test>
+ <param name="input" value="dna_filter_in1.bed" />
+ <param name="cond" value="c8=='G'" />
+ <param name="n_handling" value="all" />
+ <output name="out_file1" file="dna_filter_out1.bed" />
+ </test>
+ <test>
+ <param name="input" value="dna_filter_in1.bed" />
+ <param name="cond" value="(c10==c11 or c17==c18) and c6!='C' and c23=='R'" />
+ <param name="n_handling" value="all" />
+ <output name="out_file1" file="dna_filter_out2.bed" />
+ </test>
+ <test>
+ <param name="input" value="dna_filter_in1.bed" />
+ <param name="cond" value="c4=='B' or c9==c10" />
+ <param name="n_handling" value="none" />
+ <output name="out_file1" file="dna_filter_out3.bed" />
+ </test>
+ <test>
+ <param name="input" value="dna_filter_in1.bed" />
+ <param name="cond" value="c7!='Y' and c9!='U'" />
+ <param name="n_handling" value="none" />
+ <output name="out_file1" file="dna_filter_out4.bed" />
+ </test>
+ </tests>
+ <help>
+
+.. class:: warningmark
+
+Double equal signs, ==, must be used as *"equal to"* (e.g., **c1 == 'G'**)
+
+.. class:: infomark
+
+**TIP:** If your data is not TAB delimited, use *Text Manipulation->Convert*
+
+.. class:: infomark
+
+**TIP:** This tool is intended primarily for comparing column values (such as "c5==c12"), although it is also possible to filter on specific values (like "c6!='G'"). Be aware that when searching for specific values, any possible match is considered. So if you search on "c6!='G'", rows will be excluded when c6 is G, K, R, S, B, V, or D (plus N or X if you set that to equal "all"), because it is possible those values could be G.
+
+-----
+
+**Syntax**
+
+The filter tool allows you to restrict the dataset using simple conditional statements.
+
+- Columns are referenced with **c** and a **number**. For example, **c1** refers to the first column of a tab-delimited file
+- Make sure that multi-character operators contain no white space ( e.g., **!=** is valid while **! =** is not valid )
+- When using 'equal-to' operator **double equal sign '==' must be used** ( e.g., **c1=='chr1'** )
+- Non-numerical values must be included in single or double quotes ( e.g., **c6=='C'** )
+- Filtering condition can include logical operators, but **make sure operators are all lower case** ( e.g., **(c1!='chrX' and c1!='chrY') or c6=='+'** )
+
+-----
+
+**DNA Codes**
+
+The following are the DNA codes used for filtering::
+
+ Code Meaning
+ ---- ---------------------------
+ A A
+ T T
+ U T
+ G G
+ C C
+ K G or T
+ M A or C
+ R A or G
+ Y C or T
+ S C or G
+ W A or T
+ B C, G or T
+ V A, C or G
+ H A, C or T
+ D A, G or T
+ X A, C, G or T
+ N A, C, G or T
+ . not (A, C, G or T)
+ - gap of indeterminate length
+
+-----
+
+**Example**
+
+- **c8=='A'** selects lines in which the eighth column is A, M, R, W, V, H, D and N or X if appropriate
+- **c12==c15** selects lines where the value in the twelfth column could be the same as the fifteenth and the fifteenth column could be the same as the twelfth column (based on appropriate codes)
+- **c9!=c19** selects lines where column nine could not be the same as column nineteen and column nineteen could not be the same as column nine (using appropriate codes)
+
+</help>
+</tool>
details: http://www.bx.psu.edu/hg/galaxy/rev/dedb7be9aa44
changeset: 3357:dedb7be9aa44
user: Dan Blankenberg <dan(a)bx.psu.edu>
date: Mon Feb 08 21:33:12 2010 -0500
description:
Change naming of converter to conversion for 3356:c64ef44ed4c5 to more properly reflect the function.
diffstat:
lib/galaxy/tools/__init__.py | 18 ++++++++--------
lib/galaxy/tools/actions/__init__.py | 40 ++++++++++++++++++------------------
lib/galaxy/tools/parameters/basic.py | 8 +++---
3 files changed, 33 insertions(+), 33 deletions(-)
diffs (128 lines):
diff -r c64ef44ed4c5 -r dedb7be9aa44 lib/galaxy/tools/__init__.py
--- a/lib/galaxy/tools/__init__.py Mon Feb 08 12:45:28 2010 -0500
+++ b/lib/galaxy/tools/__init__.py Mon Feb 08 21:33:12 2010 -0500
@@ -1198,27 +1198,27 @@
current = values["__current_case__"]
wrap_values( input.cases[current].inputs, values )
elif isinstance( input, DataToolParameter ):
- ##FIXME: We're populating param_dict with converters when wrapping values,
+ ##FIXME: We're populating param_dict with conversions when wrapping values,
##this should happen as a separate step before wrapping (or call this wrapping step something more generic)
##(but iterating this same list twice would be wasteful)
- #add explicit converters by name to current parent
- for converter_name, converter_extensions, converter_datatypes in input.converters:
+ #add explicit conversions by name to current parent
+ for conversion_name, conversion_extensions, conversion_datatypes in input.conversions:
#if we are at building cmdline step, then converters have already executed
- conv_ext, converted_dataset = input_values[ input.name ].find_conversion_destination( converter_datatypes )
+ conv_ext, converted_dataset = input_values[ input.name ].find_conversion_destination( conversion_datatypes )
#when dealing with optional inputs, we'll provide a valid extension to be used for None converted dataset
if not conv_ext:
- conv_ext = converter_extensions[0]
+ conv_ext = conversion_extensions[0]
#input_values[ input.name ] is None when optional dataset,
#'conversion' of optional dataset should create wrapper around NoneDataset for converter output
if input_values[ input.name ] and not converted_dataset:
#input that converter is based from has a value, but converted dataset does not exist
- raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_values[ input.name ].extension, converter_extensions )
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_values[ input.name ].extension, conversion_extensions )
else:
- input_values[ converter_name ] = \
+ input_values[ conversion_name ] = \
DatasetFilenameWrapper( converted_dataset,
datatypes_registry = self.app.datatypes_registry,
- tool = Bunch( converter_name = Bunch( extensions = conv_ext ) ), #trick wrapper into using target conv ext (when None) without actually being a tool parameter
- name = converter_name )
+ tool = Bunch( conversion_name = Bunch( extensions = conv_ext ) ), #trick wrapper into using target conv ext (when None) without actually being a tool parameter
+ name = conversion_name )
#wrap actual input dataset
input_values[ input.name ] = \
DatasetFilenameWrapper( input_values[ input.name ],
diff -r c64ef44ed4c5 -r dedb7be9aa44 lib/galaxy/tools/actions/__init__.py
--- a/lib/galaxy/tools/actions/__init__.py Mon Feb 08 12:45:28 2010 -0500
+++ b/lib/galaxy/tools/actions/__init__.py Mon Feb 08 21:33:12 2010 -0500
@@ -63,41 +63,41 @@
# are stored as name1, name2, ...
for i, v in enumerate( value ):
input_datasets[ prefix + input.name + str( i + 1 ) ] = process_dataset( v )
- converters = []
- for converter_name, converter_extensions, converter_datatypes in input.converters:
- new_data = process_dataset( input_datasets[ prefix + input.name + str( i + 1 ) ], converter_datatypes )
- if not new_data or isinstance( new_data.datatype, converter_datatypes ):
- input_datasets[ prefix + converter_name + str( i + 1 ) ] = new_data
- converters.append( ( converter_name, new_data ) )
+ conversions = []
+ for conversion_name, conversion_extensions, conversion_datatypes in input.conversions:
+ new_data = process_dataset( input_datasets[ prefix + input.name + str( i + 1 ) ], conversion_datatypes )
+ if not new_data or isinstance( new_data.datatype, conversion_datatypes ):
+ input_datasets[ prefix + conversion_name + str( i + 1 ) ] = new_data
+ conversions.append( ( conversion_name, new_data ) )
else:
- raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name + str( i + 1 ) ].extension, converter_extensions )
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name + str( i + 1 ) ].extension, conversion_extensions )
if parent:
parent[input.name] = input_datasets[ prefix + input.name + str( i + 1 ) ]
- for converter_name, converter_data in converters:
+ for conversion_name, conversion_data in conversions:
#allow explicit conversion to be stored in job_parameter table
- parent[ converter_name ] = converter_data.id #a more robust way to determine JSONable value is desired
+ parent[ conversion_name ] = conversion_data.id #a more robust way to determine JSONable value is desired
else:
param_values[input.name][i] = input_datasets[ prefix + input.name + str( i + 1 ) ]
- for converter_name, converter_data in converters:
+ for conversion_name, conversion_data in conversions:
#allow explicit conversion to be stored in job_parameter table
- param_values[ converter_name ][i] = converter_data.id #a more robust way to determine JSONable value is desired
+ param_values[ conversion_name ][i] = conversion_data.id #a more robust way to determine JSONable value is desired
else:
input_datasets[ prefix + input.name ] = process_dataset( value )
- converters = []
- for converter_name, converter_extensions, converter_datatypes in input.converters:
- new_data = process_dataset( input_datasets[ prefix + input.name ], converter_datatypes )
- if not new_data or isinstance( new_data.datatype, converter_datatypes ):
- input_datasets[ prefix + converter_name ] = new_data
- converters.append( ( converter_name, new_data ) )
+ conversions = []
+ for conversion_name, conversion_extensions, conversion_datatypes in input.conversions:
+ new_data = process_dataset( input_datasets[ prefix + input.name ], conversion_datatypes )
+ if not new_data or isinstance( new_data.datatype, conversion_datatypes ):
+ input_datasets[ prefix + conversion_name ] = new_data
+ conversions.append( ( conversion_name, new_data ) )
else:
- raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name ].extension, converter_extensions )
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name ].extension, conversion_extensions )
target_dict = parent
if not target_dict:
target_dict = param_values
target_dict[ input.name ] = input_datasets[ prefix + input.name ]
- for converter_name, converter_data in converters:
+ for conversion_name, conversion_data in conversions:
#allow explicit conversion to be stored in job_parameter table
- target_dict[ converter_name ] = converter_data.id #a more robust way to determine JSONable value is desired
+ target_dict[ conversion_name ] = conversion_data.id #a more robust way to determine JSONable value is desired
tool.visit_inputs( param_values, visitor )
return input_datasets
diff -r c64ef44ed4c5 -r dedb7be9aa44 lib/galaxy/tools/parameters/basic.py
--- a/lib/galaxy/tools/parameters/basic.py Mon Feb 08 12:45:28 2010 -0500
+++ b/lib/galaxy/tools/parameters/basic.py Mon Feb 08 21:33:12 2010 -0500
@@ -1168,15 +1168,15 @@
else:
self.options = dynamic_options.DynamicOptions( options, self )
self.is_dynamic = self.options is not None
- # Load converters required for the dataset input
- self.converters = []
- for conv_elem in elem.findall( "converter" ):
+ # Load conversions required for the dataset input
+ self.conversions = []
+ for conv_elem in elem.findall( "conversion" ):
name = conv_elem.get( "name" ) #name for commandline substitution
conv_extensions = conv_elem.get( "type" ) #target datatype extension
# FIXME: conv_extensions should be able to be an ordered list
assert None not in [ name, type ], 'A name (%s) and type (%s) are required for explicit conversion' % ( name, type )
conv_types = tool.app.datatypes_registry.get_datatype_by_extension( conv_extensions.lower() ).__class__
- self.converters.append( ( name, conv_extensions, conv_types ) )
+ self.conversions.append( ( name, conv_extensions, conv_types ) )
def get_html_field( self, trans=None, value=None, other_values={} ):
filter_value = None
details: http://www.bx.psu.edu/hg/galaxy/rev/c64ef44ed4c5
changeset: 3356:c64ef44ed4c5
user: Dan Blankenberg <dan(a)bx.psu.edu>
date: Mon Feb 08 12:45:28 2010 -0500
description:
First pass at allowing explicit datatype conversion to be specified. This can be used for e.g. providing a tool with a non-metadata based index file.
Defined as part of a DataToolParameter:
<param name="input1" type="interval" label="An Interval File">
<converter name='input1_as_a_bed_file' type='bed'/>
</param>
Both the original input as well as the converted dataset can be accessed like:
<command>some_binary $input1 $input1_as_a_bed_file </command>
if $input1 is already BED, it will be used for input1_as_a_bed_file.
The name is be placed in the dictionary space of the data input parameter's parent; so for Grouping objects, e.g. a repeat:
<repeat name="queries" title="Query">
<param name="input2" type="data" label="Select" >
<converter name='input2_as_a_bed_file' type='bed'/>
</param>
</repeat>
is accessed like (putting both the original and converted dataset as arguments on the command line):
<command> ...
#for $q in $queries
${q.input2} ${q.input2_as_a_bed_file}
#end for
... </command>
See notes in code in commit for additional comments.
diffstat:
lib/galaxy/tools/__init__.py | 28 ++++++++++++++++++++++++
lib/galaxy/tools/actions/__init__.py | 41 +++++++++++++++++++++++++++++------
lib/galaxy/tools/parameters/basic.py | 13 +++++++++-
3 files changed, 73 insertions(+), 9 deletions(-)
diffs (147 lines):
diff -r 26f01eafc6bd -r c64ef44ed4c5 lib/galaxy/tools/__init__.py
--- a/lib/galaxy/tools/__init__.py Mon Feb 08 11:20:44 2010 -0500
+++ b/lib/galaxy/tools/__init__.py Mon Feb 08 12:45:28 2010 -0500
@@ -1198,6 +1198,28 @@
current = values["__current_case__"]
wrap_values( input.cases[current].inputs, values )
elif isinstance( input, DataToolParameter ):
+ ##FIXME: We're populating param_dict with converters when wrapping values,
+ ##this should happen as a separate step before wrapping (or call this wrapping step something more generic)
+ ##(but iterating this same list twice would be wasteful)
+ #add explicit converters by name to current parent
+ for converter_name, converter_extensions, converter_datatypes in input.converters:
+ #if we are at building cmdline step, then converters have already executed
+ conv_ext, converted_dataset = input_values[ input.name ].find_conversion_destination( converter_datatypes )
+ #when dealing with optional inputs, we'll provide a valid extension to be used for None converted dataset
+ if not conv_ext:
+ conv_ext = converter_extensions[0]
+ #input_values[ input.name ] is None when optional dataset,
+ #'conversion' of optional dataset should create wrapper around NoneDataset for converter output
+ if input_values[ input.name ] and not converted_dataset:
+ #input that converter is based from has a value, but converted dataset does not exist
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_values[ input.name ].extension, converter_extensions )
+ else:
+ input_values[ converter_name ] = \
+ DatasetFilenameWrapper( converted_dataset,
+ datatypes_registry = self.app.datatypes_registry,
+ tool = Bunch( converter_name = Bunch( extensions = conv_ext ) ), #trick wrapper into using target conv ext (when None) without actually being a tool parameter
+ name = converter_name )
+ #wrap actual input dataset
input_values[ input.name ] = \
DatasetFilenameWrapper( input_values[ input.name ],
datatypes_registry = self.app.datatypes_registry,
@@ -1212,6 +1234,12 @@
# tools (e.g. UCSC) should really be handled in a special way.
if self.check_values:
wrap_values( self.inputs, param_dict )
+ ###FIXME: when self.check_values==True, input datasets are being wrapped twice
+ ### (above and below, creating 2 separate DatasetFilenameWrapper objects - first is overwritten by second),
+ ###is this necessary? - if we get rid of this way to access children, can we stop this redundancy, or is there another reason for this?
+ ###Only necessary when self.check_values is False (==external dataset tool?: can this be abstracted out as part of being a datasouce tool?)
+ ### but we still want (ALWAYS) to wrap input datasets
+ ### (this should be checked to prevent overhead of creating a new object?)
# Additionally, datasets go in the param dict. We wrap them such that
# if the bare variable name is used it returns the filename (for
# backwards compatibility). We also add any child datasets to the
diff -r 26f01eafc6bd -r c64ef44ed4c5 lib/galaxy/tools/actions/__init__.py
--- a/lib/galaxy/tools/actions/__init__.py Mon Feb 08 11:20:44 2010 -0500
+++ b/lib/galaxy/tools/actions/__init__.py Mon Feb 08 12:45:28 2010 -0500
@@ -31,11 +31,13 @@
"""
input_datasets = dict()
def visitor( prefix, input, value, parent = None ):
- def process_dataset( data ):
- if data and not isinstance( data.datatype, input.formats ):
+ def process_dataset( data, formats = None ):
+ if formats is None:
+ formats = input.formats
+ if data and not isinstance( data.datatype, formats ):
# Need to refresh in case this conversion just took place, i.e. input above in tool performed the same conversion
trans.sa_session.refresh( data )
- target_ext, converted_dataset = data.find_conversion_destination( input.formats, converter_safe = input.converter_safe( param_values, trans ) )
+ target_ext, converted_dataset = data.find_conversion_destination( formats, converter_safe = input.converter_safe( param_values, trans ) )
if target_ext:
if converted_dataset:
data = converted_dataset
@@ -61,16 +63,41 @@
# are stored as name1, name2, ...
for i, v in enumerate( value ):
input_datasets[ prefix + input.name + str( i + 1 ) ] = process_dataset( v )
+ converters = []
+ for converter_name, converter_extensions, converter_datatypes in input.converters:
+ new_data = process_dataset( input_datasets[ prefix + input.name + str( i + 1 ) ], converter_datatypes )
+ if not new_data or isinstance( new_data.datatype, converter_datatypes ):
+ input_datasets[ prefix + converter_name + str( i + 1 ) ] = new_data
+ converters.append( ( converter_name, new_data ) )
+ else:
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name + str( i + 1 ) ].extension, converter_extensions )
if parent:
parent[input.name] = input_datasets[ prefix + input.name + str( i + 1 ) ]
+ for converter_name, converter_data in converters:
+ #allow explicit conversion to be stored in job_parameter table
+ parent[ converter_name ] = converter_data.id #a more robust way to determine JSONable value is desired
else:
param_values[input.name][i] = input_datasets[ prefix + input.name + str( i + 1 ) ]
+ for converter_name, converter_data in converters:
+ #allow explicit conversion to be stored in job_parameter table
+ param_values[ converter_name ][i] = converter_data.id #a more robust way to determine JSONable value is desired
else:
input_datasets[ prefix + input.name ] = process_dataset( value )
- if parent:
- parent[input.name] = input_datasets[ prefix + input.name ]
- else:
- param_values[input.name] = input_datasets[ prefix + input.name ]
+ converters = []
+ for converter_name, converter_extensions, converter_datatypes in input.converters:
+ new_data = process_dataset( input_datasets[ prefix + input.name ], converter_datatypes )
+ if not new_data or isinstance( new_data.datatype, converter_datatypes ):
+ input_datasets[ prefix + converter_name ] = new_data
+ converters.append( ( converter_name, new_data ) )
+ else:
+ raise Exception, 'A path for explicit datatype conversion has not been found: %s --/--> %s' % ( input_datasets[ prefix + input.name ].extension, converter_extensions )
+ target_dict = parent
+ if not target_dict:
+ target_dict = param_values
+ target_dict[ input.name ] = input_datasets[ prefix + input.name ]
+ for converter_name, converter_data in converters:
+ #allow explicit conversion to be stored in job_parameter table
+ target_dict[ converter_name ] = converter_data.id #a more robust way to determine JSONable value is desired
tool.visit_inputs( param_values, visitor )
return input_datasets
diff -r 26f01eafc6bd -r c64ef44ed4c5 lib/galaxy/tools/parameters/basic.py
--- a/lib/galaxy/tools/parameters/basic.py Mon Feb 08 11:20:44 2010 -0500
+++ b/lib/galaxy/tools/parameters/basic.py Mon Feb 08 12:45:28 2010 -0500
@@ -50,14 +50,14 @@
def get_html( self, trans=None, value=None, other_values={}):
"""
- Returns the html widget corresponding to the paramter.
+ Returns the html widget corresponding to the parameter.
Optionally attempt to retain the current value specific by 'value'
"""
return self.get_html_field( trans, value, other_values ).get_html()
def from_html( self, value, trans=None, other_values={} ):
"""
- Convert a value from an HTML POST into the parameters prefered value
+ Convert a value from an HTML POST into the parameters preferred value
format.
"""
return value
@@ -1168,6 +1168,15 @@
else:
self.options = dynamic_options.DynamicOptions( options, self )
self.is_dynamic = self.options is not None
+ # Load converters required for the dataset input
+ self.converters = []
+ for conv_elem in elem.findall( "converter" ):
+ name = conv_elem.get( "name" ) #name for commandline substitution
+ conv_extensions = conv_elem.get( "type" ) #target datatype extension
+ # FIXME: conv_extensions should be able to be an ordered list
+ assert None not in [ name, type ], 'A name (%s) and type (%s) are required for explicit conversion' % ( name, type )
+ conv_types = tool.app.datatypes_registry.get_datatype_by_extension( conv_extensions.lower() ).__class__
+ self.converters.append( ( name, conv_extensions, conv_types ) )
def get_html_field( self, trans=None, value=None, other_values={} ):
filter_value = None
details: http://www.bx.psu.edu/hg/galaxy/rev/26f01eafc6bd
changeset: 3355:26f01eafc6bd
user: Dan Blankenberg <dan(a)bx.psu.edu>
date: Mon Feb 08 11:20:44 2010 -0500
description:
On edit attributes page, only sanitize and add annotation when it exists.
Metadata editing was throwing server error when not logged in.
diffstat:
lib/galaxy/web/controllers/root.py | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diffs (15 lines):
diff -r 4e8785b6815c -r 26f01eafc6bd lib/galaxy/web/controllers/root.py
--- a/lib/galaxy/web/controllers/root.py Mon Feb 08 10:00:30 2010 -0500
+++ b/lib/galaxy/web/controllers/root.py Mon Feb 08 11:20:44 2010 -0500
@@ -307,8 +307,9 @@
setattr( data.metadata, name, spec.unwrap( params.get (name, None) ) )
data.datatype.after_setting_metadata( data )
# Sanitize annotation before adding it.
- annotation = sanitize_html( params.annotation, 'utf-8', 'text/html' )
- self.add_item_annotation( trans, data, annotation )
+ if params.annotation:
+ annotation = sanitize_html( params.annotation, 'utf-8', 'text/html' )
+ self.add_item_annotation( trans, data, annotation )
else:
msg = ' (Metadata could not be changed because this dataset is currently being used as input or output. You must cancel or wait for these jobs to complete before changing metadata.)'
trans.sa_session.flush()