[hg] galaxy 3364: Changed some of the filtering logic for DNA co...
details: http://www.bx.psu.edu/hg/galaxy/rev/33b475ccc316 changeset: 3364:33b475ccc316 user: Kelly Vincent <kpvincent@bx.psu.edu> date: Wed Feb 10 13:18:45 2010 -0500 description: Changed some of the filtering logic for DNA code filter tool, and got tests working diffstat: test-data/dna_filter_in1.tabular | 49 +++++++++++++++++++++ test-data/dna_filter_out1.tabular | 35 +++++++++++++++ test-data/dna_filter_out2.tabular | 31 +++++++++++++ test-data/dna_filter_out3.tabular | 42 ++++++++++++++++++ test-data/dna_filter_out4.tabular | 6 ++ tools/stats/dna_filtering.py | 87 ++++++++++++++++++++++++-------------- tools/stats/dna_filtering.xml | 24 +++++---- 7 files changed, 230 insertions(+), 44 deletions(-) diffs (389 lines): diff -r 1fc06b260097 -r 33b475ccc316 test-data/dna_filter_in1.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/dna_filter_in1.tabular Wed Feb 10 13:18:45 2010 -0500 @@ -0,0 +1,49 @@ +chr1 256 257 A N M N - M N U N N A N D N G N N K N N N + +chr1 468 469 C C C N M N N K . N C U N H N G N N M N S - +chr1 582 583 G G G N G R N R N - N M N V K N N N G C R + +chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R + +chr1 4792 4793 A A M K N W S S N N Y N N N N N M R N R N + +chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N + +chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S - +chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N + +chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y + +chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N + +chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K - +chr1 45026 45027 C A C N N Y N S Y N N X N A D N N K N N A + +chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y + +chr2 45407 45408 C N C S B N N N N N C N Y N N T K G N C N + +chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C - +chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N + +chr2 47814 47815 A C A S N X D N N H W N G N Y C N N M R N + +chr2 48073 48074 A G A Y W . N K N N N G N N N G N N N Y N + +chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N + +chr2 51304 51305 A G N N C N W - N S Y N . N N G N N N W R + +chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N - +chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N - +chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N + +chr2 53505 53506 A A A M N N Y N N N N - K N W N N N S N R + +chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y + +chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N + +chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N + +chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N + +chr10 55870 55871 C G C N H G - N N N C N H K N M G N N N N + +chr10 56024 56025 A T A N D U N Y B N N X N N Y N T N - N N + +chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N + +chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N - +chr10 56137 56138 A A A N A Y H . Y N G N . D N N T N N N N + +chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N + +chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N + +chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T + +chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A - +chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R + +chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N + +chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N + +chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W + +chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N - +chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K + +chr18 91548 91549 A A A S N X H S R N A K N N N N U A R N N + +chr18 93895 93896 T T T H N N V W Y N N N - N N N N N N Y N + +chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y + +chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N + +chr18 140324 140325 A A A N M N N Y N S N V N N X N C N N . M + +chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N - \ No newline at end of file diff -r 1fc06b260097 -r 33b475ccc316 test-data/dna_filter_out1.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/dna_filter_out1.tabular Wed Feb 10 13:18:45 2010 -0500 @@ -0,0 +1,35 @@ +chr1 582 583 G G G N G R N R N - N M N V K N N N G C R + +chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R + +chr1 4792 4793 A A M K N W S S N N Y N N N N N M R N R N + +chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N + +chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S - +chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N + +chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y + +chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K - +chr1 45026 45027 C A C N N Y N S Y N N X N A D N N K N N A + +chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y + +chr2 45407 45408 C N C S B N N N N N C N Y N N T K G N C N + +chr2 47814 47815 A C A S N X D N N H W N G N Y C N N M R N + +chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N + +chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N - +chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N - +chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N + +chr2 53505 53506 A A A M N N Y N N N N - K N W N N N S N R + +chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y + +chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N + +chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N + +chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N + +chr10 56024 56025 A T A N D U N Y B N N X N N Y N T N - N N + +chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N + +chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N + +chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T + +chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A - +chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R + +chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N + +chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N + +chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W + +chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N - +chr18 91548 91549 A A A S N X H S R N A K N N N N U A R N N + +chr18 93895 93896 T T T H N N V W Y N N N - N N N N N N Y N + +chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N + +chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N - \ No newline at end of file diff -r 1fc06b260097 -r 33b475ccc316 test-data/dna_filter_out2.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/dna_filter_out2.tabular Wed Feb 10 13:18:45 2010 -0500 @@ -0,0 +1,31 @@ +chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R + +chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y + +chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N + +chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K - +chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C - +chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N + +chr2 47814 47815 A C A S N X D N N H W N G N Y C N N M R N + +chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N + +chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N + +chr2 53505 53506 A A A M N N Y N N N N - K N W N N N S N R + +chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y + +chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N + +chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N + +chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N + +chr10 56024 56025 A T A N D U N Y B N N X N N Y N T N - N N + +chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N + +chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N - +chr10 56137 56138 A A A N A Y H . Y N G N . D N N T N N N N + +chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N + +chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N + +chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T + +chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R + +chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N + +chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N + +chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W + +chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N - +chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K + +chr18 91548 91549 A A A S N X H S R N A K N N N N U A R N N + +chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y + +chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N + +chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N - \ No newline at end of file diff -r 1fc06b260097 -r 33b475ccc316 test-data/dna_filter_out3.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/dna_filter_out3.tabular Wed Feb 10 13:18:45 2010 -0500 @@ -0,0 +1,42 @@ +chr1 468 469 C C C N M N N K . N C U N H N G N N M N S - +chr1 582 583 G G G N G R N R N - N M N V K N N N G C R + +chr1 602 603 G G G N G N Y N R G G N N U N T N A K N R + +chr1 6119 6120 G G M N S N N W B N S D N N H V N B W N N + +chr1 6357 6358 G G N M K N G - N N G U N N N B N N K N S - +chr1 6433 6434 G G N R N N C N N N . N N . N N N N N R N + +chr1 39160 39161 T T T N N Y N - N N N N N N N V N N N N Y + +chr1 41920 41921 G C G N M C G N A N G N K N W S N N N V N + +chr1 42100 42101 T T T Y R W N N N V N M R N N G N M Y N K - +chr1 45026 45027 C A C N N Y N S Y N N X N A D N N K N N A + +chr1 45161 45162 C T C . N X H V N N C R N Y N N N N R N Y + +chr2 45407 45408 C N C S B N N N N N C N Y N N T K G N C N + +chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C - +chr2 46243 46244 T T T N W N N B V N U N T N N Y C N U N N + +chr2 48073 48074 A G A Y W . N K N N N G N N N G N N N Y N + +chr2 48633 48634 T T T N G N N N . N N N N S N Y N . N N N + +chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N - +chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N - +chr2 53130 53131 T C T K R . N B N N T N N M N Y N N Y N N + +chr2 53559 53560 T T T N N V R V N N T N U N N B N M N V Y + +chr2 55607 55608 A N A U S N N H R K N N N Y N N G N N N N + +chr10 55659 55660 T N T C N K N N N U N S N N N V C R S N N + +chr10 55734 55735 T N T G N C N M M G C N B N . N G N N N N + +chr10 55870 55871 C G C N H G - N N N C N H K N M G N N N N + +chr10 56100 56101 T T A W N N W N S N K M N R N R N R N G N + +chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N - +chr10 56137 56138 A A A N A Y H . Y N G N . D N N T N N N N + +chr10 56174 56175 A T A Y A N N N N N N N N N . S T Y N B N + +chr10 59373 59374 A G A N N N N N N T N S N N N G N N N V N + +chr10 68912 68913 G T G R N B R N H N U W Y N N N N N N N T + +chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A - +chr10 77052 77053 G A R N G N N Y N N N N N N B R N W N N R + +chr18 78200 78201 G G G N N H N N V N G N N N N A A N K X N + +chr18 81076 81077 T A T B N N G N N X W N X N V N N D N N N + +chr18 81198 81199 A T A N N N N - N N X N K T N M N K X N W + +chr18 81216 81217 G A G Y N N D N X N N N N A N S N N N D N - +chr18 81398 81399 G T G N - W N N M N G C N K N S N N N N K + +chr18 93895 93896 T T T H N N V W Y N N N - N N N N N N Y N + +chr18 98172 98173 T T T N . N N N S N T N Y N N Y X D V N Y + +chr18 110904 110905 T - A A N A N A W A N N A X N W N N N N N + +chr18 140324 140325 A A A N M N N Y N S N V N N X N C N N . M + +chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N - \ No newline at end of file diff -r 1fc06b260097 -r 33b475ccc316 test-data/dna_filter_out4.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/dna_filter_out4.tabular Wed Feb 10 13:18:45 2010 -0500 @@ -0,0 +1,6 @@ +chr2 45788 45789 T T T N W S N Y N R Y N S N W M N C T N C - +chr2 51324 51325 T T N R N N N N N - N U N W A N N N N N N - +chr2 52065 52066 T C T N N N S N . N T N M N S W N T Y C N - +chr10 56120 56121 A - A N A N N Y N N N W V N N Y G N N W N - +chr10 72946 72947 T A N N N N N N B N N . B D W U N U N D A - +chr18 160592 160593 C G G G N G N G N G N N G N N M T N Y N N - \ No newline at end of file diff -r 1fc06b260097 -r 33b475ccc316 tools/stats/dna_filtering.py --- a/tools/stats/dna_filtering.py Wed Feb 10 12:17:38 2010 -0500 +++ b/tools/stats/dna_filtering.py Wed Feb 10 13:18:45 2010 -0500 @@ -14,7 +14,7 @@ """ #from __future__ import division -import os.path, re, string, sys +import os.path, re, string, string, sys from galaxy import eggs import pkg_resources; pkg_resources.require( "bx-python" ) from bx.cookbook import doc_optparse @@ -62,28 +62,29 @@ orig_cond_text = cond_text # Expand to allow for DNA codes dot_letters = [ letter for letter in string.uppercase if letter not in \ - [ 'A', 'T', 'U', 'G', 'C', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ] ] - codes = {'A': [ 'A', 'M', 'R', 'W', 'V', 'H', 'D' ], - 'T': [ 'T', 'U', 'K', 'Y', 'W', 'B', 'H', 'D' ], - 'G': [ 'G', 'K', 'R', 'S', 'B', 'V', 'D' ], - 'C': [ 'C', 'M', 'Y', 'S', 'B', 'V', 'H' ], - 'U': [ 'T', 'U', 'K', 'Y', 'W', 'B', 'H', 'D' ], - 'K': [ 'K', 'G', 'T' ], - 'M': [ 'M', 'A', 'C' ], - 'R': [ 'R', 'A', 'G' ], - 'Y': [ 'Y', 'C', 'T' ], - 'S': [ 'S', 'C', 'G' ], - 'W': [ 'W', 'A', 'T' ], - 'B': [ 'B', 'C', 'G', 'T' ], - 'V': [ 'V', 'A', 'C', 'G' ], - 'H': [ 'H', 'A', 'C', 'T' ], - 'D': [ 'D', 'A', 'G', 'T' ], + [ 'A', 'C', 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'N', 'R', 'S', 'V', 'W', 'X', 'Y' ] ] + dot_letters.append( '.' ) + codes = {'A': [ 'A', 'D', 'H', 'M', 'R', 'V', 'W' ], + 'C': [ 'C', 'B', 'H', 'M', 'S', 'V', 'Y' ], + 'G': [ 'G', 'B', 'D', 'K', 'R', 'S', 'V' ], + 'T': [ 'T', 'U', 'B', 'D', 'H', 'K', 'W', 'Y' ], + 'U': [ 'T', 'U', 'B', 'D', 'H', 'K', 'W', 'Y' ], + 'K': [ 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'R', 'S', 'V', 'W', 'Y' ], + 'M': [ 'A', 'C', 'B', 'D', 'H', 'M', 'R', 'S', 'V', 'W', 'Y' ], + 'R': [ 'A', 'G', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'W' ], + 'Y': [ 'C', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'S', 'V', 'W', 'Y' ], + 'S': [ 'C', 'G', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'Y' ], + 'W': [ 'A', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'R', 'V', 'W', 'Y' ], + 'B': [ 'C', 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'W', 'Y' ], + 'V': [ 'A', 'C', 'G', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'W' ], + 'H': [ 'A', 'C', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'W', 'Y' ], + 'D': [ 'A', 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'R', 'S', 'V', 'W', 'Y' ], '.': dot_letters, '-': [ '-' ]} # Add handling for N and X if n_handling == "all": - codes[ 'N' ] = [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ] - codes[ 'X' ] = [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X' ] + codes[ 'N' ] = [ 'A', 'C', 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'N', 'R', 'S', 'V', 'W', 'X', 'Y' ] + codes[ 'X' ] = [ 'A', 'C', 'G', 'T', 'U', 'B', 'D', 'H', 'K', 'M', 'N', 'R', 'S', 'V', 'W', 'X', 'Y' ] for code in codes.keys(): if code != '.' and code != '-': codes[code].append( 'N' ) @@ -91,31 +92,51 @@ else: codes[ 'N' ] = dot_letters codes[ 'X' ] = dot_letters + codes[ '.' ].extend( [ 'N', 'X' ] ) # Expand conditions to allow for DNA codes try: match_replace = {} - pat = re.compile( "c\d+\s*[!=]=\s*[\w']+" ) + pat = re.compile( 'c\d+\s*[!=]=\s*[\w\d"\'+-.]+' ) matches = pat.findall( cond_text ) for match in matches: - if match.find( '==' ) > 0: + if match.find( 'chr' ) >= 0 or match.find( 'scaffold' ) >= 0 or match.find( '+' ) >= 0: + if match.find( '==' ) >= 0: + match_parts = match.split( '==' ) + elif match.find( '!=' ) >= 0: + match_parts = match.split( '!=' ) + else: + raise Exception, "The operators '==' and '!=' were not found." + left = match_parts[0].strip() + right = match_parts[1].strip() + new_match = "(%s)" % ( match ) + elif match.find( '==' ) > 0: match_parts = match.split( '==' ) - new_match = '(%s in codes[%s] and %s in codes[%s])' % ( match_parts[0], match_parts[1], match_parts[1], match_parts[0] ) + left = match_parts[0].strip() + right = match_parts[1].strip() + new_match = '(%s in codes[%s] and %s in codes[%s])' % ( left, right, right, left ) elif match.find( '!=' ) > 0 : match_parts = match.split( '!=' ) - new_match = '(%s not in codes[%s] or %s not in codes[%s])' % ( match_parts[0], match_parts[1], match_parts[1], match_parts[0] ) + left = match_parts[0].strip() + right = match_parts[1].strip() + new_match = '(%s not in codes[%s] or %s not in codes[%s])' % ( left, right, right, left ) else: - raise Exception - if match_parts[1].find( "'" ) >= 0: - assert match_parts[1].replace( "'", '' ) in [ 'G', 'A', 'T', 'C', 'U', 'K', 'M', 'R', 'Y', 'S', 'W', 'B', 'V', 'H', 'D', 'N', 'X', '-', '.' ] + raise Exception, "The operators '==' and '!=' were not found." + assert left.startswith( 'c' ), 'The column names should start with c (lowercase)' + if right.find( "'" ) >= 0 or right.find( '"' ) >= 0: + test = right.replace( "'", '' ).replace( '"', '' ) + assert test in string.uppercase or test.find( '+' ) >= 0 or test.find( '.' ) >= 0 or test.find( '-' ) >= 0\ + or test.startswith( 'chr' ) or test.startswith( 'scaffold' ), \ + 'The value to search for should be a valid base, code, plus sign, chromosome (like "chr1") or scaffold (like "scaffold5"). ' \ + 'Use the general filter tool to filter on anything else first' else: - assert match_parts[1].startswith( 'c' ) + assert right.startswith( 'c' ), 'The column names should start with c (lowercase)' match_replace[match] = new_match + if len( match_replace.keys() ) == 0: + raise Exception, 'There do not appear to be any valid conditions' for match in match_replace.keys(): - cond_text = cond_text.replace(match, match_replace[match]) - if len( match_replace ) == 0: - raise Exception - except: - stop_err( "One of your conditions is invalid. Make sure to use only '!=' or '==', valid column numbers, and valid base values." ) + cond_text = cond_text.replace( match, match_replace[match] ) + except Exception, e: + stop_err( "At least one of your conditions is invalid. Make sure to use only '!=' or '==', valid column numbers, and valid base values.\n" + str(e) ) # Attempt to determine if the condition includes executable stuff and, if so, exit secured = dir() @@ -177,7 +198,7 @@ out.close() if str( e ).startswith( 'invalid syntax' ): valid_filter = False - stop_err( 'Filter condition "%s" likely invalid. See tool tips, syntax and examples.' % orig_cond_text ) + stop_err( 'Filter condition "%s" likely invalid. See tool tips, syntax and examples.' % orig_cond_text + ' '+str(e)) else: stop_err( str( e ) ) diff -r 1fc06b260097 -r 33b475ccc316 tools/stats/dna_filtering.xml --- a/tools/stats/dna_filtering.xml Wed Feb 10 12:17:38 2010 -0500 +++ b/tools/stats/dna_filtering.xml Wed Feb 10 13:18:45 2010 -0500 @@ -11,7 +11,7 @@ </command> <inputs> <param format="tabular" name="input" type="data" label="Filter" help="Query missing? See TIP below."/> - <param name="cond" size="40" type="text" value="c8=='G'" label="With following condition" help="Double equal signs, ==, must be used as shown above. To filter for an arbitrary string, use the Select tool."> + <param name="cond" size="40" type="text" value="c4 == 'G'" label="With following condition" help="Double equal signs, ==, must be used as shown above. To filter for an arbitrary string, use the Select tool."> <validator type="empty_field" message="Enter a valid filtering condition, see syntax and examples below."/> </param> <param name="n_handling" type="select" label="Do you want N (and X) to match A or C or G or T OR nothing?"> @@ -24,28 +24,28 @@ </outputs> <tests> <test> - <param name="input" value="dna_filter_in1.bed" /> + <param name="input" ftype="tabular" value="dna_filter_in1.tabular" /> <param name="cond" value="c8=='G'" /> <param name="n_handling" value="all" /> - <output name="out_file1" file="dna_filter_out1.bed" /> + <output name="out_file1" ftype="tabular" file="dna_filter_out1.tabular" /> </test> <test> - <param name="input" value="dna_filter_in1.bed" /> - <param name="cond" value="(c10==c11 or c17==c18) and c6!='C' and c23=='R'" /> + <param name="input" value="dna_filter_in1.tabular" /> + <param name="cond" value="(c10 == c11 or c17 == c18) and c6 != 'C' and c23 == 'R'" /> <param name="n_handling" value="all" /> - <output name="out_file1" file="dna_filter_out2.bed" /> + <output name="out_file1" file="dna_filter_out2.tabular" /> </test> <test> - <param name="input" value="dna_filter_in1.bed" /> + <param name="input" value="dna_filter_in1.tabular" /> <param name="cond" value="c4=='B' or c9==c10" /> <param name="n_handling" value="none" /> - <output name="out_file1" file="dna_filter_out3.bed" /> + <output name="out_file1" file="dna_filter_out3.tabular" /> </test> <test> - <param name="input" value="dna_filter_in1.bed" /> - <param name="cond" value="c7!='Y' and c9!='U'" /> + <param name="input" value="dna_filter_in1.tabular" /> + <param name="cond" value="c1!='chr1' and c7!='Y' and c25!='+'" /> <param name="n_handling" value="none" /> - <output name="out_file1" file="dna_filter_out4.bed" /> + <output name="out_file1" file="dna_filter_out4.tabular" /> </test> </tests> <help> @@ -73,6 +73,7 @@ - When using 'equal-to' operator **double equal sign '==' must be used** ( e.g., **c1=='chr1'** ) - Non-numerical values must be included in single or double quotes ( e.g., **c6=='C'** ) - Filtering condition can include logical operators, but **make sure operators are all lower case** ( e.g., **(c1!='chrX' and c1!='chrY') or c6=='+'** ) +- You can use spaces between the arguments and equality sign or not (e.g. both **c9 == c10** and **c9==c10** are valid) ----- @@ -109,6 +110,7 @@ - **c8=='A'** selects lines in which the eighth column is A, M, R, W, V, H, D and N or X if appropriate - **c12==c15** selects lines where the value in the twelfth column could be the same as the fifteenth and the fifteenth column could be the same as the twelfth column (based on appropriate codes) - **c9!=c19** selects lines where column nine could not be the same as column nineteen and column nineteen could not be the same as column nine (using appropriate codes) +- **c4 == 'A' and c4 == c5** selects lines where column 4 and 5 are both A, M, R, W, V, H, D and N or X if appropriate </help> </tool>
participants (1)
-
Greg Von Kuster