[hg] galaxy 3467: Modifications to pileup_parser
details: http://www.bx.psu.edu/hg/galaxy/rev/765c454bcaa7 changeset: 3467:765c454bcaa7 user: Anton Nekrutenko <anton@bx.psu.edu> date: Tue Mar 02 16:35:21 2010 -0500 description: Modifications to pileup_parser diffstat: static/images/pileup_parser_help1.png | 0 static/images/pileup_parser_help2.png | 0 static/images/pileup_parser_help3.png | 0 static/images/pileup_parser_help4.png | 0 test-data/pileup_parser.10col.20-3-yes-yes-yes-no.pileup.out | 86 ++++++ test-data/pileup_parser.10col.20-3-yes-yes-yes-yes.pileup.out | 86 ++++++ tools/samtools/pileup_parser.pl | 22 +- tools/samtools/pileup_parser.xml | 128 ++++++++- 8 files changed, 307 insertions(+), 15 deletions(-) diffs (526 lines): diff -r 1d11aec88053 -r 765c454bcaa7 static/images/pileup_parser_help1.png Binary file static/images/pileup_parser_help1.png has changed diff -r 1d11aec88053 -r 765c454bcaa7 static/images/pileup_parser_help2.png Binary file static/images/pileup_parser_help2.png has changed diff -r 1d11aec88053 -r 765c454bcaa7 static/images/pileup_parser_help3.png Binary file static/images/pileup_parser_help3.png has changed diff -r 1d11aec88053 -r 765c454bcaa7 static/images/pileup_parser_help4.png Binary file static/images/pileup_parser_help4.png has changed diff -r 1d11aec88053 -r 765c454bcaa7 test-data/pileup_parser.10col.20-3-yes-yes-yes-no.pileup.out --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/pileup_parser.10col.20-3-yes-yes-yes-no.pileup.out Tue Mar 02 16:35:21 2010 -0500 @@ -0,0 +1,86 @@ +chrM 13 14 A A 56 0 25 18 16 0 1 0 17 1 +chrM 18 19 T T 55 0 25 20 0 0 1 18 19 1 +chrM 35 36 A A 103 0 25 30 27 0 1 0 28 1 +chrM 58 59 A A 50 0 24 16 12 1 0 0 13 1 +chrM 59 60 C C 50 0 24 16 1 15 0 0 16 1 +chrM 157 158 A G 117 117 19 62 0 0 56 0 56 56 +chrM 170 171 T T 141 0 21 46 0 0 1 45 46 1 +chrM 172 173 A A 130 0 20 42 36 1 0 0 37 1 +chrM 173 174 A A 122 0 21 40 38 1 0 0 39 1 +chrM 187 188 A A 36 0 21 13 11 0 1 0 12 1 +chrM 195 196 T T 17 0 19 5 0 0 1 4 5 1 +chrM 284 285 G G 34 0 15 11 0 0 10 1 11 1 +chrM 285 286 T T 33 0 15 11 1 0 0 7 8 1 +chrM 303 304 G G 41 0 18 13 1 0 11 0 12 1 +chrM 310 311 T T 41 0 18 13 0 1 0 11 12 1 +chrM 347 348 T T 4 0 20 3 1 0 0 2 3 1 +chrM 354 355 A C 14 36 25 4 1 3 0 0 4 3 +chrM 355 356 T C 39 39 25 4 0 4 0 0 4 4 +chrM 380 381 C T 77 108 25 28 0 0 0 24 24 24 +chrM 383 384 A A 76 0 25 30 25 1 0 0 26 1 +chrM 385 386 A G 88 120 25 32 0 0 29 0 29 29 +chrM 414 415 C T 75 75 25 16 0 0 0 15 15 15 +chrM 464 465 C C 169 0 25 65 1 58 0 0 59 1 +chrM 468 469 T T 207 0 25 66 0 0 1 56 57 1 +chrM 471 472 C C 187 0 25 72 1 64 0 0 65 1 +chrM 490 491 C C 212 0 23 79 0 73 0 1 74 1 +chrM 506 507 A A 150 0 21 76 64 1 0 0 65 1 +chrM 510 511 C C 213 0 21 72 1 68 0 0 69 1 +chrM 526 527 A A 164 0 22 58 51 1 0 0 52 1 +chrM 527 528 A A 164 0 22 58 48 0 1 0 49 1 +chrM 536 537 A A 138 0 23 49 42 1 0 0 43 1 +chrM 557 558 T T 83 0 23 27 0 0 1 23 24 1 +chrM 606 607 G G 105 0 24 43 0 0 33 2 35 2 +chrM 618 619 T T 117 0 24 48 0 1 0 34 35 1 +chrM 627 628 G G 108 0 22 57 1 0 51 0 52 1 +chrM 659 660 C C 166 0 19 58 1 53 0 0 54 1 +chrM 668 669 C C 135 0 19 42 1 40 1 0 42 2 +chrM 713 714 A A 127 0 23 96 90 1 0 0 91 1 +chrM 719 720 T T 130 0 24 96 0 0 1 77 78 1 +chrM 736 737 A A 98 0 24 98 87 1 0 0 88 1 +chrM 747 748 A A 97 0 25 86 80 1 0 0 81 1 +chrM 749 750 A A 93 0 25 77 71 2 0 0 73 2 +chrM 759 760 T T 106 0 25 86 0 0 1 69 70 1 +chrM 762 763 A A 99 0 25 83 76 1 0 0 77 1 +chrM 763 764 G G 107 0 25 86 1 0 75 0 76 1 +chrM 764 765 G G 88 0 25 77 0 0 68 1 69 1 +chrM 767 768 T T 102 0 25 71 1 0 0 65 66 1 +chrM 777 778 G G 108 0 25 63 0 0 58 1 59 1 +chrM 784 785 A A 99 0 25 63 51 1 0 0 52 1 +chrM 790 791 G G 155 0 25 60 0 0 55 1 56 1 +chrM 794 795 T T 212 0 25 74 1 0 0 62 63 1 +chrM 807 808 C C 226 0 22 132 1 110 0 0 111 1 +chrM 808 809 T T 242 0 22 134 0 1 0 109 110 1 +chrM 809 810 A A 243 0 22 144 128 0 1 0 129 1 +chrM 814 815 C C 246 0 22 160 1 142 0 0 143 1 +chrM 815 816 A A 255 0 22 169 159 0 1 0 160 1 +chrM 816 817 A A 254 0 22 171 150 1 0 0 151 1 +chrM 819 820 A A 255 0 22 183 157 0 0 1 158 1 +chrM 822 823 T T 252 0 22 192 0 1 0 171 172 1 +chrM 825 826 A A 255 0 22 198 177 0 1 0 178 1 +chrM 829 830 G G 255 0 22 198 1 0 185 0 186 1 +chrM 837 838 G G 255 0 22 169 0 0 160 1 161 1 +chrM 841 842 C C 239 0 22 173 1 156 0 0 157 1 +chrM 856 857 C C 254 0 23 124 1 114 0 0 115 1 +chrM 858 859 A A 250 0 23 118 110 0 1 0 111 1 +chrM 859 860 A A 255 0 23 115 101 1 0 0 102 1 +chrM 864 865 G G 255 0 23 115 1 0 109 0 110 1 +chrM 866 867 A A 224 0 23 105 81 1 0 0 82 1 +chrM 872 873 C C 206 0 23 94 1 84 0 0 85 1 +chrM 873 874 A A 208 0 23 94 80 1 0 0 81 1 +chrM 931 932 C C 53 0 23 22 2 18 0 0 20 2 +chrM 936 937 C C 98 0 24 31 1 21 0 0 22 1 +chrM 950 951 C C 191 0 23 76 1 61 0 0 62 1 +chrM 951 952 A A 223 0 24 81 78 2 0 0 80 2 +chrM 952 953 C C 179 0 24 86 2 67 0 0 69 2 +chrM 956 957 T T 237 0 23 110 0 0 1 91 92 1 +chrM 957 958 C C 251 0 23 117 1 96 0 0 97 1 +chrM 966 967 A A 246 0 23 125 119 1 0 0 120 1 +chrM 974 975 C C 255 0 23 125 1 111 1 0 113 2 +chrM 980 981 C C 247 0 23 123 0 111 1 0 112 1 +chrM 981 982 C C 255 0 23 122 2 106 0 0 108 2 +chrM 982 983 C C 252 0 23 122 1 105 0 0 106 1 +chrM 983 984 A A 250 0 23 119 105 1 0 0 106 1 +chrM 987 988 A A 243 0 23 94 79 1 0 0 80 1 +chrM 1005 1006 C C 138 0 18 49 1 43 0 0 44 1 +chrM 1025 1026 G G 19 0 8 33 0 0 31 1 32 1 diff -r 1d11aec88053 -r 765c454bcaa7 test-data/pileup_parser.10col.20-3-yes-yes-yes-yes.pileup.out --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/pileup_parser.10col.20-3-yes-yes-yes-yes.pileup.out Tue Mar 02 16:35:21 2010 -0500 @@ -0,0 +1,86 @@ +chrM 13 14 A A 56 0 25 18 .......G.........^:. BIIIIIII+IIIIIIIII 16 0 1 0 17 1 +chrM 18 19 T T 55 0 25 20 ..................GG IIIIIIIIIIIIIIIIII'A 0 0 1 18 19 1 +chrM 35 36 A A 103 0 25 30 .$..N...G.....C...............^:. 7:>"EIIIEI5><$C7B?B=IIIIIIIIII 27 0 1 0 28 1 +chrM 58 59 A A 50 0 24 16 ...............C IB20III:<DIII#II 12 1 0 0 13 1 +chrM 59 60 C C 50 0 24 16 .$.............A. I?>=IIIIIIIIIIBI 1 15 0 0 16 1 +chrM 157 158 A G 117 117 19 62 GGGGGGGGGGGGGGGGGGGNNNGGGGGG.GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGg^:g II6IIII<4I+IIIIIIII"""IIIIII$IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII> 0 0 56 0 56 56 +chrM 170 171 T T 141 0 21 46 .$.$.$.$.$.....................................,,.^:G IIIII=>IIIIIIIIIIIIIIDDII7IGIIIIIHIIIIIIIIIBI8 0 0 1 45 46 1 +chrM 172 173 A A 130 0 20 42 .$.$...C...............................,,... 914HG841?IA0III:IB@>;@FIIIIIIIEIIBII;IIIII 36 1 0 0 37 1 +chrM 173 174 A A 122 0 21 40 .$.$..........C......................,,... ?DCI?@I7C5I*I9C9IIIE?C>::I8IIIIIIIIIIIII 38 1 0 0 39 1 +chrM 187 188 A A 36 0 21 13 G$.$.$....,,.... II5II5,IIIIII 11 0 1 0 12 1 +chrM 195 196 T T 17 0 19 5 G.... <IIII 0 0 1 4 5 1 +chrM 284 285 G G 34 0 15 11 ....T....^!.^!. IIIIIIIIIII 0 0 10 1 11 1 +chrM 285 286 T T 33 0 15 11 ..C..N.A... AI&II"II(II 1 0 0 7 8 1 +chrM 303 304 G G 41 0 18 13 ...A....,.,., IIIIIIIIIIII* 1 0 11 0 12 1 +chrM 310 311 T T 41 0 18 13 .$...C...,.,., IIIIIIIIIIII* 0 1 0 11 12 1 +chrM 347 348 T T 4 0 20 3 ,$.$A IIA 1 0 0 2 3 1 +chrM 354 355 A C 14 36 25 4 .ccc IIII 1 3 0 0 4 3 +chrM 355 356 T C 39 39 25 4 Cccc IIII 0 4 0 0 4 4 +chrM 380 381 C T 77 108 25 28 tttttttTttttgttttttttttt^:t^:t^:t^:t IIIIIIIIIIII&I7CI>BCA296+IA1 0 0 0 24 24 24 +chrM 383 384 A A 76 0 25 30 ,,,,,,,.,c,,g,,,,,,,g,,,,,,g,. IIIIIIIIIIII$IIIIIII$IIIIF,4II 25 1 0 0 26 1 +chrM 385 386 A G 88 120 25 32 gggggggGggnggggggggggggggggggGgg III>IIIIII"I&IIIIIIIIIIII?IIDI94 0 0 29 0 29 29 +chrM 414 415 C T 75 75 25 16 t$t$tttttTttttTTTT IIIIIII>II93IIII 0 0 0 15 15 15 +chrM 464 465 C C 169 0 25 65 ,.............,,.....A.,......,..............,,,..........,a...,^:A IIIIBI>H1IIIIB$)IIIII$IIIIIIIIII0IIIII6IIIIIIHIIIIIIIIIIIII&IIICI 1 58 0 0 59 1 +chrM 468 469 T T 207 0 25 66 .$.$.$.........,,.......,......,..............,n,..........,,...,G... III>I44EIIICI@IIEDC$IBIIII(IIIIDIIII@IIIIII1"0IIII%IIIIII1III7IIII 0 0 1 56 57 1 +chrM 471 472 C C 187 0 25 72 .$.$.....,t.......,....A.,..............,a,..........,a...,A.......,,^:.^:.^:.^:.^:. I@IIIIII&IIIFG6IIIIII$IIIIIIIII$IIFIIII:IIIIIIIIIIII%IIII,IIIIII/I-IIIII 1 64 0 0 65 1 +chrM 490 491 C C 212 0 23 79 ..n,,....A.....,,...,........t,.....,....,..........,....,.,,...,.....,,..,.,., II"IIIIII&II:IIIIIIII(II@HIII;IIIIIIIIIIIIIIIIII?IIIIIIIIIIFIIIIIIIIIII+IIII3II 0 73 0 1 74 1 +chrM 506 507 A A 150 0 21 76 .$.$.$.$C$,....,..........c....,.,,...,.....,,..,.,.,,,,...,,,..,,....,,.,....^:.^:,^:, /I:3&I33.;I,./<G=I?IIIAIIIIIIIDI*IIIIIIIIIIIHDIIIIIIII=DIIIICIIII@IIIIII(I@6 64 1 0 0 65 1 +chrM 510 511 C C 213 0 21 72 .$.$.$....,....,.,,...,.....,,..,.,.,,,,...,,,..,,....,,.,...A.,,..,,,,,,,^:, III@/IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGHIIIIIICIIEII6II/562I 1 68 0 0 69 1 +chrM 526 527 A A 164 0 22 58 ,,,...,,,..,,....,,.,.....,,..,,,,,,,,,..,,,..C,,,,,.,,,,, III3I=III;/II7I24II3I'II9IIIFIIIIIIIIIIIIIIIIIIIIIIIIIIIII 51 1 0 0 52 1 +chrM 527 528 A A 164 0 22 58 g$,$,...,,,..,,....,,.,.....,,..,,,,,,,,,..,,,...,,,,,.,,,,, IIIF$9IIIIIII0A6(II4I+FI%:II4=IIIII'IIIIIIIIIIIIIIII3IIIII 48 0 1 0 49 1 +chrM 536 537 A A 138 0 23 49 .c,.,.....,,..,,,,,,,,,..,,,...,,,,,.,,,,,.,,,,,, 7II+I$0I8III/@IIIIIIIIIIIIIIIIIIIIIIIIIIIIII4I46I 42 1 0 0 43 1 +chrM 557 558 T T 83 0 23 27 ,$,$,$.,,,,,.,,,,,,.,.,....G.^:. III+IIIIIII>IHDIG4I4IIIIIII 0 0 1 23 24 1 +chrM 606 607 G G 105 0 24 43 ..............,....T..T..................^:.^!, BI-&*1&6IIIGIIII%;9I.;>IIIIIIIIIIIIIIIIIII3 0 0 33 2 35 2 +chrM 618 619 T T 117 0 24 48 ,.........C..C..............,...........N..,...^:. I2&-+)(4I+>CI$&I15@IIIIIIIII,IIIIIIIIIII"II5IIII 0 1 0 34 35 1 +chrM 627 628 G G 108 0 22 57 .$...........A........,..............,................A^!.^!.^:. 5GI<=4;>G=I<%IIIIII.IIII'IIIIIIII$IIIIIIIIIIIIIIIIIIIIIII 1 0 51 0 52 1 +chrM 659 660 C C 166 0 19 58 .$.$.............,...n,.,,......,......................a,... /CII>9@II277IIII9II"IIIIIIFIIIIIIBIIIIIIIIIIIIIIII$II7IIII 1 53 0 0 54 1 +chrM 668 669 C C 135 0 19 42 .$,$,$......,.....................Ga,......,^:. IIIIIII@IIFI@IIIIIIIII;IIIIHIII56IIIIIIIAI 1 40 1 0 42 2 +chrM 713 714 A A 127 0 23 96 ...,..C.....................,......................,............................C...........^:.^:.^:.^:. 67'II>(IIIAFI<IIIII?I$EI>GAGII4III8IIDIIIDIIIFIIIIG$IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 90 1 0 0 91 1 +chrM 719 720 T T 130 0 24 96 .....C...........,...C.....G............,............G....................A...................^:.^:. D?76/$I'@'<,I7;I<I5GI'H+B.3%;=<3II1CGIIB@FII8@II0II4II$IIIII537IIIIIIIIIII#IIIIIIIII=III$IIIIIII 0 0 1 77 78 1 +chrM 736 737 A A 98 0 24 98 .$.$.$..............................C................T.......................C..................^:.^:.^:.^:.^:. AI709I4I>IDII.7IIC+938AEC5?IDIC5I*IIII7I21?5B1AIII#IIIIIIIIIIAIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 87 1 0 0 88 1 +chrM 747 748 A A 97 0 25 86 .$.$C$.$.$.$.$.$.$.........................................................................^:.^:.^:.^:. ?DIIIII<I8I@>B2IIFB%IIIIIIHIII;4III3IIIEEIIIIIIIIIIIIIIIIIIIIIIIIII$IIIIIIIIIIII@IIIII 80 1 0 0 81 1 +chrM 749 750 A A 93 0 25 77 .$.$............................C......C......................................^:. I2GII,II%@IAIIIII9IIIIIIIIIDII>I>II=IIIIIIIIIIIIIIIIIIIIIIIII8IIIII.IIIIIIIII 71 2 0 0 73 2 +chrM 759 760 T T 106 0 25 86 A$.....A..............G........................A........G........................^:.^:.^:.^:.^:.^:. #6<EF9$330.II4*;EII7+$I;II:3IICB4073I6;IEIIII5+II@IIIII:I+IIIIIIIIIIIIIIIIIIIIIIIIIIII 0 0 1 69 70 1 +chrM 762 763 A A 99 0 25 83 N........................................................................C......... "IC:6,1III=CIIIIB6GIII7G<DI>IIIIII8IIII$IHIII=I&IIIIIIIIIIIIIIIIIIIIIIII39IIIIIIIII 76 1 0 0 77 1 +chrM 763 764 G G 107 0 25 86 .$.$.$.$.$.$.$.$T$.....NN......................................AN...........................^:.^:.^:. 9I9I@8I4-I7IC&""=&=I+II=>I>IIIIIIIIIIIII@IIII+IIIIIIIII"II%IIIII@IIIIIIIIIIIIIIIIIIIII 1 0 75 0 76 1 +chrM 764 765 G G 88 0 25 77 .$.$.$.$.$.T..T.T............................................N.................... :,I;&I7D.%I'III<HD7IIIII*CIIIIIIIIII%IIIIIIIIIIIIIIIHIII"IIIIIIIIIIIIIIIIIIII 0 0 68 1 69 1 +chrM 767 768 T T 102 0 25 71 .$.$................................A.................................... &IBGII@5ICF>0?BII2IIIIII@+II;IEIII7I,III8IIIIIIIIIIIE@IIIIIIII8IIIIIIII 1 0 0 65 66 1 +chrM 777 778 G G 108 0 25 63 .$.$...T.....C......T...................T.......................^:. AG(II$IIIII&6BIII:'IIIII9IIIII=IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0 0 58 1 59 1 +chrM 784 785 A A 99 0 25 63 .$.........................................................C...^:. II/+III%I.4IE:6..;8+I$B@II2IIIIDI@0IIIEGIFIIIIIIIIIIIIIIIICIIII 51 1 0 0 52 1 +chrM 790 791 G G 155 0 25 60 .......T.............................................,..,.^:T^:, 7III5II$IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII%4II/ 0 0 55 1 56 1 +chrM 794 795 T T 212 0 25 74 .$.$.$.$.$.$...A....................................N.,..,..,.,,,,,,,.,,,^:.^:,^:,^!.^:.^:,^:, II$IIIIII9I9@IIDIIIIIIIIIIIIIIIIIIIIIIIIIIIIII"IIIIIII*IB$I#%#II$2$II:II4I 1 0 0 62 63 1 +chrM 807 808 C C 226 0 22 132 .$..............NN........N,..,..,.,,,,,,,.,aa.,,..,,a,,,.....,..,,.,.,,,.,,,.,,,,....,.,,,..a.,,,,,N..,.,..,.....,...,.....^:,^:,^:,^!.^!.^:,^!.^!.^:, IIIII;AIIIIIII.""I?IIIIII"=IIIII%II$I'I+II$)7I=IGIII%;IIIIIIIIIIAIIII=IIIII+I43IIIIIIII%IIII/IIII5;"IIII1IIIIIIII7III/IIIIID46II4II: 1 110 0 0 111 1 +chrM 808 809 T T 242 0 22 134 .$.......................N,..,..,.,,,g,,,.,,,.,,..,,,,,,.....,..,,.,.,,n.,,,.,,,,....,.,,,..,.,,,,,...,.,..,.....,C..,.....,,,..,..,^:,^:,^!. IIIIII:IIIIIII01II8I1'II"EI,III3IIII&I$II(.(IIFIIIIII+IIIIIIIII1,IIIBI"III5I34ICIIIIII/IIII?IICC53IIIHII>I2IIIIII:IIIIIIII91III1II5I4I 0 1 0 109 110 1 +chrM 809 810 A A 243 0 22 144 .$.$......................,.C,..,.,,,,,,,.,,,.,,..,,,,,,.....,..,,C,.,,,.,,,.,,g,....,.,,,..,.,,,,,...,.,..,.....,...,.....,,,..,..,,,.^:,^:.^:g^:,^:,^:,^!N^!N^:,^:,^:, HIII28IIIGII;III*I,II>IB62$2IIDDIIIIIIII&IIIIIIIIIIIIIII*IIIIIII$IIIII>IIIIIIIIIIIIII#IIIIIIIIIIIIIIIII>IIIIIHEI.IIIIIIIIIIIIIIIIIIII(I#III""GI9 128 0 1 0 129 1 +chrM 814 815 C C 246 0 22 160 .$.$..............,..,..,.,,,,,,,.,,a.,,..,,,,,,.....,..,,.,.,,t.,,,.,,,,....,.,,,..,.,,,,,...,.,..,.....,...,.....,,,..,..,,,.,.,,,,..,,,...,,,,,a....,,....,a^:.^:.^:, IIIIIIIIIIIIII:IIIIIII8IHIII$III*6=IIIIIII@I*IIIEI8IIIIIIIIII-IIIII:IIIIIIIII$IIII<IIIIIIIIIII2IIIIIIICIEIIIIIIII/I2IIIIIAIII*I#I2$IIII*III8C5$9$IIII9/IIII<&II8 1 142 0 0 143 1 +chrM 815 816 A A 255 0 22 169 ..............,.C,..,.,,,,,,,.,,,.,,..,,,,,,.....,..,,.,.,,,.,,,.,,,,....,.,,,..,.,,,,,...,.,..,.....,...,.....,,,..,..,,,.,.,,,,..,,,.NN,,,g,,....,,....,,..,^:,^:,^:.^:,^:,^:.^:.^:,^:,^:,^!. IHEF8II?+GI:I@I7%IIIIAIIII+IIA'II>II@IIIIIII3IIIII/III7IIIII5IIIIIIIIIIIIII+IIIIIIIIIIIIIIIAI6IIIIII;I8IIIIIIIIIIFIIIIIIIIIIIIIIIIIIIII""IIIAIIIIIIIIIIIIIIIIIIIIIIIIIIII 159 0 1 0 160 1 +chrM 816 817 A A 254 0 22 171 .$.C.......T..Nn..,..,.,,,,,,,.,,,.,,..,,,,,,.....,..,,.,.,n,N,,,.,,,,....,.,,,..,.,,,,,...,.,..,.....,...,.....,,,..,..,,,.,.,,,,..,,,...,,,,,,....,,....,,..,,,.,,..,,,.^:.^!. IDIII?9=<0$I2"":I*III1IIIIIIII.I@FII5IIIIIIII322+IIIII3III"I"IIIGIIII2>IIIIIIIIIIIIIIIIIF4IIIIIIIII)-I>IIIIII>IIIIIIIIIIIIIIIIIIIIIII6IIIIII$IIIIIII?IIIIIIIIIIIIIIIIIIIIII 150 1 0 0 151 1 +chrM 819 820 A A 255 0 22 183 .$.......,..,..,.,,,,,,,.,,,.,,..,,,,,,....C,..,,.,.,,,.,,,.,,,,....,.,,,..,.,,,,,...,.,..,...C.,...t.....,,,..,..,,,.,.,,,,..,,,...,,,,,,....,,....,,..,,,.g,..,,,...,g.,..,,,..^!.^:.^:.^:.^:,^:,^:, I3/I+:DII7&I&I*<III$$(II;II@II2IIIIIIIIIII&IIIIE2IHIII)IIIDIIIIDCI>/BIHIIIIIIIIIIII'III6IGIII+7IIII@IIIIIIEIIIDIIFI9IIIIII9II1,+III.I(83IIIII4GIIIIIIIIIIBI,BIIIIIIIII$IIII<5@IIIIIII9I 157 0 0 1 158 1 +chrM 822 823 T T 252 0 22 192 .$,$..,..,.,,,,,,,.,,,.,,..,,,,,,.....,..,,.,.,,,.,,,.,,,,....,.,c,..,.,,,,,...,.,..,.....,...,.....,,,..,..,,,.,.,n,,..,,,...,,,g,,....,,....,,..,,,.,,..,,,...,,.,..,,,......,,,,.,,,.,^:.^!.^:,^:,^:,^:,^:,^:,^:, IIIIIII79I0IIIIII$&>IIIEIIIIIAIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIEII$IIIIEEIIIIIIII4IIIIIIII'IIIIIIII(;III;IICI@I+II"<CII>=/III6:B#0IIIII0FIIIIDIII5E6I@8IIIIGIII9II=II.$)IIIIII9>0:I59BIIII/E2?BA/ 0 1 0 171 172 1 +chrM 825 826 A A 255 0 22 198 .$,$.,,,,,,,.,,,.,,..,c,,,,....C,..,,.,.,,,.,,,.,,,,....,.,,,..,.,,,,,...,.,C.,.....,...,.....,,,..,..,,,.,.,,,,..,,,...,,,,,,....,,....,,..,,,.,,..,,,...,,.,..,,,......,,,,.,,,.g..,,,,,,,.,.,.,,,g^!.^:.^:, IIIIIIIIIIIIII=II4II#IIIIIIII'IAIIIFIIIIIIIIIFIIII40IIIIIII/IIIIIIII8IIIII'I@III4II/IIIIIIIIIIIIIIIIIICI#IIIIF%#II)IIIIADIGIIIIIIIIIIIIIIIIIII%AIIIIIIIIIIIII/337II@IIII/I:IFIII#II/BI9I;EIIIEIIII@III 177 0 1 0 178 1 +chrM 829 830 G G 255 0 22 198 .$,$a$.$.$,$,$,,,,.NNNN,..,,.,.,,,.,,,.,,,,....,.,,,..,.,,,,,...,.,..,.....,C..,...C.,,,..,..,,,.,.,,,,..,,,...,,,,,,....,,....,,..,,,.,,..,,,...,,.,..,,,......,,,,.,,,.,..,,,,,,,.,.,.,,,,..,....,,,,^:.^!.^:c^!.^:,^:, <II1IIIIIIIA""""I0III;IIIIIIIII=IIIIC<IIIIIIIIIIIIIIIIII.III3<IIIIIII)IIIIIF%)IIIIIIIIIIIIDIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIHIIIIIIIII9III=IFIIIIIIIIBIIIIIIIIII:IIIIIIIII:III8III9IIIIIII7II#IBI 1 0 185 0 186 1 +chrM 837 838 G G 255 0 22 169 .$.$.$,$.$,$..,.....,A..,.....,,,..,..,,,.,.,,,,..,,t...,,,,,,....,,....,,..,,,.,,..,,,.C.,,.,..,,,......,,,,.,,,.,..,,,,,,,.,.,.,n,,..,....,,,,..a.,,......,..,,,..,,.,.,,,..^!. AIEIII;>IIFI<?I,FIII8I)IIII*1IIIIIIIIIIIIIIIII9IIIIIIIIIGDG;IIIIIIIIIIIIIIIIIIIIII%IIIEIE6IIIIIIIIIIIIIIAIIIIIIIIIIIIIIIIHII"@IIIIIIIIDI%@II#IIIIIIIIIGIIIIIIIIII5IIF:III 0 0 160 1 161 1 +chrM 841 842 C C 239 0 22 173 .$.$.$.$,,,..,..,,,.,.,,,,..,,,...,,,,,,....,,....,,..,,,.,,..,,,.T.,,.,..,,,......,,,,.,,,.,..,,,,,,,.,.,.,,,,..,....aa,,..,.,,......,..,,,..,,.,.,,,..A...,..,....,,..N.....^:a^:.^!. IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIC5CIIIIIIIIIIIIIIIIIIIIIII#II%+II5IIIIIIIIII-IIIIIII<IIIIIEIII4IIIII:IIIIIIIII-II>II$I*IIIIIIIIIIIIIII10$7I<BDII&III>IIIIIII0'II"IIIII&II 1 156 0 0 157 1 +chrM 856 857 C C 254 0 23 124 ,$..,,,,,,,.,.,.,,,,..,....,,g,..,.,,......,..,,,..,,.,.,,,......,..,....,,........a.......,....,....,..,..,...........,...^:.^:. I06IIIIIIIIICIIIIIIIIIIIIIII#IHIIIIII3IDI9IHI+$IIIII#IIIII>IIIIIIIIIIIII#=IIII8I5IFIIIIIIIIIIIIDIIII/IIDII>IIIIIIIIIII1IIIII 1 114 0 0 115 1 +chrM 858 859 A A 250 0 23 118 .$,$.$,$.$,$,,g..,....,,,,..,.,,......,..,,,..,,T,.,,,......,..,....,,........,.......,....,....,..g..,...........,.......^:.^:. II9IIIIII@DIC=.EIIIIE@IIIII*CI8-IB8IIIIIII$IHIIIIIIIIIIICIHIIIIIIIIIII@IIIIIHIIIFIIIIIIFI-III*III)IIIIIIIIIIIIIIIIIIII 110 0 1 0 111 1 +chrM 859 860 A A 255 0 23 115 ,$,$,$..,....,,,,..,.,,......,..,,,..,,.,.,,,......,..,....c,........,.......,N.N.,....,..,..,..G........,........C^:.^:.^:. IIII2IEA(IIIIIHAIIIIG,=G1,IIIIII:EIIII6IIIIIII@IIIIIIIIIEIC@43:I,I@IIIIIIII"I"IIIIIIAIIIIIII*4IIIIEEIIIIIIIIIII%III 101 1 0 0 102 1 +chrM 864 865 G G 255 0 23 115 .$.$,$.$,$,$......,..,,,..,,.,.,,,......,..,....,,........,......N,....,...T,..,..,..........A,..............,.........^:.^!, IDIIIII6IIIIIIIIIIIIII9IAIIIIIIIIIIIIIIIII7IIIIIIIIIIIIIIII"IIIIIII,I1IIIIIIII+IEIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII3 1 0 109 0 110 1 +chrM 866 867 A A 224 0 23 105 .$.$,$..,,,..,,T,.,,,......,..,....,,........,......N,....,....c.N,..,.CC........,..............,.........., .'IG4III/III$IFIII3III&;I@CII160II99II.G%GIFI:)ID"IHIIIIF,F*II"II;II$%+B0I/-A.IHIIIIIIIIIIIIIIIIIIIIIIIII 81 1 0 0 82 1 +chrM 872 873 C C 206 0 23 94 .$...,..,....,,........,.......,N.N.a....,..,..,...........,..............,....N.....,.g....T.. II9EII9IFIIIIIIII/EIBIIII?GIIII"I"III/I-IIIIIIIIBIIIIIIIIIIHIIIIIIIIIIIIIIIIII"IIIII?+$IIII$II 1 84 0 0 85 1 +chrM 873 874 A A 208 0 23 94 .$.$.$,$.$.$,$....,,......C.,.......,....,....c..,..,...........,.....N........,..........,.g.......^:. II<IA&IIEIIII9DII(I&IIGI63IIIIIIIIII.I/II&IIIIIBII@5I3.II)IIIII"IIIFIGIIIIIIIIIIII1II)IIIIIIII 80 1 0 0 81 1 +chrM 931 932 C C 53 0 23 22 ,,.,,.,,,a,a,,,,,,,,,^:, II*IIII.I7I6III@9<::<I 2 18 0 0 20 2 +chrM 936 937 C C 98 0 24 31 .,,.,,,,,,,t,a,,,,,,,,.,,,.,,a^:, IIGII/IIHI'#15I4IIGII.IIIII-I$1 1 21 0 0 22 1 +chrM 950 951 C C 191 0 23 76 ,$,.,,,,,,,,,a,,,,,a,,.,,,.,,,,,,,,,,,,,,,.a..,.,..,........,,,.,,.,,,,,^:.^!.^:,^:a^!. IIIIIIIIIIII:IIIII%II5IIIIIIIIII?0B1I<IIGI)II1IDII*IIIIIIII)&III&I4>%5*II4(I 1 61 0 0 62 1 +chrM 951 952 A A 223 0 24 81 c.,,,,,,,,,,,,,,,c,,.,,,.,,,,,,,,,,,,,,,.,..,.,..,........,,,.,,.,,,,,..,,.^:.^:.^:.^:,^:,^:, IBIIIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIII8IIIIIIIIIIIIIIIIIIIIICIIII4IGIIIIIIIIIIIIII? 78 2 0 0 80 2 +chrM 952 953 C C 179 0 24 86 ,.,,,,,,,,,,,,,,,,,,.n,,.,,a,,,,,,a,,,,,.,..,.,..,........a,,.,,.,,,,,..,a....,,a^:.^!.^!.^:a^:a IIIIIIIIIIIIIIIIIIIII"IIIII)III6I1)I4II(I7II?I9II(IIIIIIII+1III0I+214%II79IIIIBB0III%I 2 67 0 0 69 2 +chrM 956 957 T T 237 0 23 110 ,.,,,,,,,,,,,,,,,,,,N,,,N,,,,,,,,,,,,,,,G,..,.,..,........,,,.,,.,,,,,..,,....,,,...,,.......,..,..,NN,^:.^:,^:,^:,^:.^:.^:. I/IIIIIIIIIIIIIIIIII"III"III4IIIII)I6GHI63II86III0IIIIIIII8BII6+II56I.II71IIII;/+IIIDDIIIIIIIIII;II,""2I@03III 0 0 1 91 92 1 +chrM 957 958 C C 251 0 23 117 ,.,,,t,t,,,,,,,,,,,,.,,,.,,,,,,,,,,,,,,,.,..,.,..,........,,,.,,.,,,,,..,a....,a,...,,.......,..,..,..,.,,,...^:,^:,^:.^!.^:,^:,^:, I9III(I&IIIIIIIIIIIIIIIIIIIIIIIIII>I3III*III3IIII:IIIIIIII(/IIIBIII(&$IIIIIIII<+%III>@IIIIIIIIIIIII,II,I%(6IIII1II(-/ 1 96 0 0 97 1 +chrM 966 967 A A 246 0 23 125 ,$c,G,,,.,,,,,,,,,,,,,n,.,..,.,..,........,,,.,,.,,,,,..,,....,,,...,,.......,..,..,..,.,,,...,,..,,,......g...,,,.,,,,,,,.,.^:, I5I#IIIIIIIIIIIIIIIBI"IIIIII;IIIIIIIIIIII;2IIIIIIIIIIIII@IIIIIIIIIIIIIIIIIII6IIIIIIIIIIIIIIIIIIIIIIIIII/II#IIIIIIIIIIIII>IIII 119 1 0 0 120 1 +chrM 974 975 C C 255 0 23 125 ,,,,,,,.,..,.,..,........,,,.,,.,,,,,..,,....,,,...,,.......,..,..,..,.,,,...,,..,,,......,...,,n.,,aa,,,N,.,.,,.gg,,,,....^:n^:, IIIIIII4IIIIIIEII/I?IIIIIIII7II<IIIIIIIIIIEIIIIIIIIIIII2IIIIIIIIIIIIIIGIIIIIIIIIIIIIIII%II3IIIII"III%IIII"IIII@.IB*BIDIIIII", 1 111 1 0 113 2 +chrM 980 981 C C 247 0 23 123 .$.$,........,,,.,,.,,,,,..,,....,,,...,,.......,..,..,..,.,,,...,,..,,,......,...,,,.,,a,,,,.,.,.,,.gt,,,,....,,,..,.,,,,^:.^:,^:, IIIII6IIIIIIII;IIAIIIIIIIIICI5III&IIIIIII<IIIEIIIIIIIIIIIIII=IIIIIIIIII5I&IIIIIIHIIIII'III9IIIII92I6#9II3IIIIII+II>I&71'I45 0 111 1 0 112 1 +chrM 981 982 C C 255 0 23 122 ,$.$.$......,,,.,,.,,,,,..,,....,,,...,,.......,..,..,..,.,,,...,,..,,,......,...,,,.,,,,,,a.,.,.,,.g,,,,,....a,,..,.,,,a.,,^!, IIFBIEIIIIII?II@IIIIIIIIIIIIIII6II9IIIIIIIIIIIIIIIIIIIIIII5IIIIIIIIIIII&IIIFIIDIIII0$III<IIIII2II%9,?&IIIIII*/II+I59I)I1-) 2 106 0 0 108 2 +chrM 982 983 C C 252 0 23 122 .$.$.$...,,,.,,.,,,,,..,,....,,,...,,.......,..,..,..,.,,,...,,..,,,......,...,,,.,,,,,,a.,.,.,a.,,,,,,....,,,..,.a,,a.,,,^!,^!,^!, :IIIIIIIICIIIIIIIIIIIIIIIIIIIIIIIIIIAIIIIIIIIIBIIII?III2I5IIIIIIIIAI#IIIIIIIIIII&)IIIIIIIIII+II&9F8IIIIIII2II)I')I$I++)I(+ 1 105 0 0 106 1 +chrM 983 984 A A 250 0 23 119 .$.$.$,$,$,$.$,$,$.$,$,,,,..,,....,,,...,,.......,..,..,..,.,,,...c,..,,,......,...,,,.,,,,,,,C,.,.,,.,,,,,,....,,,..,.,,,,.,,,,,, C:IIII5II3IIIII95II/1.IIII>A5II4IA=)-)I>HIIDIIII-IIIII,III;IIII?:$III6=IIII8IIIIIII)IIIIII4I@IIIIII@=IIIIIIIIIDIIIIIID= 105 1 0 0 106 1 +chrM 987 988 A A 243 0 23 94 .$.$C$,$,$.......,..,..,..,.,,,...,,..,,,......,...,,,.,,,,,,,.,.,.,,.,c,,,,....,,,..,.,,,,.,,,,,,^!, I<%II,C21-@,IIGIIII==I+III%A2II<.IIIC/C$II$6EIIII9IIIIIIIIIIIIII3I6IIIII7@AIIIIIIIIIIIIGIIIIDI 79 1 0 0 80 1 +chrM 1005 1006 C C 138 0 18 49 .$,$,$,,,,....,,,..,.,,,,.,,,,,,,,,,,,,.,,,,,,a,,,,^!, 6IIIIII00@<III<IIIIIIIIIIIIIIIFI*I8IIIIII?IGII<&2 1 43 0 0 44 1 +chrM 1025 1026 G G 19 0 8 33 ,$t.,,,,,,,,,,,,,,,,,,,,,,,,,,,.., II.IIIIIIIIIIIIIIIIIIIIIB=@HIIIII 0 0 31 1 32 1 diff -r 1d11aec88053 -r 765c454bcaa7 tools/samtools/pileup_parser.pl --- a/tools/samtools/pileup_parser.pl Tue Mar 02 13:20:47 2010 -0500 +++ b/tools/samtools/pileup_parser.pl Tue Mar 02 16:35:21 2010 -0500 @@ -4,7 +4,7 @@ use POSIX; -die "Usage: pileup_parser.pl <in_file> <ref_base_column> <read_bases_column> <base_quality_column> <coverage column> <qv cutoff> <coverage cutoff> <SNPs only?> <output bed?> <coord_column> <out_file>\n" unless @ARGV == 11; +die "Usage: pileup_parser.pl <in_file> <ref_base_column> <read_bases_column> <base_quality_column> <coverage column> <qv cutoff> <coverage cutoff> <SNPs only?> <output bed?> <coord_column> <out_file> <total_diff> <print_qual_bases>\n" unless @ARGV == 13; my $in_file = $ARGV[0]; my $ref_base_column = $ARGV[1]-1; # 1 based @@ -17,6 +17,8 @@ my $bed = $ARGV[8]; #set to "Yes" to convert coordinates to bed format (0-based start, 1-based end); set to "No" to leave as is my $coord_column = $ARGV[9]-1; #1 based my $out_file = $ARGV[10]; +my $total_diff = $ARGV[11]; # set to "Yes" to print total number of deviant based +my $print_qual_bases = $ARGV[12]; #set to "Yes" to print quality and read base columns my $invalid_line_counter = 0; my $first_skipped_line = ""; @@ -24,6 +26,7 @@ my $above_qv_bases = 0; my $SNPs_exist = 0; my $out_string = ""; +my $diff_count = 0; open (IN, "<$in_file") or die "Cannot open $in_file $!\n"; open (OUT, ">$out_file") or die "Cannot open $out_file $!\n"; @@ -65,6 +68,7 @@ { $SNPs_exist = 1; $SNPs{ uc( $bases[ $base ] ) } += 1; + $diff_count += 1; } elsif ( $bases[ $base ] =~ m/[\.,]/ ) { $SNPs{ uc( $fields[ $ref_base_column ] ) } += 1; } @@ -77,11 +81,24 @@ $fields[ $coord_column ] = "$start\t$end"; } + if ($print_qual_bases ne "Yes") { + $fields[ $base_quality_column ] = ""; + $fields[ $read_bases_column ] = ""; + } + + $out_string = join("\t", @fields); # \t$read_bases\t$base_quality"; foreach my $SNP (sort keys %SNPs) { $out_string .= "\t$SNPs{$SNP}"; } - $out_string .= "\t$above_qv_bases\n"; + + if ($total_diff eq "Yes") { + $out_string .= "\t$above_qv_bases\t$diff_count\n"; + } else { + $out_string .= "\t$above_qv_bases\n"; + } + + $out_string =~ s/\t+/\t/g; if ( $SNPs_only eq "Yes" ) { print OUT $out_string if $SNPs_exist == 1; @@ -94,6 +111,7 @@ %SNPs = ('A',0,'T',0,'C',0,'G',0); $above_qv_bases = 0; $SNPs_exist = 0; + $diff_count = 0; } diff -r 1d11aec88053 -r 765c454bcaa7 tools/samtools/pileup_parser.xml --- a/tools/samtools/pileup_parser.xml Tue Mar 02 13:20:47 2010 -0500 +++ b/tools/samtools/pileup_parser.xml Tue Mar 02 16:35:21 2010 -0500 @@ -1,9 +1,9 @@ -<tool id="pileup_parser" name="Filter pileup" version="1.0.1">> +<tool id="pileup_parser" name="Filter pileup" version="1.0.2">> <description>on coverage and SNPs</description> <command interpreter="perl"> - #if $pileup_type.type_select == "six" #pileup_parser.pl $input "3" "5" "6" "4" $qv_cutoff $cvrg_cutoff $snps_only $interval "2" $out_file1 - #elif $pileup_type.type_select == "ten" #pileup_parser.pl $input "3" "9" "10" "8" $qv_cutoff $cvrg_cutoff $snps_only $interval "2" $out_file1 - #elif $pileup_type.type_select == "manual" #pileup_parser.pl $input $pileup_type.ref_base_column $pileup_type.read_bases_column $pileup_type.read_qv_column $pileup_type.cvrg_column $qv_cutoff $cvrg_cutoff $snps_only $interval $pileup_type.coord_column $out_file1 + #if $pileup_type.type_select == "six" #pileup_parser.pl $input "3" "5" "6" "4" $qv_cutoff $cvrg_cutoff $snps_only $interval "2" $out_file1 $diff $qc_base + #elif $pileup_type.type_select == "ten" #pileup_parser.pl $input "3" "9" "10" "8" $qv_cutoff $cvrg_cutoff $snps_only $interval "2" $out_file1 $diff $qc_base + #elif $pileup_type.type_select == "manual" #pileup_parser.pl $input $pileup_type.ref_base_column $pileup_type.read_bases_column $pileup_type.read_qv_column $pileup_type.cvrg_column $qv_cutoff $cvrg_cutoff $snps_only $interval $pileup_type.coord_column $out_file1 $diff $qc_base #end if# </command> <inputs> @@ -36,6 +36,14 @@ <option value="No" selected="true">No</option> <option value="Yes">Yes</option> </param> + <param name="diff" label="Print total number of differences?" type="select" help="See "Example 3" below for explanation"> + <option value="No" selected="true">No</option> + <option value="Yes">Yes</option> + </param> + <param name="qc_base" label="Print quality and base string?" type="select" help="See "Example 4" below for explanation"> + <option value="No">No</option> + <option value="Yes" selected="true">Yes</option> + </param> </inputs> <outputs> @@ -54,6 +62,8 @@ <param name="cvrg_cutoff" value="3" /> <param name="snps_only" value="Yes"/> <param name="interval" value="Yes" /> + <param name="diff" value="No" /> + <param name="qc_base" value="Yes" /> </test> <test> <param name="input" value="pileup_parser.6col.pileup"/> @@ -63,6 +73,8 @@ <param name="cvrg_cutoff" value="3" /> <param name="snps_only" value="Yes"/> <param name="interval" value="No" /> + <param name="diff" value="No" /> + <param name="qc_base" value="Yes" /> </test> <test> <param name="input" value="pileup_parser.6col.pileup"/> @@ -72,6 +84,8 @@ <param name="cvrg_cutoff" value="3" /> <param name="snps_only" value="No"/> <param name="interval" value="No" /> + <param name="diff" value="No" /> + <param name="qc_base" value="Yes" /> </test> <test> <param name="input" value="pileup_parser.10col.pileup"/> @@ -81,6 +95,8 @@ <param name="cvrg_cutoff" value="3" /> <param name="snps_only" value="Yes"/> <param name="interval" value="Yes" /> + <param name="diff" value="No" /> + <param name="qc_base" value="Yes" /> </test> <test> <param name="input" value="pileup_parser.10col.pileup"/> @@ -95,7 +111,43 @@ <param name="cvrg_cutoff" value="3" /> <param name="snps_only" value="Yes"/> <param name="interval" value="Yes" /> + <param name="diff" value="No" /> + <param name="qc_base" value="Yes" /> </test> + <test> + <param name="input" value="pileup_parser.10col.pileup"/> + <output name="out_file1" file="pileup_parser.10col.20-3-yes-yes-yes-yes.pileup.out"/> + <param name="type_select" value="manual"/> + <param name="ref_base_column" value="3"/> + <param name="read_bases_column" value="9"/> + <param name="read_qv_column" value="10"/> + <param name="cvrg_column" value="8"/> + <param name="coord_column" value="2"/> + <param name="qv_cutoff" value="20" /> + <param name="cvrg_cutoff" value="3" /> + <param name="snps_only" value="Yes"/> + <param name="interval" value="Yes" /> + <param name="diff" value="Yes" /> + <param name="qc_base" value="Yes" /> + </test> + <test> + <param name="input" value="pileup_parser.10col.pileup"/> + <output name="out_file1" file="pileup_parser.10col.20-3-yes-yes-yes-no.pileup.out"/> + <param name="type_select" value="manual"/> + <param name="ref_base_column" value="3"/> + <param name="read_bases_column" value="9"/> + <param name="read_qv_column" value="10"/> + <param name="cvrg_column" value="8"/> + <param name="coord_column" value="2"/> + <param name="qv_cutoff" value="20" /> + <param name="cvrg_cutoff" value="3" /> + <param name="snps_only" value="Yes"/> + <param name="interval" value="Yes" /> + <param name="diff" value="Yes" /> + <param name="qc_base" value="No" /> + </test> + + </tests> <help> @@ -120,7 +172,7 @@ --------------------------------- chrM 412 A 2 ., II chrM 413 G 4 ..t, IIIH - chrM 414 C 4 ...a III2 + chrM 414 C 4 ..Ta III2 chrM 415 C 4 TTTt III7 where:: @@ -143,7 +195,7 @@ ------------------------------------------------ chrM 412 A A 75 0 25 2 ., II chrM 413 G G 72 0 25 4 ..t, IIIH - chrM 414 C C 75 0 25 4 ...a III2 + chrM 414 C C 75 0 25 4 ..Ta III2 chrM 415 C T 75 75 25 4 TTTt III7 where:: @@ -178,18 +230,21 @@ - Number of **T** variants - Number of read bases covering this position, where quality is equal to or higher than the value set by **Do not consider read bases with quality lower than** option. +Optionally, if **Print total number of differences?** is set to **Yes**, the tool will append the sixth column with the total number of deviants (see below). + 2. If **Convert coordinates to intervals?** is set to **Yes**, the tool replaces the position column (typically the second column) with a pair of tab-delimited start/end values. For example, if you are calling variants with base quality above 20 on this dataset:: chrM 412 A 2 ., II chrM 413 G 4 ..t, III2 - chrM 414 C 4 ...a III2 + chrM 414 C 4 ..Ta III2 chrM 415 C 4 TTTt III7 you will get:: chrM 413 G 4 ..t, IIIH 0 0 2 1 3 + chrM 414 C 4 ..Ta III2 1 1 0 1 3 chrM 415 C 4 TTTt III7 0 0 0 4 4 where:: @@ -200,18 +255,29 @@ 2 Position (1-based) 3 Reference base at that position 4 Coverage (# reads aligning over that position) - 5 Bases within reads where (see Galaxy wiki for more info) + 5 Bases within reads where (see Galaxy wiki for more info) 6 Quality values (phred33 scale, see Galaxy wiki for more) 7 Number of A variants 8 Number of C variants 9 Number of G variants 10 Number of T variants 11 Quality adjusted coverage: - Number of read bases (i.e., # of reads) with quality above the set threshold + 12 Number of read bases (i.e., # of reads) with quality above the set threshold + 13 Total number of deviants (if Convert coordinates to intervals? is set to yes) + +if **Print total number of differences?** is set to **Yes**, you will get:: + + chrM 413 G 4 ..t, IIIH 0 0 2 1 3 1 + chrM 414 C 4 ..Ta III2 1 2 0 1 3 2 + chrM 415 C 4 TTTt III7 0 0 0 4 4 0 -and if **Convert coordinates to intervals?** is set to **Yes**, you will get one additional column:: +Note the additional column 13, that contains the number of deviant reads (e.g., there are two deviants, T and a, for position 414). + + +Finally, if **Convert coordinates to intervals?** is set to **Yes**, you will get one additional column with the end coordinate:: chrM 412 413 G 4 ..t, III2 0 0 2 1 3 + chrM 414 415 C 4 ..Ta III2 1 2 0 1 3 chrM 414 415 C 4 TTTt III7 0 0 0 4 4 where:: @@ -230,13 +296,14 @@ 10 Number of G variants 11 Number of T variants 12 Quality adjusted coverage + 13 Total number of deviants (if Convert coordinates to intervals? is set to yes) Note that in this case the coordinates of SNPs were converted to intervals, where the start coordinate is 0-based and the end coordinate in 1-based using the UCSC Table Browser convention. Although three positions have variants in the original file (413, 414, and 415), only 413 and 415 are reported because the quality values associated with these two SNPs are above the threshold of 20. In the case of 414 the **a** allele has a quality value of 17 ( ord("2")-33 ), and is therefore not reported. Note that five columns have been added to each of the reported lines:: - chrM 413 G 4 ..t, IIIH 0 0 0 1 3 + chrM 413 G 4 ..t, IIIH 0 0 2 1 3 Here, there is one variant, **t**. Because the fourth column represents **T** counts, it is incremented by 1. The last column shows that at this position, three reads have bases above the quality threshold of 20. @@ -248,7 +315,7 @@ chrM 412 A 2 ., II chrM 413 G 4 ..t, III2 - chrM 414 C 4 ...a III2 + chrM 414 C 4 ..Ta III2 chrM 415 C 4 TTTt III7 To call all variants (with no restriction by coverage) with quality above phred value of 20, we will need to set the parameters as follows: @@ -258,6 +325,7 @@ Running the tool with these parameters will return:: chrM 413 G 4 ..t, IIIH 0 0 0 1 3 + chrM 414 C 4 ..Ta III2 0 2 0 1 3 chrM 415 C 4 TTTt III7 0 0 0 4 4 **Note** that position 414 is not reported because the *a* variant has associated quality value of 17 (because ord('2')-33 = 17) and is below the phred threshold of 20 set by the **Count variants with quality above this value** parameter. @@ -274,12 +342,46 @@ chrM 412 A 2 ., II 2 0 0 0 2 chrM 413 G 4 ..t, III2 0 0 2 1 3 - chrM 414 C 4 ...a III2 3 0 0 0 3 + chrM 414 C 4 ..Ta III2 0 2 0 1 3 chrM 415 C 4 TTTt III7 0 0 0 4 4 Here, you can see that although the total coverage at position 414 is 4 (column 4), the quality adjusted coverage is 3 (last column). This is because only three out of four reads have bases with quality above the set threshold of 20 (the actual qualities are III2 or, after conversion, 40, 40, 40, 17). One can use the last column of this dataset to filter out (using Galaxy's **Filter** tool) positions where quality adjusted coverage (last column) is below a set threshold. + +------ + +**Example 3**: Report everything and print total number of differences + +If you set the **Print total number of differences?** to **Yes** the tool will print an additional column with the total number of reads where a devinat base is above the quality threshold cutoff. So, seetiing parametrs like this: + +.. image:: ../static/images/pileup_parser_help3.png + +will produce this:: + + chrM 412 A 2 ., II 2 0 0 0 2 0 + chrM 413 G 4 ..t, III2 0 0 2 1 3 1 + chrM 414 C 4 ..Ta III2 0 2 0 1 3 1 + chrM 415 C 4 TTTt III7 0 0 0 4 4 0 + + +----- + +**Example 4**: Report everything, print total number of differences, and ignore qualities and read bases + +Setting **Print quality and base string?** to **Yes** as shown here: + +.. image:: ../static/images/pileup_parser_help4.png + +will produce this:: + + chrM 412 A 2 2 0 0 0 2 0 + chrM 413 G 4 0 0 2 1 3 1 + chrM 414 C 4 0 2 0 1 3 1 + chrM 415 C 4 0 0 0 4 4 0 + + + </help> </tool>
participants (1)
-
Greg Von Kuster