July 2010 - galaxy-commits - lists.galaxyproject.org

galaxy-dist commit 1dc82a345db2: Updated indel analysis tools: fixed counting bugs in indel_analysis and improved help section; standardized CIGAR regex across tools; updated test files for several tools
by commits-noreply＠bitbucket.org 23 Jul '10

23 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Kelly Vincent <kpvincent(a)bx.psu.edu> # Date 1279743946 14400 # Node ID 1dc82a345db24860b31b73624030fa2644c18d38 # Parent 95491a8e9e1bca9267ec5830ac5df2a6d30ed915 Updated indel analysis tools: fixed counting bugs in indel_analysis and improved help section; standardized CIGAR regex across tools; updated test files for several tools --- a/test-data/indel_analysis_out1.interval +++ b/test-data/indel_analysis_out1.interval @@ -1,4 +1,7 @@ -ref 10 11 1 1 9.09 -ref 12 14 2 1 7.69 -ref 13 14 1 1 7.69 -ref 18 20 2 1 8.33 +chr1 11 13 1 100.00 +chr1 21 22 1 25.00 +chr1 21 23 1 25.00 +chrM 16 18 1 9.09 +chrM 19 20 1 8.33 +chrM 21 22 1 9.09 +chrM 22 23 1 9.09 --- a/tools/indels/indel_sam2interval.xml +++ b/tools/indels/indel_sam2interval.xml @@ -1,5 +1,5 @@ -<tool id="indel_sam2interval" name="Convert SAM to interval/BED" version="1.0.0"> - <description>for indels</description> +<tool id="indel_sam2interval" name="Extract indels" version="1.0.0"> + <description>from SAM</description><command interpreter="python"> indel_sam2interval.py --input=$input1 @@ -30,7 +30,7 @@ <when value="false" /></conditional><conditional name="del_out"> - <param name="include_del_out" type="select" label="Include insertions output bed file?"> + <param name="include_del_out" type="select" label="Include deletions output bed file?"><option value="true">Yes</option><option value="false">No</option></param> @@ -41,10 +41,10 @@ <outputs><data format="interval" name="output1" /><data format="bed" name="output2"> - <filter>ins_out[ "include_ins_out" ] = "true"</filter> + <filter>ins_out[ "include_ins_out" ] == "true"</filter></data><data format="bed" name="output3"> - <filter>del_out[ "include_del_out" ] = "true"</filter> + <filter>del_out[ "include_del_out" ] == "true"</filter></data></outputs><tests> @@ -69,48 +69,66 @@ Given a SAM file containing indels, conv **Example** -Suppose you have the following:: +Suppose you have the following mapping results:: - r770 89 ref 116 37 17M1I5M = 72131356 0 CACACTGTGACAGACAGCGCAGC 00/02!!0//1200210AA44/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r780 181 ref 4567 0 24M = 72131356 0 TTGGTGCGCGCGGTTGAGGGTTGG $$(#%%#$%#%####$%%##$### XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r1231 69 * 0 0 * * 0 0 AGACCGGGCGGGGTGGCGTTCGGT %##+'#######%###$#$##$(# - r1563 133 * 0 0 * * 0 0 GTTCGTGGCCGGTGGGTGTTTGGG ###$$#$#$&#####$'$#$###$ - r1945 177 ref 71908 0 23M 190342418 181247988 0 AGAGAGAGAGAGAGAGAGAGAGA SQQWZYURVYWX]]YXTSY]]ZM XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 - r3671 117 ref 31903418 0 24M = 190342418 0 CTGGCGTTCTCGGCGTGGATGGGT #####$$##$#%#%%###%$#$## XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r3673 153 ref 48819768 37 16M1I6M = 190342418 0 TCTAACTTAGCCTCATAATAGCT /<<!"0///////00/!!0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r3824 117 ref 80729921 0 24M = 80324999 0 TCCAGTCGCGTTGTTAGGTTCGGA #$#$$$#####%##%%###**#+/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r3911 153 ref 87824718 37 8M1I14M = 80324999 0 TTTAGCCCGAAATGCCTAGAGCA 4;6//11!"11100110////00 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r4795 81 ref 126739130 0 23M 57401793 57401793 0 TGGCATTCCTGTAGGCAGAGAGG AZWWZS]!"QNXZ]VQ]]]/2]] XT:A:R CM:i:2 SM:i:0 AM:i:0 X0:i:3 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:23 - r4797 161 ref 57401793 37 23M 26739130 26739130 0 GATCACCCAGGTGATGTAACTCC ]WV]]]]WW]]]]]]]]]]PU]] XT:A:U CM:i:0 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 - r4800 16 ref 241 255 15M2D8M = 1550 0 CGTGGCCGGCGGGCCGAAGGCAT IIIIIIIIIICCCCIII?IIIII XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r5377 170 ref 52090793 37 23M 26739130 26739130 0 TATCAATAAGGTGATGTAACTCG ]WV]ABAWW]]]]]P]P//GU]] XT:A:U CM:i:0 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 - r5612 151 ref 190341152 37 19M1D4M = 190342418 0 TCTAACTTAGCCTCATAATGCTAA /<<!"0/4/*/7//00/BC0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r5623 151 ref 188841418 37 19M1I3M = 190342418 0 TCTAACTTAGCCTCATAATAGCT /<<!"0/4//7//00/BC0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r7899 69 * 0 0 * * 0 0 CTGCGTGTTGGTGTCTACTGGGGT #%#'##$#$##&%#%$$$%#%#'# - r9192 133 * 0 0 * * 0 0 GTGCGTCGGGGAGGGTGCTGTCGG ######%#$%#$$###($###&&% + r327 16 chrM 11 37 8M1D10M * 0 0 CTTACCAGATAGTCATCA -+<2;?@BA@?-,.+4=4 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:41^C35 + r457 0 chr1 14 37 14M * 0 0 ACCTGACAGATATC =/DF;?@1A@?-,. XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r501 16 chrM 6 23 7M1I13M * 0 0 TCTGTGCCTACCAGACATTCA +=$2;?@BA@?-,.+4=4=4A XT:A:U NM:i:3 X0:i:1 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:28C36G9 XA:Z:chrM,+134263658,14M1I61M,4; + r1288 16 chrM 8 37 11M1I7M * 0 0 TCACTTACCTGTACACACA /*F2;?@%A@?-,.+4=4= XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T0T1A69 + r1902 0 chr1 4 37 7M2D18M * 0 0 AGTCTCTTACCTGACGGTTATGA <2;?@BA@?-,.+4=4=4AA663 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r2204 16 chrM 9 0 19M * 0 0 CTGGTACCTGACAGGTATC 2;?@BA@?-,.+4=4=4AA XT:A:R NM:i:1 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0T75 XA:Z:chrM,-564927,76M,1; + r2314 16 chrM 6 37 10M2D8M * 0 0 TCACTCTTACGTCTGA <2;?@BA@?-,.+4=4 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:25A5^CA45 + r3001 0 chrM 13 37 3M1D5M2I7M * 0 0 TACAGTCACCCTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r3218 0 chr1 13 37 8M1D7M * 0 0 TACAGTCACTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r4767 16 chr2 3 37 15M2I7M * 0 0 CAGACTCTCTTACCAAAGACAGAC <2;?@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T1A4T65 + r5333 0 chrM 5 37 17M1D8M * 0 0 GTCTCTCATACCAGACAACGGCAT FB3$@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:45C10^C0C5C13 + r6690 16 chrM 7 23 20M * 0 0 CTCTCTTACCAGACAGACAT 2;?@BA/(@?-,.+4=4=4A XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chrM,-568532,76M,1; + r7211 0 chrM 7 37 24M * 0 0 CGACAGAGACAAAATAACATTTAA //<2;?@BA@?-,.+4=442;;6: XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:2 XO:i:1 XG:i:1 MD:Z:73G0G0 + r7899 69 * 0 0 * * 0 0 CTGCGTGTTGGTGTCTACTGGGGT #%#'##$#$##&%#%$$$%#%#'# + r9192 133 * 0 0 * * 0 0 GTGCGTCGGGGAGGGTGCTGTCGG ######%#$%#$$###($###&&% + r9922 16 chrM 4 0 7M3I9M * 0 0 CCAGACATTTGAAATCAGG F/D4=44^D++26632;;6 XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r9987 16 chrM 4 0 9M1I18M * 0 0 AGGTTCTCATTACCTGACACTCATCTTG G/AD6"/+4=4426632;;6:<2;?@BA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r10145 16 chr1 16 0 5M2D7M * 0 0 CACATTGTTGTA G//+4=44=4AA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r10324 16 chrM 15 0 6M1D5M * 0 0 CCGTTCTACTTG A(a)??8.G//+4= XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r12331 16 chrM 17 0 4M2I6M * 0 0 AGTCGAATACGTG 632;;6:<2;?@B XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r12914 16 chr2 24 0 4M3I3M * 0 0 ACTACCCCAA G//+4=42,. XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r13452 16 chrM 13 0 3M1D11M * 0 0 TACGTCACTCATCA IIIABCCCICCCCI XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 The following three files will be produced (Interval, Insertions BED and Deletions BED):: - ref 133 134 I C 1 - ref 256 258 D - 1 - ref 48819784 48819785 I A 1 - ref 87824726 87824727 I G 1 - ref 188841437 188841419 I T 1 - ref 190341171 190341172 D - 1 + chr1 11 13 D - 1 + chr1 21 22 D - 1 + chr1 21 23 D - 1 + chr2 18 19 I AA 1 + chr2 28 29 I CCC 1 + chrM 11 12 I TTT 1 + chrM 13 14 I C 1 + chrM 13 14 I T 1 + chrM 16 17 D - 1 + chrM 16 18 D - 1 + chrM 19 20 D - 1 + chrM 19 20 I T 1 + chrM 21 22 D - 1 + chrM 21 22 I GA 1 + chrM 22 23 D - 1 - ref 133 134 - ref 48819784 48819785 - ref 87824726 87824727 - ref 188841437 188841419 + chr2 18 19 + chr2 28 29 + chrM 11 12 + chrM 13 14 + chrM 13 14 + chrM 19 20 + chrM 21 22 - ref 256 258 - ref 190341171 190341172 - - - - - + chr1 11 13 + chr1 21 22 + chr1 21 23 + chrM 16 17 + chrM 16 18 + chrM 19 20 + chrM 21 22 + chrM 22 23 For more information on SAM, please consult the `SAM format description`__. --- a/tools/indels/indel_sam2interval.py +++ b/tools/indels/indel_sam2interval.py @@ -72,11 +72,11 @@ def __main__(): output_bed_del = None # the pattern to match, assuming just one indel per cigar string - pat_indel = re.compile( '(?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M' ) + pat_indel = re.compile( '^(?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M$' ) pat_multi = re.compile( '(\d+[MIDNSHP])(\d+[MIDNSHP])(\d+[MIDNSHP])+' ) # go through all lines in input file - out_data = [] + out_data = {} multi_indel_lines = 0 for line in open( options.input, 'rb' ): if line and not line.startswith( '#' ) and not line.startswith( '@' ) : @@ -84,14 +84,14 @@ def __main__(): if split_line < 12: continue # grab relevant pieces - cigar = split_line[5] + cigar = split_line[5].strip() pos = int( split_line[3] ) chr = split_line[2] base_string = split_line[9] # parse cigar string - m = pat_indel.search( cigar ) + m = pat_indel.match( cigar ) if not m: - m = pat_multi.search( cigar ) + m = pat_multi.match( cigar ) # skip this line if no match if not m: continue @@ -109,37 +109,43 @@ def __main__(): start = left + pos if middle_type == 'D': end = start + middle - d = [ chr, start, end, 'D' ] + data = [ chr, start, end, 'D' ] if options.include_base == "true": - d.append( '-' ) - out_data.append( tuple( d ) ) - if output_bed_del: - output_bed_del.write( '%s\t%s\t%s\n' % ( chr, start, end ) ) + data.append( '-' ) else: - end = start + 1#+ middle - d = [ chr, start, end, 'I' ] + end = start + 1 + data = [ chr, start, end, 'I' ] if options.include_base == "true": - d.append( bases ) - out_data.append( tuple( d ) ) - if output_bed_ins: - output_bed_ins.write( '%s\t%s\t%s\n' % ( chr, start, end ) ) + data.append( bases ) + location = '\t'.join( [ '%s' % d for d in data ] ) + try: + out_data[ location ] += 1 + except KeyError: + out_data[ location ] = 1 # output to interval file - if options.collapse == 'true': - out_dict = {} - # first collapse and get counts - for data in out_data: - location = ' '.join( [ '%s' % d for d in data ] ) - try: - out_dict[ location ].append( data ) - except KeyError: - out_dict[ location ] = [ data ] - locations = out_dict.keys() - locations.sort( numeric_sort ) - for loc in locations: - output.write( '%s\t%s\n' % ( '\t'.join( [ '%s' % d for d in out_dict[ loc ][0] ] ), len( out_dict[ loc ] ) ) ) - else: - for data in out_data: - output.write( '%s\n' % '\t'.join( [ '%s' % d for d in data ] ) ) + # get all locations and sort + locations = out_data.keys() + locations.sort( numeric_sort ) + last_line = '' + # output each location, either with counts or each occurrence + for loc in locations: + sp_loc = loc.split( '\t' ) + cur_line = '\t'.join( sp_loc[:3] ) + if options.collapse == 'true': + output.write( '%s\t%s\n' % ( loc, out_data[ loc ] ) ) + if output_bed_del and sp_loc[3] == 'D': + output_bed_del.write( '%s\n' % cur_line ) + if output_bed_ins and sp_loc[3] == 'I' and last_line != cur_line: + output_bed_ins.write( '%s\n' % cur_line ) + last_line = cur_line + else: + for i in range( out_data[ loc ] ): + output.write( '%s\n' % loc ) + if output_bed_del or output_bed_ins: + if output_bed_del and sp_loc[3] == 'D': + output_bed_del.write( '%s\n' % cur_line ) + if output_bed_ins and sp_loc[3] == 'I': + output_bed_ins.write( '%s\n' % cur_line ) # cleanup, close files if output_bed_ins: @@ -150,6 +156,6 @@ def __main__(): # if skipped lines because of more than one indel, output message if multi_indel_lines > 0: - sys.stdout.write( '%s alignments were skipped because they contained more than one indel or had unhandled operations (N/S/H/P).' % multi_indel_lines ) + sys.stdout.write( '%s alignments were skipped because they contained more than one indel.' % multi_indel_lines ) if __name__=="__main__": __main__() --- a/test-data/indel_sam2interval_in1.sam +++ b/test-data/indel_sam2interval_in1.sam @@ -1,17 +1,24 @@ -r770 89 ref 116 37 17M1I5M = 72131356 0 CACACTGTGACAGACAGCGCAGC 00/02!!0//1200210AA44/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r780 181 ref 4567 0 24M = 72131356 0 TTGGTGCGCGCGGTTGAGGGTTGG $$(#%%#$%#%####$%%##$### XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 +r327 16 chrM 11 37 8M1D10M * 0 0 CTTACCAGATAGTCATCA -+<2;?@BA@?-,.+4=4 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:41^C35 +r457 0 chr1 14 37 14M * 0 0 ACCTGACAGATATC =/DF;?@1A@?-,. XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r501 16 chrM 6 23 7M1I13M * 0 0 TCTGTGCCTACCAGACATTCA +=$2;?@BA@?-,.+4=4=4A XT:A:U NM:i:3 X0:i:1 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:28C36G9 XA:Z:chrM,+134263658,14M1I61M,4; r1231 69 * 0 0 * * 0 0 AGACCGGGCGGGGTGGCGTTCGGT %##+'#######%###$#$##$(# +r1288 16 chrM 8 37 11M1I7M * 0 0 TCACTTACCTGTACACACA /*F2;?@%A@?-,.+4=4= XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T0T1A69 r1563 133 * 0 0 * * 0 0 GTTCGTGGCCGGTGGGTGTTTGGG ###$$#$#$&#####$'$#$###$ -r1945 177 ref 71908 0 23M 190342418 181247988 0 AGAGAGAGAGAGAGAGAGAGAGA SQQWZYURVYWX]]YXTSY]]ZM XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r3671 117 ref 31903418 0 24M = 190342418 0 CTGGCGTTCTCGGCGTGGATGGGT #####$$##$#%#%%###%$#$## -r3673 153 ref 48819768 37 16M1I6M = 190342418 0 TCTAACTTAGCCTCATAATAGCT /<<!"0///////00/!!0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r3824 117 ref 80729921 0 24M = 80324999 0 TCCAGTCGCGTTGTTAGGTTCGGA #$#$$$#####%##%%###**#+/ -r3911 153 ref 87824718 37 8M1I14M = 80324999 0 TTTAGCCCGAAATGCCTAGAGCA 4;6//11!"11100110////00 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r4795 81 ref 126739130 0 23M 57401793 57401793 0 TGGCATTCCTGTAGGCAGAGAGG AZWWZS]!"QNXZ]VQ]]]/2]] XT:A:R CM:i:2 SM:i:0 AM:i:0 X0:i:3 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:23 -r4797 161 ref 57401793 37 23M 26739130 26739130 0 GATCACCCAGGTGATGTAACTCC ]WV]]]]WW]]]]]]]]]]PU]] XT:A:U CM:i:0 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r4800 16 ref 241 255 15M2D8M = 1550 0 CGTGGCCGGCGGGCCGAAGGCAT IIIIIIIIIICCCCIII?IIIII -r5377 170 ref 52090793 37 23M 26739130 26739130 0 TATCAATAAGGTGATGTAACTCG ]WV]ABAWW]]]]]P]P//GU]] XT:A:U CM:i:0 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r5612 151 ref 190341152 37 19M1D4M = 190342418 0 TCTAACTTAGCCTCATAATGCTAA /<<!"0/4/*/7//00/BC0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r5623 151 ref 188841418 37 19M1I3M = 190342418 0 TCTAACTTAGCCTCATAATAGCT /<<!"0/4//7//00/BC0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 +r1902 0 chr1 4 37 7M2D18M * 0 0 AGTCTCTTACCTGACGGTTATGA <2;?@BA@?-,.+4=4=4AA663 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r2204 16 chrM 9 0 19M * 0 0 CTGGTACCTGACAGGTATC 2;?@BA@?-,.+4=4=4AA XT:A:R NM:i:1 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0T75 XA:Z:chrM,-564927,76M,1; +r2314 16 chrM 6 37 10M2D8M * 0 0 TCACTCTTACGTCTGA <2;?@BA@?-,.+4=4 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:25A5^CA45 +r3001 0 chrM 13 37 3M1D5M2I7M * 0 0 TACAGTCACCCTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r3218 0 chr1 13 37 8M1D7M * 0 0 TACAGTCACTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r4767 16 chr2 3 37 15M2I7M * 0 0 CAGACTCTCTTACCAAAGACAGAC <2;?@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T1A4T65 +r5333 0 chrM 5 37 17M1D8M * 0 0 GTCTCTCATACCAGACAACGGCAT FB3$@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:45C10^C0C5C13 +r6690 16 chrM 7 23 20M * 0 0 CTCTCTTACCAGACAGACAT 2;?@BA/(@?-,.+4=4=4A XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chrM,-568532,76M,1; +r7211 0 chrM 7 37 24M * 0 0 CGACAGAGACAAAATAACATTTAA //<2;?@BA@?-,.+4=442;;6: XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:2 XO:i:1 XG:i:1 MD:Z:73G0G0 r7899 69 * 0 0 * * 0 0 CTGCGTGTTGGTGTCTACTGGGGT #%#'##$#$##&%#%$$$%#%#'# r9192 133 * 0 0 * * 0 0 GTGCGTCGGGGAGGGTGCTGTCGG ######%#$%#$$###($###&&% +r9922 16 chrM 4 0 7M3I9M * 0 0 CCAGACATTTGAAATCAGG F/D4=44^D++26632;;6 XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r9987 16 chrM 4 0 9M1I18M * 0 0 AGGTTCTCATTACCTGACACTCATCTTG G/AD6"/+4=4426632;;6:<2;?@BA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10145 16 chr1 16 0 5M2D7M * 0 0 CACATTGTTGTA G//+4=44=4AA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10324 16 chrM 15 0 6M1D5M * 0 0 CCGTTCTACTTG A(a)??8.G//+4= XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r12331 16 chrM 17 0 4M2I6M * 0 0 AGTCGAATACGTG 632;;6:<2;?@B XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r12914 16 chr2 24 0 4M3I3M * 0 0 ACTACCCCAA G//+4=42,. XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r13452 16 chrM 13 0 3M2D11M * 0 0 TACTCACTCATCAG IIIABCCCICCCCI XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 --- a/tools/indels/indel_analysis.xml +++ b/tools/indels/indel_analysis.xml @@ -24,7 +24,7 @@ </test><test><param name="input1" value="indel_analysis_in2.sam" ftype="sam"/> - <param name="threshold" value="0.08"/> + <param name="threshold" value="0.09"/><output name="out_del" file="indel_analysis_out3.interval" ftype="interval"/><output name="out_ins" file="indel_analysis_out4.interval" ftype="interval"/></test> @@ -35,75 +35,87 @@ Given an input sam file, this tool provides analysis of the indels. It filters out matches that do not meet the frequency threshold. The way this frequency of occurence is calculated is different for deletions and insertions. The CIGAR string's "M" can indicate an exact match or a mismatch. For SAM containing the following bits of information (assuming the reference "ACTGCTCGAT"):: - CHROM POS CIGAR SEQ - ref 3 2M1I3M TATCTC - ref 2 3M1D2M ATGTC - ref 4 2M2I3M GTTGAAG - ref 1 2M2D2M ACCT - ref 2 3M1I2M TCCATC - ref 7 4M CTAT - ref 5 5M CGTGA + CHROM POS CIGAR SEQ + ref 3 2M1I3M TACTTC + ref 1 2M1D3M ACGCT + ref 4 4M2I3M GTTCAAGAT + ref 2 2M2D3M CTCCG + ref 1 3M1D4M AACCTGG + ref 6 3M1I2M TTCAAT + ref 5 3M1I3M CTCTGTT + ref 7 4M CTAT + ref 5 5M CGCTA + ref 3 2M1D2M TGCC -The following totals would be calculated:: +The following totals would be calculated (this is an intermediate step and not output):: - Counts for chromosome "ref", where "-" indicates a deletion and - "+" an insertion - ---------------------------------------------------------------- - POS BASE NUMREADS DELPROPCALC DELPROP INSPROPCALC INSPROP - 1 A 1 1/1 1.00 -- -- - 2 A 1 1/2 0.50 -- -- - C 1 1/2 0.50 -- -- - 3 T 3 3/4 0.75 -- -- - -- 1 1/4 0.25 -- -- - 4 A 1 1/5 0.20 -- -- - G 2 2/5 0.40 -- -- - C 1 1/5 0.20 -- -- - - 1 1/5 0.20 -- -- - 5 C 4 4/5 0.80 -- -- - T 1 1/5 0.20 -- -- - +A 1 --- --- 1/6 0.17 - +T 1 --- --- 1/6 0.17 - 6 A 1 1/6 0.17 -- -- - C 1 1/6 0.17 -- -- - T 4 4/6 0.67 -- -- - +TG 1 --- --- 1/7 0.14 - 7 A 1 1/6 0.17 -- -- - C 3 3/6 0.50 -- -- - G 1 1/6 0.17 -- -- - T 1 1/6 0.17 -- -- - 8 G 2 2/3 0.67 -- -- - T 1 1/3 0.33 -- -- - 9 A 2 2/2 1.00 -- -- - 10 T 1 1/1 1.00 -- -- + ------------------------------------------------------------------------------------------------------- + POS BASE NUMREADS DELPROPCALC DELPROP INSPROPSTARTCALC INSSTARTPROP INSPROPENDCALC INSENDPROP + ------------------------------------------------------------------------------------------------------- + 1 A 2 2/2 1.00 --- --- --- --- + 2 A 1 1/3 0.33 --- --- --- --- + C 2 2/3 0.67 --- --- --- --- + 3 C 1 1/5 0.20 --- --- --- --- + T 3 3/5 0.60 --- --- --- --- + - 1 1/5 0.20 --- --- --- --- + 4 A 1 1/6 0.17 --- --- --- --- + G 3 3/6 0.50 --- --- --- --- + - 1 1/6 0.17 --- --- --- --- + -- 1 1/6 0.17 --- --- --- --- + 5 C 4 4/7 0.57 --- --- --- --- + T 2 2/7 0.29 --- --- --- --- + - 1 1/7 0.14 --- --- --- --- + +C 1 --- --- 1/7 0.14 1/9 0.11 + 6 C 2 2/9 0.22 --- --- --- --- + G 1 1/9 0.11 --- --- --- --- + T 6 6/9 0.67 --- --- --- --- + 7 C 7 7/9 0.78 --- --- --- --- + G 1 1/9 0.11 --- --- --- --- + T 1 1/9 0.11 --- --- --- --- + 8 C 1 1/7 0.14 --- --- --- --- + G 4 4/7 0.57 --- --- --- --- + T 2 2/7 0.29 --- --- --- --- + +T 1 --- --- 1/8 0.13 1/6 0.17 + +AA 1 --- --- 1/8 0.13 1/6 0.17 + 9 A 4 4/5 0.80 --- --- --- --- + T 1 1/5 0.20 --- --- --- --- + +A 1 --- --- 1/6 0.17 1/5 0.20 + 10 T 4 4/4 1.00 --- --- --- --- -Note that the way these are calculated may not be immediately clear. First, the basic total number of reads at a given position is the number of reads with a particular base plus the number of reads with that a deletion at that given position. Note that deletions of two bases and one base would be counted separately. Insertions are not counted in this total, which is used to calculate the proportion of occurrences of each individual base and deletion. For position 4 above, the reference base is G, and there are 2 occurrences of it along with one each of mismatching bases A and C. Also, there is on 1-base deletion. So there are a total of 5 matches/mismatches/deletions, and the proportions for each base are either 1/5 = 0.20 or 2/5 = 0.40, and for the deletion it is 1/5 = 0.20. Insertions are slightly more complicated. Each insertion is regarded individually, and the total number of occurrences of that insertion is divided by the sum of the number of its occurrences and the b asic total. So for position 5, there are a total of 5 matches/mismatches/deletions, and two insertions that each occur once, so each has a insertion has a proportion of 1/6 = 0.17. +The general idea for calculating these is that we want to find out the proportion of times a particular event occurred at a position among all reads that touch that base in some way. First, the basic total number of reads at a given position is the number of reads with each particular base plus the number of reads with that a deletion at that given position (including the bases that are "mismatches"). Note that deletions of two bases and one base would be counted completely separately. Insertions are not counted in this total. For position 4 above, the reference base is G, and there are 3 occurrences of it along with one mismatching base, A. Also, there is a 1-base deletion and another 2-base deletion. So there are a total of 5 matches/mismatches/deletions, and the proportions for each base are 1/6 = 0.17 (A) and 3/6 = 0.50 (G), and for each deletion it is 1/6 = 0.17. -The DELPROP or INSPROP needs to be greater than the threshold frequency specified by the user. +Insertions are slightly more complicated. We actually want to get the frequency of occurrence for both the associated start and end positions, since an insertion appears between those two bases. Each insertion is regarded individually, and the total number of occurrences of that insertion is divided by the sum of the number of its occurrences and the basic total for either the start or end. So for the insertions at position 8, there are a total of 7 matches/mismatches/deletions at position 8, and two insertions that each occur once, so each has an INSSTARTPROP of 1/8 = 0.13. For the end position there are 5 matches/mismatches/deletions, so the INSENDPROP is 1/6 = 0.17 for both insertions (T and AA). -The output varies for deletions and insertions, though for both, the first three columns are chromosome, start position, and end position. +These proportions (DELPROP and either INSSTARTPROP or INSENDPROP) need to be greater than the threshold frequency specified by the user in order for that base, deletion or insertion to be included in the output. + + +** Output format ** + +The output varies for deletions and insertions, although for both, the first three columns are chromosome, start position, and end position. Columns in the deletions file:: - Column Description - ---------------------------- ------------------------------------------------------------------------------------ - 1 Chrom Chromosome - 2 Start Starting position - 3 End Ending position - 4 Number of Deleted Base(s) The number of bases deleted at Start position - 5 Frequency Percentage Frequency of this exact deletion (2 and 1 are mutually exclusive), as percentage (%) + Column Description + ----------------------------- --------------------------------------------------------------------------------------------------- + 1 Chrom Chromosome + 2 Start Starting position + 3 End Ending position + 4 Coverage Number of reads containing this exact deletion + 5 Frequency Percentage Frequency of this exact deletion (2 and 1 are mutually exclusive, for instance), as percentage (%) Columns in the insertions file:: - Column Description - -------------------------- ------------------------------------------------------------------------------------------------------------- - 1 Chrom Chromosome - 2 Start Starting position - 3 End Ending position (always Start + 1 for insertions) - 4 Inserted Base(s) The exact base(s) inserted at Start position - 5 Freq. Perc. at Start Frequency of this exact insertion given Start position ("GG" and "G" are considered distinct), as percentage (%) - 6 Freq. Perc. at End Frequency of this exact insertion given End position ("GG" and "G" are considered distinct), as percentage (%) + Column Description + ------------------------ ----------------------------------------------------------------------------------------------------------------- + 1 Chrom Chromosome + 2 Start Starting position + 3 End Ending position (always Start + 1 for insertions) + 4 Inserted Base(s) The exact base(s) inserted at Start position + 5 Coverage Number of reads containing this exact insertion + 6 Freq. Perc. at Start Frequency of this exact insertion given Start position ("GG" and "G" are considered distinct), as percentage (%) + 7 Freq. Perc. at End Frequency of this exact insertion given End position ("GG" and "G" are considered distinct), as percentage (%) -Before using this tool, you probably will want to use the Filter SAM for indels tool to filter out indels on bases with insufficient quality scores, but this is not required. +Before using this tool, you may will want to use the Filter SAM for indels tool to filter out indels on bases with insufficient quality scores, but this is not required. ----- @@ -112,38 +124,43 @@ Before using this tool, you probably wil If you set the threshold to 0.0 and have the following SAM file:: - r770 89 ref 6 37 7M1I5M = 0 0 TGGATCTTCATAG !0//110AA44/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r1124 113 ref 4 0 23M = 0 0 CATCGTTCTGTTAGATCTACGTA PQRVUMNXYRPUXYXWXSOSZ]M XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 - r1789 177 ref 6 0 17M = 0 0 TCGATCGCTTAGTTCTC SQQWZY]]YXTSY]]ZM XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 - r3671 153 ref 10 37 6M1I6M = 0 0 TCTCTTTAGGTCT /<<!"0/////// XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r3824 153 ref 5 37 8M1I7M = 0 0 ATTGATGTTCTTAGAT 4;6//11!"100110/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r4800 16 ref 7 255 5M2D6M = 0 0 CGATCTTTGAT IIIIIIIIIIC XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r5612 151 ref 5 37 8M1D9M = 0 0 ATCTATCTTTTGATCTC /<<!"0/4/*/7//B0/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r5612 151 ref 11 37 3M1I10M = 0 0 CTCCTTAGCTCTCC /<<!"0/4//7//0 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r9145 115 ref 11 0 19M = 0 -1 CTCTTAGCTCTCCGAATTAG 7753:<5#"4!&=9518A> XT:A:R CM:i:2 SM:i:0 AM:i:0 X0:i:4 X1:i:137 XM:i:2 XO:i:0 XG:i:0 MD:Z:23 - r11770 89 ref 10 37 10M2I8M = 0 0 TCTCTTAGATGGCTCCGTAT 00/02!!0/120210AA4/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r13671 153 ref 1 37 12M1I12M = 0 0 TCGCATCGATCTCCGTAGATCTCCG /<""<!"0///////00/!!0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r13824 153 ref 13 37 9M1I7M = 0 0 CATAGATCTACCGGATT 4;6//11!"11100110 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r24800 16 ref 3 255 15M2D9M = 0 0 GCATCTATCTGATAGCTCCGAATT IIIIIIIII45"CCCIII?IIIII XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r25612 151 ref 1 37 9M1D5M = 0 0 TCGCATCGACTCTT 0/4/*/7//00/1C XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r25612 151 ref 21 37 4M1I7M = 0 0 TGCGTTATTGGG <!"0/70/BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 - r29962 16 ref 20 37 4M1I7M = 0 0 CTCCGGTATGAGG <!"0/70/7BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 + r327 16 chrM 11 37 8M1D10M * 0 0 CTTACCAGATAGTCATCA -+<2;?@BA@?-,.+4=4 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:41^C35 + r457 0 chr1 14 37 14M * 0 0 ACCTGACAGATATC =/DF;?@1A@?-,. XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r501 16 chrM 6 23 7M1I13M * 0 0 TCTGTGCCTACCAGACATTCA +=$2;?@BA@?-,.+4=4=4A XT:A:U NM:i:3 X0:i:1 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:28C36G9 XA:Z:chrM,+134263658,14M1I61M,4; + r1288 16 chrM 8 37 11M1I7M * 0 0 TCACTTACCTGTACACACA /*F2;?@%A@?-,.+4=4= XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T0T1A69 + r1902 0 chr1 4 37 7M2D18M * 0 0 AGTCTCTTACCTGACGGTTATGA <2;?@BA@?-,.+4=4=4AA663 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r2204 16 chrM 9 0 19M * 0 0 CTGGTACCTGACAGGTATC 2;?@BA@?-,.+4=4=4AA XT:A:R NM:i:1 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0T75 XA:Z:chrM,-564927,76M,1; + r2314 16 chrM 6 37 10M2D8M * 0 0 TCACTCTTACGTCTGA <2;?@BA@?-,.+4=4 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:25A5^CA45 + r3001 0 chrM 13 37 3M1D5M2I7M * 0 0 TACAGTCACCCTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r3218 0 chr1 13 37 8M1D7M * 0 0 TACAGTCACTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 + r4767 16 chr2 3 37 15M2I7M * 0 0 CAGACTCTCTTACCAAAGACAGAC <2;?@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T1A4T65 + r5333 0 chrM 5 37 17M1D8M * 0 0 GTCTCTCATACCAGACAACGGCAT FB3$@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:45C10^C0C5C13 + r6690 16 chrM 7 23 20M * 0 0 CTCTCTTACCAGACAGACAT 2;?@BA/(@?-,.+4=4=4A XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chrM,-568532,76M,1; + r7211 0 chrM 7 37 24M * 0 0 CGACAGAGACAAAATAACATTTAA //<2;?@BA@?-,.+4=442;;6: XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:2 XO:i:1 XG:i:1 MD:Z:73G0G0 + r9922 16 chrM 4 0 7M3I9M * 0 0 CCAGACATTTGAAATCAGG F/D4=44^D++26632;;6 XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r9987 16 chrM 4 0 9M1I18M * 0 0 AGGTTCTCATTACCTGACACTCATCTTG G/AD6"/+4=4426632;;6:<2;?@BA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r10145 16 chr1 16 0 5M2D7M * 0 0 CACATTGTTGTA G//+4=44=4AA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r10324 16 chrM 15 0 6M1D5M * 0 0 CCGTTCTACTTG A(a)??8.G//+4= XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r12331 16 chrM 17 0 4M2I6M * 0 0 AGTCGAATACGTG 632;;6:<2;?@B XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 + r12914 16 chr2 24 0 4M3I3M * 0 0 ACTACCCCAA G//+4=42,. XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 The following will be produced (deletions file followed by insertions file):: - ref 10 11 1 1 9.09 - ref 12 14 2 1 7.69 - ref 13 14 1 1 7.69 - ref 18 20 2 1 8.33 + chr1 11 13 1 100.00 + chr1 21 22 1 25.00 + chr1 21 23 1 25.00 + chrM 16 18 1 9.09 + chrM 19 20 1 8.33 + chrM 21 22 1 9.09 + chrM 22 23 1 9.09 - ref 13 14 C 1 6.25 6.67 - ref 13 14 T 2 12.50 13.33 - ref 14 15 C 1 6.67 7.14 - ref 16 17 T 1 7.14 7.69 - ref 20 21 GG 1 8.33 8.33 - ref 22 23 A 1 8.33 11.11 - ref 24 25 G 1 11.11 12.50 - ref 25 26 T 1 12.50 14.29 + chr2 18 19 AA 1 50.00 50.00 + chr2 28 29 CCC 1 50.00 50.00 + chrM 11 12 TTT 1 9.09 9.09 + chrM 13 14 C 1 9.09 9.09 + chrM 13 14 T 1 9.09 9.09 + chrM 19 20 T 1 7.69 8.33 + chrM 21 22 GA 1 8.33 8.33 </help> --- a/test-data/indel_analysis_out3.interval +++ b/test-data/indel_analysis_out3.interval @@ -1,2 +1,6 @@ -ref 10 11 1 1 9.09 -ref 18 20 2 1 8.33 +chr1 11 13 1 100.00 +chr1 21 22 1 25.00 +chr1 21 23 1 25.00 +chrM 16 18 1 9.09 +chrM 21 22 1 9.09 +chrM 22 23 1 9.09 --- a/test-data/indel_analysis_out4.interval +++ b/test-data/indel_analysis_out4.interval @@ -1,5 +1,5 @@ -ref 13 14 T 2 13.33 14.29 -ref 20 21 GG 1 8.33 8.33 -ref 22 23 A 1 8.33 11.11 -ref 24 25 G 1 11.11 14.29 -ref 25 26 T 1 12.50 14.29 +chr2 18 19 AA 1 50.00 50.00 +chr2 28 29 CCC 1 50.00 50.00 +chrM 11 12 TTT 1 9.09 9.09 +chrM 13 14 C 1 9.09 9.09 +chrM 13 14 T 1 9.09 9.09 --- a/tools/indels/indel_analysis.py +++ b/tools/indels/indel_analysis.py @@ -10,7 +10,7 @@ usage: %prog [options] [input3 sum3[ inp -D, --out_del=D: The interval output file showing deletions """ -import re, sys +import re, sets, sys from galaxy import eggs import pkg_resources; pkg_resources.require( "bx-python" ) from bx.cookbook import doc_optparse @@ -20,18 +20,18 @@ def stop_err( msg ): sys.stderr.write( '%s\n' % msg ) sys.exit() -def add_to_ref_pos( ref_pos, pos, bases ): +def add_to_mis_matches( mis_matches, pos, bases ): """ - Adds the bases and counts to the ref_pos dict + Adds the bases and counts to the mis_matches dict """ for j, base in enumerate( bases ): try: - ref_pos[ pos + j ][ base ] += 1 + mis_matches[ pos + j ][ base ] += 1 except KeyError: try: - ref_pos[ pos + j ][ base ] = 1 + mis_matches[ pos + j ][ base ] = 1 except KeyError: - ref_pos[ pos + j ] = { base: 1 } + mis_matches[ pos + j ] = { base: 1 } def __main__(): #Parse Command Line @@ -39,17 +39,16 @@ def __main__(): # prep output files out_ins = open( options.out_ins, 'wb' ) out_del = open( options.out_del, 'wb' ) - # pattern - pat = re.compile( '(^(?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M$)|(^(?P<match_width>\d+)M$)' ) + # patterns + pat = re.compile( '^((?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M)$|((?P<match_width>\d+)M)$' ) pat_multi = re.compile( '(\d+[MIDNSHP])(\d+[MIDNSHP])(\d+[MIDNSHP])+' ) # for tracking occurences at each pos of ref - ref_pos = {} + mis_matches = {} indels = {} - num_reads = {} multi_indel_lines = 0 # go through all lines in input file for i,line in enumerate( open( options.input, 'rb' ) ): - if line and not line.startswith( '#' ) and not line.startswith( '@' ) : + if line.strip() and not line.startswith( '#' ) and not line.startswith( '@' ) : split_line = line.split( '\t' ) chrom = split_line[2].strip() pos = int( split_line[3].strip() ) @@ -59,11 +58,11 @@ def __main__(): if chrom == '*': continue # find matches like 3M2D7M or 7M3I10M - matches = '' - m = pat.search( cigar ) + match = {} + m = pat.match( cigar ) # unprocessable CIGAR if not m: - m = pat_multi.search( cigar ) + m = pat_multi.match( cigar ) # skip this line if no match if not m: continue @@ -72,11 +71,9 @@ def __main__(): multi_indel_lines += 1 # get matching parts for the indel or full match if matching else: - if not ref_pos.has_key( chrom ): - ref_pos[ chrom ] = {} + if not mis_matches.has_key( chrom ): + mis_matches[ chrom ] = {} indels[ chrom ] = { 'D': {}, 'I': {} } - if not num_reads.has_key( chrom ): - num_reads[ chrom ] = {} parts = m.groupdict() if parts[ 'match_width' ] or ( parts[ 'lmatch' ] and parts[ 'ins_del_width' ] and parts[ 'rmatch' ] ): match = parts @@ -84,12 +81,7 @@ def __main__(): if match: # match/mismatch if parts[ 'match_width' ]: - add_to_ref_pos( ref_pos[ chrom ], pos, bases ) - for i, base in enumerate( bases ): - try: - num_reads[ chrom ][ i + pos ] += 1 - except KeyError: - num_reads[ chrom ][ i + pos ] = 1 + add_to_mis_matches( mis_matches[ chrom ], pos, bases ) # indel else: # pieces of CIGAR string @@ -104,86 +96,81 @@ def __main__(): right_bases = bases[ -right : ] start = pos + left # add data to ref_pos dict for match/mismatch bases on left and on right - add_to_ref_pos( ref_pos[ chrom ], pos, left_bases ) - for i, base in enumerate( left_bases ): - try: - num_reads[ chrom ][ i + pos ] += 1 - except KeyError: - num_reads[ chrom ][ i + pos ] = 1 + add_to_mis_matches( mis_matches[ chrom ], pos, left_bases ) if match[ 'ins_del' ] == 'I': - add_to_ref_pos( ref_pos[ chrom ], start, right_bases ) - indel_pos = start + add_to_mis_matches( mis_matches[ chrom ], start, right_bases ) else: - add_to_ref_pos( ref_pos[ chrom ], start + middle, right_bases ) - indel_pos = start + middle - for i, base in enumerate( right_bases ): - try: - num_reads[ chrom ][ i + indel_pos ] += 1 - except KeyError: - num_reads[ chrom ][ i + indel_pos ] = 1 + add_to_mis_matches( mis_matches[ chrom ], start + middle, right_bases ) # for insertions, count instances of particular inserted bases if match[ 'ins_del' ] == 'I': if indels[ chrom ][ 'I' ].has_key( start ): - if indels[ chrom ][ 'I' ][ start ].has_key( middle_bases ): + try: indels[ chrom ][ 'I' ][ start ][ middle_bases ] += 1 - else: + except KeyError: indels[ chrom ][ 'I' ][ start ][ middle_bases ] = 1 else: indels[ chrom ][ 'I' ][ start ] = { middle_bases: 1 } # for deletions, count number of deletions bases else: if indels[ chrom ][ 'D' ].has_key( start ): - if indels[ chrom ][ 'D' ][ start ].has_key( middle ): + try: indels[ chrom ][ 'D' ][ start ][ middle ] += 1 - else: + except KeyError: indels[ chrom ][ 'D' ][ start ][ middle ] = 1 else: indels[ chrom ][ 'D' ][ start ] = { middle: 1 } # compute deletion frequencies and insertion frequencies for checking against threshold freqs = {} ins_freqs = {} - chroms = ref_pos.keys() + chroms = mis_matches.keys() chroms.sort() for chrom in chroms: freqs[ chrom ] = {} ins_freqs[ chrom ] = {} - poses = num_reads[ chrom ].keys() - poses.sort() + poses = mis_matches[ chrom ].keys() + poses.extend( indels[ chrom ][ 'D' ].keys() ) + poses.extend( indels[ chrom ][ 'I' ].keys() ) + poses = list( sets.Set( poses ) ) for pos in poses: # all reads touching this particular position freqs[ chrom ][ pos ] = {} sum_counts = 0.0 sum_counts_end = 0.0 # get basic counts (match/mismatch) - if num_reads[ chrom ].has_key( pos ): - sum_counts += float( num_reads[ chrom ][ pos ] ) - try: - sum_counts_end += float( num_reads[ chrom ][ pos + 1 ] ) - except KeyError: - pass + try: + sum_counts += float( sum( mis_matches[ chrom ][ pos ].values() ) ) + except KeyError: + pass + try: + sum_counts_end += float( sum( mis_matches[ chrom ][ pos + 1 ].values() ) ) + except KeyError: + pass # add deletions also touching this position try: sum_counts += float( sum( indels[ chrom ][ 'D' ][ pos ].values() ) ) - try: - sum_counts_end += float( sum( indels[ chrom ][ 'D' ][ pos + 1 ].values() ) ) - except KeyError: - pass - for d in indels[ chrom ][ 'D' ][ pos ].keys(): - freqs[ chrom ][ pos ][ '-' * d ] = indels[ chrom ][ 'D' ][ pos ][ d ] / sum_counts except KeyError: pass + try: + sum_counts_end += float( sum( indels[ chrom ][ 'D' ][ pos + 1 ].values() ) ) + except KeyError: + pass + freqs[ chrom ][ pos ][ 'total' ] = sum_counts # calculate actual frequencies # deletions - freqs[ chrom ][ pos ][ 'total' ] = sum_counts - for base in ref_pos[ chrom ][ pos ].keys(): - try: - prop = float( ref_pos[ chrom ][ pos ][ base ] ) / sum_counts - freqs[ chrom ][ pos ][ base ] = prop - except ZeroDivisionError: - freqs[ chrom ][ pos ][ base ] = 0.0 + # frequencies for deletions try: for d in indels[ chrom ][ 'D' ][ pos ].keys(): - freqs[ chrom ][ pos ][ '-' * d ] = indels[ chrom ][ 'D' ][ pos ][ d ] / sum_counts + freqs[ chrom ][ pos ][ d ] = indels[ chrom ][ 'D' ][ pos ][ d ] / sum_counts + except KeyError: + pass + # frequencies for matches/mismatches + try: + for base in mis_matches[ chrom ][ pos ].keys(): + try: + prop = float( mis_matches[ chrom ][ pos ][ base ] ) / sum_counts + freqs[ chrom ][ pos ][ base ] = prop + except ZeroDivisionError: + freqs[ chrom ][ pos ][ base ] = 0.0 except KeyError: pass # insertions @@ -191,7 +178,7 @@ def __main__(): for bases in indels[ chrom ][ 'I' ][ pos ].keys(): prop_start = indels[ chrom ][ 'I' ][ pos ][ bases ] / ( indels[ chrom ][ 'I' ][ pos ][ bases ] + sum_counts ) try: - prop_end = indels[ chrom ][ 'I' ][ pos ][ bases ] / sum_counts_end + prop_end = indels[ chrom ][ 'I' ][ pos ][ bases ] / ( indels[ chrom ][ 'I' ][ pos ][ bases ] + sum_counts_end ) except ZeroDivisionError: prop_end = 0.0 try: @@ -204,7 +191,7 @@ def __main__(): threshold = float( options.threshold ) #out_del.write( '#Chrom\tStart\tEnd\t#Del\t#Reads\t%TotReads\n' ) #out_ins.write( '#Chrom\tStart\tEnd\tInsBases\t#Reads\t%TotReadsAtStart\t%ReadsAtEnd\n' ) - for chrom in ref_pos.keys(): + for chrom in chroms: # deletions file poses = indels[ chrom ][ 'D' ].keys() poses.sort() @@ -214,9 +201,9 @@ def __main__(): dels.sort() for d in dels: end = start + d - prop = freqs[ chrom ][ start ][ '-' * d ] + prop = freqs[ chrom ][ start ][ d ] if prop > threshold : - out_del.write( '%s\t%s\t%s\t%s\t%s\t%.2f\n' % ( chrom, start, end, d, indels[ chrom ][ 'D' ][ pos ][ d ], 100.0 * prop ) ) + out_del.write( '%s\t%s\t%s\t%s\t%.2f\n' % ( chrom, start, end, indels[ chrom ][ 'D' ][ pos ][ d ], 100.0 * prop ) ) # insertions file poses = indels[ chrom ][ 'I' ].keys() poses.sort() @@ -235,6 +222,6 @@ def __main__(): out_ins.close() # if skipped lines because of more than one indel, output message if multi_indel_lines > 0: - sys.stdout.write( '%s alignments were skipped because they contained more than one indel or had unhandled operations (N/S/H/P).' % multi_indel_lines ) + sys.stdout.write( '%s alignments were skipped because they contained more than one indel.' % multi_indel_lines ) if __name__=="__main__": __main__() --- a/test-data/indel_sam2interval_out1.interval +++ b/test-data/indel_sam2interval_out1.interval @@ -1,6 +1,14 @@ -ref 133 134 I C 1 -ref 256 258 D - 1 -ref 48819784 48819785 I A 1 -ref 87824726 87824727 I G 1 -ref 188841437 188841438 I A 1 -ref 190341171 190341172 D - 1 +chr1 11 13 D - 1 +chr1 21 22 D - 1 +chr1 21 23 D - 1 +chr2 18 19 I AA 1 +chr2 28 29 I CCC 1 +chrM 11 12 I TTT 1 +chrM 13 14 I C 1 +chrM 13 14 I T 1 +chrM 16 18 D - 2 +chrM 19 20 D - 1 +chrM 19 20 I T 1 +chrM 21 22 D - 1 +chrM 21 22 I GA 1 +chrM 22 23 D - 1 --- a/test-data/indel_analysis_in1.sam +++ b/test-data/indel_analysis_in1.sam @@ -1,22 +1,19 @@ -r770 89 ref 6 37 7M1I5M = 0 0 TCGATCTTCATAG !0//110AA44/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r1124 113 ref 4 0 23M = 0 0 CATCGTTCTGTTAGATCTACGTA PQRVUMNXYRPUXYXWXSOSZ]M XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r1231 69 * 0 0 * * 0 0 AGACCGGGCGGGGTGGCGTTCGGT %##+'#######%###$#$##$(# -r1563 133 * 0 0 * * 0 0 GTTCGTGGCCGGTGGGTGTTTGGG ###$$#$#$&#####$'$#$###$ -r1789 177 ref 6 0 17M = 0 0 TCGATCGCTTAGTTCTC SQQWZY]]YXTSY]]ZM XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r3671 153 ref 10 37 6M1I6M = 0 0 TCTCTTTAGGTCT /<<!"0/////// XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r3824 153 ref 5 37 8M1I7M = 0 0 ATCGATGTTCTTAGAT 4;6//11!"100110/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r4800 16 ref 7 255 5M2D6M = 0 0 CGATCTTTGAT IIIIIIIIIIC XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r5612 151 ref 5 37 8M1D9M = 0 0 ATCTATCTTTTGATCTC /<<!"0/4/*/7//B0/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r5929 151 ref 11 37 3M1I10M = 0 0 CTCCTTAGCTCTCC /<<!"0/4//7//0 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r6743 69 * 0 0 * * 0 0 TGCCGTGTCTTGCTAACGCCGATT #'#$$#$###%%##$$$$###### -r9145 115 ref 11 0 19M = 0 -1 CTCTTAGCTCTCCGAATTAG 7753:<5#"4!&=9518A> XT:A:R CM:i:2 SM:i:0 AM:i:0 X0:i:4 X1:i:137 XM:i:2 XO:i:0 XG:i:0 MD:Z:23 -r11770 89 ref 10 37 10M2I8M = 0 0 TCTCTTAGATGGCTCCGTAT 00/02!!0/120210AA4/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r13671 153 ref 1 37 12M1I12M = 0 0 TCGCATCGATCTCCTTAGATCTCCG /<""<!"0///////00/!!0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r13762 133 * 0 0 * * 0 0 TGGGTGGATGTGTTGTCGTTCATG #$#$###$#$#######$#$#### -r13824 153 ref 13 37 9M1I7M = 0 0 CATAGATCTACCGGATT 4;6//11!"11100110 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r24800 16 ref 3 255 15M2D9M = 0 0 GCATCTATCTGATAGCTCCGAATT IIIIIIIII45"CCCIII?IIIII XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r25612 151 ref 1 37 9M1D5M = 0 0 TCGCATCGACTCTT 0/4/*/7//00/1C XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r25786 151 ref 21 37 4M1I7M = 0 0 TGCGTTATTGGG <!"0/70/BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r27899 69 * 0 0 * * 0 0 CTGCGTGTTGGTGTCTACTGGGGT #%#'##$#$##&%#%$$$%#%#'# -r29192 133 * 0 0 * * 0 0 GTGCGTCGGGGAGGGTGCTGTCGG ######%#$%#$$###($###&&% -r29962 16 ref 20 37 4M1I7M = 0 0 CTCCGGTATGAGG <!"0/70/7BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 +r327 16 chrM 11 37 8M1D10M * 0 0 CTTACCAGATAGTCATCA -+<2;?@BA@?-,.+4=4 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:41^C35 +r457 0 chr1 14 37 14M * 0 0 ACCTGACAGATATC =/DF;?@1A@?-,. XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r501 16 chrM 6 23 7M1I13M * 0 0 TCTGTGCCTACCAGACATTCA +=$2;?@BA@?-,.+4=4=4A XT:A:U NM:i:3 X0:i:1 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:28C36G9 XA:Z:chr5,+134263658,14M1I61M,4; +r1288 16 chrM 8 37 11M1I7M * 0 0 TCACTTACCTGTACACACA /*F2;?@%A@?-,.+4=4= XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T0T1A69 +r1902 0 chr1 4 37 7M2D18M * 0 0 AGTCTCTTACCTGACGGTTATGA <2;?@BA@?-,.+4=4=4AA663 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r2204 16 chrM 9 0 19M * 0 0 CTGGTACCTGACAGGTATC 2;?@BA@?-,.+4=4=4AA XT:A:R NM:i:1 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0T75 XA:Z:chr1,-564927,76M,1; +r2314 16 chrM 6 37 10M2D8M * 0 0 TCACTCTTACGTCTGA <2;?@BA@?-,.+4=4 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:25A5^CA45 +r3001 0 chrM 13 37 3M1D5M2I7M * 0 0 TACAGTCACCCTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r3218 0 chr1 13 37 8M1D7M * 0 0 TACAGTCACTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r4767 16 chr2 3 37 15M2I7M * 0 0 CAGACTCTCTTACCAAAGACAGAC <2;?@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T1A4T65 +r5333 0 chrM 5 37 17M1D8M * 0 0 GTCTCTCATACCAGACAACGGCAT FB3$@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:45C10^C0C5C13 +r6690 16 chrM 7 23 20M * 0 0 CTCTCTTACCAGACAGACAT 2;?@BA/(@?-,.+4=4=4A XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chr1,-568532,76M,1; +r7211 0 chrM 7 37 24M * 0 0 CGACAGAGACAAAATAACATTTAA //<2;?@BA@?-,.+4=442;;6: XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:2 XO:i:1 XG:i:1 MD:Z:73G0G0 +r9922 16 chrM 4 0 7M3I9M * 0 0 CCAGACATTTGAAATCAGG F/D4=44^D++26632;;6 XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r9987 16 chrM 4 0 9M1I18M * 0 0 AGGTTCTCATTACCTGACACTCATCTTG G/AD6"/+4=4426632;;6:<2;?@BA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10145 16 chr1 16 0 5M2D7M * 0 0 CACATTGTTGTA G//+4=44=4AA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10324 16 chrM 15 0 6M1D5M * 0 0 CCGTTCTACTTG A(a)??8.G//+4= XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r12331 16 chrM 17 0 4M2I6M * 0 0 AGTCGAATACGTG 632;;6:<2;?@B XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r12914 16 chr2 24 0 4M3I3M * 0 0 ACTACCCCAA G//+4=42,. XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 --- a/tools/indels/sam_indel_filter.py +++ b/tools/indels/sam_indel_filter.py @@ -26,11 +26,11 @@ def __main__(): # prep output file output = open( options.output, 'wb' ) # patterns - pat_indel = re.compile( '(?P<before_match>(\d+[MNSHP])*)(?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M(?P<after_match>(\d+[MNSHP])*)' ) - pat_matches = re.compile( '(\d+[MIDNSHP])+' ) + pat = re.compile( '^(?P<lmatch>\d+)M(?P<ins_del_width>\d+)(?P<ins_del>[ID])(?P<rmatch>\d+)M$' ) + pat_multi = re.compile( '(\d+[MIDNSHP])(\d+[MIDNSHP])(\d+[MIDNSHP])+' ) try: - qual_thresh = int( options.quality_threshold ) + 33 - if qual_thresh < 33 or qual_thresh > 126: + qual_thresh = int( options.quality_threshold ) + if qual_thresh < 0 or qual_thresh > 93: raise ValueError except ValueError: stop_err( 'Your quality threshold should be an integer between 0 and 93, inclusive.' ) @@ -46,54 +46,39 @@ def __main__(): for i,line in enumerate(open( options.input, 'rb' )): if line and not line.startswith( '#' ) and not line.startswith( '@' ) : split_line = line.split( '\t' ) - cigar = split_line[5] - # find all possible matches, like 3M2D7M and 7M3I10M in 3M2D7M3I10M - cigar_copy = cigar[:] - matches = [] - while len( cigar_copy ) >= 6: # nMnInM or nMnDnM - m = pat_indel.search( cigar_copy ) + cigar = split_line[5].strip() + # find matches like 3M2D7M or 7M3I10M + match = {} + m = pat.match( cigar ) + # if unprocessable CIGAR + if not m: + m = pat_multi.match( cigar ) + # skip this line if no match if not m: - break + continue + # account for multiple indels or operations we don't process else: - parts = m.groupdict() - if parts[ 'lmatch' ] and parts[ 'ins_del_width' ] and parts[ 'rmatch' ]: - pre_left = 0 - if m.start() > 0: - pre_left_groups = pat_matches.search( cigar_copy[ : m.start() ] ) - if pre_left_groups: - for pl in pre_left_groups.groups(): - if pl.endswith( 'M' ) or pl.endswith( 'S' ) or pl.endswith( 'P' ): - pre_left += pl[:-1] - parts[ 'pre_left' ] = pre_left - matches.append( parts ) - cigar_copy = cigar_copy[ len( parts[ 'lmatch' ] ) + 1 : ] - # see if matches meet filter requirements - if len( matches ) > 1: - multi_indel_lines += 1 - elif len( matches ) == 1: - pre_left = int( matches[0][ 'pre_left' ] ) - left = int( matches[0][ 'lmatch' ] ) - right = int( matches[0][ 'rmatch' ] ) - if matches[0][ 'ins_del' ] == 'D': - middle = int( matches[0][ 'ins_del_width' ] ) + multi_indel_lines += 1 + # otherwise get matching parts + else: + match = m.groupdict() + # process for indels + if match: + left = int( match[ 'lmatch' ] ) + right = int( match[ 'rmatch' ] ) + if match[ 'ins_del' ] == 'I': + middle = int( match[ 'ins_del_width' ] ) else: middle = 0 # if there are enough adjacent bases to check, then do so if left >= adj_bases and right >= adj_bases: quals = split_line[10] - left_quals = quals[ pre_left : pre_left + left ][ -adj_bases : ] - middle_quals = quals[ pre_left + left : pre_left + left + middle ] - right_quals = quals[ pre_left + left + middle : pre_left + left + middle + right ][ : adj_bases ] + eligible_quals = quals[ left - adj_bases : left + middle + adj_bases ] qual_thresh_met = True - for l in left_quals: - if ord( l ) < qual_thresh: + for q in eligible_quals: + if ord( q ) - 33 < qual_thresh: qual_thresh_met = False break - if qual_thresh_met: - for r in right_quals: - if ord( r ) < qual_thresh: - qual_thresh_met = False - break # if filter reqs met, output line if qual_thresh_met: output.write( line ) --- a/tools/indels/sam_indel_filter.xml +++ b/tools/indels/sam_indel_filter.xml @@ -1,5 +1,5 @@ -<tool id="sam_indel_filter" name="Filter SAM" version="1.0.0"> - <description>for indels</description> +<tool id="sam_indel_filter" name="Filter Indels" version="1.0.0"> + <description>for SAM</description><command interpreter="python"> sam_indel_filter.py --input=$input1 --- a/test-data/indel_sam2interval_out3.bed +++ b/test-data/indel_sam2interval_out3.bed @@ -1,2 +1,7 @@ -ref 256 258 -ref 190341171 190341172 +chr1 11 13 +chr1 21 22 +chr1 21 23 +chrM 16 18 +chrM 19 20 +chrM 21 22 +chrM 22 23 --- a/test-data/indel_sam2interval_out2.bed +++ b/test-data/indel_sam2interval_out2.bed @@ -1,4 +1,6 @@ -ref 133 134 -ref 48819784 48819785 -ref 87824726 87824727 -ref 188841437 188841438 +chr2 18 19 +chr2 28 29 +chrM 11 12 +chrM 13 14 +chrM 19 20 +chrM 21 22 --- a/test-data/indel_analysis_out2.interval +++ b/test-data/indel_analysis_out2.interval @@ -1,8 +1,7 @@ -ref 13 14 C 1 7.14 7.14 -ref 13 14 T 2 13.33 14.29 -ref 14 15 C 1 6.67 7.14 -ref 16 17 T 1 7.14 7.69 -ref 20 21 GG 1 8.33 8.33 -ref 22 23 A 1 8.33 11.11 -ref 24 25 G 1 11.11 14.29 -ref 25 26 T 1 12.50 14.29 +chr2 18 19 AA 1 50.00 50.00 +chr2 28 29 CCC 1 50.00 50.00 +chrM 11 12 TTT 1 9.09 9.09 +chrM 13 14 C 1 9.09 9.09 +chrM 13 14 T 1 9.09 9.09 +chrM 19 20 T 1 7.69 8.33 +chrM 21 22 GA 1 8.33 8.33 --- a/test-data/indel_analysis_in2.sam +++ b/test-data/indel_analysis_in2.sam @@ -1,16 +1,24 @@ -r770 89 ref 6 37 7M1I5M = 0 0 TCGATCTTCATAG !0//110AA44/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r1124 113 ref 4 0 23M = 0 0 CATCGTTCTGTTAGATCTACGTA PQRVUMNXYRPUXYXWXSOSZ]M XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r1789 177 ref 6 0 17M = 0 0 TCGATCGCTTAGTTCTC SQQWZY]]YXTSY]]ZM XT:A:R CM:i:0 SM:i:0 AM:i:0 X0:i:163148 XM:i:0 XO:i:0 XG:i:0 MD:Z:23 -r3671 153 ref 10 37 6M1I6M = 0 0 TCTCTTTAGGTCT /<<!"0/////// XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r3824 153 ref 5 37 8M1I7M = 0 0 ATCGATGTTCTTAGAT 4;6//11!"100110/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r4800 16 ref 7 255 5M2D6M = 0 0 CGATCTTTGAT IIIIIIIIIIC XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r5612 151 ref 5 37 8M1D9M = 0 0 ATCTATCTTTTGATCTC /<<!"0/4/*/7//B0/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r5929 151 ref 11 37 3M1I10M = 0 0 CTCCTTAGCTCTCC /<<!"0/4//7//0 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r9145 115 ref 11 0 19M = 0 -1 CTCTTAGCTCTCCGAATTAG 7753:<5#"4!&=9518A> XT:A:R CM:i:2 SM:i:0 AM:i:0 X0:i:4 X1:i:137 XM:i:2 XO:i:0 XG:i:0 MD:Z:23 -r11770 89 ref 10 37 10M2I8M = 0 0 TCTCTTAGATGGCTCCGTAT 00/02!!0/120210AA4/1 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r13671 153 ref 1 37 12M1I12M = 0 0 TCGCATCGATCTCCTTAGATCTCCG /<""<!"0///////00/!!0121/ XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r13824 153 ref 13 37 9M1I7M = 0 0 CATAGATCTACCGGATT 4;6//11!"11100110 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r24800 16 ref 3 255 15M2D9M = 0 0 GCATCTATCTGATAGCTCCGAATT IIIIIIIII45"CCCIII?IIIII XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r25612 151 ref 1 37 9M1D5M = 0 0 TCGCATCGACTCTT 0/4/*/7//00/1C XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r25786 151 ref 21 37 4M1I7M = 0 0 TGCGTTATTGGG <!"0/70/BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 -r29962 16 ref 20 37 4M1I7M = 0 0 CTCCGGTATGAGG <!"0/70/7BC01 XT:A:U CM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:1 MD:Z:22 +r327 16 chrM 11 37 8M1D10M * 0 0 CTTACCAGATAGTCATCA -+<2;?@BA@?-,.+4=4 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:0 XO:i:1 XG:i:1 MD:Z:41^C35 +r457 0 chr1 14 37 14M * 0 0 ACCTGACAGATATC =/DF;?@1A@?-,. XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r501 16 chrM 6 23 7M1I13M * 0 0 TCTGTGCCTACCAGACATTCA +=$2;?@BA@?-,.+4=4=4A XT:A:U NM:i:3 X0:i:1 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:28C36G9 XA:Z:chr5,+134263658,14M1I61M,4; +r764 4 * 0 0 * * 0 0 TTAAGGGGAACGTGTGGGCTATTTAGGCTTTATG BBB=?BBA?>;?=B=AA=;A@B>=;;>:A=?:?9 +r1288 16 chrM 8 37 11M1I7M * 0 0 TCACTTACCTGTACACACA /*F2;?@%A@?-,.+4=4= XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T0T1A69 +r1902 0 chr1 4 37 7M2D18M * 0 0 AGTCTCTTACCTGACGGTTATGA <2;?@BA@?-,.+4=4=4AA663 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r2204 16 chrM 9 0 19M * 0 0 CTGGTACCTGACAGGTATC 2;?@BA@?-,.+4=4=4AA XT:A:R NM:i:1 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0T75 XA:Z:chr1,-564927,76M,1; +r2314 16 chrM 6 37 10M2D6M * 0 0 TCACTCTTACGTCTGA <2;?@BA@?-,.+4=4 XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:25A5^CA45 +r3001 0 chrM 13 37 3M1D5M2I7M * 0 0 TACAGTCACCCTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r3218 0 chr1 13 37 8M1D7M * 0 0 TACAGTCACTCATCA <2;?@BA/(@?-,$& XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:1 XO:i:1 XG:i:2 MD:Z:17^CA58A0 +r3760 4 * 0 0 * * 0 0 CCTGTGATCCATCGTGATGTCTTATTTAAGG BABCBCBBBABBACBCABCAACBBCBBB=?B +r4767 16 chr2 3 37 15M2I7M * 0 0 CAGACTCTCTTACCAAAGACAGAC <2;?@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:2T1A4T65 +r5333 0 chrM 5 37 17M1D8M * 0 0 GTCTCTCATACCAGACAACGGCAT FB3$@BA/(@?-,.+4=4=4AA66 XT:A:U NM:i:4 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:45C10^C0C5C13 +r6121 4 * 0 0 * * 0 0 GTTAATAGGGTGATAGA AB/BC==CC%ACBC/CB +r6690 16 chrM 7 23 20M * 0 0 CTCTCTTACCAGACAGACAT 2;?@BA/(@?-,.+4=4=4A XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chr1,-568532,76M,1; +r7211 0 chrM 7 37 24M * 0 0 CGACAGAGACAAAATAACATTTAA //<2;?@BA@?-,.+4=442;;6: XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:2 XO:i:1 XG:i:1 MD:Z:73G0G0 +r8122 4 * 0 0 * * 0 0 GACCTGTGATCCATCGTGATGT CBBAB/$3BB<AB/,CBCABCA +r9922 16 chrM 4 0 7M3I9M * 0 0 CCAGACATTTGAAATCAGG F/D4=44^D++26632;;6 XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r9987 16 chrM 4 0 9M1I18M * 0 0 AGGTTCTCATTACCTGACACTCATCTTG G/AD6"/+4=4426632;;6:<2;?@BA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10145 16 chr1 16 0 5M2D7M * 0 0 CACATTGTTGTA G//+4=44=4AA XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r10324 16 chrM 15 0 6M1D5M * 0 0 CCGTTCTACTTG A(a)??8.G//+4= XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r11324 4 * 0 0 * * 0 0 AATAGGGTGATAGACCTG //CCCC#ACB;;C3BA5C +r12331 16 chrM 17 0 4M2I6M * 0 0 AGTCGAATACGTG 632;;6:<2;?@B XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 +r12914 16 chr2 24 0 4M3I3M * 0 0 ACTACCCCAA G//+4=42,. XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76

1 0

galaxy-dist commit 0f5eb93a7d61: lims: look and feel improvements.
by commits-noreply＠bitbucket.org 23 Jul '10

23 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User rc # Date 1279307355 14400 # Node ID 0f5eb93a7d61478565d63563bddcbffc6b6a59d9 # Parent a27244147802bd74be35ae24665cc3fc57580f23 lims: look and feel improvements. Removed borders from the request page and added +/- icons for expand/collapse --- a/templates/requests/common/get_data.mako +++ b/templates/requests/common/get_data.mako @@ -128,82 +128,77 @@ </li></ul> -<div class="toolForm"> - %if len(dataset_files): + +%if len(dataset_files): + <h3>Sample Dataset(s)</h3> + %if sample.untransferred_dataset_files() and cntrller == 'requests_admin': <div class="form-row"> - <h4>Sample Dataset(s)</h4> - %if sample.untransferred_dataset_files() and cntrller == 'requests_admin': + <ul class="manage-table-actions"> + <li> + <a class="action-button" href="${h.url_for( controller='requests_admin', action='get_data', start_transfer_button=True, sample_id=sample.id )}"> + <span>Start transfer</span></a> + </li> + </ul> + </div> + %endif + <div class="form-row"> + <table class="grid"> + <thead> + <tr> + <th>Dataset File</th> + <th>Transfer Status</th> + <th></th> + </tr> + <thead> + <tbody> + %for dataset_index, dataset_file in enumerate(dataset_files): + ${sample_dataset_files( dataset_index, dataset_file['name'], dataset_file['status'] )} + %endfor + </tbody> + </table> + </div> +%else: + <div class="form-row"> + There are no dataset files associated with this sample. + </div> +%endif + +<br/> +<br/> + +%if cntrller == 'requests_admin' and trans.user_is_admin(): + <form name="get_data" id="get_data" action="${h.url_for( controller='requests_admin', cntrller=cntrller, action='get_data', sample_id=sample.id)}" method="post" > + <div class="toolFormTitle">Select files for transfer</div> + ##<h4>Select files for transfer</h4> + <div class="toolForm"><div class="form-row"> - <ul class="manage-table-actions"> - <li> - <a class="action-button" href="${h.url_for( controller='requests_admin', action='get_data', start_transfer_button=True, sample_id=sample.id )}"> - <span>Start transfer</span></a> - </li> - </ul> - </div> - %endif - <div class="form-row"> - <table class="grid"> - <thead> - <tr> - <th>Dataset File</th> - <th>Transfer Status</th> - <th></th> - </tr> - <thead> - <tbody> - %for dataset_index, dataset_file in enumerate(dataset_files): - ${sample_dataset_files( dataset_index, dataset_file['name'], dataset_file['status'] )} + <label>Folder path on the sequencer:</label> + <input type="text" name="folder_path" value="${folder_path}" size="100"/> + <input type="submit" name="browse_button" value="List contents"/> + ##<input type="submit" name="open_folder" value="Open folder"/> + <input type="submit" name="folder_up" value="Up"/> + </div> + <div class="form-row"> + <select name="files_list" id="files_list" style="max-width: 60%; width: 98%; height: 150px; font-size: 100%;" ondblclick="open_folder1(${sample.id}, '${folder_path}')" onChange="display_file_details(${sample.id}, '${folder_path}')" multiple> + %for index, f in enumerate(files): + <option value="${f}">${f}</option> %endfor - </tbody> - </table> - </div> - </div> - %else: - <div class="form-row"> - There are no dataset files associated with this sample. - </div> - %endif - - <br/> - <br/> - - %if cntrller == 'requests_admin' and trans.user_is_admin(): - <form name="get_data" id="get_data" action="${h.url_for( controller='requests_admin', cntrller=cntrller, action='get_data', sample_id=sample.id)}" method="post" > - <div class="form-row"> - ##<div class="toolFormTitle">Select files for transfer</div> - <h4>Select files for transfer</h4> - <div style="width: 60%;"> - <div class="form-row"> - <label>Folder path on the sequencer:</label> - <input type="text" name="folder_path" value="${folder_path}" size="100"/> - <input type="submit" name="browse_button" value="List contents"/> - ##<input type="submit" name="open_folder" value="Open folder"/> - <input type="submit" name="folder_up" value="Up"/> - </div> - <div class="form-row"> - <select name="files_list" id="files_list" style="max-width: 98%; width: 98%; height: 150px; font-size: 100%;" ondblclick="open_folder1(${sample.id}, '${folder_path}')" onChange="display_file_details(${sample.id}, '${folder_path}')" multiple> - %for index, f in enumerate(files): - <option value="${f}">${f}</option> - %endfor - </select> - <br/> - <div id="file_details" class="toolParamHelp" style="clear: both;"> - - </div> - </div> - <div class="form-row"> - <div class="toolParamHelp" style="clear: both;"> - After selecting dataset(s), be sure to click on the <b>Start transfer</b> button. - Once the transfer is complete the dataset(s) will show up on this page. - </div> - <input type="submit" name="select_files_button" value="Select"/> + </select> + <br/> + <div id="file_details" class="toolParamHelp" style="clear: both;"> + </div></div> + <div class="form-row"> + <div class="toolParamHelp" style="clear: both;"> + After selecting dataset(s), be sure to click on the <b>Start transfer</b> button. + Once the transfer is complete the dataset(s) will show up on this page. + </div> + <input type="submit" name="select_files_button" value="Select"/> + </div></div> - </form> - %endif -</div> + </form> +%endif --- a/templates/requests/common/edit_request.mako +++ b/templates/requests/common/edit_request.mako @@ -34,8 +34,8 @@ <br/><ul class="manage-table-actions"><li> - <a class="action-button" href="${h.url_for( controller=cntrller, cntrller=cntrller, action='list')}"> - <span>Browse requests</span></a> + <a class="action-button" href="${h.url_for( controller=cntrller, action='list', operation='show', id=trans.security.encode_id(request.id))}"> + <span>Browse this request</span></a></li></ul> --- a/templates/requests/common/show_request.mako +++ b/templates/requests/common/show_request.mako @@ -30,6 +30,31 @@ }); </script> +<script type="text/javascript"> +function showContent(vThis) +{ + // http://www.javascriptjunkie.com + // alert(vSibling.className + " " + vDef_Key); + vParent = vThis.parentNode; + vSibling = vParent.nextSibling; + while (vSibling.nodeType==3) { + // Fix for Mozilla/FireFox Empty Space becomes a TextNode or Something + vSibling = vSibling.nextSibling; + }; + if(vSibling.style.display == "none") + { + vThis.src="/static/images/fugue/toggle.png"; + vThis.alt = "Hide"; + vSibling.style.display = "block"; + } else { + vSibling.style.display = "none"; + vThis.src="/static/images/fugue/toggle-expand.png"; + vThis.alt = "Show"; + } + return; +} +</script> + <script type="text/javascript"> $(document).ready(function(){ @@ -37,7 +62,7 @@ $(".msg_body").hide(); //toggle the componenet with class msg_body $(".msg_head").click(function(){ - $(this).next(".msg_body").slideToggle(450); + $(this).next(".msg_body").slideToggle(0); }); }); </script> @@ -153,47 +178,43 @@ ${render_msg( message, status )} %endif - - - -<div class="toolForm"> +<div> +<h3><img src="/static/images/fugue/toggle-expand.png" alt="Show" onclick="showContent(this);" style="cursor:pointer;"/> Request Information</h3> +<div style="display:none;" > + %for index, rd in enumerate(request_details): + <div class="form-row"> + <label>${rd['label']}</label> + %if not rd['value']: + <i>None</i> + %else: + %if rd['label'] == 'State': + <a href="${h.url_for( controller=cntrller, action='list', operation='events', id=trans.security.encode_id(request.id) )}">${rd['value']}</a> + %else: + ${rd['value']} + %endif + %endif + </div> + <div style="clear: both"></div> + %endfor <div class="form-row"> - <div class="msg_list"> - <h4 class="msg_head"><u>Request Information</u></h4> - <div class="msg_body"> - %for index, rd in enumerate(request_details): - <div class="form-row"> - <label>${rd['label']}</label> - %if not rd['value']: - <i>None</i> - %else: - %if rd['label'] == 'State': - <a href="${h.url_for( controller=cntrller, action='list', operation='events', id=trans.security.encode_id(request.id) )}">${rd['value']}</a> - %else: - ${rd['value']} - %endif - %endif - </div> - <div style="clear: both"></div> - %endfor - <div class="form-row"> - <ul class="manage-table-actions"> - <li> - <a class="action-button" href="${h.url_for( controller='requests_admin', action='list', operation='Edit', id=trans.security.encode_id(request.id))}"> - <span>Edit request details</span></a> - </li> - </ul> - </div> - </div> - </div> + <ul class="manage-table-actions"> + <li> + <a class="action-button" href="${h.url_for( controller=cntrller, action='list', operation='Edit', id=trans.security.encode_id(request.id))}"> + <span>Edit request details</span></a> + </li> + </ul></div></div> +</div> + + + <br/> -<div class="toolForm"> +##<div class="toolForm"><form id="show_request" name="show_request" action="${h.url_for( controller='requests_common', cntrller=cntrller, action='request_page', edit_mode=edit_mode )}" method="post" > - <div class="form-row"> + ##<div class="form-row"> %if current_samples: ## first render the basic info grid ${render_basic_info_grid()} @@ -201,12 +222,11 @@ <% trans.sa_session.refresh( request.type.sample_form ) %> %for grid_index, grid_name in enumerate(request.type.sample_form.layout): ${render_grid( grid_index, grid_name, request.type.sample_form.fields_of_grid( grid_index ) )} - <br/> %endfor %else: <label>There are no samples.</label> %endif - </div> + ##</div> %if request.samples and request.submitted(): <script type="text/javascript"> // Updater @@ -253,97 +273,94 @@ <input type="submit" name="save_samples_button" value="Save"/><input type="submit" name="cancel_changes_button" value="Cancel"/></div> - %elif request.unsubmitted(): + %elif edit_mode == 'True' or len(current_samples) > len(request.samples): <div class="form-row"><input type="submit" name="save_samples_button" value="Save"/> + <input type="submit" name="cancel_changes_button" value="Cancel"/></div> %endif %endif <input type="hidden" name="id" value="${trans.security.encode_id(request.id)}" /></form> -</div> +##</div><br/> %if request.unsubmitted(): -<div class="toolForm"><form id="import" name="import" action="${h.url_for( controller='requests_common', action='request_page', edit_mode=edit_mode, request_id=trans.security.encode_id(request.id) )}" enctype="multipart/form-data" method="post" > - <div class="form-row"> - <div class="msg_list"> - <h4 class="msg_head"><u>Import samples from csv file</u></h4> - <div class="msg_body"> - <input type="file" name="file_data" /> - <input type="submit" name="import_samples_button" value="Import samples"/> - <br/> - <div class="toolParamHelp" style="clear: both;"> - The csv file must be in the following format:<br/> - SampleName,DataLibrary,DataLibraryFolder,FieldValue1,FieldValue2... - </div> - </div> + <h4><img src="/static/images/fugue/toggle-expand.png" alt="Show" onclick="showContent(this);" style="cursor:pointer;"/> Import samples</h4> + <div style="display:none;"> + <input type="file" name="file_data" /> + <input type="submit" name="import_samples_button" value="Import samples"/> + <br/> + <div class="toolParamHelp" style="clear: both;"> + The csv file must be in the following format:<br/> + SampleName,DataLibrary,DataLibraryFolder,FieldValue1,FieldValue2... </div></div> -## <input type="hidden" name="request_id" value="${request.id}" /></form> -</div> +##</div> %endif + + <%def name="render_grid( grid_index, grid_name, fields_dict )"><br/> - <div class="msg_list"> - %if grid_name: - <h4 class="msg_head"><u>${grid_name}</u></h4> - %else: - <h4>Grid ${grid_index}</h4> - %endif - %if edit_mode == 'False' or len(current_samples) <= len(request.samples): - <div class="msg_body"> - %else: - <div class="msg_body2"> - %endif - <table class="grid"> - <thead> - <tr> - <th>Name</th> - %for index, field in fields_dict.items(): - <th> - ${field['label']} - <div class="toolParamHelp" style="clear: both;"> - <i>${field['helptext']}</i> - </div> - </th> - %endfor - <th></th> - </tr> - <thead> - <tbody> - <% - trans.sa_session.refresh( request ) - %> - %for sample_index, sample in enumerate(current_samples): - %if edit_mode == 'True': + <% if not grid_name: + grid_name = "Grid "+ grid_index + %> + <div> + %if edit_mode == 'True' or len(current_samples) > len(request.samples): + <h4><img src="/static/images/fugue/toggle.png" alt="Show" onclick="showContent(this);" style="cursor:pointer;"/> ${grid_name}</h4> + <div> + %else: + <h4><img src="/static/images/fugue/toggle-expand.png" alt="Hide" onclick="showContent(this);" style="cursor:pointer;"/> ${grid_name}</h4> + <div style="display:none;"> + %endif + <table class="grid"> + <thead><tr> - ${render_sample_form( sample_index, sample['name'], sample['field_values'], fields_dict)} - </tr> - %else: - <tr> - %if sample_index in range(len(request.samples)): - ${render_sample( sample_index, sample['name'], sample['field_values'], fields_dict )} - %else: - ${render_sample_form( sample_index, sample['name'], sample['field_values'], fields_dict)} + <th>Name</th> + %for index, field in fields_dict.items(): + <th> + ${field['label']} + <div class="toolParamHelp" style="clear: both;"> + <i>${field['helptext']}</i> + </div> + </th> + %endfor + <th></th> + </tr> + <thead> + <tbody> + <% + trans.sa_session.refresh( request ) + %> + %for sample_index, sample in enumerate(current_samples): + %if edit_mode == 'True': + <tr> + ${render_sample_form( sample_index, sample['name'], sample['field_values'], fields_dict)} + </tr> + %else: + <tr> + %if sample_index in range(len(request.samples)): + ${render_sample( sample_index, sample['name'], sample['field_values'], fields_dict )} + %else: + ${render_sample_form( sample_index, sample['name'], sample['field_values'], fields_dict)} + %endif + </tr> %endif - </tr> - %endif - %endfor - </tbody> - </table> - </div> + %endfor + </tbody> + </table> + </div></div></%def> ## This function displays the "Basic Information" grid <%def name="render_basic_info_grid()"> - <h4>Sample Information</h4> + <h3>Sample Information</h3><table class="grid"><thead><tr> @@ -408,7 +425,7 @@ %if sample: %if sample.request.unsubmitted(): <a class="action-button" href="${h.url_for( controller='requests_common', cntrller=cntrller, action='delete_sample', request_id=request.id, sample_id=sample_index )}"> - <img src="${h.url_for('/static/images/delete_icon.png')}" /> + <img src="${h.url_for('/static/images/delete_icon.png')}" style="cursor:pointer;"/><span></span></a> %endif %endif

1 0

galaxy-dist commit 84671d8f4684: Fix for sqlalchemy-migrate scripts so they will function across webapps - galaxy is the default, so no command line changes necessary when migrating the db.
by commits-noreply＠bitbucket.org 23 Jul '10

23 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Greg Von Kuster <greg(a)bx.psu.edu> # Date 1279632434 14400 # Node ID 84671d8f46845b7a5bcfcc8028e68eaf241a5058 # Parent 28e629ad59490af90a28bc45073a7cfc10ed646c Fix for sqlalchemy-migrate scripts so they will function across webapps - galaxy is the default, so no command line changes necessary when migrating the db. --- a/manage_db.sh +++ b/manage_db.sh @@ -2,7 +2,7 @@ ####### # NOTE: To downgrade to a specific version, use something like: -# sh manage_db.sh downgrade --version=3 +# sh manage_db.sh downgrade --version=3 <community if using that webapp - galaxy is the default> ####### cd `dirname $0` --- a/scripts/manage_db.py +++ b/scripts/manage_db.py @@ -12,11 +12,20 @@ pkg_resources.require( "sqlalchemy-migra from migrate.versioning.shell import main from ConfigParser import SafeConfigParser - log = logging.getLogger( __name__ ) +if sys.argv[-1] in [ 'community' ]: + # Need to pop the last arg so the command line args will be correct + # for sqlalchemy-migrate + webapp = sys.argv.pop() + config_file = 'community_wsgi.ini' + repo = 'lib/galaxy/webapps/community/model/migrate' +else: + config_file = 'universe_wsgi.ini' + repo = 'lib/galaxy/model/migrate' + cp = SafeConfigParser() -cp.read( "universe_wsgi.ini" ) +cp.read( config_file ) if cp.has_option( "app:main", "database_connection" ): db_url = cp.get( "app:main", "database_connection" ) @@ -43,4 +52,4 @@ except KeyError: # Let this go, it could possibly work with db's we don't support log.error( "database_connection contains an unknown SQLAlchemy database dialect: %s" % dialect ) -main( repository='lib/galaxy/model/migrate', url=db_url ) +main( repository=repo, url=db_url )

1 0

galaxy-dist commit 28e629ad5949: Upgrade pbs_python to 4.1.0 and use PBS exit_status (if keep_completed is set) so we can detect PBS failures. This is a reapplication of 3786:48432330228e, which was backed out in a subsequent revision due to crashes experienced in pbs_python 2.9.8.
by commits-noreply＠bitbucket.org 23 Jul '10

23 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Nate Coraor <nate(a)bx.psu.edu> # Date 1279556993 14400 # Node ID 28e629ad59490af90a28bc45073a7cfc10ed646c # Parent 0f5eb93a7d61478565d63563bddcbffc6b6a59d9 Upgrade pbs_python to 4.1.0 and use PBS exit_status (if keep_completed is set) so we can detect PBS failures. This is a reapplication of 3786:48432330228e, which was backed out in a subsequent revision due to crashes experienced in pbs_python 2.9.8. --- a/eggs.ini +++ b/eggs.ini @@ -17,7 +17,7 @@ Cheetah = 2.2.2 DRMAA_python = 0.2 MySQL_python = 1.2.3c1 numpy = 1.3.0 -pbs_python = 2.9.4 +pbs_python = 4.1.0 psycopg2 = 2.0.13 pycrypto = 2.0.1 pysam = 0.1.1 --- a/lib/galaxy/jobs/runners/pbs.py +++ b/lib/galaxy/jobs/runners/pbs.py @@ -50,6 +50,19 @@ cd %s %s """ +# From pbs' job.h +JOB_EXIT_STATUS = { + 0: "job exec successful", + -1: "job exec failed, before files, no retry", + -2: "job exec failed, after files, no retry", + -3: "job execution failed, do retry", + -4: "job aborted on MOM initialization", + -5: "job aborted on MOM init, chkpt, no migrate", + -6: "job aborted on MOM init, chkpt, ok migrate", + -7: "job restart failed", + -8: "exec() of user command failed", +} + class PBSJobState( object ): def __init__( self ): """ @@ -65,6 +78,7 @@ class PBSJobState( object ): self.efile = None self.runner_url = None self.check_count = 0 + self.stop_job = False class PBSJobRunner( object ): """ @@ -193,8 +207,9 @@ class PBSJobRunner( object ): pbs_options = self.determine_pbs_options( runner_url ) c = pbs.pbs_connect( pbs_server_name ) if c <= 0: + errno, text = pbs.error() job_wrapper.fail( "Unable to queue job for execution. Resubmitting the job may succeed." ) - log.error( "Connection to PBS server for submit failed" ) + log.error( "Connection to PBS server for submit failed: %s: %s" % ( errno, text ) ) return # define job attributes @@ -336,58 +351,78 @@ class PBSJobRunner( object ): log.debug( "(%s/%s) Skipping state check because PBS server connection failed" % ( galaxy_job_id, job_id ) ) new_watched.append( pbs_job_state ) continue - if statuses.has_key( job_id ): + try: status = statuses[job_id] - if status.job_state != old_state: - log.debug("(%s/%s) job state changed from %s to %s" % ( galaxy_job_id, job_id, old_state, status.job_state ) ) - if status.job_state == "R" and not pbs_job_state.running: - pbs_job_state.running = True - pbs_job_state.job_wrapper.change_state( model.Job.states.RUNNING ) - if status.job_state == "R" and ( pbs_job_state.check_count % 20 ) == 0: - # Every 20th time the job status is checked, do limit checks (if configured) - if self.app.config.output_size_limit > 0: - # Check the size of the job outputs - fail = False - for outfile, size in pbs_job_state.job_wrapper.check_output_sizes(): - if size > self.app.config.output_size_limit: - pbs_job_state.fail_message = 'Job output grew too large (greater than %s), please try different job parameters or' \ - % nice_size( self.app.config.output_size_limit ) - log.warning( '(%s/%s) Dequeueing job due to output %s growing larger than %s limit' \ - % ( galaxy_job_id, job_id, os.path.basename( outfile ), nice_size( self.app.config.output_size_limit ) ) ) - self.work_queue.put( ( 'fail', pbs_job_state ) ) - fail = True - break - if fail: - continue - if self.job_walltime is not None: - # Check the job's execution time - if status.get( 'resources_used', False ): - # resources_used may not be in the status for new jobs - h, m, s = [ int( i ) for i in status.resources_used.walltime.split( ':' ) ] - time_executing = timedelta( 0, s, 0, 0, m, h ) - if time_executing > self.job_walltime: - pbs_job_state.fail_message = 'Job ran longer than maximum allowed execution time (%s), please try different job parameters or' \ - % self.app.config.job_walltime - log.warning( '(%s/%s) Dequeueing job since walltime has been reached' \ - % ( galaxy_job_id, job_id ) ) - self.work_queue.put( ( 'fail', pbs_job_state ) ) - continue - pbs_job_state.old_state = status.job_state - new_watched.append( pbs_job_state ) - else: + except KeyError: try: - # recheck to make sure it wasn't a communication problem + # Recheck to make sure it wasn't a communication problem self.check_single_job( pbs_server_name, job_id ) - log.warning( "(%s/%s) job was not in state check list, but was found with individual state check" % ( galaxy_job_id, job_id ) ) + log.warning( "(%s/%s) PBS job was not in state check list, but was found with individual state check" % ( galaxy_job_id, job_id ) ) new_watched.append( pbs_job_state ) except: errno, text = pbs.error() - if errno != 15001: - log.info("(%s/%s) state check resulted in error (%d): %s" % (galaxy_job_id, job_id, errno, text) ) + if errno == 15001: + # 15001 == job not in queue + log.debug("(%s/%s) PBS job has left queue" % (galaxy_job_id, job_id) ) + self.work_queue.put( ( 'finish', pbs_job_state ) ) + else: + # Unhandled error, continue to monitor + log.info("(%s/%s) PBS state check resulted in error (%d): %s" % (galaxy_job_id, job_id, errno, text) ) new_watched.append( pbs_job_state ) - else: - log.debug("(%s/%s) job has left queue" % (galaxy_job_id, job_id) ) - self.work_queue.put( ( 'finish', pbs_job_state ) ) + continue + if status.job_state != old_state: + log.debug("(%s/%s) PBS job state changed from %s to %s" % ( galaxy_job_id, job_id, old_state, status.job_state ) ) + if status.job_state == "R" and not pbs_job_state.running: + pbs_job_state.running = True + pbs_job_state.job_wrapper.change_state( model.Job.states.RUNNING ) + if status.job_state == "R" and ( pbs_job_state.check_count % 20 ) == 0: + # Every 20th time the job status is checked, do limit checks (if configured) + if self.app.config.output_size_limit > 0: + # Check the size of the job outputs + fail = False + for outfile, size in pbs_job_state.job_wrapper.check_output_sizes(): + if size > self.app.config.output_size_limit: + pbs_job_state.fail_message = 'Job output grew too large (greater than %s), please try different job parameters or' \ + % nice_size( self.app.config.output_size_limit ) + log.warning( '(%s/%s) Dequeueing job due to output %s growing larger than %s limit' \ + % ( galaxy_job_id, job_id, os.path.basename( outfile ), nice_size( self.app.config.output_size_limit ) ) ) + pbs_job_state.stop_job = True + self.work_queue.put( ( 'fail', pbs_job_state ) ) + fail = True + break + if fail: + continue + if self.job_walltime is not None: + # Check the job's execution time + if status.get( 'resources_used', False ): + # resources_used may not be in the status for new jobs + h, m, s = [ int( i ) for i in status.resources_used.walltime.split( ':' ) ] + time_executing = timedelta( 0, s, 0, 0, m, h ) + if time_executing > self.job_walltime: + pbs_job_state.fail_message = 'Job ran longer than maximum allowed execution time (%s), please try different job parameters or' \ + % self.app.config.job_walltime + log.warning( '(%s/%s) Dequeueing job since walltime has been reached' \ + % ( galaxy_job_id, job_id ) ) + pbs_job_state.stop_job = True + self.work_queue.put( ( 'fail', pbs_job_state ) ) + continue + elif status.job_state == "C": + # "keep_completed" is enabled in PBS, so try to check exit status + try: + assert int( status.exit_status ) == 0 + log.debug("(%s/%s) PBS job has completed successfully" % ( galaxy_job_id, job_id ) ) + except AssertionError: + pbs_job_state.fail_message = 'Job cannot be completed due to a cluster error. Please retry or' + log.error( '(%s/%s) PBS job failed: %s' % ( galaxy_job_id, job_id, JOB_EXIT_STATUS.get( int( status.exit_status ), 'Unknown error: %s' % status.exit_status ) ) ) + self.work_queue.put( ( 'fail', pbs_job_state ) ) + continue + except AttributeError: + # No exit_status, can't verify proper completion so we just have to assume success. + log.debug("(%s/%s) PBS job has completed" % ( galaxy_job_id, job_id ) ) + self.work_queue.put( ( 'finish', pbs_job_state ) ) + continue + pbs_job_state.old_state = status.job_state + new_watched.append( pbs_job_state ) # Replace the watch list with the updated version self.watched = new_watched @@ -411,9 +446,10 @@ class PBSJobRunner( object ): log.debug("connection to PBS server %s for state check failed" % pbs_server_name ) failures.append( pbs_server_name ) continue - stat_attrl = pbs.new_attrl(2) + stat_attrl = pbs.new_attrl(3) stat_attrl[0].name = pbs.ATTR_state stat_attrl[1].name = pbs.ATTR_used + stat_attrl[2].name = pbs.ATTR_exitstat jobs = pbs.pbs_statjob( c, None, stat_attrl, None ) pbs.pbs_disconnect( c ) statuses.update( self.convert_statjob_to_bunches( jobs ) ) @@ -480,7 +516,8 @@ class PBSJobRunner( object ): """ Seperated out so we can use the worker threads for it. """ - self.stop_job( self.sa_session.query( self.app.model.Job ).get( pbs_job_state.job_wrapper.job_id ) ) + if pbs_job_state.stop_job: + self.stop_job( self.sa_session.query( self.app.model.Job ).get( pbs_job_state.job_wrapper.job_id ) ) pbs_job_state.job_wrapper.fail( pbs_job_state.fail_message ) self.cleanup( ( pbs_job_state.ofile, pbs_job_state.efile, pbs_job_state.job_file ) ) --- a/scripts/scramble/scripts/pbs_python.py +++ b/scripts/scramble/scripts/pbs_python.py @@ -29,8 +29,8 @@ if not os.path.exists( 'setup.py.orig' ) i = open( 'setup.py.orig', 'r' ) o = open( 'setup.py', 'w' ) for line in i.readlines(): - if line == " version = '2.9.0',\n": - line = " version = '2.9.4',\n" + if line == " version = '4.0.0',\n": + line = " version = '4.1.0',\n" print >>o, line, i.close() o.close()

1 0

galaxy-dist commit a27244147802: Modify tools contributed by Chungoo Park (cxp440@psu.edu) for Linear Discriminant Analysis, and drawing Receiver Operating Characteristic plots; add a new tool to 'Generate A Matrix for using PC and LDA'; remove redundant 'Principal Component Analysis' tool.
by commits-noreply＠bitbucket.org 23 Jul '10

23 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Dan Blankenberg <dan(a)bx.psu.edu> # Date 1279306372 14400 # Node ID a27244147802bd74be35ae24665cc3fc57580f23 # Parent 11fb3bb5b25058b55d927d4dfb39dab1112f290b Modify tools contributed by Chungoo Park (cxp440(a)psu.edu) for Linear Discriminant Analysis, and drawing Receiver Operating Characteristic plots; add a new tool to 'Generate A Matrix for using PC and LDA'; remove redundant 'Principal Component Analysis' tool. --- a/test-data/pca_for_lda_output.txt.re_match +++ /dev/null @@ -1,87 +0,0 @@ -\"21\.91439361\d*\" -\"28\.33335569\d*\" -\"33\.54383183\d*\" -\"38\.39279966\d*\" -\"42\.95792886\d*\" -\"47\.09815550\d*\" -\"50\.94487376\d*\" -\"54\.58512629\d*\" -\"58\.11752456\d*\" -\"61\.36666878\d*\" -\"64\.43815007\d*\" -\"67\.37442210\d*\" -\"70\.19430653\d*\" -\"72\.98844264\d*\" -\"75\.67923231\d*\" -\"78\.07658354\d*\" -\"80\.38195983\d*\" -\"82\.59937656\d*\" -\"84\.57494344\d*\" -\"86\.46924307\d*\" -\"88\.07774382\d*\" -\"89\.63410862\d*\" -\"91\.04016474\d*\" -\"92\.37409738\d*\" -\"93\.48214466\d*\" -\"94\.55735875\d*\" -\"95\.53454224\d*\" -\"96\.40615879\d*\" -\"97\.19612900\d*\" -\"97\.88859479\d*\" -\"98\.51598802\d*\" -\"99\.11847048\d*\" -\"99\.57772251\d*\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" -\"100\" --- a/test-data/pca_for_lda_output.txt +++ /dev/null @@ -1,87 +0,0 @@ -"21.9143936160079" -"28.3333556968471" -"33.5438318357343" -"38.3927996693607" -"42.9579288624902" -"47.0981555066747" -"50.9448737656701" -"54.5851262912127" -"58.117524563425" -"61.3666687874078" -"64.4381500783847" -"67.3744221025279" -"70.1943065338356" -"72.9884426407974" -"75.6792323125723" -"78.0765835478277" -"80.381959836786" -"82.5993765654159" -"84.5749434452634" -"86.4692430753353" -"88.0777438285547" -"89.6341086298525" -"91.0401647492736" -"92.3740973813576" -"93.4821446651001" -"94.5573587555456" -"95.534542246934" -"96.4061587978985" -"97.1961290046054" -"97.888594796593" -"98.5159880218074" -"99.11847048736" -"99.5777225170877" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" -"100" --- a/tools/stats/plot_from_lda.xml +++ b/tools/stats/plot_from_lda.xml @@ -1,24 +1,24 @@ -<tool id="plot_for_lda_output1" name="Draw ROC"> +<tool id="plot_for_lda_output1" name="Draw ROC" version="1.0.1"><description>Receiver Operating Characteristic plot</description><command interpreter="sh">r_wrapper.sh $script_file</command><inputs> - <param format="text" name="input" type="data" label="Source file"/> - <param name="my_title" size="30" type="text" value="ROC" label="Title of your plot" help="See syntax below"> - </param> - + <param format="text" name="input" type="data" label="Source file"></param> + <param name="my_title" size="30" type="text" value="My Figure" label="Title of your plot" help="See syntax below"></param> + <param name="X_axis" size="30" type="text" value="Text for X axis" label="Legend of X axis in your plot" help="See syntax below"></param> + <param name="Y_axis" size="30" type="text" value="Text for Y axis" label="Legend of Y axis in your plot" help="See syntax below"></param></inputs><outputs><data format="pdf" name="pdf_output" /> - <data format="txt" name="desc_output" /></outputs><tests><test><param name="input" value="lda_analy_output.txt"/> - <param name="my_title" value="Test Plot"/> + <param name="my_title" value="Test Plot1"/> + <param name="X_axis" value="Test Plot2"/> + <param name="Y_axis" value="Test Plot3"/><output name="pdf_output" file="plot_for_lda_output.pdf"/> - <output name="desc_output" file="empty_file.dat"/></test></tests> @@ -203,7 +203,7 @@ for (i in 1:nfiles) { r<-rez_ext[[i]] #tr - rate<-rbind(rate, c(files_alias[i]," "," "," ") ) + # rate<-rbind(rate, c(files_alias[i]," "," "," ") ) mm<-which((r[3,])==max(r[3,])) m_tr[i]<-mm[1] @@ -213,9 +213,9 @@ pdf(file= paste("${pdf_output}")) - plot(rez_ext[[i]][2,]~rez_ext[[i]][3,], xlim=c(0,100), ylim=c(0,100), xlab="X", ylab="Y", type="l", lty=1, col="blue", xaxt='n', yaxt='n') + plot(rez_ext[[i]][2,]~rez_ext[[i]][3,], xlim=c(0,100), ylim=c(0,100), xlab="${X_axis} [1-FP(False Positive)]", ylab="${Y_axis} [1-FP(False Positive)]", type="l", lty=1, col="blue", xaxt='n', yaxt='n') for (i in 1:nfiles) { - lines(rez_ext[[i]][2,]~rez_ext[[i]][3,], xlab="X", ylab="Y", type="l", lty=1, col=i) + lines(rez_ext[[i]][2,]~rez_ext[[i]][3,], xlab="${X_axis} [1-FP(False Positive)]", ylab="${Y_axis} [1-FP(False Positive)]", type="l", lty=1, col=i) # pt=c(r,) points(x=rez_ext[[i]][3,m_tr[i]],y=rez_ext[[i]][2,m_tr[i]], pch=16, col=i) } @@ -236,27 +236,6 @@ <help> -.. class:: infomark - -**What it does** - -This tool consists of the module to perform the Linear Discriminant Analysis as described in Carrel et al., 2006 (PMID: 17009873) - -*Carrel L, Park C, Tyekucheva S, Dunn J, Chiaromonte F, et al. (2006) Genomic Environment Predicts Expression Patterns on the Human Inactive X Chromosome. PLoS Genet 2(9): e151. doi:10.1371/journal.pgen.0020151* - ------ - -**Example** - -- Input file - - A output file from "LDA" tool is used to perform it. - -- Output file - - There are two files produced: - (1) A summary file for LDA classification success rates with optimal tau - (2) An image file for plot </help> Binary file static/images/tools/lda/second_matrix_generator_example_file.png has changed --- a/test-data/lda_analy_output.txt +++ b/test-data/lda_analy_output.txt @@ -1,119 +1,134 @@ -structure(c(56.1728395061728, 59.5890410958904, 25, 56.1728395061728, -59.5890410958904, 25, 56.1728395061728, 59.5890410958904, 25, -56.1728395061728, 59.5890410958904, 25, 56.1728395061728, 59.5890410958904, -25, 56.1728395061728, 59.5890410958904, 25, 56.1728395061728, -59.5890410958904, 25, 56.1728395061728, 59.5890410958904, 25, -56.1728395061728, 59.5890410958904, 25, 56.1728395061728, 59.5890410958904, -25, 56.1728395061728, 59.5890410958904, 25, 56.1728395061728, -59.5890410958904, 25, 56.1728395061728, 59.5890410958904, 25, -56.1728395061728, 59.5890410958904, 25, 56.1728395061728, 59.5890410958904, -25, 56.1728395061728, 59.5890410958904, 25, 56.1728395061728, -59.5890410958904, 25, 56.1728395061728, 59.5890410958904, 25, -55.5555555555556, 58.9041095890411, 25, 55.5555555555556, 58.9041095890411, -25, 55.5555555555556, 58.9041095890411, 25, 55.5555555555556, -58.9041095890411, 25, 55.5555555555556, 58.9041095890411, 25, -55.5555555555556, 58.9041095890411, 25, 55.5555555555556, 58.9041095890411, -25, 55.5555555555556, 58.9041095890411, 25, 55.5555555555556, -58.9041095890411, 25, 55.5555555555556, 58.9041095890411, 25, -55.5555555555556, 58.9041095890411, 25, 54.9382716049383, 58.2191780821918, -25, 54.9382716049383, 58.2191780821918, 25, 54.9382716049383, -58.2191780821918, 25, 54.9382716049383, 58.2191780821918, 25, -54.9382716049383, 58.2191780821918, 25, 54.9382716049383, 58.2191780821918, -25, 54.9382716049383, 58.2191780821918, 25, 54.9382716049383, -58.2191780821918, 25, 54.9382716049383, 58.2191780821918, 25, -54.9382716049383, 58.2191780821918, 25, 54.320987654321, 57.5342465753425, -25, 53.7037037037037, 56.8493150684932, 25, 53.7037037037037, -56.8493150684932, 25, 53.7037037037037, 56.8493150684932, 25, -53.7037037037037, 56.8493150684932, 25, 53.7037037037037, 56.8493150684932, -25, 53.7037037037037, 56.8493150684932, 25, 53.7037037037037, -56.8493150684932, 25, 53.7037037037037, 56.8493150684932, 25, -53.7037037037037, 56.8493150684932, 25, 53.7037037037037, 56.8493150684932, -25, 53.7037037037037, 56.8493150684932, 25, 53.7037037037037, -56.8493150684932, 25, 53.7037037037037, 56.8493150684932, 25, -53.7037037037037, 56.8493150684932, 25, 53.7037037037037, 56.8493150684932, -25, 53.0864197530864, 56.1643835616438, 25, 53.0864197530864, -56.1643835616438, 25, 52.4691358024691, 55.4794520547945, 25, -52.4691358024691, 55.4794520547945, 25, 52.4691358024691, 55.4794520547945, -25, 52.4691358024691, 55.4794520547945, 25, 52.4691358024691, -55.4794520547945, 25, 52.4691358024691, 55.4794520547945, 25, -52.4691358024691, 55.4794520547945, 25, 52.4691358024691, 55.4794520547945, -25, 51.8518518518519, 54.7945205479452, 25, 51.8518518518519, -54.7945205479452, 25, 51.8518518518519, 54.7945205479452, 25, -51.8518518518519, 54.7945205479452, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 51.2345679012346, 54.1095890410959, -25, 51.2345679012346, 54.1095890410959, 25, 51.2345679012346, -54.1095890410959, 25, 51.2345679012346, 54.1095890410959, 25, -51.2345679012346, 54.1095890410959, 25, 50.6172839506173, 53.4246575342466, -25, 50.6172839506173, 53.4246575342466, 25, 50.6172839506173, -53.4246575342466, 25, 50.6172839506173, 53.4246575342466, 25, -50.6172839506173, 53.4246575342466, 25, 50.6172839506173, 53.4246575342466, -25, 50.6172839506173, 53.4246575342466, 25, 50.6172839506173, -53.4246575342466, 25, 50.6172839506173, 53.4246575342466, 25, -50.6172839506173, 53.4246575342466, 25, 51.2345679012346, 53.4246575342466, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 50.6172839506173, 52.7397260273973, 31.25, 50.6172839506173, -52.7397260273973, 31.25, 50.6172839506173, 52.7397260273973, -31.25, 49.3827160493827, 51.3698630136986, 31.25, 49.3827160493827, -51.3698630136986, 31.25, 49.3827160493827, 51.3698630136986, -31.25, 49.3827160493827, 51.3698630136986, 31.25, 49.3827160493827, -51.3698630136986, 31.25, 49.3827160493827, 51.3698630136986, -31.25, 49.3827160493827, 51.3698630136986, 31.25, 48.1481481481482, -50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, -31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, -48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, -50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, -31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, -48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, -50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, -31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, -48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, -50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, -31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, -48.1481481481482, 50, 31.25, 48.1481481481482, 50, 31.25, 48.1481481481482, -50, 31.25, 47.5308641975309, 49.3150684931507, 31.25, 46.9135802469136, -48.6301369863014, 31.25, 46.9135802469136, 48.6301369863014, -31.25, 46.9135802469136, 48.6301369863014, 31.25, 46.9135802469136, -48.6301369863014, 31.25, 46.9135802469136, 48.6301369863014, -31.25, 46.9135802469136, 48.6301369863014, 31.25, 46.9135802469136, -48.6301369863014, 31.25, 46.9135802469136, 48.6301369863014, -31.25, 46.9135802469136, 48.6301369863014, 31.25, 46.9135802469136, -48.6301369863014, 31.25, 46.9135802469136, 48.6301369863014, -31.25, 46.2962962962963, 47.945205479452, 31.25, 46.2962962962963, -47.945205479452, 31.25, 46.9135802469136, 47.945205479452, 37.5, -46.9135802469136, 47.945205479452, 37.5, 46.9135802469136, 47.945205479452, -37.5, 46.9135802469136, 47.945205479452, 37.5, 46.9135802469136, -47.945205479452, 37.5, 46.9135802469136, 47.945205479452, 37.5, -46.9135802469136, 47.945205479452, 37.5, 46.9135802469136, 47.945205479452, -37.5, 46.2962962962963, 47.2602739726027, 37.5, 46.2962962962963, -47.2602739726027, 37.5, 46.2962962962963, 47.2602739726027, 37.5, -46.2962962962963, 47.2602739726027, 37.5, 46.2962962962963, 47.2602739726027, -37.5, 46.2962962962963, 47.2602739726027, 37.5, 46.2962962962963, -47.2602739726027, 37.5, 46.2962962962963, 47.2602739726027, 37.5, -45.679012345679, 46.5753424657534, 37.5, 45.679012345679, 46.5753424657534, -37.5, 45.679012345679, 46.5753424657534, 37.5, 45.679012345679, -46.5753424657534, 37.5, 45.679012345679, 46.5753424657534, 37.5, -45.679012345679, 46.5753424657534, 37.5, 53.7037037037037, 53.4246575342466, -56.25), .Dim = c(3L, 201L)) +structure(c(37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, +23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, 62.5, +37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, +62.5, 37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, +23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, 62.5, +37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, +62.5, 37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, +23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, 62.5, +37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, +62.5, 37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, +23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, 62.5, +37.9310344827586, 23.6363636363636, 62.5, 37.9310344827586, 23.6363636363636, +62.5, 37.9310344827586, 23.6363636363636, 62.5, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 39.0804597701149, 23.6363636363636, +65.625, 39.0804597701149, 23.6363636363636, 65.625, 39.0804597701149, +23.6363636363636, 65.625, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 40.2298850574713, 23.6363636363636, +68.75, 40.2298850574713, 23.6363636363636, 68.75, 40.2298850574713, +23.6363636363636, 68.75, 39.0804597701149, 21.8181818181818, +68.75, 39.0804597701149, 21.8181818181818, 68.75, 39.0804597701149, +21.8181818181818, 68.75, 39.0804597701149, 21.8181818181818, +68.75, 39.0804597701149, 21.8181818181818, 68.75, 39.0804597701149, +21.8181818181818, 68.75, 39.0804597701149, 21.8181818181818, +68.75, 39.0804597701149, 21.8181818181818, 68.75, 39.0804597701149, +21.8181818181818, 68.75, 40.2298850574713, 21.8181818181818, +71.875, 40.2298850574713, 21.8181818181818, 71.875, 40.2298850574713, +21.8181818181818, 71.875, 40.2298850574713, 21.8181818181818, +71.875, 40.2298850574713, 21.8181818181818, 71.875, 40.2298850574713, +21.8181818181818, 71.875, 40.2298850574713, 21.8181818181818, +71.875, 41.3793103448276, 21.8181818181818, 75, 42.5287356321839, +21.8181818181818, 78.125, 42.5287356321839, 21.8181818181818, +78.125, 42.5287356321839, 21.8181818181818, 78.125, 42.5287356321839, +21.8181818181818, 78.125, 42.5287356321839, 21.8181818181818, +78.125, 42.5287356321839, 21.8181818181818, 78.125, 42.5287356321839, +21.8181818181818, 78.125, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 43.6781609195402, 21.8181818181818, 81.25, 43.6781609195402, +21.8181818181818, 81.25, 43.6781609195402, 21.8181818181818, +81.25, 56.3218390804598, 78.1818181818182, 18.75), .Dim = c(3L, +201L)) Binary file static/images/tools/lda/first_matrix_generator_example_file.png has changed Binary file static/images/tools/lda/LDA_Input.png has changed --- /dev/null +++ b/tools/stats/generate_matrix_for_pca_lda.xml @@ -0,0 +1,48 @@ +<tool id="generate_matrix_for_pca_and_lda1" name="Generate A Matrix"> + <description>for using PC and LDA</description> + <command interpreter="perl">generate_matrix_for_pca_lda.pl $input_1 $input_2 $output</command> + + <inputs> + <param format="tabular" name="input_1" type="data" label="Source file First: a matrix (samples/observations in rows and variables/features in columns)"></param> + <param format="tabular" name="input_2" type="data" label="Source file Second: a table (samples/observations with response/class label)"></param> + </inputs> + + <outputs> + <data format="tabular" name="output" /> + </outputs> + + <tests> + <test> + <param name="input_1" value="matrix_generator_for_pc_and_lda_input_1.tabular"/> + <param name="input_2" value="matrix_generator_for_pc_and_lda_input_2.tabular"/> + <output name="output" file="matrix_generator_for_pc_and_lda_output.tabular"/> + </test> + </tests> + + <help> + +.. class:: infomark + +**What it does** + +This tool consists of a module to generate a matrix to be used for running the Linear Discriminant Analysis as described in Carrel et al., 2006 (PMID: 17009873) + +*Carrel L, Park C, Tyekucheva S, Dunn J, Chiaromonte F, et al. (2006) Genomic Environment Predicts Expression Patterns on the Human Inactive X Chromosome. PLoS Genet 2(9): e151. doi:10.1371/journal.pgen.0020151* + +----- + +**Example** + +- Input file (Source file First) + +.. image:: ../static/images/tools/lda/first_matrix_generator_example_file.png + + +- Input file (Source file Second) + +.. image:: ../static/images/tools/lda/second_matrix_generator_example_file.png + + +</help> + +</tool> --- /dev/null +++ b/tools/stats/generate_matrix_for_pca_lda.pl @@ -0,0 +1,147 @@ +#!/usr/bin/perl -w + +use strict; +use warnings; + +my $Input_Matrix = $ARGV[0]; +my $Input_Label = $ARGV[1]; + +my %Hash_X = (); +my %Hash_Y = (); +my $My_Num_X = 0; +my $My_Num_Y = 0; + +open (OUT, "> $ARGV[2]"); + +open (LABEL, "< $Input_Label") || + die "Sorry, I couldn't open the escape.txt for clone: $!\n"; + +my $Label_Index = 0; +my $X_Label; +my $input_Label; +while (defined($input_Label = <LABEL>)){ + chomp($input_Label); + my @cArray_Label = $input_Label =~ /(\S+)\s*/g; + if ($input_Label =~ /\w/){ + if ($Label_Index == 0){ + $Hash_X{$cArray_Label[0]} = $cArray_Label[1]; + $X_Label = $cArray_Label[1]; + $Label_Index = 1; + }else{ + if ($cArray_Label[1] eq $X_Label){ + $Hash_X{$cArray_Label[0]} = $cArray_Label[1]; + }else{ + $Hash_Y{$cArray_Label[0]} = $cArray_Label[1]; + } + } + } +} +close(LABEL); + +open (MATRIX, "< $Input_Matrix") || + die "Sorry, I couldn't open the escape.txt for clone: $!\n"; + +my %Hash_Matrix = (); +my %Hash_Features = (); +my @cArray_Features = (); + +my %Hash_Sum = (); +my $Matrix_Index = 0; +my $input_Matrix; +while (defined($input_Matrix = <MATRIX>)){ + chomp($input_Matrix); + my @cArray_Matrix = $input_Matrix =~ /(\S+)\s*/g; + if ($input_Matrix =~ /\w/){ + if ($Matrix_Index == 0){ + @cArray_Features = @cArray_Matrix; + my $Temp_Num_Array = scalar(@cArray_Matrix); + my $Temp_Index = 0; + for(;$Temp_Index < $Temp_Num_Array; $Temp_Index++){ + $Hash_Features{$cArray_Matrix[$Temp_Index]} = "BOL"; + $Hash_Sum{$cArray_Matrix[$Temp_Index]} = 0; + } + $Matrix_Index = 1; + }else{ + $Hash_Matrix{$cArray_Matrix[0]} = $input_Matrix; + } + } +} +close(MATRIX); + +my $Trace_Key; + +foreach $Trace_Key (sort {$a cmp $b} keys %Hash_X){ + my @cArray_Trace_X = $Hash_Matrix{$Trace_Key} =~ /(\S+)\s*/g; + my $Num_Array_Feature_X = scalar(@cArray_Features); + my $Index_Feature_X = 0; + for(;$Index_Feature_X < $Num_Array_Feature_X; $Index_Feature_X++){ + if ($Hash_Features{$cArray_Features[$Index_Feature_X]} eq "BOL"){ + $Hash_Features{$cArray_Features[$Index_Feature_X]} = $cArray_Trace_X[$Index_Feature_X + 1]; + }else{ + $Hash_Features{$cArray_Features[$Index_Feature_X]} = $Hash_Features{$cArray_Features[$Index_Feature_X]} . "\t" . $cArray_Trace_X[$Index_Feature_X + 1]; + } + + $Hash_Sum{$cArray_Features[$Index_Feature_X]} += $cArray_Trace_X[$Index_Feature_X + 1]; + } + $My_Num_X ++; +} + +my $Append_Key; +foreach $Append_Key (keys %Hash_Features){ + $Hash_Features{$Append_Key} = $Hash_Features{$Append_Key} . "\t" . $Hash_Sum{$Append_Key}; + $Hash_Sum{$Append_Key} = 0; +} + +foreach $Trace_Key (sort {$a cmp $b} keys %Hash_Y){ + my @cArray_Trace_Y = $Hash_Matrix{$Trace_Key} =~ /(\S+)\s*/g; + my $Num_Array_Feature_Y = scalar(@cArray_Features); + my $Index_Feature_Y = 0; + for(;$Index_Feature_Y < $Num_Array_Feature_Y; $Index_Feature_Y++){ + if ($Hash_Features{$cArray_Features[$Index_Feature_Y]} eq "BOL"){ + $Hash_Features{$cArray_Features[$Index_Feature_Y]} = $cArray_Trace_Y[$Index_Feature_Y + 1]; + }else{ + $Hash_Features{$cArray_Features[$Index_Feature_Y]} = $Hash_Features{$cArray_Features[$Index_Feature_Y]} . "\t" . $cArray_Trace_Y[$Index_Feature_Y + 1]; + } + + $Hash_Sum{$cArray_Features[$Index_Feature_Y]} += $cArray_Trace_Y[$Index_Feature_Y + 1]; + } + $My_Num_Y ++; +} + +foreach $Append_Key (keys %Hash_Features){ + $Hash_Features{$Append_Key} = $Hash_Features{$Append_Key} . "\t" . $Hash_Sum{$Append_Key} . "\t" . "EOL"; +} + +my $Prt_Key; +print OUT " \t"; +foreach $Prt_Key (sort {$a cmp $b} keys %Hash_X){ + print OUT "$Prt_Key \t"; +} +print OUT "X(SUM) \t"; + +foreach $Prt_Key (sort {$a cmp $b} keys %Hash_Y){ + print OUT "$Prt_Key \t"; +} +print OUT "Y(SUM) \t"; +print OUT "\n"; + +my $Prt_Index = 0; +my $Prt_Array_Num = scalar (@cArray_Features); +for(;$Prt_Index < $Prt_Array_Num; $Prt_Index++){ + print OUT "$cArray_Features[$Prt_Index] \t$Hash_Features{$cArray_Features[$Prt_Index]}\n"; +} + +print OUT " \t"; +my $My_Label_Index = 0; +for(;$My_Label_Index < $My_Num_X; $My_Label_Index++){ + print OUT "X \t"; +} +print OUT " \t"; + +$My_Label_Index = 0; +for(;$My_Label_Index < $My_Num_Y; $My_Label_Index++){ + print OUT "Y \t"; +} +print OUT " \t\n"; + +close(OUT); --- a/tools/stats/pca_for_lda.xml +++ /dev/null @@ -1,245 +0,0 @@ -<tool id="pca_for_lda1" name="Calculate PC"> - <description>Principal Component Analysis</description> - <command interpreter="sh">r_wrapper.sh $script_file</command> - - <inputs> - <param format="tabular" name="input" type="data" label="Source file"> - </param> - </inputs> - - <outputs> - <data format="txt" name="output" /> - </outputs> - - <tests> - <test> - <param name="input" value="pca_and_lda_analy_for_lda_input.tabular"/> - <output name="output" file="pca_for_lda_output.txt.re_match" compare="re_match"/> - </test> - </tests> - - <configfiles> - <configfile name="script_file"> - - rm(list = objects() ) - - ############# FORMAT X DATA ######################### - format<-function(data) { - ind=NULL - for(i in 1 : ncol(data)){ - if (is.na(data[nrow(data),i])) { - ind<-c(ind,i) - } - } - #print(is.null(ind)) - if (!is.null(ind)) { - data<-data[,-c(ind)] - } - - data - } - - ########GET RESPONSES ############################### - get_resp<- function(data) { - resp1<-as.vector(data[,ncol(data)]) - resp=numeric(length(resp1)) - for (i in 1:length(resp1)) { - if (resp1[i]=="Control ") { - resp[i] = 0 - } - if (resp1[i]=="XLMR ") { - resp[i] = 1 - } - } - return(resp) - } - - ######## CHARS TO NUMBERS ########################### - f_to_numbers<- function(F) { - ind<-NULL - G<-matrix(0,nrow(F), ncol(F)) - for (i in 1:nrow(F)) { - for (j in 1:ncol(F)) { - G[i,j]<-as.integer(F[i,j]) - } - } - return(G) - } - - ###################NORMALIZING######################### - norm <- function(M, a=NULL, b=NULL) { - C<-NULL - ind<-NULL - - for (i in 1: ncol(M)) { - if (sd(M[,i])!=0) { - M[,i]<-(M[,i]-mean(M[,i]))/sd(M[,i]) - } - # else {print(mean(M[,i]))} - } - return(M) - } - - ##### LDA DIRECTIONS ################################# - lda_dec <- function(data, k){ - priors=numeric(k) - grandmean<-numeric(ncol(data)-1) - means=matrix(0,k,ncol(data)-1) - B = matrix(0, ncol(data)-1, ncol(data)-1) - N=nrow(data) - for (i in 1:k){ - priors[i]=sum(data[,1]==i)/N - grp=subset(data,data\$group==i) - means[i,]=mean(grp[,2:ncol(data)]) - #print(means[i,]) - #print(priors[i]) - #print(priors[i]*means[i,]) - grandmean = priors[i]*means[i,] + grandmean - } - - for (i in 1:k) { - B= B + priors[i]*((means[i,]-grandmean)%*%t(means[i,]-grandmean)) - } - - W = var(data[,2:ncol(data)]) - svdW = svd(W) - inv_sqrtW =solve(svdW\$v %*% diag(sqrt(svdW\$d)) %*% t(svdW\$v)) - B_star= t(inv_sqrtW)%*%B%*%inv_sqrtW - B_star_decomp = svd(B_star) - directions = inv_sqrtW%*%B_star_decomp\$v - return( list(directions, B_star_decomp\$d) ) - } - - ################ NAIVE BAYES FOR 1D SIR OR LDA ############## - naive_bayes_classifier <- function(resp, tr_data, test_data, k=2, tau) { - tr_data=data.frame(resp=resp, dir=tr_data) - means=numeric(k) - #print(k) - cl=numeric(k) - predclass=numeric(length(test_data)) - for (i in 1:k) { - grp = subset(tr_data, resp==i) - means[i] = mean(grp\$dir) - #print(i, means[i]) - } - cutoff = tau*means[1]+(1-tau)*means[2] - #print(tau) - #print(means) - #print(cutoff) - if (cutoff>means[1]) { - cl[1]=1 - cl[2]=2 - } - else { - cl[1]=2 - cl[2]=1 - } - - for (i in 1:length(test_data)) { - - if (test_data[i] <= cutoff) { - predclass[i] = cl[1] - } - else { - predclass[i] = cl[2] - } - } - #print(means) - #print(mean(means)) - #X11() - #plot(test_data,pch=predclass, col=resp) - predclass - } - - ################# EXTENDED ERROR RATES ################# - ext_error_rate <- function(predclass, actualclass,msg=c("you forgot the message"), pr=1) { - er=sum(predclass != actualclass)/length(predclass) - - matr<-data.frame(predclass=predclass,actualclass=actualclass) - escapes = subset(matr, actualclass==1) - subjects = subset(matr, actualclass==2) - er_esc=sum(escapes\$predclass != escapes\$actualclass)/length(escapes\$predclass) - er_subj=sum(subjects\$predclass != subjects\$actualclass)/length(subjects\$predclass) - - if (pr==1) { - # print(paste(c(msg, 'overall : ', (1-er)*100, "%."),collapse=" ")) - # print(paste(c(msg, 'within escapes : ', (1-er_esc)*100, "%."),collapse=" ")) - # print(paste(c(msg, 'within subjects: ', (1-er_subj)*100, "%."),collapse=" ")) - } - return(c((1-er)*100, (1-er_esc)*100, (1-er_subj)*100)) - } - - ## Main Function ## - - files<-matrix("${input}", 1,1, byrow=T) - - tau<-seq(0,1, by=0.005) - #tau<-seq(0,1, by=0.1) - for_curve=matrix(-10, 3,length(tau)) - - ############################################################## - - test_data_whole_X <-read.delim(files[1,1], row.names=1) - - #### FORMAT TRAINING DATA #################################### - # get only necessary columns - - test_data_whole_X<-format(test_data_whole_X) - oligo_labels<-test_data_whole_X[1:(nrow(test_data_whole_X)-1),ncol(test_data_whole_X)] - test_data_whole_X<-test_data_whole_X[,1:(ncol(test_data_whole_X)-1)] - - X_names<-colnames(test_data_whole_X)[1:ncol(test_data_whole_X)] - test_data_whole_X<-t(test_data_whole_X) - resp<-get_resp(test_data_whole_X) - ldaqda_resp = resp + 1 - a<-sum(resp) # Number of Subject - b<-length(resp) - a # Number of Escape - ## FREQUENCIES ################################################# - F<-test_data_whole_X[,1:(ncol(test_data_whole_X)-1)] - F<-f_to_numbers(F) - FN<-norm(F, a, b) - ss<-svd(FN) - eigvar<-NULL - eig<-ss\$d^2 - - print_out <- c(rep('NA', length(ss\$d))) - for ( i in 1:length(ss\$d)) { - eigvar[i]<-sum(eig[1:i])/sum(eig) - # print(paste(c("Variance explained : ", eigvar[i]*100, " %"), collapse="")) - print_out[i] <- c(eigvar[i]*100) - } - - #My_Names <- c("order of principal components", "accumulated variance explained", "") - #names(print_out)<-My_Names - - write.table(print_out, file = "${output}", col.names = FALSE, row.names = FALSE) - - </configfile> - </configfiles> - - <help> - -.. class:: infomark - -**What it does** - -This tool consists of the first module to perform the Principal Component Analysis as described in Carrel et al., 2006 (PMID: 17009873) - -*Carrel L, Park C, Tyekucheva S, Dunn J, Chiaromonte F, et al. (2006) Genomic Environment Predicts Expression Patterns on the Human Inactive X Chromosome. PLoS Genet 2(9): e151. doi:10.1371/journal.pgen.0020151* - ------ - -**Example** - -- Input file - -.. image:: ../static/images/tools/lda/LDA_Input.png - -- Output file - - The values indicate "accumulated variance explained (%)" and were sorted by the order of principal components (i.e., 1st, 2nd, 3rd, 4th...). - - -</help> - -</tool> --- a/tool_conf.xml.sample +++ b/tool_conf.xml.sample @@ -129,7 +129,7 @@ <tool file="stats/gsummary.xml" /><tool file="filters/uniq.xml" /><tool file="stats/cor.xml" /> - <tool file="stats/pca_for_lda.xml" /> + <tool file="stats/generate_matrix_for_pca_lda.xml" /><tool file="stats/lda_analy.xml" /><tool file="stats/plot_from_lda.xml" /><tool file="regVariation/t_test_two_samples.xml" /> --- /dev/null +++ b/test-data/matrix_generator_for_pc_and_lda_input_1.tabular @@ -0,0 +1,88 @@ + CTCAGAAAAAAA CCATCCATCCATCCATCC AGAGGGAGAGGG ATCTGTTGATGG AGGGGTGACAGA ATCCCATTTACA AAGGGTATCAGT ATAATATATACA TATTATATGTATA CATCATCATCATCA CCCCATGATCCA TGATTCCTTAAAGAA AACACTTGGACA TATTAATATAATTTATAAA ATGTTTATTTTA TATCCAGGAGAA CAATTCTCCTGCC ACCATAAAAATGAGAAC CTCAGCCATAAAAA ATGAGTGAGAATAT ACTGTGAGTCAA TAAACTAAAGAGCTTCTGCACAGCA TCCTTTGGGTATAT ATGCTGCTATAAAGACACATGCACAC TGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTG TCCACAATGGTTGAA TTGGCTGCATAAAT CAATGGAACAGAACAGAGCCCTCAGAAATAA GAACTAGAAATACCATTTGACCCAGC TCAAAGATCAGAT GACCCATCTCACACCAGTTAGAATG CAAAGAGAATAAAATACCTAGGTATTTTATTCTTTTT AAAATGGCCATACTGCCCAAG AAATTTTTCCAATTCTGTGAAGAAA TTCACAATAGCAAAGAC TAGGAAGAATCAATA CTGGGTATATACCCAGTAATGG TTGGAAGTTCTGGCCAGGGCAA TGAAGGACCTCTTCAAGGAGAACTACAAACCACTGCTCAA CTCTCACCACTCCTATTCAACATA ACACACACATAT CCATGAGCATGGAATGTTCT AACCCAAATGTCC TTTACAAGAAAAAAACAAACAACCCCA AGTTCTAGATCC CTGGAAGCATTCCCTTT GATCCCATTTGTC GATATGAACAGACA TGGGTTTGTCATA AGTTTCTTTTGC A ATTACCCAGTCT TCATGTCATCTGCAAACA AAAACTCTCAAT CAACTTCAGCAA ATTGGCTGTGGGTTT AATTAGATCCCA GACCTAAAACCA AGAATCTACAAT AAACCAAACACC GGAGTTCCAGACCAGCCT CCTTGCCCATGCC ATCATACTGAAT AGGAAGTCAAATT AGCAAACCACCA AGCATGATTTATA GAGAGCCAAATCATGAGTG AAAGAGCCAAAT AGCAATTGTGAATGG CACTATTCACAATTGC TCAGGATACAAAATCAATG CAATGAACTCAAACAAATT TCCAACAAAGGACATGAACTCATC AATTGAACAATGAGAACATGTGGTAT AACCACAATGAGAACA GTGTATATATATACAT ACATATATATAT GTCAGATGGATA TCATCTGACAAAGGGCTAATATCCAG TTTGTCAGGTTTGTCAAAG TTCTTAATCCAGTCTATCATT TTTAATCCATCTTGA AAACTACCATCAGAGTGAACAGGCA GCCATCAGAGAAAT ACAGGCAACCTACA GCAACCTACTCA TAGAAAACCCCAT +ARHGEF9 0 0 0 0 0 0 0 0 1 0 0 0 2 0 1 0 0 3 1 0 0 0 0 1 2 2 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 +KIAA2022 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 +ZDHHC15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +PGK1 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 +BRWD3 0 1 0 0 1 0 0 1 0 1 1 0 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 +PAK3 0 1 2 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 +AGTR2 0 0 2 0 0 1 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +GRIA3 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 +ZDHHC9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 +GPC3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +PHF6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +SOX3 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 +FMR1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +MTM1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 +SHROOM4 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 2 0 1 0 0 1 1 0 1 1 0 0 0 0 0 1 0 0 0 2 0 1 +KLF8 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 +UXT 1 1 0 1 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 0 2 0 2 0 0 0 0 0 2 2 0 3 1 2 2 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 0 0 1 1 1 2 0 1 2 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 +PCSK1N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +PAGE1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +PAGE4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +CLCN5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +NUDT10 2 0 1 1 0 0 1 0 1 0 0 0 1 1 0 2 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +NUDT11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +GSPT2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TMEM29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +GPR173 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +HUWE1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TSR2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TRO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +MAGEH1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +USP51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +RRAGB 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 2 0 1 2 0 0 0 0 2 0 0 0 0 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 +UBQLN2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +SPIN3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +SPIN2A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +ZXDA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +SPIN4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +MTMR8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +KIAA1166 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +LAS1L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +MSN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +HEPH 0 0 1 0 2 0 0 2 0 1 0 0 1 2 0 0 0 1 0 0 0 0 0 1 1 1 0 0 2 0 2 0 0 0 0 0 2 0 0 0 1 2 3 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 3 1 1 1 1 0 0 0 0 0 2 1 1 0 +AR 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +STARD8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TMEM28 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +EDA 0 1 3 2 1 1 3 0 1 0 1 0 0 0 2 2 1 1 0 2 2 2 4 0 0 0 1 0 4 2 0 1 1 1 1 1 1 0 1 0 0 1 0 0 3 0 2 1 0 1 0 1 4 3 1 1 1 1 0 1 1 2 0 3 2 1 0 0 1 2 2 1 2 0 0 0 1 0 0 1 3 1 0 0 0 0 +SLC7A3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +CXCR3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +NAP1L2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +CHIC1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +FXYD8 0 3 4 0 2 0 3 1 1 0 1 0 0 0 0 2 2 1 0 2 3 3 2 0 0 0 0 0 2 3 0 2 1 4 1 1 1 0 1 1 1 2 1 0 2 0 2 1 2 2 0 2 4 1 2 0 1 2 1 1 1 2 3 4 2 1 1 2 0 1 2 1 1 0 0 1 0 0 0 3 1 1 0 0 0 1 +RNF12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +UPRT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +MAGEE2 1 3 2 0 1 0 2 0 0 0 0 1 0 0 0 2 0 1 0 0 2 0 2 0 0 0 0 0 2 3 0 4 1 0 1 1 1 2 0 0 0 0 2 0 2 0 1 0 1 2 0 1 3 0 0 2 1 1 0 0 0 1 3 1 1 0 0 2 0 2 0 1 0 0 0 0 0 0 0 0 2 1 0 1 0 0 +CXorf26 0 4 3 1 0 0 3 0 1 0 0 0 0 0 0 2 0 1 0 0 3 1 2 0 0 0 0 0 2 2 0 2 1 2 3 1 2 2 1 0 0 0 1 0 1 0 1 1 2 3 0 1 3 3 1 3 1 1 0 2 0 2 2 3 3 0 0 2 0 3 1 1 1 0 0 0 1 1 0 1 3 1 0 0 1 0 +MAGEE1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +FGF16 0 3 3 1 2 0 3 1 1 0 1 0 0 0 0 2 2 1 0 0 3 2 1 0 0 0 1 0 2 3 0 2 1 2 2 0 0 2 0 0 0 1 0 0 1 0 1 0 1 0 0 1 4 2 1 0 1 2 0 0 0 2 1 3 2 1 0 2 1 3 2 1 0 0 0 0 0 1 0 1 0 1 1 0 0 1 +TAF9B 0 1 2 1 1 0 1 0 1 0 1 0 0 0 0 2 0 1 0 2 2 1 3 0 0 0 1 0 2 2 0 2 1 1 1 1 1 1 2 1 0 2 1 0 1 0 1 0 1 1 0 0 3 2 1 1 1 0 0 1 1 2 1 2 1 1 0 0 1 2 1 1 1 0 0 0 1 0 0 1 2 1 1 0 0 1 +CYSLTR1 0 3 3 1 0 0 2 1 0 0 0 0 0 0 0 0 1 1 0 2 2 2 3 0 0 0 1 0 2 1 1 1 1 2 1 1 1 1 1 0 0 1 0 0 1 0 2 0 0 1 0 1 1 2 0 1 0 2 0 1 1 2 1 2 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 0 +ZCCHC5 0 3 4 2 1 0 2 0 0 1 1 0 1 0 0 0 1 1 0 3 3 3 2 0 0 0 1 0 4 2 0 1 2 1 1 1 1 4 1 1 1 4 4 1 2 0 2 1 0 2 0 1 1 2 3 2 1 2 1 1 0 0 2 1 1 0 1 2 2 2 2 1 0 0 0 0 1 0 0 3 4 1 1 2 0 2 +GPR23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +P2RY10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +ITM2A 0 2 2 1 0 0 2 0 1 0 0 0 1 0 0 2 0 1 0 0 3 0 2 0 0 0 0 0 2 2 0 3 1 0 2 0 0 2 1 1 0 0 2 0 1 0 1 0 1 2 0 0 2 3 2 0 0 1 0 2 1 1 3 5 0 0 1 2 1 3 1 2 0 1 1 0 0 0 0 0 3 0 0 0 2 0 +TBX22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +NSBP1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 2 1 1 0 0 1 0 0 3 2 1 1 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 1 1 2 +RPS6KA6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +HDX 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +APOOL 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 +ZNF711 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +POF1B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +CHM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +DACH2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +KLHL4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 +CPXCR1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 +PABPC5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +DIAPH2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +SYTL4 0 0 1 1 2 0 1 3 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 5 1 0 0 1 0 1 1 0 0 0 0 0 3 2 0 0 0 1 3 0 0 0 0 1 0 0 2 6 0 1 2 1 0 0 0 1 1 1 1 0 0 2 3 1 0 1 0 0 3 1 2 0 +CSTF2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 0 2 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 +DRP2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TCEAL2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +BEX5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +NXF3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +BEX4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +TCEAL4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +ESX1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +IL1RAPL2 0 3 2 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 2 3 2 3 0 0 0 0 0 2 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 +NRK 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 --- a/tools/stats/lda_analy.xml +++ b/tools/stats/lda_analy.xml @@ -1,9 +1,9 @@ -<tool id="lda_analy1" name="Perform LDA"> +<tool id="lda_analy1" name="Perform LDA" version="1.0.1"><description>Linear Discriminant Analysis</description><command interpreter="sh">r_wrapper.sh $script_file</command><inputs><param format="tabular" name="input" type="data" label="Source file"/> - <param name="cond" size="30" type="integer" value="3" label="Number of principal components" help="See syntax below"> + <param name="cond" size="30" type="integer" value="3" label="Number of principal components" help="See TIP below"><validator type="empty_field" message="Enter a valid number of principal components, see syntax below for examples"/></param> @@ -14,7 +14,7 @@ <tests><test> - <param name="input" value="pca_and_lda_analy_for_lda_input.tabular"/> + <param name="input" value="matrix_generator_for_pc_and_lda_output.tabular"/><output name="output" file="lda_analy_output.txt"/><param name="cond" value="2"/> @@ -47,10 +47,10 @@ resp1<-as.vector(data[,ncol(data)]) resp=numeric(length(resp1)) for (i in 1:length(resp1)) { - if (resp1[i]=="Control ") { + if (resp1[i]=="Y ") { resp[i] = 0 } - if (resp1[i]=="XLMR ") { + if (resp1[i]=="X ") { resp[i] = 1 } } @@ -256,6 +256,12 @@ <help> +.. class:: infomark + +**TIP:** If you want to perform Principal Component Analysis (PCA) on the give numeric input data (which corresponds to the "Source file First in "Generate A Matrix" tool), please use *Multivariate Analysis/Principal Component Analysis* + +----- + .. class:: infomark **What it does** @@ -266,16 +272,13 @@ This tool consists of the module to perf ----- -**Example** +.. class:: warningmark -- Input file +**Note** -.. image:: ../static/images/tools/lda/LDA_Input.png +- Output from "Generate A Matrix" tool is used as input file for this tool +- Output of this tool contains LDA classification success rates for different values of the turning parameter tau (from 0 to 1 with 0.005 interval). This output file will be used to establish the ROC plot, and you can obtain more detail information from this plot. -- Output file - - LDA classification success rates for different values of the turning parameter tau (from 0 to 1 with 0.005 interval). - This output file will be used to establish the ROC plot. </help> --- /dev/null +++ b/test-data/matrix_generator_for_pc_and_lda_output.tabular @@ -0,0 +1,88 @@ + AGTR2 ARHGEF9 BRWD3 CLCN5 FMR1 GPC3 GPR173 GRIA3 GSPT2 HUWE1 KIAA2022 KLF8 MAGEH1 MTM1 NUDT10 NUDT11 PAGE1 PAGE4 PAK3 PCSK1N PGK1 PHF6 RRAGB SHROOM4 SOX3 TMEM29 TRO TSR2 USP51 UXT ZDHHC15 ZDHHC9 X(SUM) APOOL AR BEX4 BEX5 CHIC1 CHM CPXCR1 CSTF2 CXCR3 CXorf26 CYSLTR1 DACH2 DIAPH2 DRP2 EDA ESX1 FGF16 FXYD8 GPR23 HDX HEPH IL1RAPL2 ITM2A KIAA1166 KLHL4 LAS1L MAGEE1 MAGEE2 MSN MTMR8 NAP1L2 NRK NSBP1 NXF3 P2RY10 PABPC5 POF1B RNF12 RPS6KA6 SLC7A3 SPIN2A SPIN3 SPIN4 STARD8 SYTL4 TAF9B TBX22 TCEAL2 TCEAL4 TMEM28 UBQLN2 UPRT ZCCHC5 ZNF711 ZXDA Y(SUM) +CTCAGAAAAAAA 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 EOL +CCATCCATCCATCCATCC 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 4 3 0 0 0 1 0 3 3 0 0 0 3 2 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 3 0 0 26 EOL +AGAGGGAGAGGG 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 3 3 0 0 0 3 0 3 4 0 0 1 2 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 4 0 0 30 EOL +ATCTGTTGATGG 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0 0 0 1 1 0 0 0 2 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 2 0 0 11 EOL +AGGGGTGACAGA 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 2 2 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 1 0 0 12 EOL +ATCCCATTTACA 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 EOL +AAGGGTATCAGT 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 8 0 0 0 0 0 0 0 0 0 3 2 0 0 0 3 0 3 3 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 2 0 0 22 EOL +ATAATATATACA 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 8 EOL +TATTATATGTATA 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 6 EOL +CATCATCATCATCA 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 4 EOL +CCCCATGATCCA 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 7 EOL +TGATTCCTTAAAGAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 EOL +AACACTTGGACA 1 2 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 5 EOL +TATTAATATAATTTATAAA 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 EOL +ATGTTTATTTTA 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 EOL +TATCCAGGAGAA 1 0 1 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 8 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 0 2 2 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 14 EOL +CAATTCTCCTGCC 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 2 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 11 EOL +ACCATAAAAATGAGAAC 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 10 EOL +CTCAGCCATAAAAA 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 EOL +ATGAGTGAGAATAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 13 EOL +ACTGTGAGTCAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 3 2 0 0 0 2 0 3 3 0 0 0 3 3 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 26 EOL +TAAACTAAAGAGCTTCTGCACAGCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 2 0 2 3 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 3 0 0 16 EOL +TCCTTTGGGTATAT 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 6 0 0 0 0 0 0 0 0 0 2 3 0 0 0 4 0 1 2 0 0 0 3 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 2 0 0 24 EOL +ATGCTGCTATAAAGACACATGCACAC 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 EOL +TGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTG 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 EOL +TCCACAATGGTTGAA 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 EOL +TTGGCTGCATAAAT 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 6 EOL +CAATGGAACAGAACAGAGCCCTCAGAAATAA 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOL +GAACTAGAAATACCATTTGACCCAGC 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 2 0 0 5 0 0 0 0 0 0 0 0 0 2 2 0 0 0 4 0 2 2 0 0 2 2 2 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 4 0 0 28 EOL +TCAAAGATCAGAT 1 0 0 0 0 0 0 1 0 0 1 1 0 2 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 2 0 0 10 0 0 0 0 0 0 0 0 0 2 1 0 0 0 2 0 3 3 0 0 0 0 2 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 20 EOL +GACCCATCTCACACCAGTTAGAATG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 EOL +CAAAGAGAATAAAATACCTAGGTATTTTATTCTTTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 2 0 1 0 0 0 0 3 0 0 9 0 0 0 0 0 0 0 1 0 2 1 0 0 0 1 0 2 2 0 0 0 1 3 0 0 0 0 4 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 22 EOL +AAAATGGCCATACTGCCCAAG 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 5 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 12 EOL +AAATTTTTCCAATTCTGTGAAGAAA 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 2 0 0 5 0 0 0 0 0 0 0 1 0 2 2 0 0 0 1 0 2 4 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 1 0 0 0 0 0 0 1 0 0 20 EOL +TTCACAATAGCAAAGAC 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 2 0 0 6 0 0 0 0 0 0 1 0 0 3 1 0 0 0 1 0 2 1 0 0 0 0 2 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 16 EOL +TAGGAAGAATCAATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 7 EOL +CTGGGTATATACCCAGTAATGG 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 0 1 6 0 0 0 0 0 0 0 0 0 2 1 0 0 0 1 0 0 1 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 10 EOL +TTGGAAGTTCTGGCCAGGGCAA 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 2 1 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 4 0 0 16 EOL +TGAAGGACCTCTTCAAGGAGAACTACAAACCACTGCTCAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 9 EOL +CTCTCACCACTCCTATTCAACATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 2 0 0 0 0 0 1 0 0 6 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 7 EOL +ACACACACATAT 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 4 EOL +CCATGAGCATGGAATGTTCT 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 2 0 0 2 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 4 0 0 16 EOL +AACCCAAATGTCC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 3 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 4 0 0 15 EOL +TTTACAAGAAAAAAACAAACAACCCCA 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 EOL +AGTTCTAGATCC 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 4 0 0 0 0 0 0 1 0 0 1 1 0 0 0 3 0 1 2 0 0 0 1 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 16 EOL +CTGGAAGCATTCCCTTT 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOL +GATCCCATTTGTC 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 6 0 0 0 0 0 0 0 0 0 1 2 0 0 0 2 0 1 2 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 0 0 0 0 0 0 2 0 0 17 EOL +GATATGAACAGACA 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 2 1 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 7 EOL +TGGGTTTGTCATA 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 5 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 9 EOL +AGTTTCTTTTGC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 3 1 0 0 0 1 0 0 2 0 0 0 1 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 0 0 16 EOL +AATTACCCAGTCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOL +TCATGTCATCTGCAAACA 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 2 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 11 EOL +AAAACTCTCAAT 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 1 0 0 6 0 0 0 0 0 0 1 0 0 3 1 0 0 0 4 0 4 4 0 0 0 1 2 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 0 0 0 0 0 0 1 0 0 31 EOL +CAACTTCAGCAA 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 6 0 0 0 0 0 0 1 0 0 3 2 0 0 0 3 0 2 1 0 0 0 1 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 21 EOL +ATTGGCTGTGGGTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 2 0 0 0 0 2 0 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 3 0 0 15 EOL +AATTAGATCCCA 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 2 0 0 6 0 0 0 0 0 0 0 0 0 3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 11 EOL +GACCTAAAACCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 9 EOL +AGAATCTACAAT 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 3 1 0 0 0 0 0 1 0 0 1 2 0 0 0 1 0 2 2 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 15 EOL +AAACCAAACACC 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 2 0 0 7 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 EOL +GGAGTTCCAGACCAGCCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 2 1 0 0 0 1 0 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 13 EOL +CCTTGCCCATGCC 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 9 EOL +ATCATACTGAAT 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 2 2 0 0 0 2 0 2 2 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 2 0 0 0 0 0 0 0 0 0 21 EOL +AGGAAGTCAAATT 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 2 1 0 0 0 0 0 1 3 0 0 0 0 3 0 0 0 0 3 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 21 EOL +AGCAAACCACCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 1 0 3 2 0 0 0 3 0 3 4 0 0 0 1 5 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 0 29 EOL +AGCATGATTTATA 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 1 1 0 3 1 0 0 0 2 0 2 2 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 1 0 0 19 EOL +GAGAGCCAAATCATGAGTG 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 7 EOL +AAAGAGCCAAAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 EOL +AGCAATTGTGAATGG 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 2 2 0 0 0 0 2 0 0 0 0 2 0 0 0 0 4 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 18 EOL +CACTATTCACAATTGC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 6 EOL +TCAGGATACAAAATCAATG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 1 2 0 3 1 0 0 0 2 0 3 1 0 0 0 1 3 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 2 0 0 24 EOL +CAATGAACTCAAACAAATT 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 1 0 1 1 0 0 0 2 0 2 2 0 0 0 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 2 0 0 18 EOL +TCCAACAAAGGACATGAACTCATC 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 12 EOL +AATTGAACAATGAGAACATGTGGTAT 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 1 1 0 0 0 2 0 0 1 0 0 3 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 12 EOL +AACCACAATGAGAACA 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 4 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 EOL +GTGTATATATATACAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 EOL +ACATATATATAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 5 EOL +GTCAGATGGATA 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 0 0 0 0 0 0 1 0 0 10 EOL +TCATCTGACAAAGGGCTAATATCCAG 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 3 EOL +TTTGTCAGGTTTGTCAAAG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOL +TTCTTAATCCAGTCTATCATT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 3 0 0 15 EOL +TTTAATCCATCTTGA 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 3 1 0 0 0 3 0 0 1 0 0 0 0 3 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 4 0 0 20 EOL +AAACTACCATCAGAGTGAACAGGCA 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 1 0 0 5 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 8 EOL +GCCATCAGAGAAAT 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 0 0 0 0 0 0 1 0 0 8 EOL +ACAGGCAACCTACA 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 4 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 7 EOL +GCAACCTACTCA 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 7 EOL +TAGAAAACCCCAT 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 8 EOL + X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y --- a/test-data/plot_for_lda_output.pdf +++ b/test-data/plot_for_lda_output.pdf @@ -2,8 +2,8 @@ %��ρ�r 1 0 obj << -/CreationDate (D:20100525153422) -/ModDate (D:20100525153422) +/CreationDate (D:20100716143839) +/ModDate (D:20100716143839) /Title (R Graphics Output) /Producer (R 2.10.1) /Creator (R) @@ -36,205 +36,205 @@ 0.75 w 1 J 1 j 10.00 M -170.40 292.19 m -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 285.12 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 280.41 l -170.40 280.41 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 275.69 l -170.40 275.69 l -170.40 275.69 l -170.40 275.69 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -194.40 270.98 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 256.84 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 252.13 l -194.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l +314.40 168.51 m +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +362.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l S Q q 0.000 0.000 0.000 RG @@ -252,10 +252,10 @@ S Q q BT 0.000 0.000 0.000 rg -/F2 1 Tf 12.00 0.00 -0.00 12.00 262.40 18.72 Tm (X) Tj +/F2 1 Tf 12.00 0.00 -0.00 12.00 176.67 18.72 Tm [(T) 120 (est Plot2 [1-FP$F) 50 (alse P) 50 (ositiv) 25 (e$])] TJ ET BT -/F2 1 Tf 0.00 12.00 -12.00 0.00 12.96 255.20 Tm (Y) Tj +/F2 1 Tf 0.00 12.00 -12.00 0.00 12.96 169.47 Tm [(T) 120 (est Plot3 [1-FP$F) 50 (alse P) 50 (ositiv) 25 (e$])] TJ ET Q q 59.04 73.44 414.72 371.52 re W n 0.000 0.000 0.000 RG @@ -264,214 +264,214 @@ 0.75 w 1 J 1 j 10.00 M -170.40 292.19 m -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 292.19 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 289.83 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 287.47 l -170.40 285.12 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 282.76 l -170.40 280.41 l -170.40 280.41 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 278.05 l -170.40 275.69 l -170.40 275.69 l -170.40 275.69 l -170.40 275.69 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 273.34 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -170.40 270.98 l -194.40 270.98 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 268.62 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 263.91 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 259.20 l -194.40 256.84 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 254.49 l -194.40 252.13 l -194.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 252.13 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 249.78 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l -218.40 247.42 l +314.40 168.51 m +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +314.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +326.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 168.51 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +338.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +350.40 162.25 l +362.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +374.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l +386.40 162.25 l S 0.000 0.000 0.000 rg BT -/F1 1 Tf 0 Tr 7.48 0 0 7.48 215.44 249.54 Tm (l) Tj 0 Tr +/F1 1 Tf 0 Tr 7.48 0 0 7.48 383.44 159.66 Tm (l) Tj 0 Tr ET Q q BT 0.000 0.000 0.000 rg -/F3 1 Tf 13.00 0.00 -0.00 13.00 59.04 469.81 Tm [(T) 60 (est Plot)] TJ +/F3 1 Tf 13.00 0.00 -0.00 13.00 59.04 469.81 Tm [(T) 60 (est Plot1)] TJ ET Q q 0.000 0.000 0.000 RG @@ -535,7 +535,7 @@ Q endstream endobj 7 0 obj -8278 +8405 endobj 3 0 obj << @@ -590,15 +590,15 @@ 0 12 0000000000 65535 f 0000000021 00000 n 0000000164 00000 n -0000008644 00000 n -0000008727 00000 n +0000008771 00000 n +0000008854 00000 n 0000000213 00000 n 0000000293 00000 n -0000008624 00000 n -0000008831 00000 n -0000009088 00000 n -0000009171 00000 n -0000009268 00000 n +0000008751 00000 n +0000008958 00000 n +0000009215 00000 n +0000009298 00000 n +0000009395 00000 n trailer << /Size 12 @@ -606,5 +606,5 @@ trailer /Root 2 0 R >> startxref -9370 +9497 %%EOF --- /dev/null +++ b/test-data/matrix_generator_for_pc_and_lda_input_2.tabular @@ -0,0 +1,87 @@ +ARHGEF9 XLMR +KIAA2022 XLMR +ZDHHC15 XLMR +PGK1 XLMR +BRWD3 XLMR +PAK3 XLMR +AGTR2 XLMR +GRIA3 XLMR +ZDHHC9 XLMR +GPC3 XLMR +PHF6 XLMR +SOX3 XLMR +FMR1 XLMR +MTM1 XLMR +SHROOM4 XLMR +KLF8 XLMR +UXT XLMR +PCSK1N XLMR +PAGE1 XLMR +PAGE4 XLMR +CLCN5 XLMR +NUDT10 XLMR +NUDT11 XLMR +GSPT2 XLMR +TMEM29 XLMR +GPR173 XLMR +HUWE1 XLMR +TSR2 XLMR +TRO XLMR +MAGEH1 XLMR +USP51 XLMR +RRAGB XLMR +UBQLN2 CONTROL +SPIN3 CONTROL +SPIN2A CONTROL +ZXDA CONTROL +SPIN4 CONTROL +MTMR8 CONTROL +KIAA1166 CONTROL +LAS1L CONTROL +MSN CONTROL +HEPH CONTROL +AR CONTROL +STARD8 CONTROL +TMEM28 CONTROL +EDA CONTROL +SLC7A3 CONTROL +CXCR3 CONTROL +NAP1L2 CONTROL +CHIC1 CONTROL +FXYD8 CONTROL +RNF12 CONTROL +UPRT CONTROL +MAGEE2 CONTROL +CXorf26 CONTROL +MAGEE1 CONTROL +FGF16 CONTROL +TAF9B CONTROL +CYSLTR1 CONTROL +ZCCHC5 CONTROL +GPR23 CONTROL +P2RY10 CONTROL +ITM2A CONTROL +TBX22 CONTROL +NSBP1 CONTROL +RPS6KA6 CONTROL +HDX CONTROL +APOOL CONTROL +ZNF711 CONTROL +POF1B CONTROL +CHM CONTROL +DACH2 CONTROL +KLHL4 CONTROL +CPXCR1 CONTROL +PABPC5 CONTROL +DIAPH2 CONTROL +SYTL4 CONTROL +CSTF2 CONTROL +DRP2 CONTROL +TCEAL2 CONTROL +BEX5 CONTROL +NXF3 CONTROL +BEX4 CONTROL +TCEAL4 CONTROL +ESX1 CONTROL +IL1RAPL2 CONTROL +NRK CONTROL --- a/test-data/pca_and_lda_analy_for_lda_input.tabular +++ /dev/null @@ -1,89 +0,0 @@ - ARHGEF9 KIAA2022 ZDHHC15 PGK1 BRWD3 PAK3 AGTR2 GRIA3 ZDHHC9 GPC3 PHF6 SOX3 FMR1 MTM1 SHROOM4 KLF8 Sum(XLMR) UXT PCSK1N PAGE1 PAGE4 CLCN5 NUDT10 NUDT11 GSPT2 TMEM29 GPR173 HUWE1 TSR2 TRO MAGEH1 USP51 RRAGB UBQLN2 SPIN3 SPIN2A ZXDA SPIN4 MTMR8 KIAA1166 LAS1L MSN HEPH AR STARD8 TMEM28 EDA SLC7A3 CXCR3 NAP1L2 CHIC1 FXYD8 RNF12 UPRT MAGEE2 CXorf26 MAGEE1 FGF16 TAF9B CYSLTR1 ZCCHC5 GPR23 P2RY10 ITM2A TBX22 NSBP1 RPS6KA6 HDX APOOL ZNF711 POF1B CHM DACH2 KLHL4 CPXCR1 PABPC5 DIAPH2 SYTL4 CSTF2 DRP2 TCEAL2 BEX5 NXF3 BEX4 TCEAL4 ESX1 IL1RAPL2 NRK SERPINA7 MUM1L1 CXorf57 MORC4 VSIG1 NXT2 TMEM164 CHRDL1 PAK3 ZCCHC16 LRCH2 LUZP4 PLS3 AGTR2 CXorf61 KLHL13 WDR44 LONRF3 PGRMC1 SLC25A43 6-Sep NKAP ZBTB33 FAM70A ATP1B4 C1GALT1C1 GLUD2 GRIA3 THOC2 SH2D1A ODZ1 WDR40C ACTRT1 APLN XPNPEP2 UTP14A BCORL1 ENOX2 FLJ30058 MST4 RAP2C MBNL3 HS6ST2 GPC4 PHF6 DDX26B FHL1 BRS3 ZIC3 FGF13 F9 CDR1 SPANXB1 LDOC1 SPANXE MAGEC3 SLITRK4 SPANXN2 SPANXN1 TMEM185A MAGEA8 CXorf40B MAMLD1 GPR50 LOC203547 PASD1 MAGEA4 GABRE MAGEA3 PNMA3 ZNF275 VBP1 TMLHE SPRY3 VAMP7 Sum(Control) -AGACCAGCTTGG 0 0 0 0 0 0 0 0 1 0 0 0 2 0 1 0 4 0 3 1 0 0 0 0 1 2 2 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 2 0 0 0 0 0 0 1 0 0 0 0 1 1 1 2 1 0 0 0 0 0 4 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 38 XLMR -CTCAGAAAAAAA 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 4 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 2 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 2 0 0 0 0 1 27 XLMR -CCATCCATCCATCCATCC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 XLMR -AGAGGGAGAGGG 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 2 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 26 XLMR -ATCTGTTGATGG 0 1 0 0 1 0 0 1 0 1 1 0 2 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 XLMR -AGGGGTGACAGA 0 1 2 0 0 0 2 0 0 0 0 0 0 0 0 1 6 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 28 XLMR -ATCCCATTTACA 0 0 2 0 0 1 0 0 0 0 1 0 1 0 0 1 6 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 19 XLMR -AAGGGTATCAGT 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 4 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 27 XLMR -ATAATATATACA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 XLMR -TATTATATGTATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 XLMR -CATCATCATCATCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 XLMR -CCCCATGATCCA 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 3 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 1 1 0 26 Control -TGATTCCTTAAAGAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 Control -AACACTTGGACA 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 4 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 2 1 0 0 1 0 1 1 0 1 38 Control -TATTAATATAATTTATAAA 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 2 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 2 0 1 0 0 1 1 0 1 1 0 0 0 0 0 1 0 0 0 2 0 1 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 2 0 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 2 0 0 1 1 0 0 48 Control -ATGTTTATTTTA 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 12 Control -TATCCAGGAGAA 1 1 0 1 0 0 2 0 0 0 0 0 0 0 0 2 7 0 0 0 0 2 0 2 0 0 0 0 0 2 2 0 3 1 2 2 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 0 0 1 1 1 2 0 1 2 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 2 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 0 58 Control -CAATTCTCCTGCC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ACCATAAAAATGAGAAC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CTCAGCCATAAAAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ATGAGTGAGAATAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ACTGTGAGTCAA 2 0 1 1 0 0 1 0 1 0 0 0 1 1 0 2 10 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 2 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 26 Control -TAAACTAAAGAGCTTCTGCACAGCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCCTTTGGGTATAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ATGCTGCTATAAAGACACATGCACAC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCCACAATGGTTGAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTGGCTGCATAAAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CAATGGAACAGAACAGAGCCCTCAGAAATAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GAACTAGAAATACCATTTGACCCAGC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCAAAGATCAGAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GACCCATCTCACACCAGTTAGAATG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 2 0 1 2 0 0 0 0 2 0 0 0 0 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 28 Control -CAAAGAGAATAAAATACCTAGGTATTTTATTCTTTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AAAATGGCCATACTGCCCAAG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AAATTTTTCCAATTCTGTGAAGAAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 10 Control -TTCACAATAGCAAAGAC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TAGGAAGAATCAATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CTGGGTATATACCCAGTAATGG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTGGAAGTTCTGGCCAGGGCAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TGAAGGACCTCTTCAAGGAGAACTACAAACCACTGCTCAA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CTCTCACCACTCCTATTCAACATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ACACACACATAT 0 0 1 0 2 0 0 2 0 1 0 0 1 2 0 0 9 0 1 0 0 0 0 0 1 1 1 0 0 2 0 2 0 0 0 0 0 2 0 0 0 1 2 3 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 3 1 1 1 1 0 0 0 0 0 2 1 1 0 0 1 1 0 3 0 1 1 0 0 0 2 2 2 0 3 0 3 6 2 2 0 1 0 1 1 1 0 2 1 0 0 1 2 0 0 0 0 2 2 0 0 0 0 0 0 1 0 1 3 1 1 0 0 0 6 3 4 0 1 0 0 0 1 0 0 1 2 2 3 0 2 0 1 0 0 107 Control -CCATGAGCATGGAATGTTCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AACCCAAATGTCC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTTACAAGAAAAAAACAAACAACCCCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AGTTCTAGATCC 0 1 3 2 1 1 3 0 1 0 1 0 0 0 2 2 17 1 1 0 2 2 2 4 0 0 0 1 0 4 2 0 1 1 1 1 1 1 0 1 0 0 1 0 0 3 0 2 1 0 1 0 1 4 3 1 1 1 1 0 1 1 2 0 3 2 1 0 0 1 2 2 1 2 0 0 0 1 0 0 1 3 1 0 0 0 0 2 2 2 2 2 0 2 0 2 1 0 1 1 0 3 2 0 0 0 0 0 0 3 0 1 1 1 1 0 0 0 0 0 2 1 0 0 0 1 2 2 1 1 1 0 1 0 0 1 0 0 2 0 0 0 1 2 0 1 0 0 0 1 0 1 0 1 1 0 0 1 0 2 2 2 1 130 Control -CTGGAAGCATTCCCTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GATCCCATTTGTC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GATATGAACAGACA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TGGGTTTGTCATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AGTTTCTTTTGC 0 3 4 0 2 0 3 1 1 0 1 0 0 0 0 2 17 2 1 0 2 3 3 2 0 0 0 0 0 2 3 0 2 1 4 1 1 1 0 1 1 1 2 1 0 2 0 2 1 2 2 0 2 4 1 2 0 1 2 1 1 1 2 3 4 2 1 1 2 0 1 2 1 1 0 0 1 0 0 0 3 1 1 0 0 0 1 1 3 3 2 0 1 1 1 2 0 1 0 0 0 3 2 0 0 0 0 1 0 2 0 1 1 2 1 1 0 0 0 0 2 0 0 0 0 0 1 1 1 0 1 0 1 0 0 1 1 1 2 0 0 0 2 3 1 1 0 0 2 0 0 0 0 0 1 1 0 1 1 2 0 1 1 143 Control -AATTACCCAGTCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCATGTCATCTGCAAACA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AAAACTCTCAAT 1 3 2 0 1 0 2 0 0 0 0 1 0 0 0 2 12 0 1 0 0 2 0 2 0 0 0 0 0 2 3 0 4 1 0 1 1 1 2 0 0 0 0 2 0 2 0 1 0 1 2 0 1 3 0 0 2 1 1 0 0 0 1 3 1 1 0 0 2 0 2 0 1 0 0 0 0 0 0 0 0 2 1 0 1 0 0 1 1 1 2 0 0 1 0 1 0 1 1 1 0 2 2 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 1 2 0 0 1 1 2 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 86 Control -CAACTTCAGCAA 0 4 3 1 0 0 3 0 1 0 0 0 0 0 0 2 14 0 1 0 0 3 1 2 0 0 0 0 0 2 2 0 2 1 2 3 1 2 2 1 0 0 0 1 0 1 0 1 1 2 3 0 1 3 3 1 3 1 1 0 2 0 2 2 3 3 0 0 2 0 3 1 1 1 0 0 0 1 1 0 1 3 1 0 0 1 0 2 2 2 1 0 2 0 1 2 0 1 1 1 0 3 3 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 1 2 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 1 0 0 1 116 Control -ATTGGCTGTGGGTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AATTAGATCCCA 0 3 3 1 2 0 3 1 1 0 1 0 0 0 0 2 17 2 1 0 0 3 2 1 0 0 0 1 0 2 3 0 2 1 2 2 0 0 2 0 0 0 1 0 0 1 0 1 0 1 0 0 1 4 2 1 0 1 2 0 0 0 2 1 3 2 1 0 2 1 3 2 1 0 0 0 0 0 1 0 1 0 1 1 0 0 1 1 0 0 2 0 1 1 1 2 0 1 1 1 0 3 2 0 0 0 1 0 0 1 0 1 1 1 1 1 0 0 1 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 0 1 1 0 2 0 0 1 2 3 0 1 0 0 1 0 0 0 1 0 1 0 0 1 0 2 1 2 1 113 Control -GACCTAAAACCA 0 1 2 1 1 0 1 0 1 0 1 0 0 0 0 2 10 0 1 0 2 2 1 3 0 0 0 1 0 2 2 0 2 1 1 1 1 1 1 2 1 0 2 1 0 1 0 1 0 1 1 0 0 3 2 1 1 1 0 0 1 1 2 1 2 1 1 0 0 1 2 1 1 1 0 0 0 1 0 0 1 2 1 1 0 0 1 2 2 2 1 0 0 1 0 1 0 0 1 1 0 1 2 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 1 0 2 1 0 0 0 0 1 1 1 0 1 0 1 0 0 1 1 0 2 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 1 1 100 Control -AGAATCTACAAT 0 3 3 1 0 0 2 1 0 0 0 0 0 0 0 0 10 1 1 0 2 2 2 3 0 0 0 1 0 2 1 1 1 1 2 1 1 1 1 1 0 0 1 0 0 1 0 2 0 0 1 0 1 1 2 0 1 0 2 0 1 1 2 1 2 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 0 1 2 2 2 0 1 2 0 2 0 1 1 1 0 2 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 2 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 2 0 0 0 1 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 89 Control -AAACCAAACACC 0 3 4 2 1 0 2 0 0 1 1 0 1 0 0 0 15 1 1 0 3 3 3 2 0 0 0 1 0 4 2 0 1 2 1 1 1 1 4 1 1 1 4 4 1 2 0 2 1 0 2 0 1 1 2 3 2 1 2 1 1 0 0 2 1 1 0 1 2 2 2 2 1 0 0 0 0 1 0 0 3 4 1 1 2 0 2 2 2 2 3 2 1 2 1 2 0 0 1 1 0 2 1 0 0 0 0 0 0 2 0 1 1 0 1 0 1 0 0 0 4 1 0 0 0 1 1 3 0 1 1 0 1 0 0 1 0 0 1 1 0 2 0 1 1 1 0 0 2 1 0 0 1 2 1 0 0 0 0 2 1 2 1 154 Control -GGAGTTCCAGACCAGCCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CCTTGCCCATGCC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ATCATACTGAAT 0 2 2 1 0 0 2 0 1 0 0 0 1 0 0 2 11 0 1 0 0 3 0 2 0 0 0 0 0 2 2 0 3 1 0 2 0 0 2 1 1 0 0 2 0 1 0 1 0 1 2 0 0 2 3 2 0 0 1 0 2 1 1 3 5 0 0 1 2 1 3 1 2 0 1 1 0 0 0 0 0 3 0 0 0 2 0 2 3 3 2 0 2 1 0 1 0 1 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 1 0 0 0 0 1 0 0 2 1 0 1 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 2 1 1 1 104 Control -AGGAAGTCAAATT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AGCAAACCACCA 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 4 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 2 1 1 0 0 1 0 0 3 2 1 1 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 1 1 2 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 3 1 1 2 1 0 1 0 0 0 0 2 0 0 1 0 0 0 1 2 2 1 64 Control -AGCATGATTTATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GAGAGCCAAATCATGAGTG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AAAGAGCCAAAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 9 Control -AGCAATTGTGAATGG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CACTATTCACAATTGC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCAGGATACAAAATCAATG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -CAATGAACTCAAACAAATT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TCCAACAAAGGACATGAACTCATC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 Control -AATTGAACAATGAGAACATGTGGTAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 1 0 1 0 0 0 0 1 29 Control -AACCACAATGAGAACA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GTGTATATATATACAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ACATATATATAT 0 0 1 1 2 0 1 3 0 1 0 0 1 0 0 0 10 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 5 1 0 0 1 0 1 1 0 0 0 0 0 3 2 0 0 0 1 3 0 0 0 0 1 0 0 2 6 0 1 2 1 0 0 0 1 1 1 1 0 0 2 3 1 0 1 0 0 3 1 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 4 2 0 1 0 1 0 1 1 2 0 3 3 0 0 0 2 0 0 0 1 5 0 3 0 0 0 0 0 1 0 2 2 1 0 1 3 0 4 6 4 3 4 1 0 1 0 4 0 2 0 0 1 0 1 0 1 1 0 125 Control -GTCAGATGGATA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 0 2 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 34 Control -TCATCTGACAAAGGGCTAATATCCAG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTTGTCAGGTTTGTCAAAG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTCTTAATCCAGTCTATCATT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -TTTAATCCATCTTGA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -AAACTACCATCAGAGTGAACAGGCA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GCCATCAGAGAAAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -ACAGGCAACCTACA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control -GCAACCTACTCA 0 3 2 1 0 0 0 0 0 0 1 0 0 0 0 0 7 1 0 0 2 3 2 3 0 0 0 0 0 2 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 1 0 2 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 2 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 48 Control -TAGAAAACCCCAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Control - XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR XLMR Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control Control

1 0

galaxy-dist commit 11fb3bb5b250: Fix bug so that users not logged in can view accessible/published visualizations.
by commits-noreply＠bitbucket.org 16 Jul '10

16 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User jeremy goecks <jeremy.goecks(a)emory.edu> # Date 1279227905 14400 # Node ID 11fb3bb5b25058b55d927d4dfb39dab1112f290b # Parent 067e34574408d42d18cbf91b78e2ee6046a4ef01 Fix bug so that users not logged in can view accessible/published visualizations. --- a/lib/galaxy/web/controllers/visualization.py +++ b/lib/galaxy/web/controllers/visualization.py @@ -280,12 +280,11 @@ class VisualizationController( BaseContr return return_dict @web.expose - @web.require_login("get item content asynchronously") def get_item_content_async( self, trans, id ): """ Returns item content in HTML format. """ # Get visualization, making sure it's accessible. - visualization = self.get_visualization( trans, id, False, True) + visualization = self.get_visualization( trans, id, False, True ) if visualization is None: raise web.httpexceptions.HTTPNotFound()

1 0

galaxy-dist commit e56cc2ed6642: Add checks to get_estimated_display_viewport() methods for tabular data types to ensure datasets are valid before generating display links ( this fix should eliminate some memory problems ) and standardize the returned values from the methods. Also fix several calls to dataset.has_data where the call assumed a property rather than a method.
by commits-noreply＠bitbucket.org 16 Jul '10

16 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Greg Von Kuster <greg(a)bx.psu.edu> # Date 1279226539 14400 # Node ID e56cc2ed66423b51249eb0bf3f5f9f7ccd774bce # Parent 0bb17e66a56098391e5bbdd34ca227b595ddf082 Add checks to get_estimated_display_viewport() methods for tabular data types to ensure datasets are valid before generating display links ( this fix should eliminate some memory problems ) and standardize the returned values from the methods. Also fix several calls to dataset.has_data where the call assumed a property rather than a method. --- a/lib/galaxy/datatypes/tabular.py +++ b/lib/galaxy/datatypes/tabular.py @@ -210,6 +210,14 @@ class Tabular( data.Text ): def display_peek( self, dataset ): """Returns formatted html of peek""" return self.make_html_table( dataset ) + def displayable( self, dataset ): + try: + return dataset.has_data() \ + and dataset.state == dataset.states.OK \ + and dataset.metadata.columns > 0 \ + and dataset.metadata.data_lines > 0 + except: + return False def as_gbrowse_display_file( self, dataset, **kwd ): return open( dataset.file_name ) def as_ucsc_display_file( self, dataset, **kwd ): --- a/templates/library/common/browse_library.mako +++ b/templates/library/common/browse_library.mako @@ -232,7 +232,7 @@ %if not branch_deleted( folder ) and not ldda.library_dataset.deleted and can_modify: <a class="action-button" href="${h.url_for( controller='library_common', action='upload_library_dataset', cntrller=cntrller, library_id=trans.security.encode_id( library.id ), folder_id=trans.security.encode_id( folder.id ), replace_id=trans.security.encode_id( library_dataset.id ), show_deleted=show_deleted )}">Upload a new version of this dataset</a> %endif - %if not branch_deleted( folder ) and not ldda.library_dataset.deleted and ldda.has_data: + %if not branch_deleted( folder ) and not ldda.library_dataset.deleted and ldda.has_data(): <a class="action-button" href="${h.url_for( controller='library_common', action='act_on_multiple_datasets', cntrller=cntrller, library_id=trans.security.encode_id( library.id ), ldda_ids=trans.security.encode_id( ldda.id ), do_action='import_to_history', use_panels=use_panels, show_deleted=show_deleted )}">Import this dataset into your current history</a><a class="action-button" href="${h.url_for( controller='library_common', action='download_dataset_from_folder', cntrller=cntrller, id=trans.security.encode_id( ldda.id ), library_id=trans.security.encode_id( library.id ), use_panels=use_panels )}">Download this dataset</a> %endif --- a/lib/galaxy/datatypes/genetics.py +++ b/lib/galaxy/datatypes/genetics.py @@ -80,7 +80,7 @@ class GenomeGraphs( Tabular ): ggtail = 'hgGenome_doSubmitUpload=submit' if not dataset.dbkey: dataset.dbkey = 'hg18' # punt! - if dataset.has_data: + if dataset.has_data(): for site_name, site_url in util.get_ucsc_by_build(dataset.dbkey): if site_name in app.config.ucsc_display_sites: site_url = site_url.replace('/hgTracks?','/hgGenome?') # for genome graphs --- a/lib/galaxy/datatypes/tracks.py +++ b/lib/galaxy/datatypes/tracks.py @@ -20,7 +20,7 @@ class GeneTrack( binary.Binary ): return binary.Binary.get_display_links( self, dataset, type, app, base_url, target_frame=target_frame, **kwd ) def genetrack_link( self, hda, type, app, base_url ): ret_val = [] - if hda.has_data: + if hda.dataset.has_data(): # Get the disk file name and data id file_name = hda.dataset.get_file_name() data_id = quote_plus( str( hda.id ) ) --- a/templates/root/history_common.mako +++ b/templates/root/history_common.mako @@ -114,7 +114,7 @@ <div class="info">${_('Info: ')}${data.display_info()}</div><div><% dataset_id=trans.security.encode_id( data.id ) %> - %if data.has_data: + %if data.has_data(): <a href="${h.url_for( controller='dataset', action='display', dataset_id=dataset_id, to_ext=data.ext )}" title="Save" class="icon-button disk tooltip"></a> %if for_editing: <a href="${h.url_for( controller='tool_runner', action='rerun', id=data.id )}" target="galaxy_main" title="Run this job again" class="icon-button arrow-circle tooltip"></a> --- a/lib/galaxy/datatypes/interval.py +++ b/lib/galaxy/datatypes/interval.py @@ -54,10 +54,8 @@ class Interval( Tabular ): """Initialize interval datatype, by adding UCSC display apps""" Tabular.__init__(self, **kwd) self.add_display_app ( 'ucsc', 'display at UCSC', 'as_ucsc_display_file', 'ucsc_links' ) - def init_meta( self, dataset, copy_from=None ): Tabular.init_meta( self, dataset, copy_from=copy_from ) - def set_peek( self, dataset, line_count=None, is_multi_byte=False ): """Set the peek and blurb text""" if not dataset.dataset.purged: @@ -76,9 +74,8 @@ class Interval( Tabular ): dataset.peek = 'file does not exist' dataset.blurb = 'file purged from disk' def set_meta( self, dataset, overwrite = True, first_line_is_header = False, **kwd ): + """Tries to guess from the line the location number of the column for the chromosome, region start-end and strand""" Tabular.set_meta( self, dataset, overwrite = overwrite, skip = 0 ) - - """Tries to guess from the line the location number of the column for the chromosome, region start-end and strand""" if dataset.has_data(): empty_line_count = 0 num_check_lines = 100 # only check up to this many non empty lines @@ -138,11 +135,20 @@ class Interval( Tabular ): break # Our metadata is set or we examined 100 non-empty lines, so break out of the outer loop else: empty_line_count += 1 - - + def displayable( self, dataset ): + try: + return dataset.has_data() \ + and dataset.state == dataset.states.OK \ + and dataset.metadata.columns > 0 \ + and dataset.metadata.data_lines > 0 \ + and dataset.metadata.chromCol \ + and dataset.metadata.startCol \ + and dataset.metadata.endCol + except: + return False def get_estimated_display_viewport( self, dataset ): """Return a chrom, start, stop tuple for viewing a file.""" - if dataset.has_data() and dataset.state == dataset.states.OK: + if self.displayable( dataset ): try: c, s, e = dataset.metadata.chromCol, dataset.metadata.startCol, dataset.metadata.endCol c, s, e = int(c)-1, int(s)-1, int(e)-1 @@ -156,20 +162,17 @@ class Interval( Tabular ): peek.append( line.rstrip( '\n\r' ).split() ) if idx > 100 and idx > skipme: # viewport should have at least 100 features break - chr, start, stop = peek[skipme][c], int( peek[skipme][s] ), int( peek[skipme][e] ) - for p in peek[(skipme+1):]: if p[0] == chr: start = min( start, int( p[s] ) ) stop = max( stop, int( p[e] ) ) except Exception, exc: - log.error( 'Viewport generation error -> %s ' % str(exc) ) - (chr, start, stop) = 'chr1', 1, 1000 + log.exception( str(exc) ) + return ( None, None, None ) return (chr, str( start ), str( stop )) else: - return ('', '', '') - + return ( None, None, None ) def as_ucsc_display_file( self, dataset, **kwd ): """Returns file contents with only the bed data""" fd, temp_name = tempfile.mkstemp() @@ -195,7 +198,6 @@ class Interval( Tabular ): os.write(fd, '%s\n' % '\t'.join(tmp) ) os.close(fd) return open(temp_name) - def make_html_table( self, dataset, skipchars=[] ): """Create HTML table, used for displaying peek""" out = ['<table cellspacing="0" cellpadding="3">'] @@ -223,30 +225,24 @@ class Interval( Tabular ): except Exception, exc: out = "Can't create peek %s" % str( exc ) return out - def ucsc_links( self, dataset, type, app, base_url ): ret_val = [] - if dataset.has_data: - viewport_tuple = self.get_estimated_display_viewport(dataset) - if viewport_tuple: - chrom = viewport_tuple[0] - start = viewport_tuple[1] - stop = viewport_tuple[2] - for site_name, site_url in util.get_ucsc_by_build(dataset.dbkey): - if site_name in app.config.ucsc_display_sites: - # HACK: UCSC doesn't support https, so force http even - # if our URL scheme is https. Making this work - # requires additional hackery in your upstream proxy. - # If UCSC ever supports https, remove this hack. - internal_url = "%s" % url_for( controller='/dataset', dataset_id=dataset.id, action='display_at', filename='ucsc_' + site_name ) - if base_url.startswith( 'https://' ): - base_url = base_url.replace( 'https', 'http', 1 ) - display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s&authz_method=display_at" % (base_url, url_for( controller='root' ), dataset.id, type) ) - redirect_url = urllib.quote_plus( "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % (site_url, dataset.dbkey, chrom, start, stop ) ) - link = '%s?redirect_url=%s&display_url=%s' % ( internal_url, redirect_url, display_url ) - ret_val.append( (site_name, link) ) + chrom, start, stop = self.get_estimated_display_viewport( dataset ) + if chrom is not None: + for site_name, site_url in util.get_ucsc_by_build(dataset.dbkey): + if site_name in app.config.ucsc_display_sites: + # HACK: UCSC doesn't support https, so force http even + # if our URL scheme is https. Making this work + # requires additional hackery in your upstream proxy. + # If UCSC ever supports https, remove this hack. + internal_url = "%s" % url_for( controller='/dataset', dataset_id=dataset.id, action='display_at', filename='ucsc_' + site_name ) + if base_url.startswith( 'https://' ): + base_url = base_url.replace( 'https', 'http', 1 ) + display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s&authz_method=display_at" % (base_url, url_for( controller='root' ), dataset.id, type) ) + redirect_url = urllib.quote_plus( "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % (site_url, dataset.dbkey, chrom, start, stop ) ) + link = '%s?redirect_url=%s&display_url=%s' % ( internal_url, redirect_url, display_url ) + ret_val.append( (site_name, link) ) return ret_val - def validate( self, dataset ): """Validate an interval file using the bx GenomicIntervalReader""" errors = list() @@ -344,7 +340,7 @@ class BedGraph( Interval ): """ Set viewport based on dataset's first 100 lines. """ - if dataset.has_data() and dataset.state == dataset.states.OK: + if self.displayable( dataset ): try: # Set seqid, start, stop. seqid = None @@ -379,9 +375,10 @@ class BedGraph( Interval ): if stop == 0: stop = 1 return ( seqid, str( start ), str( stop ) ) - except: - return( '', '', '' ) - return( '', '', '' ) + except Exception, exc: + log.exception( str( exc ) ) + return ( None, None, None ) + return ( None, None, None ) class Bed( Interval ): """Tab delimited data in BED format""" @@ -647,7 +644,7 @@ class Gff( Tabular, _RemoteCallMixin ): Return a chrom, start, stop tuple for viewing a file. There are slight differences between gff 2 and gff 3 formats. This function should correctly handle both... """ - if dataset.has_data() and dataset.state == dataset.states.OK: + if self.displayable( dataset ): try: seqid = '' start = 2147483647 # Maximum value of a signed 32 bit integer ( 2**31 - 1 ) @@ -698,39 +695,33 @@ class Gff( Tabular, _RemoteCallMixin ): break except Exception, e: log.exception( str( e ) ) - seqid, start, stop = ( '', '', '' ) + return ( None, None, None ) return ( seqid, str( start ), str( stop ) ) else: - return ( '', '', '' ) + return ( None, None, None ) def ucsc_links( self, dataset, type, app, base_url ): ret_val = [] - if dataset.has_data: - seqid, start, stop = self.get_estimated_display_viewport( dataset ) - if seqid and start and stop: - for site_name, site_url in util.get_ucsc_by_build( dataset.dbkey ): - if site_name in app.config.ucsc_display_sites: - redirect_url = urllib.quote_plus( - "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % - ( site_url, dataset.dbkey, seqid, start, stop ) ) - link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url ) - ret_val.append( ( site_name, link ) ) + seqid, start, stop = self.get_estimated_display_viewport( dataset ) + if seqid is not None: + for site_name, site_url in util.get_ucsc_by_build( dataset.dbkey ): + if site_name in app.config.ucsc_display_sites: + redirect_url = urllib.quote_plus( + "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % + ( site_url, dataset.dbkey, seqid, start, stop ) ) + link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url ) + ret_val.append( ( site_name, link ) ) return ret_val - def gbrowse_links( self, dataset, type, app, base_url ): ret_val = [] - if dataset.has_data: - viewport_tuple = self.get_estimated_display_viewport( dataset ) - seqid = viewport_tuple[0] - start = viewport_tuple[1] - stop = viewport_tuple[2] - if seqid and start and stop: - for site_name, site_url in util.get_gbrowse_sites_by_build( dataset.dbkey ): - if site_name in app.config.gbrowse_display_sites: - # Old method, the one uncommented below now seems to be the way GBrowse wants the request - # redirect_url = urllib.quote_plus( "%s%s/?ref=%s&start=%s&stop=%s&eurl=%%s" % ( site_url, dataset.dbkey, seqid, start, stop ) ) - redirect_url = urllib.quote_plus( "%s/?q=%s:%s..%s" % ( site_url, seqid, start, stop ) ) - link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url ) - ret_val.append( ( site_name, link ) ) + seqid, start, stop = self.get_estimated_display_viewport( dataset ) + if seqid is not None: + for site_name, site_url in util.get_gbrowse_sites_by_build( dataset.dbkey ): + if site_name in app.config.gbrowse_display_sites: + # Old method, the one uncommented below now seems to be the way GBrowse wants the request + # redirect_url = urllib.quote_plus( "%s%s/?ref=%s&start=%s&stop=%s&eurl=%%s" % ( site_url, dataset.dbkey, seqid, start, stop ) ) + redirect_url = urllib.quote_plus( "%s/?q=%s:%s..%s" % ( site_url, seqid, start, stop ) ) + link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url ) + ret_val.append( ( site_name, link ) ) return ret_val def sniff( self, filename ): """ @@ -974,53 +965,45 @@ class Wiggle( Tabular, _RemoteCallMixin self.add_display_app( 'gbrowse', 'display in Gbrowse', 'as_gbrowse_display_file', 'gbrowse_links' ) def get_estimated_display_viewport( self, dataset ): - num_check_lines = 100 # only check up to this many non empty lines - vstart = None - vend = 0 - vwig_chr = '?' - value = None - for i, line in enumerate( file( dataset.file_name ) ): - line = line.rstrip( '\r\n' ) - if line: - if line.startswith( "browser" ): - chr_info = line.split()[-1] - wig_chr, coords = chr_info.split( ":" ) - start, end = coords.split( "-" ) - value = ( wig_chr, start, end ) + if self.displayable( dataset ): + num_check_lines = 100 # only check up to this many non empty lines + vstart = None + vend = 0 + vwig_chr = '?' + value = None + for i, line in enumerate( file( dataset.file_name ) ): + line = line.rstrip( '\r\n' ) + if line: + if line.startswith( "browser" ): + chr_info = line.split()[-1] + wig_chr, coords = chr_info.split( ":" ) + start, end = coords.split( "-" ) + value = ( wig_chr, start, end ) + break + # variableStep chrom=chr20 + if line and (line.lower().startswith( "variablestep" ) or line.lower().startswith( "fixedstep" )): + c = line.split("chr")[-1] + c = c.split()[0] + vwig_chr = 'chr%s' % c + else: + try: + offset = line.split()[0] + offset = int(offset) + vend = max(vend,offset) + if not vstart: + vstart = offset # first + except: + pass + if i > num_check_lines: break - # variableStep chrom=chr20 - if line and (line.lower().startswith( "variablestep" ) or line.lower().startswith( "fixedstep" )): - c = line.split("chr")[-1] - c = c.split()[0] - vwig_chr = 'chr%s' % c - else: - try: - offset = line.split()[0] - offset = int(offset) - vend = max(vend,offset) - if not vstart: - vstart = offset # first - except: - pass - if i > num_check_lines: - break - if value == None: - value = (vwig_chr, vstart, vend) - return value - - def _get_viewer_range( self, dataset ): - """Retrieve the chromosome, start, end for an external viewer.""" - if dataset.has_data: - viewport_tuple = self.get_estimated_display_viewport( dataset ) - if viewport_tuple: - chrom = viewport_tuple[0] - start = viewport_tuple[1] - stop = viewport_tuple[2] - return ( chrom, start, stop ) - return ( None, None, None ) + if value == None: + value = (vwig_chr, vstart, vend) + return value + else: + return ( None, None, None ) def gbrowse_links( self, dataset, type, app, base_url ): ret_val = [] - chrom, start, stop = self._get_viewer_range( dataset ) + chrom, start, stop = self.get_estimated_display_viewport( dataset ) if chrom is not None: for site_name, site_url in util.get_gbrowse_sites_by_build( dataset.dbkey ): if site_name in app.config.gbrowse_display_sites: @@ -1030,7 +1013,7 @@ class Wiggle( Tabular, _RemoteCallMixin return ret_val def ucsc_links( self, dataset, type, app, base_url ): ret_val = [] - chrom, start, stop = self._get_viewer_range( dataset ) + chrom, start, stop = self.get_estimated_display_viewport( dataset ) if chrom is not None: for site_name, site_url in util.get_ucsc_by_build( dataset.dbkey ): if site_name in app.config.ucsc_display_sites: @@ -1137,59 +1120,56 @@ class CustomTrack ( Tabular ): """Returns formated html of peek""" return Tabular.make_html_table( self, dataset, skipchars=['track', '#'] ) def get_estimated_display_viewport( self, dataset ): - try: - wiggle_format = False - for line in open(dataset.file_name): - if (line.startswith("chr") or line.startswith("scaffold")): - line = line.rstrip( '\n\r' ) - start = line.split("\t")[1].replace(",","") - end = line.split("\t")[2].replace(",","") - - if int(start) < int(end): - value = ( line.split("\t")[0], start, end ) - else: - value = ( line.split("\t")[0], end, start ) - - break - - elif (line.startswith('variableStep')): - # wiggle format - wiggle_format = True - wig_chr = line.split()[1].split('=')[1] - if not wig_chr.startswith("chr"): - value = ('', '', '') + if self.displayable( dataset ): + try: + wiggle_format = False + for line in open(dataset.file_name): + if (line.startswith("chr") or line.startswith("scaffold")): + line = line.rstrip( '\n\r' ) + start = line.split("\t")[1].replace(",","") + end = line.split("\t")[2].replace(",","") + + if int(start) < int(end): + value = ( line.split("\t")[0], start, end ) + else: + value = ( line.split("\t")[0], end, start ) + break - elif wiggle_format: - # wiggle format - if line.split("\t")[0].isdigit(): - start = line.split("\t")[0] - end = str(int(start) + 1) - value = (wig_chr, start, end) - else: - value = (wig_chr, '', '') - break - - return value #returns the co-ordinates of the 1st track/dataset - except: - #return "." - return ('', '', '') + + elif (line.startswith('variableStep')): + # wiggle format + wiggle_format = True + wig_chr = line.split()[1].split('=')[1] + if not wig_chr.startswith("chr"): + value = ('', '', '') + break + elif wiggle_format: + # wiggle format + if line.split("\t")[0].isdigit(): + start = line.split("\t")[0] + end = str(int(start) + 1) + value = (wig_chr, start, end) + else: + value = (wig_chr, '', '') + break + return value #returns the co-ordinates of the 1st track/dataset + except: + return ( None, None, None ) + else: + return ( None, None, None ) def ucsc_links( self, dataset, type, app, base_url ): ret_val = [] - if dataset.has_data: - viewport_tuple = self.get_estimated_display_viewport(dataset) - if viewport_tuple: - chrom = viewport_tuple[0] - start = viewport_tuple[1] - stop = viewport_tuple[2] - for site_name, site_url in util.get_ucsc_by_build(dataset.dbkey): - if site_name in app.config.ucsc_display_sites: - internal_url = "%s" % url_for( controller='dataset', dataset_id=dataset.id, action='display_at', filename='ucsc_' + site_name ) - if base_url.startswith( 'https://' ): - base_url = base_url.replace( 'https', 'http', 1 ) - display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s&authz_method=display_at" % (base_url, url_for( controller='root' ), dataset.id, type) ) - redirect_url = urllib.quote_plus( "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % (site_url, dataset.dbkey, chrom, start, stop ) ) - link = '%s?redirect_url=%s&display_url=%s' % ( internal_url, redirect_url, display_url ) - ret_val.append( (site_name, link) ) + chrom, start, stop = self.get_estimated_display_viewport(dataset) + if chrom is not None: + for site_name, site_url in util.get_ucsc_by_build(dataset.dbkey): + if site_name in app.config.ucsc_display_sites: + internal_url = "%s" % url_for( controller='dataset', dataset_id=dataset.id, action='display_at', filename='ucsc_' + site_name ) + if base_url.startswith( 'https://' ): + base_url = base_url.replace( 'https', 'http', 1 ) + display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s&authz_method=display_at" % (base_url, url_for( controller='root' ), dataset.id, type) ) + redirect_url = urllib.quote_plus( "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % (site_url, dataset.dbkey, chrom, start, stop ) ) + link = '%s?redirect_url=%s&display_url=%s' % ( internal_url, redirect_url, display_url ) + ret_val.append( (site_name, link) ) return ret_val def sniff( self, filename ): """ --- a/templates/library/common/ldda_info.mako +++ b/templates/library/common/ldda_info.mako @@ -61,7 +61,7 @@ %if current_version and can_modify: <a class="action-button" href="${h.url_for( controller='library_common', action='upload_library_dataset', cntrller=cntrller, library_id=trans.security.encode_id( library.id ), folder_id=trans.security.encode_id( ldda.library_dataset.folder.id ), replace_id=trans.security.encode_id( ldda.library_dataset.id ) )}">Upload a new version of this dataset</a> %endif - %if cntrller=='library' and ldda.has_data: + %if cntrller=='library' and ldda.has_data(): <a class="action-button" href="${h.url_for( controller='library_common', action='act_on_multiple_datasets', cntrller=cntrller, library_id=trans.security.encode_id( library.id ), ldda_ids=trans.security.encode_id( ldda.id ), do_action='import_to_history', use_panels=use_panels, show_deleted=show_deleted )}">Import this dataset into your current history</a><a class="action-button" href="${h.url_for( controller='library_common', action='download_dataset_from_folder', cntrller=cntrller, id=trans.security.encode_id( ldda.id ), library_id=trans.security.encode_id( library.id ), use_panels=use_panels, show_deleted=show_deleted )}">Download this dataset</a> %endif

1 0

galaxy-dist commit 067e34574408: Fix bug so that fasta identifier produced by 'extract genomic dna' tool for GFF files is consistent with GFF coordinates.
by commits-noreply＠bitbucket.org 16 Jul '10

16 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User jeremy goecks <jeremy.goecks(a)emory.edu> # Date 1279227794 14400 # Node ID 067e34574408d42d18cbf91b78e2ee6046a4ef01 # Parent e56cc2ed66423b51249eb0bf3f5f9f7ccd774bce Fix bug so that fasta identifier produced by 'extract genomic dna' tool for GFF files is consistent with GFF coordinates. --- a/tools/extract/extract_genomic_dna.py +++ b/tools/extract/extract_genomic_dna.py @@ -158,6 +158,8 @@ def __main__(): if output_format == "fasta" : l = len( sequence ) c = 0 + if gff_format: + start, end = convert_bed_coords_to_gff( [ start, end ] ) fields = [dbkey, str( chrom ), str( start ), str( end ), strand] meta_data = "_".join( fields ) fout.write( ">%s\n" % meta_data )

1 0

galaxy-dist commit 4f157f2c6fd9: Bug fix for history/view when history has no user.
by commits-noreply＠bitbucket.org 16 Jul '10

16 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Dan Blankenberg <dan(a)bx.psu.edu> # Date 1279204138 14400 # Node ID 4f157f2c6fd9cc1b692e55fe38dd66243e881439 # Parent 9638200fbfdcc470675ca6b65f0220f28ed1809c Bug fix for history/view when history has no user. --- a/templates/history/view.mako +++ b/templates/history/view.mako @@ -68,8 +68,10 @@ <% ##TODO: is there a better way to create this URL? Can't use 'f-username' as a key b/c it's not a valid identifier. href_to_published_histories = h.url_for( controller='/history', action='list_published') - href_to_user_histories = h.url_for( controller='/history', action='list_published', xxx=history.user.username) - href_to_user_histories = href_to_user_histories.replace( 'xxx', 'f-username') + if history.user is not None: + href_to_user_histories = h.url_for( controller='/history', action='list_published', xxx=history.user.username).replace( 'xxx', 'f-username') + else: + href_to_user_histories = h.url_for( controller='/history', action='list_published' )##should this instead be be None or empty string? %><div class="unified-panel-header" unselectable="on">

1 0

galaxy-dist commit 0bb17e66a560: Added Sscrofa9.58 to manual_builds.txt and removed phiX from builds.txt.sample (it's in manual_builds.txt)
by commits-noreply＠bitbucket.org 16 Jul '10

16 Jul '10

# HG changeset patch -- Bitbucket.org # Project galaxy-dist # URL http://bitbucket.org/galaxy/galaxy-dist/overview # User Kelly Vincent <kpvincent(a)bx.psu.edu> # Date 1279209889 14400 # Node ID 0bb17e66a56098391e5bbdd34ca227b595ddf082 # Parent 4f157f2c6fd9cc1b692e55fe38dd66243e881439 Added Sscrofa9.58 to manual_builds.txt and removed phiX from builds.txt.sample (it's in manual_builds.txt) --- a/tool-data/shared/ucsc/manual_builds.txt +++ b/tool-data/shared/ucsc/manual_builds.txt @@ -669,3 +669,4 @@ arabidopsis Arabidopsis thaliana TAIR9 arabidopsis_tair8 Arabidopsis thaliana TAIR8 araTha1 Arabidopsis thaliana TAIR7 mm5 Mouse May 2004 (mm5) +Sscrofa9.58 Pig May 2010 (SGSC Sscrofa9.58) --- a/tool-data/shared/ucsc/builds.txt.sample +++ b/tool-data/shared/ucsc/builds.txt.sample @@ -150,4 +150,3 @@ falciparum P. falciparum Plasmodium falc sacCer2 S. cerevisiae June 2008 (SGD/sacCer2) (sacCer2) sacCer1 S. cerevisiae Oct. 2003 (SGD/sacCer1) (sacCer1) sc1 SARS coronavirus Apr. 2003 (GenBank Apr. 14 '03/sc1) (sc1) -phiX phiX174 (phiX)

1 0