[hg] galaxy 3425: Bug fixes to LCA tool
details: http://www.bx.psu.edu/hg/galaxy/rev/c0192e97a79a changeset: 3425:c0192e97a79a user: gua110 date: Mon Feb 22 16:22:12 2010 -0500 description: Bug fixes to LCA tool diffstat: test-data/lca_input2.taxonomy | 4 + test-data/lca_input3.taxonomy | 100 +++++++++++++++++++++++++++++++++++++++++ test-data/lca_output2.taxonomy | 1 + test-data/lca_output3.taxonomy | 40 ++++++++++++++++ tools/taxonomy/lca.py | 49 +++++++++++++------ tools/taxonomy/lca.xml | 24 +++++++-- 6 files changed, 195 insertions(+), 23 deletions(-) diffs (308 lines): diff -r 9d9f0dfeeebf -r c0192e97a79a test-data/lca_input2.taxonomy --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/lca_input2.taxonomy Mon Feb 22 16:22:12 2010 -0500 @@ -0,0 +1,4 @@ +read_1 1 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus1 subgenus1 species1 subspecies1 1 2 3 4 +read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus2 subgenus2 species2 subspecies2 1 2 3 4 +read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum3 subphylum3 superclass3 class3 subclass3 superorder3 order3 suborder3 superfamily3 family3 subfamily3 tribe3 subtribe3 genus3 subgenus3 species3 subspecies3 1 X Y Z +read_2 4 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum4 subphylum4 superclass4 class4 subclass4 superorder4 order4 suborder4 superfamily4 family4 subfamily4 tribe4 subtribe4 genus4 subgenus4 species4 subspecies4 1 X Y Z diff -r 9d9f0dfeeebf -r c0192e97a79a test-data/lca_input3.taxonomy --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/lca_input3.taxonomy Mon Feb 22 16:22:12 2010 -0500 @@ -0,0 +1,100 @@ +IA_1-79371 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604065 +IA_1-84488 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604065 +IA_1-270826 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-285361 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-93958 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-99821 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-144417 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-278966 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-314709 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-324951 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-27817 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604153 +IA_1-95255 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604181 +IA_1-104173 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-135979 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-139090 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-139090 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-139090 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-144996 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-161439 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-216231 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-237681 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-250166 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-254274 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-254274 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-27817 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-29000 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-291427 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-291427 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-293054 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-293054 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-296315 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-296315 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus +IA_1-310974 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-310974 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-311282 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-311282 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-322295 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n +IA_1-42600 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-45102 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-45102 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-48105 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-48105 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-57254 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-61975 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-61975 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-66943 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-68288 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-82334 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-95526 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 diff -r 9d9f0dfeeebf -r c0192e97a79a test-data/lca_output2.taxonomy --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/lca_output2.taxonomy Mon Feb 22 16:22:12 2010 -0500 @@ -0,0 +1,1 @@ +read_1 1 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n 1 2 3 4 diff -r 9d9f0dfeeebf -r c0192e97a79a test-data/lca_output3.taxonomy --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/lca_output3.taxonomy Mon Feb 22 16:22:12 2010 -0500 @@ -0,0 +1,40 @@ +IA_1-104173 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-135979 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-139090 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-144417 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-144996 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-160446 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-161439 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-190855 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-205154 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-216231 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-236286 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-237681 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-250166 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-254274 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-270826 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-27817 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604153 +IA_1-278966 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-285361 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-29000 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-291427 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-293054 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-296315 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-310974 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-311282 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-314709 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-324951 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604077 +IA_1-42600 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-45102 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-48105 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-57254 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-61975 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-66943 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-68288 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-79371 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604065 +IA_1-82334 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-84488 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604065 +IA_1-93958 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 +IA_1-95255 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604181 +IA_1-95526 10116 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Rodentia Sciurognathi n Muridae Murinae n n Rattus n Rattus norvegicus n 281604186 +IA_1-99821 591020 root Bacteria n n n Proteobacteria n n Gammaproteobacteria n n Enterobacteriales n n Enterobacteriaceae n n n Shigella n Shigella flexneri n 281604070 diff -r 9d9f0dfeeebf -r c0192e97a79a tools/taxonomy/lca.py --- a/tools/taxonomy/lca.py Mon Feb 22 13:56:51 2010 -0500 +++ b/tools/taxonomy/lca.py Mon Feb 22 16:22:12 2010 -0500 @@ -42,6 +42,16 @@ except: stop_err("Syntax error: Use correct syntax: program infile outfile") + fin = open(sys.argv[1],'r') + for j, line in enumerate( fin ): + elems = line.strip().split('\t') + if len(elems) < 24: + stop_err("The format of the input dataset is incorrect. Taxonomy datatype should contain at least 24 columns.") + if j > 30: + break + cols = range(1,len(elems)) + fin.close() + group_col = 0 tmpfile = tempfile.NamedTemporaryFile() @@ -68,11 +78,11 @@ remaining_vals = [] skipped_lines = 0 fout = open(outfile, "w") - cols = range(1,25) block_valid = False + for ii, line in enumerate( file( tmpfile.name )): - if line and not line.startswith( '#' ): + if line and not line.startswith( '#' ) and len(line.split('\t')) >= 24: #Taxonomy datatype should have at least 24 columns line = line.rstrip( '\r\n' ) try: fields = line.split("\t") @@ -95,14 +105,11 @@ corresponding aggregate values into the output file. This works due to the sort on group_col we've applied to the data above. """ - out_list = ['']*25 + out_list = ['']*24 out_list[0] = str(prev_item) out_list[1] = str(prev_vals[0][0]) out_list[2] = str(prev_vals[1][0]) - try: - out_list[24] = str(prev_vals[23][0]) - except: - pass + for k, col in enumerate(cols): if col >= 3 and col < 24: if len(set(prev_vals[k])) == 1: @@ -113,11 +120,17 @@ out_list[k+1] = 'n' k += 1 + j = 0 + while True: + try: + out_list.append(str(prev_vals[23+j][0])) + j += 1 + except: + break + if rank_bound == 0: print >>fout, '\t'.join(out_list).strip() - #print 'n'*( 24 - rank_bound ) else: - #print '\t'.join(out_list[rank_bound:24]) if ''.join(out_list[rank_bound:24]) != 'n'*( 24 - rank_bound ): print >>fout, '\t'.join(out_list).strip() @@ -144,15 +157,11 @@ skipped_lines += 1 # Handle the last grouped value - out_list = ['']*25 + out_list = ['']*24 out_list[0] = str(prev_item) out_list[1] = str(prev_vals[0][0]) out_list[2] = str(prev_vals[1][0]) - try: - out_list[24] = str(prev_vals[23][0]) - except: - pass - + for k, col in enumerate(cols): if col >= 3 and col < 24: if len(set(prev_vals[k])) == 1: @@ -163,11 +172,17 @@ out_list[k+1] = 'n' k += 1 + j = 0 + while True: + try: + out_list.append(str(prev_vals[23+j][0])) + j += 1 + except: + break + if rank_bound == 0: print >>fout, '\t'.join(out_list).strip() else: - #print ''.join(out_list[rank_bound:24]) - #print 'n'*( 24 - rank_bound ) if ''.join(out_list[rank_bound:24]) != 'n'*( 24 - rank_bound ): print >>fout, '\t'.join(out_list).strip() diff -r 9d9f0dfeeebf -r c0192e97a79a tools/taxonomy/lca.xml --- a/tools/taxonomy/lca.xml Mon Feb 22 13:56:51 2010 -0500 +++ b/tools/taxonomy/lca.xml Mon Feb 22 16:22:12 2010 -0500 @@ -1,4 +1,4 @@ -<tool id="lca1" name="Find lowest diagnostic rank" version="1.0.0"> +<tool id="lca1" name="Find lowest diagnostic rank" version="1.0.1"> <description></description> <command interpreter="python"> lca.py $input1 $out_file1 $rank_bound @@ -34,11 +34,23 @@ <data format="taxonomy" name="out_file1" metadata_source="input1" /> </outputs> <tests> - <test> - <param name="input1" value="lca_input.taxonomy" ftype="taxonomy"/> - <param name="rank_bound" value="0" /> - <output name="out_file1" file="lca_output.taxonomy" ftype="taxonomy"/> - </test> + <test> + <param name="input1" value="lca_input.taxonomy" ftype="taxonomy"/> + <param name="rank_bound" value="0" /> + <output name="out_file1" file="lca_output.taxonomy" ftype="taxonomy"/> + </test> + <test> + <param name="input1" value="lca_input2.taxonomy" ftype="taxonomy"/> + <param name="rank_bound" value="7" /> + <output name="out_file1" file="lca_output2.taxonomy" ftype="taxonomy"/> + </test> + + <!--Test case with invalid lines --> + <test> + <param name="input1" value="lca_input3.taxonomy" ftype="taxonomy"/> + <param name="rank_bound" value="10" /> + <output name="out_file1" file="lca_output3.taxonomy" ftype="taxonomy"/> + </test> </tests> <help>
participants (1)
-
Greg Von Kuster