lists.galaxyproject.org
Sign In
Sign Up
Sign In
Sign Up
Manage this list
×
Keyboard Shortcuts
Thread View
j
: Next unread message
k
: Previous unread message
j a
: Jump to all threads
j l
: Jump to MailingList overview
2024
September
August
July
June
May
April
March
February
January
2023
December
November
October
September
August
July
June
May
April
March
February
January
2022
December
November
October
September
August
July
June
May
April
March
February
January
2021
December
November
October
September
August
July
June
May
April
March
February
January
2020
December
November
October
September
August
July
June
May
April
March
February
January
2019
December
November
October
September
August
July
June
May
April
March
February
January
2018
December
November
October
September
August
July
June
May
April
March
February
January
2017
December
November
October
September
August
July
June
May
April
March
February
January
2016
December
November
October
September
August
July
June
May
April
March
February
January
2015
December
November
October
September
August
July
June
May
April
March
February
January
2014
December
November
October
September
August
July
June
May
April
March
February
January
2013
December
November
October
September
August
July
June
May
April
March
February
January
2012
December
November
October
September
August
July
June
May
April
March
February
January
2011
December
November
October
September
August
July
June
May
April
March
February
January
2010
December
November
October
September
August
July
June
May
April
March
February
January
2009
December
November
October
September
August
July
June
May
April
March
February
January
2008
December
November
October
September
August
List overview
Download
galaxy-dev
September 2008
----- 2024 -----
September 2024
August 2024
July 2024
June 2024
May 2024
April 2024
March 2024
February 2024
January 2024
----- 2023 -----
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
----- 2022 -----
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
----- 2021 -----
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
----- 2020 -----
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
----- 2019 -----
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
----- 2018 -----
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
----- 2017 -----
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
----- 2016 -----
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
----- 2015 -----
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
----- 2014 -----
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
----- 2013 -----
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
----- 2012 -----
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
----- 2011 -----
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
----- 2010 -----
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
----- 2009 -----
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
----- 2008 -----
December 2008
November 2008
October 2008
September 2008
August 2008
galaxy-dev@lists.galaxyproject.org
7 participants
63 discussions
Start a n
N
ew thread
[hg] galaxy 1523: Adding tools to compute Substitution rates.
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/dabed25dfbaf
changeset: 1523:dabed25dfbaf user: guru date: Sun Sep 21 17:36:28 2008 -0400 description: Adding tools to compute Substitution rates. 7 file(s) affected in this change: test-data/subRates1.out test-data/subs.out tool_conf.xml.sample tools/regVariation/substitution_rates.py tools/regVariation/substitution_rates.xml tools/regVariation/substitutions.py tools/regVariation/substitutions.xml diffs (1734 lines): diff -r 05974294cbf1 -r dabed25dfbaf test-data/subRates1.out --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/subRates1.out Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,13 @@ +#Seq1 Start1 End1 Seq2 Start2 End2 L N p +hg17.chrX 3816458 3816983 fr1.chrUn 343715247 343715776 525 188 0.3581 +hg17.chrX 3795168 3795525 fr1.chrUn 343710815 343711179 357 92 0.2577 +hg17.chrX 3787425 3787599 fr1.chrUn 343708230 343708404 174 37 0.2126 +hg17.chrX 3787284 3787384 fr1.chrUn 62078707 62078816 100 33 0.3300 +hg17.chrX 3776942 3777227 fr1.chrUn 343707053 343707336 283 122 0.4311 +hg17.chrX 3760375 3760468 fr1.chrUn 343706399 343706492 93 20 0.2151 +hg17.chrX 3733405 3733881 fr1.chrUn 303515824 303516268 444 186 0.4189 +hg17.chrX 3731355 3731463 fr1.chrUn 303515724 303515815 91 36 0.3956 +hg17.chrX 3730591 3731038 fr1.chrUn 303515378 303515724 346 126 0.3642 +hg17.chrX 3729219 3729457 fr1.chrUn 343703525 343703763 238 57 0.2395 +hg17.chrX 3700391 3700698 fr1.chrUn 241017738 241018068 307 112 0.3648 +hg17.chrX 3639441 3639646 fr1.chrUn 333536350 333536563 205 66 0.3220 diff -r 05974294cbf1 -r dabed25dfbaf test-data/subs.out --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/subs.out Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,1379 @@ +#Chr Start End +hg17.chrX 3816460 3816460 +fr1.chrUn 343715249 343715249 +hg17.chrX 3816462 3816463 +fr1.chrUn 343715251 343715252 +hg17.chrX 3816466 3816466 +fr1.chrUn 343715255 343715255 +hg17.chrX 3816471 3816471 +fr1.chrUn 343715260 343715260 +hg17.chrX 3816473 3816474 +fr1.chrUn 343715262 343715263 +hg17.chrX 3816478 3816479 +fr1.chrUn 343715267 343715268 +hg17.chrX 3816484 3816485 +fr1.chrUn 343715273 343715274 +hg17.chrX 3816493 3816494 +fr1.chrUn 343715284 343715285 +hg17.chrX 3816496 3816499 +fr1.chrUn 343715287 343715290 +hg17.chrX 3816502 3816502 +fr1.chrUn 343715293 343715293 +hg17.chrX 3816504 3816505 +fr1.chrUn 343715295 343715296 +hg17.chrX 3816507 3816507 +fr1.chrUn 343715298 343715298 +hg17.chrX 3816511 3816511 +fr1.chrUn 343715302 343715302 +hg17.chrX 3816515 3816516 +fr1.chrUn 343715306 343715307 +hg17.chrX 3816518 3816518 +fr1.chrUn 343715312 343715312 +hg17.chrX 3816521 3816522 +fr1.chrUn 343715315 343715316 +hg17.chrX 3816524 3816524 +fr1.chrUn 343715318 343715318 +hg17.chrX 3816531 3816531 +fr1.chrUn 343715324 343715324 +hg17.chrX 3816534 3816542 +fr1.chrUn 343715327 343715335 +hg17.chrX 3816544 3816544 +fr1.chrUn 343715337 343715337 +hg17.chrX 3816547 3816549 +fr1.chrUn 343715340 343715342 +hg17.chrX 3816551 3816554 +fr1.chrUn 343715344 343715347 +hg17.chrX 3816556 3816558 +fr1.chrUn 343715349 343715351 +hg17.chrX 3816561 3816561 +fr1.chrUn 343715354 343715354 +hg17.chrX 3816564 3816564 +fr1.chrUn 343715357 343715357 +hg17.chrX 3816568 3816568 +fr1.chrUn 343715361 343715361 +hg17.chrX 3816570 3816571 +fr1.chrUn 343715363 343715364 +hg17.chrX 3816578 3816579 +fr1.chrUn 343715367 343715368 +hg17.chrX 3816582 3816582 +fr1.chrUn 343715371 343715371 +hg17.chrX 3816586 3816591 +fr1.chrUn 343715375 343715380 +hg17.chrX 3816595 3816597 +fr1.chrUn 343715384 343715386 +hg17.chrX 3816600 3816602 +fr1.chrUn 343715389 343715391 +hg17.chrX 3816604 3816604 +fr1.chrUn 343715393 343715393 +hg17.chrX 3816607 3816607 +fr1.chrUn 343715396 343715396 +hg17.chrX 3816611 3816611 +fr1.chrUn 343715402 343715402 +hg17.chrX 3816614 3816616 +fr1.chrUn 343715405 343715407 +hg17.chrX 3816619 3816621 +fr1.chrUn 343715410 343715412 +hg17.chrX 3816625 3816625 +fr1.chrUn 343715416 343715416 +hg17.chrX 3816627 3816628 +fr1.chrUn 343715418 343715419 +hg17.chrX 3816632 3816635 +fr1.chrUn 343715423 343715426 +hg17.chrX 3816639 3817164 +fr1.chrUn 343715430 343715959 +hg17.chrX 3816645 3816646 +fr1.chrUn 343715441 343715442 +hg17.chrX 3816649 3816650 +fr1.chrUn 343715445 343715446 +hg17.chrX 3816662 3816662 +fr1.chrUn 343715467 343715467 +hg17.chrX 3816665 3816665 +fr1.chrUn 343715470 343715470 +hg17.chrX 3816667 3816668 +fr1.chrUn 343715472 343715473 +hg17.chrX 3816670 3816670 +fr1.chrUn 343715475 343715475 +hg17.chrX 3816672 3816672 +fr1.chrUn 343715477 343715477 +hg17.chrX 3816674 3816678 +fr1.chrUn 343715479 343715483 +hg17.chrX 3816680 3816682 +fr1.chrUn 343715485 343715487 +hg17.chrX 3816684 3816684 +fr1.chrUn 343715489 343715489 +hg17.chrX 3816687 3816687 +fr1.chrUn 343715492 343715492 +hg17.chrX 3816690 3816690 +fr1.chrUn 343715495 343715495 +hg17.chrX 3816693 3816693 +fr1.chrUn 343715498 343715498 +hg17.chrX 3816695 3816695 +fr1.chrUn 343715500 343715500 +hg17.chrX 3816698 3816699 +fr1.chrUn 343715503 343715504 +hg17.chrX 3816714 3816714 +fr1.chrUn 343715519 343715519 +hg17.chrX 3816720 3816720 +fr1.chrUn 343715525 343715525 +hg17.chrX 3816726 3816727 +fr1.chrUn 343715531 343715532 +hg17.chrX 3816736 3816736 +fr1.chrUn 343715541 343715541 +hg17.chrX 3816741 3816741 +fr1.chrUn 343715546 343715546 +hg17.chrX 3816748 3816750 +fr1.chrUn 343715553 343715555 +hg17.chrX 3816752 3816753 +fr1.chrUn 343715557 343715558 +hg17.chrX 3816756 3816757 +fr1.chrUn 343715561 343715562 +hg17.chrX 3816771 3816772 +fr1.chrUn 343715576 343715577 +hg17.chrX 3816777 3816778 +fr1.chrUn 343715582 343715583 +hg17.chrX 3816780 3816781 +fr1.chrUn 343715585 343715586 +hg17.chrX 3816784 3816784 +fr1.chrUn 343715589 343715589 +hg17.chrX 3816786 3816786 +fr1.chrUn 343715591 343715591 +hg17.chrX 3816789 3816790 +fr1.chrUn 343715594 343715595 +hg17.chrX 3816796 3816797 +fr1.chrUn 343715597 343715598 +hg17.chrX 3816800 3816800 +fr1.chrUn 343715601 343715601 +hg17.chrX 3816805 3816808 +fr1.chrUn 343715606 343715609 +hg17.chrX 3816810 3816811 +fr1.chrUn 343715611 343715612 +hg17.chrX 3816814 3816814 +fr1.chrUn 343715615 343715615 +hg17.chrX 3816818 3816819 +fr1.chrUn 343715619 343715620 +hg17.chrX 3816835 3816835 +fr1.chrUn 343715625 343715625 +hg17.chrX 3816837 3816837 +fr1.chrUn 343715627 343715627 +hg17.chrX 3816841 3816842 +fr1.chrUn 343715631 343715632 +hg17.chrX 3816844 3816846 +fr1.chrUn 343715634 343715636 +hg17.chrX 3816849 3816849 +fr1.chrUn 343715639 343715639 +hg17.chrX 3816853 3816853 +fr1.chrUn 343715643 343715643 +hg17.chrX 3816868 3816868 +fr1.chrUn 343715661 343715661 +hg17.chrX 3816870 3816870 +fr1.chrUn 343715663 343715663 +hg17.chrX 3816878 3816879 +fr1.chrUn 343715671 343715672 +hg17.chrX 3816882 3816882 +fr1.chrUn 343715675 343715675 +hg17.chrX 3816891 3816891 +fr1.chrUn 343715684 343715684 +hg17.chrX 3816894 3816894 +fr1.chrUn 343715687 343715687 +hg17.chrX 3816903 3816903 +fr1.chrUn 343715696 343715696 +hg17.chrX 3816906 3816906 +fr1.chrUn 343715699 343715699 +hg17.chrX 3816909 3816909 +fr1.chrUn 343715702 343715702 +hg17.chrX 3816912 3816912 +fr1.chrUn 343715705 343715705 +hg17.chrX 3816915 3816915 +fr1.chrUn 343715708 343715708 +hg17.chrX 3816918 3816920 +fr1.chrUn 343715711 343715713 +hg17.chrX 3816924 3816924 +fr1.chrUn 343715717 343715717 +hg17.chrX 3816930 3816931 +fr1.chrUn 343715723 343715724 +hg17.chrX 3816935 3816935 +fr1.chrUn 343715728 343715728 +hg17.chrX 3816939 3816939 +fr1.chrUn 343715732 343715732 +hg17.chrX 3816952 3816952 +fr1.chrUn 343715745 343715745 +hg17.chrX 3816958 3816958 +fr1.chrUn 343715751 343715751 +hg17.chrX 3816961 3816961 +fr1.chrUn 343715754 343715754 +hg17.chrX 3816964 3816964 +fr1.chrUn 343715757 343715757 +hg17.chrX 3816966 3816968 +fr1.chrUn 343715759 343715761 +hg17.chrX 3816972 3816972 +fr1.chrUn 343715765 343715765 +hg17.chrX 3816974 3816974 +fr1.chrUn 343715767 343715767 +hg17.chrX 3816976 3816977 +fr1.chrUn 343715769 343715770 +hg17.chrX 3816979 3816980 +fr1.chrUn 343715772 343715773 +hg17.chrX 3795168 3795168 +fr1.chrUn 343710815 343710815 +hg17.chrX 3795170 3795170 +fr1.chrUn 343710817 343710817 +hg17.chrX 3795175 3795175 +fr1.chrUn 343710822 343710822 +hg17.chrX 3795188 3795188 +fr1.chrUn 343710827 343710827 +hg17.chrX 3795192 3795194 +fr1.chrUn 343710831 343710833 +hg17.chrX 3795196 3795198 +fr1.chrUn 343710835 343710837 +hg17.chrX 3795207 3795208 +fr1.chrUn 343710846 343710847 +hg17.chrX 3795210 3795211 +fr1.chrUn 343710849 343710850 +hg17.chrX 3795218 3795222 +fr1.chrUn 343710861 343710865 +hg17.chrX 3795225 3795226 +fr1.chrUn 343710868 343710869 +hg17.chrX 3795229 3795230 +fr1.chrUn 343710874 343710875 +hg17.chrX 3795235 3795235 +fr1.chrUn 343710887 343710887 +hg17.chrX 3795239 3795239 +fr1.chrUn 343710891 343710891 +hg17.chrX 3795241 3795242 +fr1.chrUn 343710893 343710894 +hg17.chrX 3795245 3795251 +fr1.chrUn 343710897 343710903 +hg17.chrX 3795254 3795259 +fr1.chrUn 343710906 343710911 +hg17.chrX 3795265 3795265 +fr1.chrUn 343710917 343710917 +hg17.chrX 3795268 3795268 +fr1.chrUn 343710920 343710920 +hg17.chrX 3795272 3795272 +fr1.chrUn 343710924 343710924 +hg17.chrX 3795274 3795275 +fr1.chrUn 343710926 343710927 +hg17.chrX 3795284 3795284 +fr1.chrUn 343710940 343710940 +hg17.chrX 3795312 3795312 +fr1.chrUn 343710968 343710968 +hg17.chrX 3795317 3795317 +fr1.chrUn 343710973 343710973 +hg17.chrX 3795326 3795326 +fr1.chrUn 343710982 343710982 +hg17.chrX 3795332 3795332 +fr1.chrUn 343710988 343710988 +hg17.chrX 3795336 3795336 +fr1.chrUn 343710992 343710992 +hg17.chrX 3795338 3795338 +fr1.chrUn 343710994 343710994 +hg17.chrX 3795344 3795344 +fr1.chrUn 343711000 343711000 +hg17.chrX 3795350 3795350 +fr1.chrUn 343711006 343711006 +hg17.chrX 3795353 3795353 +fr1.chrUn 343711009 343711009 +hg17.chrX 3795356 3795356 +fr1.chrUn 343711012 343711012 +hg17.chrX 3795359 3795359 +fr1.chrUn 343711015 343711015 +hg17.chrX 3795377 3795377 +fr1.chrUn 343711033 343711033 +hg17.chrX 3795380 3795380 +fr1.chrUn 343711036 343711036 +hg17.chrX 3795383 3795383 +fr1.chrUn 343711039 343711039 +hg17.chrX 3795386 3795386 +fr1.chrUn 343711042 343711042 +hg17.chrX 3795389 3795389 +fr1.chrUn 343711045 343711045 +hg17.chrX 3795398 3795398 +fr1.chrUn 343711054 343711054 +hg17.chrX 3795401 3795401 +fr1.chrUn 343711057 343711057 +hg17.chrX 3795407 3795408 +fr1.chrUn 343711063 343711064 +hg17.chrX 3795416 3795416 +fr1.chrUn 343711072 343711072 +hg17.chrX 3795422 3795422 +fr1.chrUn 343711078 343711078 +hg17.chrX 3795425 3795425 +fr1.chrUn 343711081 343711081 +hg17.chrX 3795434 3795434 +fr1.chrUn 343711090 343711090 +hg17.chrX 3795443 3795443 +fr1.chrUn 343711099 343711099 +hg17.chrX 3795446 3795446 +fr1.chrUn 343711102 343711102 +hg17.chrX 3795449 3795449 +fr1.chrUn 343711105 343711105 +hg17.chrX 3795455 3795455 +fr1.chrUn 343711111 343711111 +hg17.chrX 3795461 3795461 +fr1.chrUn 343711117 343711117 +hg17.chrX 3795464 3795464 +fr1.chrUn 343711120 343711120 +hg17.chrX 3795467 3795467 +fr1.chrUn 343711123 343711123 +hg17.chrX 3795481 3795481 +fr1.chrUn 343711131 343711131 +hg17.chrX 3795483 3795483 +fr1.chrUn 343711133 343711133 +hg17.chrX 3795488 3795488 +fr1.chrUn 343711138 343711138 +hg17.chrX 3795491 3795491 +fr1.chrUn 343711141 343711141 +hg17.chrX 3795493 3795493 +fr1.chrUn 343711143 343711143 +hg17.chrX 3795500 3795501 +fr1.chrUn 343711150 343711151 +hg17.chrX 3795505 3795507 +fr1.chrUn 343711159 343711161 +hg17.chrX 3795511 3795511 +fr1.chrUn 343711165 343711165 +hg17.chrX 3795513 3795513 +fr1.chrUn 343711167 343711167 +hg17.chrX 3795515 3795515 +fr1.chrUn 343711169 343711169 +hg17.chrX 3795521 3795521 +fr1.chrUn 343711175 343711175 +hg17.chrX 3795523 3795523 +fr1.chrUn 343711177 343711177 +hg17.chrX 3787426 3787426 +fr1.chrUn 343708231 343708231 +hg17.chrX 3787430 3787430 +fr1.chrUn 343708235 343708235 +hg17.chrX 3787432 3787432 +fr1.chrUn 343708237 343708237 +hg17.chrX 3787435 3787436 +fr1.chrUn 343708240 343708241 +hg17.chrX 3787440 3787440 +fr1.chrUn 343708245 343708245 +hg17.chrX 3787449 3787449 +fr1.chrUn 343708254 343708254 +hg17.chrX 3787452 3787452 +fr1.chrUn 343708257 343708257 +hg17.chrX 3787461 3787462 +fr1.chrUn 343708266 343708267 +hg17.chrX 3787464 3787464 +fr1.chrUn 343708269 343708269 +hg17.chrX 3787471 3787471 +fr1.chrUn 343708276 343708276 +hg17.chrX 3787473 3787473 +fr1.chrUn 343708278 343708278 +hg17.chrX 3787476 3787477 +fr1.chrUn 343708281 343708282 +hg17.chrX 3787479 3787479 +fr1.chrUn 343708284 343708284 +hg17.chrX 3787491 3787491 +fr1.chrUn 343708296 343708296 +hg17.chrX 3787494 3787494 +fr1.chrUn 343708299 343708299 +hg17.chrX 3787500 3787500 +fr1.chrUn 343708305 343708305 +hg17.chrX 3787503 3787503 +fr1.chrUn 343708308 343708308 +hg17.chrX 3787510 3787510 +fr1.chrUn 343708315 343708315 +hg17.chrX 3787512 3787512 +fr1.chrUn 343708317 343708317 +hg17.chrX 3787515 3787515 +fr1.chrUn 343708320 343708320 +hg17.chrX 3787518 3787518 +fr1.chrUn 343708323 343708323 +hg17.chrX 3787539 3787539 +fr1.chrUn 343708344 343708344 +hg17.chrX 3787545 3787545 +fr1.chrUn 343708350 343708350 +hg17.chrX 3787548 3787548 +fr1.chrUn 343708353 343708353 +hg17.chrX 3787557 3787557 +fr1.chrUn 343708362 343708362 +hg17.chrX 3787561 3787561 +fr1.chrUn 343708366 343708366 +hg17.chrX 3787566 3787566 +fr1.chrUn 343708371 343708371 +hg17.chrX 3787569 3787569 +fr1.chrUn 343708374 343708374 +hg17.chrX 3787572 3787572 +fr1.chrUn 343708377 343708377 +hg17.chrX 3787578 3787578 +fr1.chrUn 343708383 343708383 +hg17.chrX 3787581 3787581 +fr1.chrUn 343708386 343708386 +hg17.chrX 3787584 3787584 +fr1.chrUn 343708389 343708389 +hg17.chrX 3787587 3787587 +fr1.chrUn 343708392 343708392 +hg17.chrX 3787590 3787590 +fr1.chrUn 343708395 343708395 +hg17.chrX 3787285 3787285 +fr1.chrUn 62078708 62078708 +hg17.chrX 3787293 3787296 +fr1.chrUn 62078716 62078719 +hg17.chrX 3787301 3787301 +fr1.chrUn 62078724 62078724 +hg17.chrX 3787303 3787303 +fr1.chrUn 62078726 62078726 +hg17.chrX 3787305 3787307 +fr1.chrUn 62078728 62078730 +hg17.chrX 3787323 3787423 +fr1.chrUn 62078739 62078848 +hg17.chrX 3787326 3787326 +fr1.chrUn 62078741 62078741 +hg17.chrX 3787328 3787328 +fr1.chrUn 62078743 62078743 +hg17.chrX 3787332 3787333 +fr1.chrUn 62078747 62078748 +hg17.chrX 3787335 3787336 +fr1.chrUn 62078750 62078751 +hg17.chrX 3787339 3787339 +fr1.chrUn 62078754 62078754 +hg17.chrX 3787342 3787343 +fr1.chrUn 62078757 62078758 +hg17.chrX 3787346 3787346 +fr1.chrUn 62078761 62078761 +hg17.chrX 3787348 3787448 +fr1.chrUn 62078763 62078872 +hg17.chrX 3787349 3787349 +fr1.chrUn 62078768 62078768 +hg17.chrX 3787355 3787355 +fr1.chrUn 62078774 62078774 +hg17.chrX 3787357 3787358 +fr1.chrUn 62078776 62078777 +hg17.chrX 3787360 3787360 +fr1.chrUn 62078779 62078779 +hg17.chrX 3787364 3787364 +fr1.chrUn 62078783 62078783 +hg17.chrX 3787369 3787369 +fr1.chrUn 62078796 62078796 +hg17.chrX 3787372 3787372 +fr1.chrUn 62078799 62078799 +hg17.chrX 3787378 3787378 +fr1.chrUn 62078810 62078810 +hg17.chrX 3776943 3776944 +fr1.chrUn 343707054 343707055 +hg17.chrX 3776946 3776946 +fr1.chrUn 343707057 343707057 +hg17.chrX 3776948 3776949 +fr1.chrUn 343707059 343707060 +hg17.chrX 3776951 3776951 +fr1.chrUn 343707062 343707062 +hg17.chrX 3776954 3776954 +fr1.chrUn 343707065 343707065 +hg17.chrX 3776957 3776958 +fr1.chrUn 343707068 343707069 +hg17.chrX 3776960 3776961 +fr1.chrUn 343707071 343707072 +hg17.chrX 3776963 3776963 +fr1.chrUn 343707074 343707074 +hg17.chrX 3776965 3776966 +fr1.chrUn 343707076 343707077 +hg17.chrX 3776968 3776969 +fr1.chrUn 343707079 343707080 +hg17.chrX 3776974 3776976 +fr1.chrUn 343707085 343707087 +hg17.chrX 3776980 3776980 +fr1.chrUn 343707091 343707091 +hg17.chrX 3776983 3776986 +fr1.chrUn 343707094 343707097 +hg17.chrX 3776995 3776995 +fr1.chrUn 343707102 343707102 +hg17.chrX 3776997 3776997 +fr1.chrUn 343707104 343707104 +hg17.chrX 3776999 3777000 +fr1.chrUn 343707106 343707107 +hg17.chrX 3777002 3777002 +fr1.chrUn 343707109 343707109 +hg17.chrX 3777005 3777007 +fr1.chrUn 343707112 343707114 +hg17.chrX 3777009 3777010 +fr1.chrUn 343707116 343707117 +hg17.chrX 3777012 3777012 +fr1.chrUn 343707119 343707119 +hg17.chrX 3777014 3777015 +fr1.chrUn 343707121 343707122 +hg17.chrX 3777018 3777018 +fr1.chrUn 343707125 343707125 +hg17.chrX 3777022 3777022 +fr1.chrUn 343707129 343707129 +hg17.chrX 3777024 3777026 +fr1.chrUn 343707131 343707133 +hg17.chrX 3777028 3777028 +fr1.chrUn 343707135 343707135 +hg17.chrX 3777030 3777033 +fr1.chrUn 343707137 343707140 +hg17.chrX 3777035 3777039 +fr1.chrUn 343707142 343707146 +hg17.chrX 3777041 3777041 +fr1.chrUn 343707148 343707148 +hg17.chrX 3777044 3777044 +fr1.chrUn 343707151 343707151 +hg17.chrX 3777046 3777046 +fr1.chrUn 343707153 343707153 +hg17.chrX 3777049 3777050 +fr1.chrUn 343707156 343707157 +hg17.chrX 3777053 3777054 +fr1.chrUn 343707160 343707161 +hg17.chrX 3777056 3777057 +fr1.chrUn 343707163 343707164 +hg17.chrX 3777059 3777059 +fr1.chrUn 343707166 343707166 +hg17.chrX 3777062 3777063 +fr1.chrUn 343707169 343707170 +hg17.chrX 3777065 3777066 +fr1.chrUn 343707172 343707173 +hg17.chrX 3777068 3777068 +fr1.chrUn 343707175 343707175 +hg17.chrX 3777071 3777073 +fr1.chrUn 343707178 343707180 +hg17.chrX 3777076 3777076 +fr1.chrUn 343707185 343707185 +hg17.chrX 3777081 3777081 +fr1.chrUn 343707190 343707190 +hg17.chrX 3777084 3777084 +fr1.chrUn 343707193 343707193 +hg17.chrX 3777087 3777087 +fr1.chrUn 343707196 343707196 +hg17.chrX 3777090 3777090 +fr1.chrUn 343707199 343707199 +hg17.chrX 3777092 3777095 +fr1.chrUn 343707201 343707204 +hg17.chrX 3777099 3777099 +fr1.chrUn 343707208 343707208 +hg17.chrX 3777103 3777103 +fr1.chrUn 343707212 343707212 +hg17.chrX 3777108 3777111 +fr1.chrUn 343707217 343707220 +hg17.chrX 3777119 3777120 +fr1.chrUn 343707228 343707229 +hg17.chrX 3777123 3777124 +fr1.chrUn 343707232 343707233 +hg17.chrX 3777126 3777127 +fr1.chrUn 343707235 343707236 +hg17.chrX 3777129 3777129 +fr1.chrUn 343707238 343707238 +hg17.chrX 3777131 3777132 +fr1.chrUn 343707240 343707241 +hg17.chrX 3777135 3777135 +fr1.chrUn 343707244 343707244 +hg17.chrX 3777139 3777141 +fr1.chrUn 343707248 343707250 +hg17.chrX 3777144 3777144 +fr1.chrUn 343707253 343707253 +hg17.chrX 3777148 3777148 +fr1.chrUn 343707257 343707257 +hg17.chrX 3777153 3777153 +fr1.chrUn 343707262 343707262 +hg17.chrX 3777156 3777156 +fr1.chrUn 343707265 343707265 +hg17.chrX 3777159 3777160 +fr1.chrUn 343707268 343707269 +hg17.chrX 3777162 3777163 +fr1.chrUn 343707271 343707272 +hg17.chrX 3777177 3777178 +fr1.chrUn 343707286 343707287 +hg17.chrX 3777180 3777181 +fr1.chrUn 343707289 343707290 +hg17.chrX 3777186 3777186 +fr1.chrUn 343707295 343707295 +hg17.chrX 3777189 3777189 +fr1.chrUn 343707298 343707298 +hg17.chrX 3777193 3777193 +fr1.chrUn 343707302 343707302 +hg17.chrX 3777198 3777198 +fr1.chrUn 343707307 343707307 +hg17.chrX 3777200 3777200 +fr1.chrUn 343707309 343707309 +hg17.chrX 3777204 3777204 +fr1.chrUn 343707313 343707313 +hg17.chrX 3777206 3777207 +fr1.chrUn 343707315 343707316 +hg17.chrX 3777211 3777211 +fr1.chrUn 343707320 343707320 +hg17.chrX 3777213 3777213 +fr1.chrUn 343707322 343707322 +hg17.chrX 3777216 3777216 +fr1.chrUn 343707325 343707325 +hg17.chrX 3777219 3777219 +fr1.chrUn 343707328 343707328 +hg17.chrX 3760376 3760376 +fr1.chrUn 343706400 343706400 +hg17.chrX 3760382 3760382 +fr1.chrUn 343706406 343706406 +hg17.chrX 3760385 3760385 +fr1.chrUn 343706409 343706409 +hg17.chrX 3760388 3760388 +fr1.chrUn 343706412 343706412 +hg17.chrX 3760391 3760391 +fr1.chrUn 343706415 343706415 +hg17.chrX 3760400 3760400 +fr1.chrUn 343706424 343706424 +hg17.chrX 3760409 3760410 +fr1.chrUn 343706433 343706434 +hg17.chrX 3760415 3760415 +fr1.chrUn 343706439 343706439 +hg17.chrX 3760418 3760418 +fr1.chrUn 343706442 343706442 +hg17.chrX 3760421 3760421 +fr1.chrUn 343706445 343706445 +hg17.chrX 3760430 3760432 +fr1.chrUn 343706454 343706456 +hg17.chrX 3760436 3760436 +fr1.chrUn 343706460 343706460 +hg17.chrX 3760442 3760442 +fr1.chrUn 343706466 343706466 +hg17.chrX 3760445 3760445 +fr1.chrUn 343706469 343706469 +hg17.chrX 3760448 3760448 +fr1.chrUn 343706472 343706472 +hg17.chrX 3760460 3760460 +fr1.chrUn 343706484 343706484 +hg17.chrX 3760465 3760465 +fr1.chrUn 343706489 343706489 +hg17.chrX 3733406 3733406 +fr1.chrUn 303515825 303515825 +hg17.chrX 3733409 3733409 +fr1.chrUn 303515828 303515828 +hg17.chrX 3733413 3733414 +fr1.chrUn 303515832 303515833 +hg17.chrX 3733417 3733419 +fr1.chrUn 303515836 303515838 +hg17.chrX 3733426 3733427 +fr1.chrUn 303515845 303515846 +hg17.chrX 3733429 3733429 +fr1.chrUn 303515848 303515848 +hg17.chrX 3733431 3733431 +fr1.chrUn 303515850 303515850 +hg17.chrX 3733433 3733433 +fr1.chrUn 303515852 303515852 +hg17.chrX 3733436 3733436 +fr1.chrUn 303515855 303515855 +hg17.chrX 3733440 3733440 +fr1.chrUn 303515859 303515859 +hg17.chrX 3733445 3733445 +fr1.chrUn 303515864 303515864 +hg17.chrX 3733454 3733454 +fr1.chrUn 303515871 303515871 +hg17.chrX 3733456 3733457 +fr1.chrUn 303515873 303515874 +hg17.chrX 3733479 3733479 +fr1.chrUn 303515877 303515877 +hg17.chrX 3733484 3733488 +fr1.chrUn 303515882 303515886 +hg17.chrX 3733491 3733491 +fr1.chrUn 303515889 303515889 +hg17.chrX 3733493 3733494 +fr1.chrUn 303515891 303515892 +hg17.chrX 3733496 3733499 +fr1.chrUn 303515894 303515897 +hg17.chrX 3733501 3733501 +fr1.chrUn 303515899 303515899 +hg17.chrX 3733503 3733504 +fr1.chrUn 303515901 303515902 +hg17.chrX 3733506 3733506 +fr1.chrUn 303515904 303515904 +hg17.chrX 3733508 3733508 +fr1.chrUn 303515906 303515906 +hg17.chrX 3733510 3733510 +fr1.chrUn 303515908 303515908 +hg17.chrX 3733519 3733519 +fr1.chrUn 303515910 303515910 +hg17.chrX 3733521 3733521 +fr1.chrUn 303515912 303515912 +hg17.chrX 3733523 3733523 +fr1.chrUn 303515914 303515914 +hg17.chrX 3733528 3733529 +fr1.chrUn 303515919 303515920 +hg17.chrX 3733537 3733538 +fr1.chrUn 303515925 303515926 +hg17.chrX 3733541 3733541 +fr1.chrUn 303515929 303515929 +hg17.chrX 3733543 3733543 +fr1.chrUn 303515931 303515931 +hg17.chrX 3733549 3733549 +fr1.chrUn 303515937 303515937 +hg17.chrX 3733551 3733553 +fr1.chrUn 303515939 303515941 +hg17.chrX 3733555 3733559 +fr1.chrUn 303515943 303515947 +hg17.chrX 3733563 3733564 +fr1.chrUn 303515951 303515952 +hg17.chrX 3733567 3733567 +fr1.chrUn 303515955 303515955 +hg17.chrX 3733569 3733569 +fr1.chrUn 303515957 303515957 +hg17.chrX 3733574 3733574 +fr1.chrUn 303515962 303515962 +hg17.chrX 3733579 3733581 +fr1.chrUn 303515967 303515969 +hg17.chrX 3733591 3733592 +fr1.chrUn 303515979 303515980 +hg17.chrX 3733594 3733596 +fr1.chrUn 303515982 303515984 +hg17.chrX 3733600 3733601 +fr1.chrUn 303515988 303515989 +hg17.chrX 3733607 3733608 +fr1.chrUn 303515995 303515996 +hg17.chrX 3733610 3734086 +fr1.chrUn 303515998 303516442 +hg17.chrX 3733612 3733612 +fr1.chrUn 303516003 303516003 +hg17.chrX 3733614 3733614 +fr1.chrUn 303516005 303516005 +hg17.chrX 3733617 3733618 +fr1.chrUn 303516008 303516009 +hg17.chrX 3733620 3733620 +fr1.chrUn 303516011 303516011 +hg17.chrX 3733623 3733625 +fr1.chrUn 303516014 303516016 +hg17.chrX 3733629 3733632 +fr1.chrUn 303516020 303516023 +hg17.chrX 3733634 3733634 +fr1.chrUn 303516025 303516025 +hg17.chrX 3733636 3733636 +fr1.chrUn 303516027 303516027 +hg17.chrX 3733642 3733642 +fr1.chrUn 303516033 303516033 +hg17.chrX 3733644 3733645 +fr1.chrUn 303516035 303516036 +hg17.chrX 3733647 3733648 +fr1.chrUn 303516038 303516039 +hg17.chrX 3733651 3733651 +fr1.chrUn 303516042 303516042 +hg17.chrX 3733653 3734129 +fr1.chrUn 303516044 303516488 +hg17.chrX 3733657 3733657 +fr1.chrUn 303516053 303516053 +hg17.chrX 3733661 3733662 +fr1.chrUn 303516057 303516058 +hg17.chrX 3733666 3733666 +fr1.chrUn 303516062 303516062 +hg17.chrX 3733670 3733671 +fr1.chrUn 303516066 303516067 +hg17.chrX 3733673 3733673 +fr1.chrUn 303516069 303516069 +hg17.chrX 3733677 3733677 +fr1.chrUn 303516073 303516073 +hg17.chrX 3733680 3733685 +fr1.chrUn 303516076 303516081 +hg17.chrX 3733689 3733692 +fr1.chrUn 303516085 303516088 +hg17.chrX 3733694 3733695 +fr1.chrUn 303516090 303516091 +hg17.chrX 3733697 3733698 +fr1.chrUn 303516093 303516094 +hg17.chrX 3733700 3733704 +fr1.chrUn 303516096 303516100 +hg17.chrX 3733709 3733710 +fr1.chrUn 303516105 303516106 +hg17.chrX 3733715 3733716 +fr1.chrUn 303516111 303516112 +hg17.chrX 3733718 3733718 +fr1.chrUn 303516114 303516114 +hg17.chrX 3733720 3733720 +fr1.chrUn 303516116 303516116 +hg17.chrX 3733723 3733723 +fr1.chrUn 303516119 303516119 +hg17.chrX 3733733 3733733 +fr1.chrUn 303516127 303516127 +hg17.chrX 3733735 3733736 +fr1.chrUn 303516129 303516130 +hg17.chrX 3733741 3733741 +fr1.chrUn 303516135 303516135 +hg17.chrX 3733747 3733748 +fr1.chrUn 303516143 303516144 +hg17.chrX 3733750 3733751 +fr1.chrUn 303516146 303516147 +hg17.chrX 3733753 3733753 +fr1.chrUn 303516149 303516149 +hg17.chrX 3733758 3733762 +fr1.chrUn 303516154 303516158 +hg17.chrX 3733765 3733765 +fr1.chrUn 303516161 303516161 +hg17.chrX 3733767 3733767 +fr1.chrUn 303516163 303516163 +hg17.chrX 3733769 3733769 +fr1.chrUn 303516165 303516165 +hg17.chrX 3733771 3733773 +fr1.chrUn 303516167 303516169 +hg17.chrX 3733775 3733775 +fr1.chrUn 303516171 303516171 +hg17.chrX 3733778 3733778 +fr1.chrUn 303516174 303516174 +hg17.chrX 3733781 3733781 +fr1.chrUn 303516177 303516177 +hg17.chrX 3733787 3734263 +fr1.chrUn 303516183 303516627 +hg17.chrX 3733809 3733810 +fr1.chrUn 303516191 303516192 +hg17.chrX 3733814 3733814 +fr1.chrUn 303516196 303516196 +hg17.chrX 3733819 3733819 +fr1.chrUn 303516206 303516206 +hg17.chrX 3733823 3733823 +fr1.chrUn 303516210 303516210 +hg17.chrX 3733825 3733825 +fr1.chrUn 303516212 303516212 +hg17.chrX 3733829 3733830 +fr1.chrUn 303516216 303516217 +hg17.chrX 3733832 3733832 +fr1.chrUn 303516219 303516219 +hg17.chrX 3733834 3733834 +fr1.chrUn 303516221 303516221 +hg17.chrX 3733836 3733837 +fr1.chrUn 303516223 303516224 +hg17.chrX 3733843 3733846 +fr1.chrUn 303516230 303516233 +hg17.chrX 3733850 3733854 +fr1.chrUn 303516237 303516241 +hg17.chrX 3733856 3733858 +fr1.chrUn 303516243 303516245 +hg17.chrX 3733861 3733861 +fr1.chrUn 303516248 303516248 +hg17.chrX 3733863 3733865 +fr1.chrUn 303516250 303516252 +hg17.chrX 3733869 3733869 +fr1.chrUn 303516256 303516256 +hg17.chrX 3733871 3733874 +fr1.chrUn 303516258 303516261 +hg17.chrX 3733879 3733879 +fr1.chrUn 303516266 303516266 +hg17.chrX 3731359 3731359 +fr1.chrUn 303515728 303515728 +hg17.chrX 3731361 3731361 +fr1.chrUn 303515730 303515730 +hg17.chrX 3731363 3731363 +fr1.chrUn 303515732 303515732 +hg17.chrX 3731365 3731366 +fr1.chrUn 303515734 303515735 +hg17.chrX 3731368 3731368 +fr1.chrUn 303515737 303515737 +hg17.chrX 3731376 3731376 +fr1.chrUn 303515739 303515739 +hg17.chrX 3731378 3731378 +fr1.chrUn 303515741 303515741 +hg17.chrX 3731381 3731382 +fr1.chrUn 303515744 303515745 +hg17.chrX 3731385 3731385 +fr1.chrUn 303515748 303515748 +hg17.chrX 3731391 3731391 +fr1.chrUn 303515753 303515753 +hg17.chrX 3731395 3731397 +fr1.chrUn 303515757 303515759 +hg17.chrX 3731400 3731400 +fr1.chrUn 303515762 303515762 +hg17.chrX 3731403 3731407 +fr1.chrUn 303515765 303515769 +hg17.chrX 3731410 3731410 +fr1.chrUn 303515772 303515772 +hg17.chrX 3731412 3731415 +fr1.chrUn 303515774 303515777 +hg17.chrX 3731419 3731419 +fr1.chrUn 303515781 303515781 +hg17.chrX 3731430 3731430 +fr1.chrUn 303515786 303515786 +hg17.chrX 3731433 3731433 +fr1.chrUn 303515789 303515789 +hg17.chrX 3731435 3731435 +fr1.chrUn 303515791 303515791 +hg17.chrX 3731439 3731439 +fr1.chrUn 303515795 303515795 +hg17.chrX 3731441 3731443 +fr1.chrUn 303515797 303515799 +hg17.chrX 3731446 3731446 +fr1.chrUn 303515802 303515802 +hg17.chrX 3731449 3731449 +fr1.chrUn 303515805 303515805 +hg17.chrX 3730593 3730593 +fr1.chrUn 303515380 303515380 +hg17.chrX 3730596 3730597 +fr1.chrUn 303515383 303515384 +hg17.chrX 3730600 3730600 +fr1.chrUn 303515387 303515387 +hg17.chrX 3730602 3730602 +fr1.chrUn 303515389 303515389 +hg17.chrX 3730604 3730608 +fr1.chrUn 303515391 303515395 +hg17.chrX 3730610 3730612 +fr1.chrUn 303515397 303515399 +hg17.chrX 3730618 3730618 +fr1.chrUn 303515405 303515405 +hg17.chrX 3730622 3730623 +fr1.chrUn 303515409 303515410 +hg17.chrX 3730628 3730628 +fr1.chrUn 303515415 303515415 +hg17.chrX 3730630 3730631 +fr1.chrUn 303515417 303515418 +hg17.chrX 3730633 3730633 +fr1.chrUn 303515420 303515420 +hg17.chrX 3730635 3730635 +fr1.chrUn 303515422 303515422 +hg17.chrX 3730639 3730642 +fr1.chrUn 303515426 303515429 +hg17.chrX 3730644 3730644 +fr1.chrUn 303515433 303515433 +hg17.chrX 3730646 3730647 +fr1.chrUn 303515435 303515436 +hg17.chrX 3730651 3730651 +fr1.chrUn 303515440 303515440 +hg17.chrX 3730659 3730659 +fr1.chrUn 303515448 303515448 +hg17.chrX 3730662 3730662 +fr1.chrUn 303515451 303515451 +hg17.chrX 3730664 3730664 +fr1.chrUn 303515453 303515453 +hg17.chrX 3730666 3730666 +fr1.chrUn 303515455 303515455 +hg17.chrX 3730670 3730670 +fr1.chrUn 303515457 303515457 +hg17.chrX 3730672 3730674 +fr1.chrUn 303515459 303515461 +hg17.chrX 3730681 3731128 +fr1.chrUn 303515468 303515814 +hg17.chrX 3730685 3730685 +fr1.chrUn 303515471 303515471 +hg17.chrX 3730688 3730690 +fr1.chrUn 303515474 303515476 +hg17.chrX 3730694 3730694 +fr1.chrUn 303515480 303515480 +hg17.chrX 3730696 3730696 +fr1.chrUn 303515482 303515482 +hg17.chrX 3730700 3730701 +fr1.chrUn 303515486 303515487 +hg17.chrX 3730703 3730705 +fr1.chrUn 303515489 303515491 +hg17.chrX 3730717 3730717 +fr1.chrUn 303515500 303515500 +hg17.chrX 3730721 3730721 +fr1.chrUn 303515504 303515504 +hg17.chrX 3730723 3730723 +fr1.chrUn 303515506 303515506 +hg17.chrX 3730726 3730728 +fr1.chrUn 303515509 303515511 +hg17.chrX 3730730 3730730 +fr1.chrUn 303515513 303515513 +hg17.chrX 3730732 3730733 +fr1.chrUn 303515515 303515516 +hg17.chrX 3730756 3730756 +fr1.chrUn 303515525 303515525 +hg17.chrX 3730758 3730758 +fr1.chrUn 303515527 303515527 +hg17.chrX 3730760 3730760 +fr1.chrUn 303515529 303515529 +hg17.chrX 3730762 3730762 +fr1.chrUn 303515531 303515531 +hg17.chrX 3730765 3730765 +fr1.chrUn 303515534 303515534 +hg17.chrX 3730774 3730774 +fr1.chrUn 303515540 303515540 +hg17.chrX 3730776 3730776 +fr1.chrUn 303515542 303515542 +hg17.chrX 3730778 3730779 +fr1.chrUn 303515544 303515545 +hg17.chrX 3730790 3730791 +fr1.chrUn 303515550 303515551 +hg17.chrX 3730796 3730796 +fr1.chrUn 303515556 303515556 +hg17.chrX 3730798 3730799 +fr1.chrUn 303515558 303515559 +hg17.chrX 3730802 3730802 +fr1.chrUn 303515562 303515562 +hg17.chrX 3730804 3730804 +fr1.chrUn 303515564 303515564 +hg17.chrX 3730807 3730807 +fr1.chrUn 303515567 303515567 +hg17.chrX 3730810 3730810 +fr1.chrUn 303515570 303515570 +hg17.chrX 3730822 3730822 +fr1.chrUn 303515578 303515578 +hg17.chrX 3730824 3730824 +fr1.chrUn 303515580 303515580 +hg17.chrX 3730828 3730831 +fr1.chrUn 303515584 303515587 +hg17.chrX 3730834 3730834 +fr1.chrUn 303515590 303515590 +hg17.chrX 3730837 3730838 +fr1.chrUn 303515593 303515594 +hg17.chrX 3730841 3730841 +fr1.chrUn 303515597 303515597 +hg17.chrX 3730850 3730850 +fr1.chrUn 303515602 303515602 +hg17.chrX 3730854 3730855 +fr1.chrUn 303515606 303515607 +hg17.chrX 3730857 3730857 +fr1.chrUn 303515609 303515609 +hg17.chrX 3730861 3730861 +fr1.chrUn 303515613 303515613 +hg17.chrX 3730863 3730864 +fr1.chrUn 303515615 303515616 +hg17.chrX 3730876 3730876 +fr1.chrUn 303515624 303515624 +hg17.chrX 3730880 3730880 +fr1.chrUn 303515628 303515628 +hg17.chrX 3730882 3730883 +fr1.chrUn 303515630 303515631 +hg17.chrX 3730885 3730885 +fr1.chrUn 303515633 303515633 +hg17.chrX 3730887 3730889 +fr1.chrUn 303515635 303515637 +hg17.chrX 3730892 3730892 +fr1.chrUn 303515640 303515640 +hg17.chrX 3730928 3730928 +fr1.chrUn 303515646 303515646 +hg17.chrX 3730931 3730931 +fr1.chrUn 303515649 303515649 +hg17.chrX 3730933 3730933 +fr1.chrUn 303515651 303515651 +hg17.chrX 3730936 3730936 +fr1.chrUn 303515654 303515654 +hg17.chrX 3730938 3730938 +fr1.chrUn 303515656 303515656 +hg17.chrX 3730950 3730950 +fr1.chrUn 303515664 303515664 +hg17.chrX 3730952 3730952 +fr1.chrUn 303515666 303515666 +hg17.chrX 3730955 3730955 +fr1.chrUn 303515669 303515669 +hg17.chrX 3730957 3730957 +fr1.chrUn 303515671 303515671 +hg17.chrX 3730959 3730959 +fr1.chrUn 303515673 303515673 +hg17.chrX 3730977 3730977 +fr1.chrUn 303515675 303515675 +hg17.chrX 3730981 3730981 +fr1.chrUn 303515679 303515679 +hg17.chrX 3730984 3730984 +fr1.chrUn 303515682 303515682 +hg17.chrX 3730988 3730988 +fr1.chrUn 303515686 303515686 +hg17.chrX 3730992 3731439 +fr1.chrUn 303515690 303516036 +hg17.chrX 3731005 3731005 +fr1.chrUn 303515693 303515693 +hg17.chrX 3731007 3731007 +fr1.chrUn 303515695 303515695 +hg17.chrX 3731019 3731019 +fr1.chrUn 303515705 303515705 +hg17.chrX 3731024 3731024 +fr1.chrUn 303515710 303515710 +hg17.chrX 3731026 3731027 +fr1.chrUn 303515712 303515713 +hg17.chrX 3731031 3731032 +fr1.chrUn 303515717 303515718 +hg17.chrX 3731034 3731034 +fr1.chrUn 303515720 303515720 +hg17.chrX 3729222 3729223 +fr1.chrUn 343703528 343703529 +hg17.chrX 3729234 3729234 +fr1.chrUn 343703540 343703540 +hg17.chrX 3729237 3729237 +fr1.chrUn 343703543 343703543 +hg17.chrX 3729240 3729240 +fr1.chrUn 343703546 343703546 +hg17.chrX 3729243 3729243 +fr1.chrUn 343703549 343703549 +hg17.chrX 3729246 3729246 +fr1.chrUn 343703552 343703552 +hg17.chrX 3729249 3729249 +fr1.chrUn 343703555 343703555 +hg17.chrX 3729252 3729252 +fr1.chrUn 343703558 343703558 +hg17.chrX 3729257 3729258 +fr1.chrUn 343703563 343703564 +hg17.chrX 3729262 3729264 +fr1.chrUn 343703568 343703570 +hg17.chrX 3729267 3729267 +fr1.chrUn 343703573 343703573 +hg17.chrX 3729270 3729270 +fr1.chrUn 343703576 343703576 +hg17.chrX 3729273 3729273 +fr1.chrUn 343703579 343703579 +hg17.chrX 3729276 3729276 +fr1.chrUn 343703582 343703582 +hg17.chrX 3729279 3729279 +fr1.chrUn 343703585 343703585 +hg17.chrX 3729288 3729288 +fr1.chrUn 343703594 343703594 +hg17.chrX 3729291 3729291 +fr1.chrUn 343703597 343703597 +hg17.chrX 3729295 3729295 +fr1.chrUn 343703601 343703601 +hg17.chrX 3729298 3729298 +fr1.chrUn 343703604 343703604 +hg17.chrX 3729300 3729301 +fr1.chrUn 343703606 343703607 +hg17.chrX 3729303 3729303 +fr1.chrUn 343703609 343703609 +hg17.chrX 3729306 3729306 +fr1.chrUn 343703612 343703612 +hg17.chrX 3729315 3729315 +fr1.chrUn 343703621 343703621 +hg17.chrX 3729324 3729324 +fr1.chrUn 343703630 343703630 +hg17.chrX 3729333 3729333 +fr1.chrUn 343703639 343703639 +hg17.chrX 3729339 3729339 +fr1.chrUn 343703645 343703645 +hg17.chrX 3729342 3729342 +fr1.chrUn 343703648 343703648 +hg17.chrX 3729351 3729351 +fr1.chrUn 343703657 343703657 +hg17.chrX 3729360 3729360 +fr1.chrUn 343703666 343703666 +hg17.chrX 3729363 3729363 +fr1.chrUn 343703669 343703669 +hg17.chrX 3729369 3729369 +fr1.chrUn 343703675 343703675 +hg17.chrX 3729372 3729372 +fr1.chrUn 343703678 343703678 +hg17.chrX 3729375 3729375 +fr1.chrUn 343703681 343703681 +hg17.chrX 3729378 3729378 +fr1.chrUn 343703684 343703684 +hg17.chrX 3729381 3729381 +fr1.chrUn 343703687 343703687 +hg17.chrX 3729390 3729390 +fr1.chrUn 343703696 343703696 +hg17.chrX 3729393 3729393 +fr1.chrUn 343703699 343703699 +hg17.chrX 3729396 3729396 +fr1.chrUn 343703702 343703702 +hg17.chrX 3729402 3729402 +fr1.chrUn 343703708 343703708 +hg17.chrX 3729408 3729409 +fr1.chrUn 343703714 343703715 +hg17.chrX 3729411 3729412 +fr1.chrUn 343703717 343703718 +hg17.chrX 3729417 3729417 +fr1.chrUn 343703723 343703723 +hg17.chrX 3729426 3729426 +fr1.chrUn 343703732 343703732 +hg17.chrX 3729429 3729429 +fr1.chrUn 343703735 343703735 +hg17.chrX 3729432 3729432 +fr1.chrUn 343703738 343703738 +hg17.chrX 3729435 3729435 +fr1.chrUn 343703741 343703741 +hg17.chrX 3729449 3729449 +fr1.chrUn 343703755 343703755 +hg17.chrX 3729452 3729454 +fr1.chrUn 343703758 343703760 +hg17.chrX 3700392 3700392 +fr1.chrUn 241017739 241017739 +hg17.chrX 3700394 3700394 +fr1.chrUn 241017741 241017741 +hg17.chrX 3700396 3700396 +fr1.chrUn 241017743 241017743 +hg17.chrX 3700400 3700401 +fr1.chrUn 241017747 241017748 +hg17.chrX 3700406 3700406 +fr1.chrUn 241017753 241017753 +hg17.chrX 3700409 3700410 +fr1.chrUn 241017756 241017757 +hg17.chrX 3700412 3700412 +fr1.chrUn 241017759 241017759 +hg17.chrX 3700418 3700420 +fr1.chrUn 241017766 241017768 +hg17.chrX 3700425 3700425 +fr1.chrUn 241017774 241017774 +hg17.chrX 3700430 3700431 +fr1.chrUn 241017782 241017783 +hg17.chrX 3700433 3700433 +fr1.chrUn 241017785 241017785 +hg17.chrX 3700438 3700438 +fr1.chrUn 241017790 241017790 +hg17.chrX 3700441 3700441 +fr1.chrUn 241017793 241017793 +hg17.chrX 3700448 3700449 +fr1.chrUn 241017800 241017801 +hg17.chrX 3700451 3700451 +fr1.chrUn 241017803 241017803 +hg17.chrX 3700454 3700460 +fr1.chrUn 241017806 241017812 +hg17.chrX 3700462 3700466 +fr1.chrUn 241017814 241017818 +hg17.chrX 3700469 3700469 +fr1.chrUn 241017821 241017821 +hg17.chrX 3700471 3700472 +fr1.chrUn 241017823 241017824 +hg17.chrX 3700474 3700474 +fr1.chrUn 241017826 241017826 +hg17.chrX 3700477 3700477 +fr1.chrUn 241017829 241017829 +hg17.chrX 3700480 3700787 +fr1.chrUn 241017832 241018162 +hg17.chrX 3700485 3700486 +fr1.chrUn 241017834 241017835 +hg17.chrX 3700489 3700489 +fr1.chrUn 241017838 241017838 +hg17.chrX 3700491 3700491 +fr1.chrUn 241017840 241017840 +hg17.chrX 3700493 3700493 +fr1.chrUn 241017842 241017842 +hg17.chrX 3700496 3700496 +fr1.chrUn 241017845 241017845 +hg17.chrX 3700502 3700502 +fr1.chrUn 241017851 241017851 +hg17.chrX 3700505 3700505 +fr1.chrUn 241017854 241017854 +hg17.chrX 3700511 3700511 +fr1.chrUn 241017860 241017860 +hg17.chrX 3700514 3700514 +fr1.chrUn 241017863 241017863 +hg17.chrX 3700517 3700517 +fr1.chrUn 241017866 241017866 +hg17.chrX 3700520 3700520 +fr1.chrUn 241017869 241017869 +hg17.chrX 3700526 3700526 +fr1.chrUn 241017875 241017875 +hg17.chrX 3700535 3700535 +fr1.chrUn 241017884 241017884 +hg17.chrX 3700547 3700549 +fr1.chrUn 241017896 241017898 +hg17.chrX 3700553 3700553 +fr1.chrUn 241017902 241017902 +hg17.chrX 3700563 3700564 +fr1.chrUn 241017921 241017922 +hg17.chrX 3700566 3700569 +fr1.chrUn 241017924 241017927 +hg17.chrX 3700571 3700571 +fr1.chrUn 241017929 241017929 +hg17.chrX 3700573 3700573 +fr1.chrUn 241017931 241017931 +hg17.chrX 3700579 3700579 +fr1.chrUn 241017937 241017937 +hg17.chrX 3700582 3700582 +fr1.chrUn 241017943 241017943 +hg17.chrX 3700584 3700584 +fr1.chrUn 241017945 241017945 +hg17.chrX 3700589 3700591 +fr1.chrUn 241017950 241017952 +hg17.chrX 3700597 3700597 +fr1.chrUn 241017962 241017962 +hg17.chrX 3700601 3700602 +fr1.chrUn 241017966 241017967 +hg17.chrX 3700604 3700604 +fr1.chrUn 241017969 241017969 +hg17.chrX 3700606 3700606 +fr1.chrUn 241017971 241017971 +hg17.chrX 3700609 3700609 +fr1.chrUn 241017974 241017974 +hg17.chrX 3700611 3700613 +fr1.chrUn 241017976 241017978 +hg17.chrX 3700615 3700615 +fr1.chrUn 241017980 241017980 +hg17.chrX 3700619 3700619 +fr1.chrUn 241017984 241017984 +hg17.chrX 3700622 3700626 +fr1.chrUn 241017987 241017991 +hg17.chrX 3700628 3700628 +fr1.chrUn 241017993 241017993 +hg17.chrX 3700630 3700937 +fr1.chrUn 241017995 241018325 +hg17.chrX 3700636 3700637 +fr1.chrUn 241018004 241018005 +hg17.chrX 3700640 3700640 +fr1.chrUn 241018008 241018008 +hg17.chrX 3700643 3700644 +fr1.chrUn 241018011 241018012 +hg17.chrX 3700646 3700649 +fr1.chrUn 241018014 241018017 +hg17.chrX 3700656 3700656 +fr1.chrUn 241018022 241018022 +hg17.chrX 3700658 3700658 +fr1.chrUn 241018024 241018024 +hg17.chrX 3700663 3700665 +fr1.chrUn 241018029 241018031 +hg17.chrX 3700669 3700669 +fr1.chrUn 241018035 241018035 +hg17.chrX 3700677 3700986 +fr1.chrUn 241018045 241018377 +hg17.chrX 3700681 3700681 +fr1.chrUn 241018051 241018051 +hg17.chrX 3700685 3700686 +fr1.chrUn 241018055 241018056 +hg17.chrX 3700691 3700692 +fr1.chrUn 241018061 241018062 +hg17.chrX 3639443 3639443 +fr1.chrUn 333536352 333536352 +hg17.chrX 3639445 3639445 +fr1.chrUn 333536354 333536354 +hg17.chrX 3639449 3639449 +fr1.chrUn 333536358 333536358 +hg17.chrX 3639452 3639452 +fr1.chrUn 333536361 333536361 +hg17.chrX 3639454 3639456 +fr1.chrUn 333536363 333536365 +hg17.chrX 3639458 3639458 +fr1.chrUn 333536367 333536367 +hg17.chrX 3639468 3639469 +fr1.chrUn 333536381 333536382 +hg17.chrX 3639471 3639471 +fr1.chrUn 333536384 333536384 +hg17.chrX 3639474 3639474 +fr1.chrUn 333536387 333536387 +hg17.chrX 3639476 3639477 +fr1.chrUn 333536389 333536390 +hg17.chrX 3639479 3639479 +fr1.chrUn 333536392 333536392 +hg17.chrX 3639487 3639491 +fr1.chrUn 333536400 333536404 +hg17.chrX 3639493 3639495 +fr1.chrUn 333536406 333536408 +hg17.chrX 3639498 3639498 +fr1.chrUn 333536411 333536411 +hg17.chrX 3639509 3639510 +fr1.chrUn 333536425 333536426 +hg17.chrX 3639512 3639512 +fr1.chrUn 333536428 333536428 +hg17.chrX 3639515 3639515 +fr1.chrUn 333536431 333536431 +hg17.chrX 3639517 3639520 +fr1.chrUn 333536433 333536436 +hg17.chrX 3639522 3639522 +fr1.chrUn 333536438 333536438 +hg17.chrX 3639525 3639525 +fr1.chrUn 333536441 333536441 +hg17.chrX 3639527 3639528 +fr1.chrUn 333536443 333536444 +hg17.chrX 3639532 3639533 +fr1.chrUn 333536451 333536452 +hg17.chrX 3639536 3639536 +fr1.chrUn 333536455 333536455 +hg17.chrX 3639539 3639539 +fr1.chrUn 333536458 333536458 +hg17.chrX 3639545 3639550 +fr1.chrUn 333536464 333536469 +hg17.chrX 3639552 3639552 +fr1.chrUn 333536471 333536471 +hg17.chrX 3639554 3639556 +fr1.chrUn 333536473 333536475 +hg17.chrX 3639563 3639564 +fr1.chrUn 333536480 333536481 +hg17.chrX 3639572 3639572 +fr1.chrUn 333536489 333536489 +hg17.chrX 3639576 3639577 +fr1.chrUn 333536493 333536494 +hg17.chrX 3639579 3639579 +fr1.chrUn 333536496 333536496 +hg17.chrX 3639588 3639588 +fr1.chrUn 333536505 333536505 +hg17.chrX 3639592 3639592 +fr1.chrUn 333536509 333536509 +hg17.chrX 3639598 3639598 +fr1.chrUn 333536515 333536515 +hg17.chrX 3639600 3639600 +fr1.chrUn 333536517 333536517 +hg17.chrX 3639603 3639603 +fr1.chrUn 333536520 333536520 +hg17.chrX 3639606 3639606 +fr1.chrUn 333536523 333536523 +hg17.chrX 3639612 3639612 +fr1.chrUn 333536529 333536529 +hg17.chrX 3639615 3639615 +fr1.chrUn 333536532 333536532 +hg17.chrX 3639622 3639622 +fr1.chrUn 333536539 333536539 +hg17.chrX 3639642 3639642 +fr1.chrUn 333536559 333536559 diff -r 05974294cbf1 -r dabed25dfbaf tool_conf.xml.sample --- a/tool_conf.xml.sample Sat Sep 20 18:14:24 2008 -0400 +++ b/tool_conf.xml.sample Sun Sep 21 17:36:28 2008 -0400 @@ -128,6 +128,8 @@ <tool file="regVariation/getIndels_2way.xml" /> <tool file="regVariation/getIndels_3way.xml" /> <tool file="regVariation/getIndelRates_3way.xml" /> + <tool file="regVariation/substitutions.xml" /> + <tool file="regVariation/substitution_rates.xml" /> </section> <section name="Multiple regression" id="multReg"> <tool file="regVariation/linear_regression.xml" /> diff -r 05974294cbf1 -r dabed25dfbaf tools/regVariation/substitution_rates.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/regVariation/substitution_rates.py Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,118 @@ +#! /usr/bin/python +#guruprasad Ananda +""" +Estimates substitution rates from pairwise alignments using JC69 model. +""" + +from galaxy import eggs +from galaxy.tools.util.galaxyops import * +from galaxy.tools.util import maf_utilities +import bx.align.maf +import sys, fileinput + +def stop_err(msg): + sys.stderr.write(msg) + sys.exit() + +if len(sys.argv) < 3: + stop_err("Incorrect number of arguments.") + +inp_file = sys.argv[1] +out_file = sys.argv[2] +fout = open(out_file, 'w') +int_file = sys.argv[3] +if int_file != "None": #The user has specified an interval file + dbkey_i = sys.argv[4] + chr_col_i, start_col_i, end_col_i, strand_col_i = parse_cols_arg( sys.argv[5] ) + + +def rateEstimator(block): + global alignlen, mismatches + + src1 = block.components[0].src + sequence1 = block.components[0].text + start1 = block.components[0].start + end1 = block.components[0].end + len1 = int(end1)-int(start1) + len1_withgap = len(sequence1) + mismatch = 0.0 + + for seq in range (1,len(block.components)): + src2 = block.components[seq].src + sequence2 = block.components[seq].text + start2 = block.components[seq].start + end2 = block.components[seq].end + len2 = int(end2)-int(start2) + for nt in range(len1_withgap): + if sequence1[nt] not in '-#$^*?' and sequence2[nt] not in '-#$^*?': #Not a gap or masked character + if sequence1[nt].upper() != sequence2[nt].upper(): + mismatch += 1 + + if int_file == "None": + p = mismatch/min(len1,len2) + print >>fout, "%s\t%s\t%s\t%s\t%s\t%s\t%d\t%d\t%.4f" %(src1,start1,end1,src2,start2,end2,min(len1,len2),mismatch,p) + else: + mismatches += mismatch + alignlen += min(len1,len2) + +def main(): + skipped = 0 + not_pairwise = 0 + + if int_file == "None": + try: + maf_reader = bx.align.maf.Reader( open(inp_file, 'r') ) + except: + stop_err("Your MAF file appears to be malformed.") + print >>fout, "#Seq1\tStart1\tEnd1\tSeq2\tStart2\tEnd2\tL\tN\tp" + for block in maf_reader: + if len(block.components) != 2: + not_pairwise += 1 + continue + try: + rateEstimator(block) + except: + skipped += 1 + else: + index, index_filename = maf_utilities.build_maf_index( inp_file, species = [dbkey_i] ) + if index is None: + print >> sys.stderr, "Your MAF file appears to be malformed." + sys.exit() + win = NiceReaderWrapper( fileinput.FileInput( int_file ), + chrom_col=chr_col_i, + start_col=start_col_i, + end_col=end_col_i, + strand_col=strand_col_i, + fix_strand=True) + species=None + mincols = 0 + global alignlen, mismatches + + for interval in win: + alignlen = 0 + mismatches = 0.0 + src = "%s.%s" % ( dbkey_i, interval.chrom ) + for block in maf_utilities.get_chopped_blocks_for_region( index, src, interval, species, mincols ): + if len(block.components) != 2: + not_pairwise += 1 + continue + try: + rateEstimator(block) + except: + skipped += 1 + if alignlen: + p = mismatches/alignlen + else: + p = 'NA' + interval.fields.append(str(alignlen)) + interval.fields.append(str(mismatches)) + interval.fields.append(str(p)) + print >>fout, "\t".join(interval.fields) + #num_blocks += 1 + + if not_pairwise: + print "Skipped %d non-pairwise blocks" %(not_pairwise) + if skipped: + print "Skipped %d blocks as invalid" %(skipped) +if __name__ == "__main__": + main() diff -r 05974294cbf1 -r dabed25dfbaf tools/regVariation/substitution_rates.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/regVariation/substitution_rates.xml Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,61 @@ +<tool id="subRate1" name="Estimate substitution rates " version="1.0.0"> + <description> for non-coding regions</description> + <command interpreter="python"> + substitution_rates.py + $input + $out_file1 + #if $region.type == "win": + ${region.input2} ${region.input2.dbkey} ${region.input2.metadata.chromCol},$region.input2.metadata.startCol,$region.input2.metadata.endCol,$region.input2.metadata.strandCol + #else: + "None" + #end if + </command> + <inputs> + <param format="maf" name="input" type="data" label="Select pair-wise alignment data"/> + <conditional name="region"> + <param name="type" type="select" label="Estimate rates corresponding to" multiple="false"> + <option value="align">Alignment block</option> + <option value="win">Intervals in your history</option> + </param> + <when value="win"> + <param format="interval" name="input2" type="data" label="Choose intervals"> + <validator type="unspecified_build" /> + </param> + </when> + <when value="align" /> + </conditional> + </inputs> + <outputs> + <data format="tabular" name="out_file1" metadata_source="input"/> + </outputs> + + <tests> + <test> + <param name="input" value="Interval2Maf_pairwise_out.maf"/> + <param name="type" value="align"/> + <output name="out_file1" file="subRates1.out"/> + </test> + </tests> + + <help> + +.. class:: infomark + +**What it does** + +This tool takes a pairwise MAF file as input and estimates substitution rate according to Jukes-Cantor JC69 model. The 3 new columns appended to the output are explanied below: + +- L: number of nucleotides compared +- N: number of different nucleotides +- p = N/L + +----- + +.. class:: warningmark + +**Note** + +Any block/s not containing exactly two sequences, will be omitted. + + </help> +</tool> \ No newline at end of file diff -r 05974294cbf1 -r dabed25dfbaf tools/regVariation/substitutions.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/regVariation/substitutions.py Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,87 @@ +#! /usr/bin/python +#Guruprasad ANanda +""" +Fetches substitutions from pairwise alignments. +""" + +from galaxy import eggs + +from galaxy.tools.util import maf_utilities + +import bx.align.maf +import sys +import os, fileinput +def stop_err(msg): + sys.stderr.write(msg) + sys.exit() + +if len(sys.argv) < 3: + stop_err("Incorrect number of arguments.") + +inp_file = sys.argv[1] +out_file = sys.argv[2] +fout = open(out_file, 'w') + +def fetchSubs(block): + + src1 = block.components[0].src + sequence1 = block.components[0].text + start1 = block.components[0].start + end1 = block.components[0].end + len1 = int(end1)-int(start1) + len1_withgap = len(sequence1) + + for seq in range (1,len(block.components)): + src2 = block.components[seq].src + sequence2 = block.components[seq].text + start2 = block.components[seq].start + end2 = block.components[seq].end + len2 = int(end2)-int(start2) + sub_begin = None + sub_end = None + begin = False + + for nt in range(len1_withgap): + if sequence1[nt] not in '-#$^*?' and sequence2[nt] not in '-#$^*?': #Not a gap or masked character + if sequence1[nt].upper() != sequence2[nt].upper(): + if not(begin): + sub_begin = nt + begin = True + sub_end = nt + else: + if begin: + print >>fout, "%s\t%s\t%s" %(src1,start1+sub_begin-sequence1[0:sub_begin].count('-'),start1+sub_end-sequence1[0:sub_end].count('-')) + print >>fout, "%s\t%s\t%s" %(src2,start2+sub_begin-sequence2[0:sub_begin].count('-'),start2+sub_end-sequence2[0:sub_end].count('-')) + begin = False + + else: + if begin: + print >>fout, "%s\t%s\t%s" %(src1,start1+sub_begin-sequence1[0:sub_begin].count('-'),end1+sub_end-sequence1[0:sub_end].count('-')) + print >>fout, "%s\t%s\t%s" %(src2,start2+sub_begin-sequence2[0:sub_begin].count('-'),end2+sub_end-sequence2[0:sub_end].count('-')) + begin = False + ended = False + + +def main(): + skipped = 0 + not_pairwise = 0 + try: + maf_reader = bx.align.maf.Reader( open(inp_file, 'r') ) + except: + stop_err("Your MAF file appears to be malformed.") + print >>fout, "#Chr\tStart\tEnd" + for block in maf_reader: + if len(block.components) != 2: + not_pairwise += 1 + continue + try: + fetchSubs(block) + except: + skipped += 1 + + if not_pairwise: + print "Skipped %d non-pairwise blocks" %(not_pairwise) + if skipped: + print "Skipped %d blocks" %(skipped) +if __name__ == "__main__": + main() diff -r 05974294cbf1 -r dabed25dfbaf tools/regVariation/substitutions.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/regVariation/substitutions.xml Sun Sep 21 17:36:28 2008 -0400 @@ -0,0 +1,38 @@ +<tool id="substitutions1" name="Fetch substitutions " version="1.0.0"> + <description> from pairwise alignments</description> + <command interpreter="python"> + substitutions.py + $input + $out_file1 + </command> + <inputs> + <param format="maf" name="input" type="data" label="Select pair-wise alignment data"/> + </inputs> + <outputs> + <data format="tabular" name="out_file1" metadata_source="input"/> + </outputs> + + <tests> + <test> + <param name="input" value="Interval2Maf_pairwise_out.maf"/> + <output name="out_file1" file="subs.out"/> + </test> + </tests> + <help> + +.. class:: infomark + +**What it does** + +This tool takes a pairwise MAF file as input and fetches substitutions per alignment block. + +----- + +.. class:: warningmark + +**Note** + +Any block/s not containing exactly two sequences, will be omitted. + + </help> +</tool> \ No newline at end of file
1
0
0
0
[hg] galaxy 1507: add SHRiMP mapper for short reads analysis.
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/842f1883cf53
changeset: 1507:842f1883cf53 user: wychung date: Mon Sep 15 15:04:41 2008 -0400 description: add SHRiMP mapper for short reads analysis. 6 file(s) affected in this change: test-data/shrimp_phix_anc.fa test-data/shrimp_wrapper_test1.fastq test-data/shrimp_wrapper_test1.out1 tool_conf.xml.sample tools/metag_tools/shrimp_wrapper.py tools/metag_tools/shrimp_wrapper.xml diffs (853 lines): diff -r 26825f08d362 -r 842f1883cf53 test-data/shrimp_phix_anc.fa --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/shrimp_phix_anc.fa Mon Sep 15 15:04:41 2008 -0400 @@ -0,0 +1,2 @@ +>PHIX174 +GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGTCGAAAAATTATCTTGATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCGAAGTGGACTGCTGGCGGAAAATGAGAAAATTCGACCTATCCTTGCGCAGCTCGAGAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTGTCAAAAACTGACGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTGGTTTAGATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGTTGACATTTTAAAAGAGCGTGGATTACTATCTGAGTCCGATGCTGTTCAACCACTAATAGGTAAGAAATCATGAGTCAAGTTACTGAACAATCCGTACGTTTCCAGACCGCTTTGGCCTCTATTAAGCTCATTCAGGCTTCTGCCGTTTTGGATTTAACCGAAGATGATTTCGATTTTCTGACGAGTAACAAAGTTTGGATTGCTACTGACCGCTCTCGTGCTCGTCGCTGCGTTGAGGCTTGCGTTTATGGTACGCTGGACTTTGTGGGATACCCTCGCTTTCCTGCTCCTGTTGAGTTTATTGCTGCCGTCATTGCTTATTATGTTCATCCCGTCAACATTCAAACGGCCTGTCTCATCATGGAAGGCGCTGAATTTACGGAAAACATTATTAATGGCGTCGAGCGTCCGGTTAAAGCCGCTGAATTGTTCGCGTTTACCTTGCGTGTACGCGCAGGAAACACTGACGTTCTTACTGACGCAGAAGAAAACGTGCGTCAAAAATTACGTGCaGAAGGAGTGATGTAATGTCTAAAGGTAAAAAACGTTCTGGCGCTCGCCCTGGTCGTCCGCAGCCGTTGCGAGGTACTAAAGGCAAGCGTAAAGGCGCTCGTCTTTGGTATGTAGGTGGTCAACAATTTTAATTGCAGGGGCTTCGGCCCCTTACTT GAGGATAAATTATGTCTAATATTCAAACTGGCGCCGAGCGTATGCCGCATGACCTTTCCCATCTTGGCTTCCTTGCTGGTCAGATTGGTCGTCTTATTACCATTTCAACTACTCCGGTTATCGCTGGCGACTCCTTCGAGATGGACGCCGTTGGCGCTCTCCGTCTTTCTCCATTGCGTCGTGGCCTTGCTATTGACTCTACTGTAGACATTTTTACTTTTTATGTCCCTCATCGTCACGTTTATGGTGAACAGTGGATTAAGTTCATGAAGGATGGTGTTAATGCCACTCCTCTCCCGACTGTTAACACTACTGGTTATATTGACCATGCCGCTTTTCTTGGCACGATTAACCCTGATACCAATAAAATCCCTAAGCATTTGTTTCAGGGTTATTTGAATATCTATAACAACTATTTTAAAGCGCCGTGGATGCCTGACCGTACCGAGGCTAACCCTAATGAGCTTAATCAAGATGATGCTCGTTATGGTTTCCGTTGCTGCCATCTCAAAAACATTTGGACTGCTCCGCTTCCTCCTGAGACTGAGCTTTCTCGCCAAATGACGACTTCTACCACATCTATTGACATTATGGGTCTGCAAGCTGCTTATGCTAATTTGCATACTGACCAAGAACGTGATTACTTCATGCAGCGTTACCgTGATGTTATTTCTTCATTTGGAGGTAAAACCTCTTATGACGCTGACAACCGTCCTTTACTTGTCATGCGCTCTAATCTCTGGGCATCTGGCTATGATGTTGATGGAACTGACCAAACGTCGTTAGGCCAGTTTTCTGGTCGTGTTCAACAGACCTATAAACATTCTGTGCCGCGTTTCTTTGTTCCTGAGCATGGCACTATGTTTACTCTTGCGCTTGTTCGTTTTCCGCCTACTGCGACTAAAGAGATTCAGTACCTTAACGCTAAAGGTGCTTTGACTTATACCGATATTGCTGGCGACCCTGTTTTGTATGGCAACTTGCCGCCG CGTGAAATTTCTATGAAGGATGTTTTCCGTTCTGGTGATTCGTCTAAGAAGTTTAAGATTGCTGAGGGTCAGTGGTATCGTTATGCGCCTTCGTATGTTTCTCCTGCTTATCACCTTCTTGAAGGCTTCCCATTCATTCAGGAACCGCCTTCTGGTGATTTGCAAGAACGCGTACTTATTCGCCACCATGATTATGACCAGTGTTTCCAGTCCGTTCAGTTGTTGCAGTGGAATAGTCAGGTTAAATTTAATGTGACCGTTTATCGCAATCTGCCGACCACTCGCGATTCAATCATGACTTCGTGATAAAAGATTGAGTGTGAGGTTATAACGCCGAAGCGGTAAAAATTTTAATTTTTGCCGCTGAGGGGTTGACCAAGCGAAGCGCGGTAGGTTTTCTGCTTAGGAGTTTAATCATGTTTCAGACTTTTATTTCTCGCCATAATTCAAACTTTTTTTCTGATAAGCTGGTTCTCACTTCTGTTACTCCAGCTTCTTCGGCACCTGTTTTACAGACACCTAAAGCTACATCGTCAACGTTATATTTTGATAGTTTGACGGTTAATGCTGGTAATGGTGGTTTTCTTCATTGCATTCAGATGGATACATCTGTCAACGCCGCTAATCAGGTTGTTTCTGTTGGTGCTGATATTGCTTTTGATGCCGACCCTAAATTTTTTGCCTGTTTGGTTCGCTTTGAGTCTTCTTCGGTTCCGACTACCCTCCCGACTGCCTATGATGTTTATCCTTTGAATGGTCGCCATGATGGTGGTTATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGCCGGGCAATAAtGTTTATGTTGGTTTCATGGTTTGGTCTAACTTTACCGCTACTAAATGCCGCGGATTGGTTTCGCTGAATCAGGTTATTAAAGAGATTATTTGTCTCCAGCCACTTAAGTGAGGTGATTTATGTTTGGTGCTATTGCTGGCGGTATTGCTTCTGCTC TTGCTGGTGGCGCCATGTCTAAATTGTTTGGAGGCGGTCAAAAAGCCGCCTCCGGTGGCATTCAAGGTGATGTGCTTGCTACCGATAACAATACTGTAGGCATGGGTGATGCTGGTATTAAATCTGCCATTCAAGGCTCTAATGTTCCTAACCCTGATGAGGCCGCCCCTAGTTTTGTTTCTGGTGCTATGGCTAAAGCTGGTAAAGGACTTCTTGAAGGTACGTTGCAGGCTGGCACTTCTGCCGTTTCTGATAAGTTGCTTGATTTGGTTGGACTTGGTGGCAAGTCTGCCGCTGATAAAGGAAAGGATACTCGTGATTATCTTGCTGCTGCATTTCCTGAGCTTAATGCTTGGGAGCGTGCTGGTGCTGATGCTTCCTCTGCTGGTATGGTTGACGCCGGATTTGAGAATCAAAAAGAGCTTACTAAAATGCAACTGGACAATCAGAAAGAGATTGCCGAGATGCAAAATGAGACTCAAAAAGAGATTGCTGGCATTCAGTCGGCGACTTCACGCCAGAATACGAAAGACCAGGTATATGCACAAAATGAGATGCTTGCTTATCAACAGAAGGAGTCTACTGCTCGCGTTGCGTCTATTATGGAAAACACCAATCTTTCCAAGCAACAGCAGGTTTCCGAGATTATGCGCCAAATGCTTACTCAAGCTCAAACGGCTGGTCAGTATTTTACCAATGACCAAATCAAAGAAATGACTCGCAAGGTTAGTGCTGAGGTTGACTTAGTTCATCAGCAAACGCAGAATCAGCGGTATGGCTCTTCTCATATTGGCGCTACTGCAAAGGATATTTCTAATGTCGTCACTGATGCTGCTTCTGGTGTGGTTGATATTTTTCATGGTATTGATAAAGCTGTTGCCGATACTTGGAACAATTTCTGGAAAGACGGTAAAGCTGATGGTATTGGCTCTAATTTGTCTAGGAAATAACCGTCAGGATTGACACCCTCCCAATTGTATGTTTTCATG CCTCCAAATCTTGGAGGCTTTTTTATGGTTCGTTCTTATTACCCTTCTGAATGTCACGCTGATTATTTTGACTTTGAGCGTATCGAGGCTCTTAAACCTGCTATTGAGGCTTGTGGCATTTCTACTCTTTCTCAATCCCCAATGCTTGGCTTCCATAAGCAGATGGATAACCGCATCAAGCTCTTGGAAGAGATTCTGTCTTTTCGTATGCAGGGCGTTGAGTTCGATAATGGTGATATGTATGTTGACGGCCATAAGGCTGCTTCTGACGTTCGTGATGAGTTTGTATCTGTTACTGAGAAGTTAATGGATGAATTGGCACAATGCTACAATGTGCTCCCCCAACTTGATATTAATAACACTATAGACCACCGCCCCGAAGGGGACGAAAAATGGTTTTTAGAGAACGAGAAGACGGTTACGCAGTTTTGCCGCAAGCTGGCTGCTGAACGCCCTCTTAAGGATATTCGCGATGAGTATAATTACCCCAAAAAGAAAGGTATTAAGGATGAGTGTTCAAGATTGCTGGAGGCCTCCACTATGAAATCGCGTAGAGGCTTTaCTATTCAGCGTTTGATGAATGCAATGCGACAGGCTCATGCTGATGGTTGGTTTATCGTTTTTGACACTCTCACGTTGGCTGACGACCGATTAGAGGCGTTTTATGATAATCCCAATGCTTTGCGTGACTATTTTCGTGATATTGGTCGTATGGTTCTTGCTGCCGAGGGTCGCAAGGCTAATGATTCACACGCCGACTGCTATCAGTATTTTTGTGTGCCTGAGTATGGTACAGCTAATGGCCGTCTTCATTTCCATGCGGTGCAtTTTATGCGGACACTTCCTACAGGTAGCGTTGACCCTAATTTTGGTCGTCGGGTACGCAATCGCCGCCAGTTAAATAGCTTGCAAAATACGTGGCCTTATGGTTACAGTATGCCCATCGCAGTTCGCTACACGCAGGACGCTTTTTCACGTTCTGGTTGGTT GTGGCCTGTTGATGCTAAAGGTGAGCCGCTTAAAGCTACCAGTTATATGGCTGTTGGTTTCTATGTGGCTAAATACGTTAACAAAAAGTCAGATATGGACCTTGCTGCTAAAGGTCTAGGAGCTAAAGAATGGAACAACTCACTAAAAACCAAGCTGTCGCTACTTCCCAAGAAGCTGTTCAGAATCAGAATGAGCCGCAACTTCGGGATGAAAATGCTCACAATGACAAATCTGTCCACGGAGTGCTTAATCCAACTTACCAAGCTGGGTTACGACGCGACGCCGTTCAACCAGATATTGAAGCAGAACGCAAAAAGAGAGATGAGATTGAGGCTGGGAAAAGTTACTGTAGCCGACGTTTTGGCGGCGCAACCTGTGACGACAAATCTGCTCAAATTTATGCGCGCTTCGATAAAAATGATTGGCGTATCCAACCTGCA diff -r 26825f08d362 -r 842f1883cf53 test-data/shrimp_wrapper_test1.fastq --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/shrimp_wrapper_test1.fastq Mon Sep 15 15:04:41 2008 -0400 @@ -0,0 +1,40 @@ +@HWI-EAS91_1_306UPAAXX:6:1:959:874 +GCGGGCTGCGACATAAAGCATACCGCCTGGGCGGCG ++HWI-EAS91_1_306UPAAXX:6:1:959:874 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:1630:1975 +GAAAGAAAATCAGCAACAGTGGCATCGATTTTACGG ++HWI-EAS91_1_306UPAAXX:6:1:1630:1975 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:770:994 +GCAGGCAGCGTGCTGCGAGTCTTTTCGAATGATAAG ++HWI-EAS91_1_306UPAAXX:6:1:770:994 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:1274:306 +GTCTTTGGTATGTAGGTGGTCAACAATTTTAATTGC ++HWI-EAS91_1_306UPAAXX:6:1:1274:306 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh\h +@HWI-EAS91_1_306UPAAXX:6:1:1339:209 +GTTTGGTCAGTTCCATCAACATCATAGCCAGATGCC ++HWI-EAS91_1_306UPAAXX:6:1:1339:209 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:203:1240 +GATTCTCTTGTTGACATTTTAAAAGAGCGTGGATTA ++HWI-EAS91_1_306UPAAXX:6:1:203:1240 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:869:448 +GCTGGCCATCAGTTCGCGGATACCGGCGGCAAACAT ++HWI-EAS91_1_306UPAAXX:6:1:869:448 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhKhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:939:928 +GGAGGCCTCCAGCAATCTTGAACACTCATCCTTAAT ++HWI-EAS91_1_306UPAAXX:6:1:939:928 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:1756:1476 +GCGTAGAGGCTTTACTATTCAGCGTTTGATGAATGC ++HWI-EAS91_1_306UPAAXX:6:1:1756:1476 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh +@HWI-EAS91_1_306UPAAXX:6:1:1528:181 +GGCTGGTCAGTATTTTACCAATGACCAAATCAAAGA ++HWI-EAS91_1_306UPAAXX:6:1:1528:181 +hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh diff -r 26825f08d362 -r 842f1883cf53 test-data/shrimp_wrapper_test1.out1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/shrimp_wrapper_test1.out1 Mon Sep 15 15:04:41 2008 -0400 @@ -0,0 +1,7 @@ +#FORMAT: readname contigname strand contigstart contigend readstart readend readlength score editstring +>HWI-EAS91_1_306UPAAXX:6:1:1528:181 PHIX174 + 3644 3679 1 36 36 3600 36 +>HWI-EAS91_1_306UPAAXX:6:1:1756:1476 PHIX174 + 4505 4540 1 36 36 3600 36 +>HWI-EAS91_1_306UPAAXX:6:1:203:1240 PHIX174 + 310 345 1 36 36 3600 36 +>HWI-EAS91_1_306UPAAXX:6:1:1274:306 PHIX174 + 933 968 1 36 36 3600 36 +>HWI-EAS91_1_306UPAAXX:6:1:939:928 PHIX174 - 4458 4493 1 36 36 3600 36 +>HWI-EAS91_1_306UPAAXX:6:1:1339:209 PHIX174 - 1732 1767 1 36 36 3600 36 diff -r 26825f08d362 -r 842f1883cf53 tool_conf.xml.sample --- a/tool_conf.xml.sample Sun Sep 14 14:58:50 2008 -0400 +++ b/tool_conf.xml.sample Mon Sep 15 15:04:41 2008 -0400 @@ -276,6 +276,7 @@ <tool file="metag_tools/blat_coverage_report.xml" /> </section> <section name="Short Read Mapping" id="solexa_tools"> + <tool file="metag_tools/shrimp_wrapper.xml" /> <tool file="sr_mapping/lastz_wrapper.xml" /> <tool file="metag_tools/megablast_wrapper.xml" /> <tool file="metag_tools/megablast_xml_parser.xml" /> diff -r 26825f08d362 -r 842f1883cf53 tools/metag_tools/shrimp_wrapper.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/metag_tools/shrimp_wrapper.py Mon Sep 15 15:04:41 2008 -0400 @@ -0,0 +1,577 @@ +#! /usr/bin/python + +""" +SHRiMP wrapper + +Inputs: + reference seq and reads + +Outputs: + table of 8 columns: + chrom ref_loc read_id read_loc ref_nuc read_nuc quality coverage + SHRiMP output + +Parameters: + -s Spaced Seed (default: 111111011111) + -n Seed Matches per Window (default: 2) + -t Seed Hit Taboo Length (default: 4) + -9 Seed Generation Taboo Length (default: 0) + -w Seed Window Length (default: 115.00%) + -o Maximum Hits per Read (default: 100) + -r Maximum Read Length (default: 1000) + -d Kmer Std. Deviation Limit (default: -1 [None]) + + -m S-W Match Value (default: 100) + -i S-W Mismatch Value (default: -150) + -g S-W Gap Open Penalty (Reference) (default: -400) + -q S-W Gap Open Penalty (Query) (default: -400) + -e S-W Gap Extend Penalty (Reference) (default: -70) + -f S-W Gap Extend Penalty (Query) (default: -70) + -h S-W Hit Threshold (default: 68.00%) + +Command: +%rmapper -s spaced_seed -n seed_matches_per_window -t seed_hit_taboo_length -9 seed_generation_taboo_length -w seed_window_length -o max_hits_per_read -r max_read_length -d kmer -m sw_match_value -i sw_mismatch_value -g sw_gap_open_ref -q sw_gap_open_query -e sw_gap_ext_ref -f sw_gap_ext_query -h sw_hit_threshold <query> <target> > <output> 2> <log> + +SHRiMP output: +>7:2:1147:982/1 chr3 + 36586562 36586595 2 35 36 2900 3G16G13 +>7:2:1147:982/1 chr3 + 95338194 95338225 4 35 36 2700 9T7C14 +>7:2:587:93/1 chr3 + 14913541 14913577 1 35 36 2960 19--16 + +Testing: +%python shrimp_wrapper.py single ~/Desktop/shrimp_wrapper/phix_anc.fa tmp tmp1 ~/Desktop/shrimp_wrapper/phix.10.solexa.fastq +%python shrimp_wrapper.py paired ~/Desktop/shrimp_wrapper/eca_ref_chrMT.fa tmp tmp1 ~/Desktop/shrimp_wrapper/eca.5.solexa_1.fastq ~/Desktop/shrimp_wrapper/eca.5.solexa_2.fastq + +""" + +import os, sys, tempfile, os.path + +assert sys.version_info[:2] >= (2.4) + +def stop_err( msg ): + + sys.stderr.write( "%s\n" % msg ) + sys.exit() + +def reverse_complement(s): + + complement_dna = {"A":"T", "T":"A", "C":"G", "G":"C", "a":"t", "t":"a", "c":"g", "g":"c", "N":"N", "n":"n" , ".":".", "-":"-"} + reversed_s = [] + for i in s: + reversed_s.append(complement_dna[i]) + reversed_s.reverse() + return "".join(reversed_s) + +def generate_sub_table(result_file, ref_file, score_files, table_outfile, hit_per_read): + + """ + TODO: the cross-over error has not been addressed yet. + """ + + insertion_size = 600 + + all_score_file = score_files.split('&') + + if len(all_score_file) != hit_per_read: stop_err('Un-equal number of files!') + + temp_table_name = tempfile.NamedTemporaryFile().name + temp_table = open(temp_table_name, 'w') + + outfile = open(table_outfile,'w') + + # reference seq: not a single fasta seq + refseq = {} + chrom_cov = {} + seq = '' + + for i, line in enumerate(file(ref_file)): + line = line.rstrip() + if not line or line.startswith('#'): continue + + if line.startswith('>'): + if seq: + if refseq.has_key(title): + pass + else: + refseq[title] = seq + chrom_cov[title] = {} + seq = '' + title = line[1:] + else: + seq += line + if seq: + if not refseq.has_key(title): + refseq[title] = seq + chrom_cov[title] = {} + + # find hits : one end and/or the other + hits = {} + for i, line in enumerate(file(result_file)): + line = line.rstrip() + if not line or line.startswith('#'): continue + + #FORMAT: readname contigname strand contigstart contigend readstart readend readlength score editstring + fields = line.split('\t') + readname = fields[0][1:] + chrom = fields[1] + strand = fields[2] + chrom_start = int(fields[3]) - 1 + chrom_end = int(fields[4]) + read_start = fields[5] + read_end = fields[6] + read_len = fields[7] + score = fields[8] + editstring = fields[9] + + if hit_per_read == 1: + endindex = '1' + else: + readname, endindex = readname.split('/') + + if hits.has_key(readname): + if hits[readname].has_key(endindex): + hits[readname][endindex].append([strand, editstring, chrom_start, chrom_end, read_start, chrom]) + else: + hits[readname][endindex] = [[strand, editstring, chrom_start, chrom_end, read_start, chrom]] + else: + hits[readname] = {} + hits[readname][endindex] = [[strand, editstring, chrom_start, chrom_end, read_start, chrom]] + + # find score : one end and the other end + hits_score = {} + readname = '' + score = '' + for num_score_file in range(len(all_score_file)): + score_file = all_score_file[num_score_file] + for i, line in enumerate(file(score_file)): + line = line.rstrip() + if not line or line.startswith('#'): continue + + if line.startswith('>'): + if score: + if hits.has_key(readname): + if len(hits[readname]) == hit_per_read: + if hits_score.has_key(readname): + if hits_score[readname].has_key(endindex): + pass + else: + hits_score[readname][endindex] = score + else: + hits_score[readname] = {} + hits_score[readname][endindex] = score + score = '' + if hit_per_read == 1: + readname = line[1:] + endindex = '1' + else: + readname, endindex = line[1:].split('/') + else: + score = line + if score: # the last one + if hits.has_key(readname): + if len(hits[readname]) == hit_per_read: + if hits_score.has_key(readname): + if hits_score[readname].has_key(endindex): + pass + else: + hits_score[readname][endindex] = score + else: + hits_score[readname] = {} + hits_score[readname][endindex] = score + + # mutation call to all mappings + for readkey in hits.keys(): + if len(hits[readkey]) != hit_per_read: continue + + matches = [] + match_count = 0 + + if hit_per_read == 1: + matches = [ hits[readkey]['1'] ] + match_count = 1 + else: + end1_data = hits[readkey]['1'] + end2_data = hits[readkey]['2'] + + for i, end1_hit in enumerate(end1_data): + crin_strand = {'+': False, '-': False} + crin_insertSize = {'+': False, '-': False} + + crin_strand[end1_hit[0]] = True + crin_insertSize[end1_hit[0]] = int(end1_hit[2]) + + for j, end2_hit in enumerate(end2_data): + crin_strand[end2_hit[0]] = True + crin_insertSize[end2_hit[0]] = int(end2_hit[2]) + + if end1_hit[-1] != end2_hit[-1] : continue + + if crin_strand['+'] and crin_strand['-']: + if (crin_insertSize['-'] - crin_insertSize['+']) <= insertion_size: + matches.append([end1_hit, end2_hit]) + match_count += 1 + + if match_count == 1: + for x, end_data in enumerate(matches[0]): + + end_strand, end_editstring, end_chr_start, end_chr_end, end_read_start, end_chrom = end_data + end_read_start = int(end_read_start) - 1 + + if end_strand == '-': + refsegment = reverse_complement(refseq[end_chrom][end_chr_start:end_chr_end]) + else: + refsegment = refseq[end_chrom][end_chr_start:end_chr_end] + + match_len = 0 + editindex = 0 + gap_read = 0 + + while editindex < len(end_editstring): + editchr = end_editstring[editindex] + chrA = '' + chrB = '' + locIndex = [] + if editchr.isdigit(): + editcode = '' + while editchr.isdigit() and editindex < len(end_editstring): + editcode += editchr + editindex += 1 + if editindex < len(end_editstring): editchr = end_editstring[editindex] + for baseIndex in range(int(editcode)): + chrA += refsegment[match_len+baseIndex] + chrB = chrA + match_len += int(editcode) + elif editchr == 'x': + # crossover: inserted between the appropriate two bases + # Two sequencing errors: 4x15x6 (25 matches with 2 crossovers) + # Treated as errors in the reads; Do nothing. + editindex += 1 + + elif editchr.isalpha(): + editcode = editchr + editindex += 1 + chrA = refsegment[match_len] + chrB = editcode + match_len += len(editcode) + + elif editchr == '-': + editcode = editchr + editindex += 1 + chrA = refsegment[match_len] + chrB = editcode + match_len += len(editcode) + gap_read += 1 + + elif editchr == '(': + editcode = '' + while editchr != ')' and editindex < len(end_editstring): + if editindex < len(end_editstring): editchr = end_editstring[editindex] + editcode += editchr + editindex += 1 + editcode = editcode[1:-1] + chrA = '-'*len(editcode) + chrB = editcode + + else: + print 'Warning! Unknown symbols', editchr + + if end_strand == '-': + chrA = reverse_complement(chrA) + chrB = reverse_complement(chrB) + + pos_line = '' + rev_line = '' + + for mappingIndex in range(len(chrA)): + # reference + chrAx = chrA[mappingIndex] + # read + chrBx = chrB[mappingIndex] + + if chrAx and chrBx and chrBx.upper() != 'N': + if end_strand == '+': + chrom_loc = end_chr_start+match_len-len(chrA)+mappingIndex + read_loc = end_read_start+match_len-len(chrA)+mappingIndex-gap_read + if chrAx == '-': chrom_loc -= 1 + + if chrBx == '-': + scoreBx = '-1' + else: + scoreBx = hits_score[readkey][str(x+1)].split()[read_loc] + + # 1-based on chrom_loc and read_loc + pos_line = pos_line + '\t'.join([end_chrom, str(chrom_loc+1), readkey+'/'+str(x+1), str(read_loc+1), chrAx, chrBx, scoreBx]) + '\n' + else: + chrom_loc = end_chr_end-match_len+mappingIndex + read_loc = end_read_start+match_len-1-mappingIndex-gap_read + if chrAx == '-': chrom_loc -= 1 + + if chrBx == '-': + scoreBx = '-1' + else: + scoreBx = hits_score[readkey][str(x+1)].split()[read_loc] + + # 1-based on chrom_loc and read_loc + rev_line = '\t'.join([end_chrom, str(chrom_loc+1), readkey+'/'+str(x+1), str(read_loc+1), chrAx, chrBx, scoreBx]) +'\n' + rev_line + + if chrom_cov.has_key(end_chrom): + if chrom_cov[end_chrom].has_key(chrom_loc): + chrom_cov[end_chrom][chrom_loc] += 1 + else: + chrom_cov[end_chrom][chrom_loc] = 1 + else: + chrom_cov[end_chrom] = {} + chrom_cov[end_chrom][chrom_loc] = 1 + + if pos_line: temp_table.write('%s\n' %(pos_line.rstrip('\r\n'))) + if rev_line: temp_table.write('%s\n' %(rev_line.rstrip('\r\n'))) + + temp_table.close() + + # chrom-wide coverage + for i, line in enumerate(open(temp_table_name)): + line = line.rstrip() + if not line or line.startswith('#'): continue + + fields = line.split() + chrom = fields[0] + eachBp = int(fields[1]) + readname = fields[2] + + if hit_per_read == 1: + fields[2] = readname.split('/')[0] + + if chrom_cov[chrom].has_key(eachBp): + outfile.write('%s\t%d\n' %('\t'.join(fields), chrom_cov[chrom][eachBp])) + else: + outfile.write('%s\t%d\n' %('\t'.join(fields), 0)) + + outfile.close() + + if os.path.exists(temp_table_name): os.remove(temp_table_name) + + return True + +def convert_fastqsolexa_to_fasta_qual(infile_name, query_fasta, query_qual): + + outfile_seq = open( query_fasta, 'w' ) + outfile_score = open( query_qual, 'w' ) + + seq_title_startswith = '' + qual_title_startswith = '' + + default_coding_value = 64 + fastq_block_lines = 0 + + for i, line in enumerate( file( infile_name ) ): + line = line.rstrip() + if not line or line.startswith( '#' ): continue + + fastq_block_lines = ( fastq_block_lines + 1 ) % 4 + line_startswith = line[0:1] + + if fastq_block_lines == 1: + # first line is @title_of_seq + if not seq_title_startswith: + seq_title_startswith = line_startswith + + if line_startswith != seq_title_startswith: + outfile_seq.close() + outfile_score.close() + stop_err( 'Invalid fastqsolexa format at line %d: %s.' % ( i + 1, line ) ) + + read_title = line[1:] + outfile_seq.write( '>%s\n' % line[1:] ) + + elif fastq_block_lines == 2: + # second line is nucleotides + read_length = len( line ) + outfile_seq.write( '%s\n' % line ) + + elif fastq_block_lines == 3: + # third line is +title_of_qualityscore ( might be skipped ) + if not qual_title_startswith: + qual_title_startswith = line_startswith + + if line_startswith != qual_title_startswith: + outfile_seq.close() + outfile_score.close() + stop_err( 'Invalid fastqsolexa format at line %d: %s.' % ( i + 1, line ) ) + + quality_title = line[1:] + if quality_title and read_title != quality_title: + outfile_seq.close() + outfile_score.close() + stop_err( 'Invalid fastqsolexa format at line %d: sequence title "%s" differes from score title "%s".' % ( i + 1, read_title, quality_title ) ) + + if not quality_title: + outfile_score.write( '>%s\n' % read_title ) + else: + outfile_score.write( '>%s\n' % line[1:] ) + + else: + # fourth line is quality scores + qual = '' + fastq_integer = True + # peek: ascii or digits? + val = line.split()[0] + try: + check = int( val ) + fastq_integer = True + except: + fastq_integer = False + + if fastq_integer: + # digits + qual = line + else: + # ascii + quality_score_length = len( line ) + if quality_score_length == read_length + 1: + # first char is qual_score_startswith + qual_score_startswith = ord( line[0:1] ) + line = line[1:] + elif quality_score_length == read_length: + qual_score_startswith = default_coding_value + else: + stop_err( 'Invalid fastqsolexa format at line %d: the number of quality scores ( %d ) is not the same as bases ( %d ).' % ( i + 1, quality_score_length, read_length ) ) + + for j, char in enumerate( line ): + score = ord( char ) - qual_score_startswith # 64 + qual = "%s%s " % ( qual, str( score ) ) + + outfile_score.write( '%s\n' % qual ) + + outfile_seq.close() + outfile_score.close() + + return True + +def __main__(): + + # I/O + type_of_reads = sys.argv[1] # single or paired + input_target = sys.argv[2] # fasta + shrimp_outfile = sys.argv[3] # shrimp output + table_outfile = sys.argv[4] # table output + + # SHRiMP parameters: total = 15 + # TODO: put threshold on each of these parameters + if len(sys.argv) == 21 or len(sys.argv) == 22: + spaced_seed = sys.argv[5] + seed_matches_per_window = sys.argv[6] + seed_hit_taboo_length = sys.argv[7] + seed_generation_taboo_length = sys.argv[8] + seed_window_length = sys.argv[9] + max_hits_per_read = sys.argv[10] + max_read_length = sys.argv[11] + kmer = sys.argv[12] + sw_match_value = sys.argv[13] + sw_mismatch_value = sys.argv[14] + sw_gap_open_ref = sys.argv[15] + sw_gap_open_query = sys.argv[16] + sw_gap_ext_ref = sys.argv[17] + sw_gap_ext_query = sys.argv[18] + sw_hit_threshold = sys.argv[19] + + # Single-end parameters + if type_of_reads == 'single': + input_query = sys.argv[20] # single-end + hit_per_read = 1 + query_fasta = tempfile.NamedTemporaryFile().name + query_qual = tempfile.NamedTemporaryFile().name + else: # Paired-end parameters + input_query_end1 = sys.argv[20] # paired-end + input_query_end2 = sys.argv[21] + hit_per_read = 2 + query_fasta_end1 = tempfile.NamedTemporaryFile().name + query_fasta_end2 = tempfile.NamedTemporaryFile().name + query_qual_end1 = tempfile.NamedTemporaryFile().name + query_qual_end2 = tempfile.NamedTemporaryFile().name + else: + spaced_seed = '111111011111' + seed_matches_per_window = '2' + seed_hit_taboo_length = '4' + seed_generation_taboo_length = '0' + seed_window_length = '115.0' + max_hits_per_read = '100' + max_read_length = '1000' + kmer = '-1' + sw_match_value = '100' + sw_mismatch_value = '-150' + sw_gap_open_ref = '-400' + sw_gap_open_query = '-400' + sw_gap_ext_ref = '-70' + sw_gap_ext_query = '-70' + sw_hit_threshold = '68.0' + + # Single-end parameters + if type_of_reads == 'single': + input_query = sys.argv[5] # single-end + hit_per_read = 1 + query_fasta = tempfile.NamedTemporaryFile().name + query_qual = tempfile.NamedTemporaryFile().name + else: # Paired-end parameters + input_query_end1 = sys.argv[5] # paired-end + input_query_end2 = sys.argv[6] + hit_per_read = 2 + query_fasta_end1 = tempfile.NamedTemporaryFile().name + query_fasta_end2 = tempfile.NamedTemporaryFile().name + query_qual_end1 = tempfile.NamedTemporaryFile().name + query_qual_end2 = tempfile.NamedTemporaryFile().name + + + # temp file for shrimp log file + shrimp_log = tempfile.NamedTemporaryFile().name + + # convert fastq to fasta and quality score files + if type_of_reads == 'single': + return_value = convert_fastqsolexa_to_fasta_qual(input_query, query_fasta, query_qual) + else: + return_value = convert_fastqsolexa_to_fasta_qual(input_query_end1, query_fasta_end1, query_qual_end1) + return_value = convert_fastqsolexa_to_fasta_qual(input_query_end2, query_fasta_end2, query_qual_end2) + + # SHRiMP command + if type_of_reads == 'single': + command = ' '.join(['rmapper-ls', '-s', spaced_seed, '-n', seed_matches_per_window, '-t', seed_hit_taboo_length, '-9', seed_generation_taboo_length, '-w', seed_window_length, '-o', max_hits_per_read, '-r', max_read_length, '-d', kmer, '-m', sw_match_value, '-i', sw_mismatch_value, '-g', sw_gap_open_ref, '-q', sw_gap_open_query, '-e', sw_gap_ext_ref, '-f', sw_gap_ext_query, '-h', sw_hit_threshold, query_fasta, input_target, '>', shrimp_outfile, '2>', shrimp_log]) + + try: + os.system(command) + except Exception, e: + if os.path.exists(query_fasta): os.remove(query_fasta) + if os.path.exists(query_qual): os.remove(query_qual) + stop_err(str(e)) + + else: + command_end1 = ' '.join(['rmapper-ls', '-s', spaced_seed, '-n', seed_matches_per_window, '-t', seed_hit_taboo_length, '-9', seed_generation_taboo_length, '-w', seed_window_length, '-o', max_hits_per_read, '-r', max_read_length, '-d', kmer, '-m', sw_match_value, '-i', sw_mismatch_value, '-g', sw_gap_open_ref, '-q', sw_gap_open_query, '-e', sw_gap_ext_ref, '-f', sw_gap_ext_query, '-h', sw_hit_threshold, query_fasta_end1, input_target, '>', shrimp_outfile, '2>', shrimp_log]) + command_end2 = ' '.join(['rmapper-ls', '-s', spaced_seed, '-n', seed_matches_per_window, '-t', seed_hit_taboo_length, '-9', seed_generation_taboo_length, '-w', seed_window_length, '-o', max_hits_per_read, '-r', max_read_length, '-d', kmer, '-m', sw_match_value, '-i', sw_mismatch_value, '-g', sw_gap_open_ref, '-q', sw_gap_open_query, '-e', sw_gap_ext_ref, '-f', sw_gap_ext_query, '-h', sw_hit_threshold, query_fasta_end2, input_target, '>>', shrimp_outfile, '2>>', shrimp_log]) + + try: + os.system(command_end1) + os.system(command_end2) + except Exception, e: + if os.path.exists(query_fasta_end1): os.remove(query_fasta_end1) + if os.path.exists(query_fasta_end2): os.remove(query_fasta_end2) + if os.path.exists(query_qual_end1): os.remove(query_qual_end1) + if os.path.exists(query_qual_end2): os.remove(query_qual_end2) + stop_err(str(e)) + + # convert to table + if type_of_reads == 'single': + return_value = generate_sub_table(shrimp_outfile, input_target, query_qual, table_outfile, hit_per_read) + else: + return_value = generate_sub_table(shrimp_outfile, input_target, query_qual_end1+'&'+query_qual_end2, table_outfile, hit_per_read) + + # remove temp. files + if type_of_reads == 'single': + if os.path.exists(query_fasta): os.remove(query_fasta) + if os.path.exists(query_qual): os.remove(query_qual) + else: + if os.path.exists(query_fasta_end1): os.remove(query_fasta_end1) + if os.path.exists(query_fasta_end2): os.remove(query_fasta_end2) + if os.path.exists(query_qual_end1): os.remove(query_qual_end1) + if os.path.exists(query_qual_end2): os.remove(query_qual_end2) + + if os.path.exists(shrimp_log): os.remove(shrimp_log) + +if __name__ == '__main__': __main__() + diff -r 26825f08d362 -r 842f1883cf53 tools/metag_tools/shrimp_wrapper.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/metag_tools/shrimp_wrapper.xml Mon Sep 15 15:04:41 2008 -0400 @@ -0,0 +1,196 @@ +<tool id="shrimp_wrapper" name="SHRiMP" version="1.0.0"> + <description>SHort Read Mapping Package</description> + <command interpreter="python"> + #if ($type_of_reads.single_or_paired=="single" and $param.skip_or_full=="skip"):#shrimp_wrapper.py $type_of_reads.single_or_paired $input_target $output1 $output2 $input_query + #elif ($type_of_reads.single_or_paired=="paired" and $param.skip_or_full=="skip"):#shrimp_wrapper.py $type_of_reads.single_or_paired $input_target $output1 $output2 ${type_of_reads.input1} ${type_of_reads.input2} + #elif ($type_of_reads.single_or_paired=="single" and $param.skip_or_full=="full"):#shrimp_wrapper.py $type_of_reads.single_or_paired $input_target $output1 $output2 $param.spaced_seed $param.seed_matches_per_window $param.seed_hit_taboo_length $param.seed_generation_taboo_length $param.seed_window_length $param.max_hits_per_read $param.max_read_length $param.kmer $param.sw_match_value $param.sw_mismatch_value $param.sw_gap_open_ref $param.sw_gap_open_query $param.sw_gap_ext_ref $param.sw_gap_ext_query $param.sw_hit_threshold $input_query + #elif ($type_of_reads.single_or_paired=="paired" and $param.skip_or_full=="full"):#shrimp_wrapper.py $type_of_reads.single_or_paired $input_target $output1 $output2 $param.spaced_seed $param.seed_matches_per_window $param.seed_hit_taboo_length $param.seed_generation_taboo_length $param.seed_window_length $param.max_hits_per_read $param.max_read_length $param.kmer $param.sw_match_value $param.sw_mismatch_value $param.sw_gap_open_ref $param.sw_gap_open_query $param.sw_gap_ext_ref $param.sw_gap_ext_query $param.sw_hit_threshold ${type_of_reads.input1} ${type_of_reads.input2} + #end if + </command> + <inputs> + <page> + <param name="input_target" type="data" format="fasta" label="Reference sequence" /> + <conditional name="type_of_reads"> + <param name="single_or_paired" type="select" label="Single- or Paired-ends"> + <option value="single">Single-end</option> + <option value="paired">Paired-end</option> + </param> + <when value="single"> + <param name="input_query" type="data" format="fastqsolexa" label="Sequence file" /> + </when> + <when value="paired"> + <param name="input1" type="data" format="fastqsolexa" label="One end" /> + <param name="input2" type="data" format="fastqsolexa" label="The other end" /> + </when> + </conditional> + <conditional name="param"> + <param name="skip_or_full" type="select" label="SHRiMP parameter selection"> + <option value="skip">Default setting</option> + <option value="full">Full list</option> + </param> + <when value="skip" /> + <when value="full"> + <param name="spaced_seed" type="text" size="30" value="111111011111" label="Spaced Seed" /> + <param name="seed_matches_per_window" type="integer" size="5" value="2" label="Seed Matches per Window" /> + <param name="seed_hit_taboo_length" type="integer" size="5" value="4" label="Seed Hit Taboo Length" /> + <param name="seed_generation_taboo_length" type="integer" size="5" value="0" label="Seed Generation Taboo Length" /> + <param name="seed_window_length" type="float" size="10" value="115.0" label="Seed Window Length" help="in percentage"/> + <param name="max_hits_per_read" type="integer" size="10" value="100" label="Maximum Hits per Read" /> + <param name="max_read_length" type="integer" size="10" value="1000" label="Maximum Read Length" /> + <param name="kmer" type="integer" size="10" value="-1" label="Kmer Std. Deviation Limit" help="-1 as None"/> + <param name="sw_match_value" type="integer" size="10" value="100" label="S-W Match Value" /> + <param name="sw_mismatch_value" type="integer" size="10" value="-150" label="S-W Mismatch Value" /> + <param name="sw_gap_open_ref" type="integer" size="10" value="-400" label="S-W Gap Open Penalty (Reference)" /> + <param name="sw_gap_open_query" type="integer" size="10" value="-400" label="S-W Gap Open Penalty (Query)" /> + <param name="sw_gap_ext_ref" type="integer" size="10" value="-70" label="S-W Gap Extend Penalty (Reference)" /> + <param name="sw_gap_ext_query" type="integer" size="10" value="-70" label="S-W Gap Extend Penalty (Query)" /> + <param name="sw_hit_threshold" type="float" size="10" value="68.0" label="S-W Hit Threshold" help="in percentage"/> + </when> + </conditional> + </page> + </inputs> + <outputs> + <data name="output1" format="tabular"/> + <data name="output2" format="tabular"/> + </outputs> + <requirements> + <requirement type="binary">SHRiMP_rmapper</requirement> + </requirements> + <tests> + <test> + <param name="single_or_paired" value="single" /> + <param name="skip_or_full" value="skip" /> + <param name="input_target" value="shrimp_phix_anc.fa" ftype="fasta" /> + <param name="input_query" value="shrimp_wrapper_test1.fastq" ftype="fastqsolexa"/> + <output name="output1" file="shrimp_wrapper_test1.out1" /> + </test> + <!-- + <test> + <param name="input1" value="shrimp_wrapper_test2_end1.fastq" ftype="fastqsolexa" /> + <param name="input2" value="shrimp_wrapper_test2_end2.fastq" ftype="fastqsolexa" /> + <param name="single_or_paired" value="paired" /> + <param name="skip_or_full" value="skip" /> + <param name="input_target" value="shrimp_eca_chrMT.fa" ftype="fasta" /> + <output name="output1" file="shrimp_wrapper_test2.out1" /> + </test> + <test> + <param name="single_or_paired" value="single" /> + <param name="skip_or_full" value="full" /> + <param name="input_target" value="shrimp_phix_anc.fa" ftype="fasta" /> + <param name="input_query" value="shrimp_wrapper_test1.fastq" ftype="fastqsolexa"/> + <param name="spaced_seed" value="111111011111" /> + <param name="seed_matches_per_window" value="2" /> + <param name="seed_hit_taboo_length" value="4" /> + <param name="seed_generation_taboo_length" value="0" /> + <param name="seed_window_length" value="115.0" /> + <param name="max_hits_per_read" value="100" /> + <param name="max_read_length" value="1000" /> + <param name="kmer" value="-1" /> + <param name="sw_match_value" value="100" /> + <param name="sw_mismatch_value" value="-150" /> + <param name="sw_gap_open_ref" value="-400" /> + <param name="sw_gap_open_query" value="-400" /> + <param name="sw_gap_ext_ref" value="-70" /> + <param name="sw_gap_ext_query" value="-70" /> + <param name="sw_hit_threshold" value="68.0" /> + <output name="output1" file="shrimp_wrapper_test1.out1" /> + </test> + <test> + <param name="single_or_paired" value="paired" /> + <param name="skip_or_full" value="full" /> + <param name="input_target" value="shrimp_eca_chrMT.fa" ftype="fasta" /> + <param name="spaced_seed" value="111111011111" /> + <param name="seed_matches_per_window" value="2" /> + <param name="seed_hit_taboo_length" value="4" /> + <param name="seed_generation_taboo_length" value="0" /> + <param name="seed_window_length" value="115.0" /> + <param name="max_hits_per_read" value="100" /> + <param name="max_read_length" value="1000" /> + <param name="kmer" value="-1" /> + <param name="sw_match_value" value="100" /> + <param name="sw_mismatch_value" value="-150" /> + <param name="sw_gap_open_ref" value="-400" /> + <param name="sw_gap_open_query" value="-400" /> + <param name="sw_gap_ext_ref" value="-70" /> + <param name="sw_gap_ext_query" value="-70" /> + <param name="sw_hit_threshold" value="68.0" /> + <param name="input1" value="shrimp_wrapper_test2_end1.fastq" ftype="fastqsolexa"/> + <param name="input2" value="shrimp_wrapper_test2_end2.fastq" ftype="fastqsolexa"/> + <output name="output1" file="shrimp_wrapper_test2.out1" /> + </test> + --> + </tests> +<help> + +.. class:: warningmark + +Only nucleotide sequences as query. + +----- + +**What it does** + +Run SHRiMP on letter-space reads. + +----- + +**Example** + +- Input a multiple-fastq file like the following:: + + @seq1 + TACCCGATTTTTTGCTTTCCACTTTATCCTACCCTT + +seq2 + hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh + +- Use default settings (for detail explanations, please see **Parameters** section) + +- Search against your own uploaded file, result will be in the following format:: + + +-------+-------+--------+----------+----------+---------+--------+--------+-------+------------+ + | id | chrom | strand | t.start | t.end | q.start | q.end | length | score | editstring | + +-------+-------+--------+----------+----------+---------+--------+--------+-------+------------+ + | >seq1 | chrMT | + | 14712 | 14747 | 1 | 36 | 36 | 3350 | 24T11 | + +-------+-------+--------+----------+----------+---------+--------+--------+-------+------------+ + +- The result will be formatted Table:: + + +-------+---------+---------+----------+---------+----------+---------+----------+ + | chrom | ref_loc | read_id | read_loc | ref_nuc | read_nuc | quality | coverage | + +-------+---------+---------+----------+---------+----------+---------+----------+ + | chrMT | 14711 | seq1 | 0 | T | T | 40 | 1 | + | chrMT | 14712 | seq1 | 1 | A | A | 40 | 1 | + | chrMT | 14713 | seq1 | 2 | C | C | 40 | 1 | + +-------+---------+---------+----------+---------+----------+---------+----------+ + +----- + +**Parameters** + +Parameter list with default value settings:: + + -s Spaced Seed (default: 111111011111) + -n Seed Matches per Window (default: 2) + -t Seed Hit Taboo Length (default: 4) + -9 Seed Generation Taboo Length (default: 0) + -w Seed Window Length (default: 115.00%) + -o Maximum Hits per Read (default: 100) + -r Maximum Read Length (default: 1000) + -d Kmer Std. Deviation Limit (default: -1 [None]) + + -m S-W Match Value (default: 100) + -i S-W Mismatch Value (default: -150) + -g S-W Gap Open Penalty (Reference) (default: -400) + -q S-W Gap Open Penalty (Query) (default: -400) + -e S-W Gap Extend Penalty (Reference) (default: -70) + -f S-W Gap Extend Penalty (Query) (default: -70) + -h S-W Hit Threshold (default: 68.00%) + +----- + +**Reference** + + **SHRiMP**: Stephen M. Rumble, Michael Brudno, Phil Lacroute, Vladimir Yanovsky, Marc Fiume, Adrian Dalca. shrimp at cs dot toronto dot edu. + +</help> +</tool>
1
0
0
0
[hg] galaxy 1509: Rewrote "Compare two queries" tool in Python.
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/eb941905fd70
changeset: 1509:eb941905fd70 user: guru date: Tue Sep 16 14:09:16 2008 -0400 description: Rewrote "Compare two queries" tool in Python. 2 file(s) affected in this change: tools/filters/compare.xml tools/filters/joinWrapper.py diffs (68 lines): diff -r ec547440ec97 -r eb941905fd70 tools/filters/compare.xml --- a/tools/filters/compare.xml Tue Sep 16 13:25:42 2008 -0400 +++ b/tools/filters/compare.xml Tue Sep 16 14:09:16 2008 -0400 @@ -1,6 +1,6 @@ <tool id="comp1" name="Compare two Queries"> <description>to find common or distinct rows</description> - <command interpreter="perl">joinWrapper.pl $input1 $input2 $field1 $field2 $mode "Y" $out_file1</command> + <command interpreter="python">joinWrapper.py $input1 $input2 $field1 $field2 $mode $out_file1</command> <inputs> <param format="tabular" name="input1" type="data" label="Compare"/> <param name="field1" label="Using column" type="data_column" data_ref="input1" /> diff -r ec547440ec97 -r eb941905fd70 tools/filters/joinWrapper.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/filters/joinWrapper.py Tue Sep 16 14:09:16 2008 -0400 @@ -0,0 +1,53 @@ +#!/usr/bin/env python +#Guruprasad Ananda +""" +This tool provides the UNIX "join" functionality. +""" +import sys, os, tempfile + +def stop_err(msg): + sys.stderr.write(msg) + sys.exit() + +def main(): + infile1 = sys.argv[1] + infile2 = sys.argv[2] + field1 = int(sys.argv[3]) + field2 = int(sys.argv[4]) + mode =sys.argv[5] + outfile = sys.argv[6] + + tmpfile1 = tempfile.NamedTemporaryFile() + tmpfile2 = tempfile.NamedTemporaryFile() + + try: + #Sort the two files based on specified fields + os.system("sort -k %d -o %s %s" %(field1, tmpfile1.name, infile1)) + os.system("sort -k %d -o %s %s" %(field2, tmpfile2.name, infile2)) + except Exception, exc: + stop_err( 'Initialization error -> %s' %str(exc) ) + + option = "" + for line in file(tmpfile1.name): + line = line.strip() + if line: + elems = line.split('\t') + for j in range(1,len(elems)+1): + if j == 1: + option = "1.1" + else: + option = option + ",1." + str(j) + break + + if mode == "V": + cmdline = 'join -v 1 -o %s -1 %d -2 %d %s %s | tr " " "\t" > %s' %(option, field1, field2, tmpfile1.name, tmpfile2.name, outfile) + else: + cmdline = 'join -o %s -1 %d -2 %d %s %s | tr " " "\t" > %s' %(option, field1, field2, tmpfile1.name, tmpfile2.name, outfile) + + try: + os.system(cmdline) + except Exception, exj: + stop_err('Error joining the two datasets -> %s' %str(exj)) + +if __name__ == "__main__": + main()
1
0
0
0
[hg] galaxy 1508: Small update for maf stats tool.
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/ec547440ec97
changeset: 1508:ec547440ec97 user: Dan Blankenberg <dan(a)bx.psu.edu> date: Tue Sep 16 13:25:42 2008 -0400 description: Small update for maf stats tool. 2 file(s) affected in this change: lib/galaxy/tools/util/maf_utilities.py tools/maf/maf_stats.py diffs (99 lines): diff -r 842f1883cf53 -r ec547440ec97 lib/galaxy/tools/util/maf_utilities.py --- a/lib/galaxy/tools/util/maf_utilities.py Mon Sep 15 15:04:41 2008 -0400 +++ b/lib/galaxy/tools/util/maf_utilities.py Tue Sep 16 13:25:42 2008 -0400 @@ -199,7 +199,7 @@ yield block def get_chopped_blocks_with_index_offset_for_region( index, src, region, species = None, mincols = 0, force_strand = None ): for block, idx, offset in index.get_as_iterator_with_index_and_offset( src, region.start, region.end ): - block = chop_block_by_region( block, src, region, species, mincols ) + block = chop_block_by_region( block, src, region, species, mincols, force_strand ) if block is not None: yield block, idx, offset @@ -209,6 +209,25 @@ else: alignment = RegionAlignment( end - start, primary_species ) return fill_region_alignment( alignment, index, primary_species, chrom, start, end, strand, species, mincols ) +#reduces a block to only positions exisiting in the src provided +def reduce_block_by_primary_genome( block, species, chromosome, region_start ): + #returns ( startIndex, {species:texts} + #where texts' contents are reduced to only positions existing in the primary genome + src = "%s.%s" % ( species, chromosome ) + ref = block.get_component_by_src( src ) + start_offset = ref.start - region_start + species_texts = {} + for c in block.components: + species_texts[ c.src.split( '.' )[0] ] = list( c.text ) + #remove locations which are gaps in the primary species, starting from the downstream end + for i in range( len( species_texts[ species ] ) - 1, -1, -1 ): + if species_texts[ species ][i] == '-': + for text in species_texts.values(): + text.pop( i ) + for spec, text in species_texts.items(): + species_texts[spec] = ''.join( text ) + return ( start_offset, species_texts ) + #fills a region alignment def fill_region_alignment( alignment, index, primary_species, chrom, start, end, strand = '+', species = None, mincols = 0 ): region = bx.intervals.Interval( start, end ) @@ -216,22 +235,7 @@ region.strand = strand primary_src = "%s.%s" % ( primary_species, chrom ) - def reduce_block_by_primary_genome( block ): - #returns ( startIndex, {species:texts} - #where texts' contents are reduced to only positions existing in the primary genome - ref = block.get_component_by_src( primary_src ) - start_offset = ref.start - start - species_texts = {} - for c in block.components: - species_texts[ c.src.split( '.' )[0] ] = list( c.text ) - #remove locations which are gaps in the primary species, starting from the downstream end - for i in range( len( species_texts[ primary_species ] ) - 1, -1, -1 ): - if species_texts[ primary_species ][i] == '-': - for text in species_texts.values(): - text.pop( i ) - for spec, text in species_texts.items(): - species_texts[spec] = ''.join( text ) - return ( start_offset, species_texts ) + #Order blocks overlaping this position by score, lowest first blocks = [] @@ -248,7 +252,7 @@ for block_dict in blocks: block = chop_block_by_region( block_dict[1].get_at_offset( block_dict[2] ), primary_src, region, species, mincols, strand ) if block is None: continue - start_offset, species_texts = reduce_block_by_primary_genome( block ) + start_offset, species_texts = reduce_block_by_primary_genome( block, primary_species, chrom, start ) for spec, text in species_texts.items(): try: alignment.set_range( start_offset, spec, text ) diff -r 842f1883cf53 -r ec547440ec97 tools/maf/maf_stats.py --- a/tools/maf/maf_stats.py Mon Sep 15 15:04:41 2008 -0400 +++ b/tools/maf/maf_stats.py Tue Sep 16 13:25:42 2008 -0400 @@ -64,19 +64,11 @@ for c in block.components: spec = c.src.split( '.' )[0] if spec not in coverage: coverage[spec] = zeros( region.end - region.start, dtype = bool ) - ref = block.get_component_by_src( src ) - #skip gap locations due to insertions in secondary species relative to primary species - start_offset = ref.start - region.start - num_gaps = 0 - for i in range( len( ref.text.rstrip().rstrip( "-" ) ) ): - if ref.text[i] in ["-"]: - num_gaps += 1 - continue - #Toggle base if covered - for comp in block.components: - spec = comp.src.split( '.' )[0] - if comp.text and comp.text[i] not in ['-']: - coverage[spec][start_offset + i - num_gaps] = True + start_offset, alignment = maf_utilities.reduce_block_by_primary_genome( block, dbkey, region.chrom, region.start ) + for i in range( len( alignment[dbkey] ) ): + for spec, text in alignment.items(): + if text[i] != '-': + coverage[spec][start_offset + i] = True if summary: #record summary for key in coverage.keys():
1
0
0
0
[hg] galaxy 1510: Strip whitespace from columns in file for data...
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/f8e3770c23f6
changeset: 1510:f8e3770c23f6 user: Dan Blankenberg <dan(a)bx.psu.edu> date: Tue Sep 16 14:10:53 2008 -0400 description: Strip whitespace from columns in file for dataset_metadata_in_file validator. 1 file(s) affected in this change: lib/galaxy/tools/parameters/validation.py diffs (12 lines): diff -r ec547440ec97 -r f8e3770c23f6 lib/galaxy/tools/parameters/validation.py --- a/lib/galaxy/tools/parameters/validation.py Tue Sep 16 13:25:42 2008 -0400 +++ b/lib/galaxy/tools/parameters/validation.py Tue Sep 16 14:10:53 2008 -0400 @@ -247,7 +247,7 @@ if line_startswith is None or line.startswith( line_startswith ): fields = line.split( '\t' ) if metadata_column < len( fields ): - self.valid_values.append( fields[metadata_column] ) + self.valid_values.append( fields[metadata_column].strip() ) def validate( self, value, history = None ): if not value: return if hasattr( value, "metadata" ):
1
0
0
0
[hg] galaxy 1511: Merge local heads
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/c3ce08879473
changeset: 1511:c3ce08879473 user: Dan Blankenberg <dan(a)bx.psu.edu> date: Tue Sep 16 14:26:14 2008 -0400 description: Merge local heads 0 file(s) affected in this change: diffs (12 lines): diff -r eb941905fd70 -r c3ce08879473 lib/galaxy/tools/parameters/validation.py --- a/lib/galaxy/tools/parameters/validation.py Tue Sep 16 14:09:16 2008 -0400 +++ b/lib/galaxy/tools/parameters/validation.py Tue Sep 16 14:26:14 2008 -0400 @@ -247,7 +247,7 @@ if line_startswith is None or line.startswith( line_startswith ): fields = line.split( '\t' ) if metadata_column < len( fields ): - self.valid_values.append( fields[metadata_column] ) + self.valid_values.append( fields[metadata_column].strip() ) def validate( self, value, history = None ): if not value: return if hasattr( value, "metadata" ):
1
0
0
0
[hg] galaxy 1512: The MetadataCollection object is now created o...
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/1e408bab8941
changeset: 1512:1e408bab8941 user: Dan Blankenberg <dan(a)bx.psu.edu> date: Tue Sep 16 15:23:23 2008 -0400 description: The MetadataCollection object is now created only once per dataset object instance (and when datatype is changed), instead of each time dataset.metadata is called. The 'no_value' attribute for a metadata element's spec is returned when the metadata element's value is None. 2 file(s) affected in this change: lib/galaxy/datatypes/metadata.py lib/galaxy/model/__init__.py diffs (89 lines): diff -r c3ce08879473 -r 1e408bab8941 lib/galaxy/datatypes/metadata.py --- a/lib/galaxy/datatypes/metadata.py Tue Sep 16 14:26:14 2008 -0400 +++ b/lib/galaxy/datatypes/metadata.py Tue Sep 16 15:23:23 2008 -0400 @@ -151,9 +151,16 @@ """ def __init__(self, parent, spec): self.parent = parent - self.bunch = parent._metadata or dict() if spec is None: self.spec = MetadataSpecCollection() else: self.spec = spec + + #set default metadata values + if not self.parent._metadata: + self.parent._metadata = {} + for name, value in self.spec.items(): + if name not in self.bunch: + self.bunch[name] = value.default + def __iter__(self): return self.bunch.__iter__() def get( self, key, default=None ): @@ -168,19 +175,21 @@ def __nonzero__(self): return self.bunch.__nonzero__() def __getattr__(self, name): - if self.bunch.get( name ): - return self.bunch.get( name ) + if name == "bunch": + return self.parent._metadata + rval = self.bunch.get( name ) + if rval is None: + rval = self.spec.get( name, None ) + if rval: + rval = rval.no_value + return rval + def __setattr__(self, name, value): + if name in ["parent","spec"]: + self.__dict__[name] = value + elif name == "bunch": + self.parent._metadata = value else: - if self.spec.get(name, None): - return self.spec[name].default - else: - return None - def __setattr__(self, name, value): - if name in ["parent","bunch","spec"]: - self.__dict__[name] = value - else: - self.__dict__["bunch"][name] = value - self.bunch = self.parent._metadata = dict( self.bunch ) + self.bunch[name] = value MetadataElement = Statement(MetadataElementSpec) diff -r c3ce08879473 -r 1e408bab8941 lib/galaxy/model/__init__.py --- a/lib/galaxy/model/__init__.py Tue Sep 16 14:26:14 2008 -0400 +++ b/lib/galaxy/model/__init__.py Tue Sep 16 15:23:23 2008 -0400 @@ -113,7 +113,7 @@ self.peek = peek self.extension = extension self.designation = designation - self._metadata = metadata or dict() + self.metadata = metadata or dict() self.dbkey = dbkey self.deleted = deleted self.visible = visible @@ -159,9 +159,9 @@ return datatypes_registry.get_datatype_by_extension( self.extension ) def get_metadata( self ): - if not self._metadata: - self._metadata = dict() - return MetadataCollection( self, self.datatype.metadata_spec ) + if not hasattr( self, '_metadata_collection' ): + self._metadata_collection = MetadataCollection( self, self.datatype.metadata_spec ) + return self._metadata_collection def set_metadata( self, bunch ): # Needs to accept a MetadataCollection, a bunch, or a dict self._metadata = dict( bunch.items() ) @@ -191,6 +191,8 @@ def change_datatype( self, new_ext ): self.clear_associated_files() + if hasattr( self, '_metadata_collection' ): + del self._metadata_collection datatypes_registry.change_datatype( self, new_ext ) def get_size( self ): """Returns the size of the data on disk"""
1
0
0
0
[hg] galaxy 1513: Quick 'n easy solution to the EMBOSS stage in/...
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/cf17b5a16eff
changeset: 1513:cf17b5a16eff user: Nate Coraor <nate(a)bx.psu.edu> date: Wed Sep 17 10:45:20 2008 -0400 description: Quick 'n easy solution to the EMBOSS stage in/out problem: read the outputs from the "real" galaxy path instead of the temp stage path. 1 file(s) affected in this change: lib/galaxy/jobs/runners/pbs.py diffs (31 lines): diff -r 1e408bab8941 -r cf17b5a16eff lib/galaxy/jobs/runners/pbs.py --- a/lib/galaxy/jobs/runners/pbs.py Tue Sep 16 15:23:23 2008 -0400 +++ b/lib/galaxy/jobs/runners/pbs.py Wed Sep 17 10:45:20 2008 -0400 @@ -146,7 +146,7 @@ if self.app.config.pbs_application_server: pbs_ofile = self.app.config.pbs_application_server + ':' + ofile pbs_efile = self.app.config.pbs_application_server + ':' + efile - stagein = self.get_stage_in_out( job_wrapper.get_input_fnames() + job_wrapper.get_output_fnames() ) + stagein = self.get_stage_in_out( job_wrapper.get_input_fnames() + job_wrapper.get_output_fnames(), symlink=True ) stageout = self.get_stage_in_out( job_wrapper.get_output_fnames() ) job_attrs = pbs.new_attropl(5) job_attrs[0].name = pbs.ATTR_o @@ -372,15 +372,15 @@ self.queue.put( self.STOP_SIGNAL ) log.info( "pbs job runner stopped" ) - def get_stage_in_out( self, fnames ): + def get_stage_in_out( self, fnames, symlink=False ): """Convenience function to create a stagein/stageout list""" stage = '' for fname in fnames: if os.access(fname, os.R_OK): - if stage != '': + if stage: stage += ',' # pathnames are now absolute - if self.app.config.pbs_stage_path != '': + if symlink and self.app.config.pbs_stage_path: stage_name = os.path.join(self.app.config.pbs_stage_path, os.path.split(fname)[1]) else: stage_name = fname
1
0
0
0
[hg] galaxy 1515: Forgot to update tool_conf.sample with the new...
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/280e8b68f845
changeset: 1515:280e8b68f845 user: guru date: Wed Sep 17 17:14:59 2008 -0400 description: Forgot to update tool_conf.sample with the new tool details. 1 file(s) affected in this change: tool_conf.xml.sample diffs (10 lines): diff -r 33e06a98b6d8 -r 280e8b68f845 tool_conf.xml.sample --- a/tool_conf.xml.sample Wed Sep 17 16:42:08 2008 -0400 +++ b/tool_conf.xml.sample Wed Sep 17 17:14:59 2008 -0400 @@ -281,5 +281,6 @@ <tool file="metag_tools/megablast_wrapper.xml" /> <tool file="metag_tools/megablast_xml_parser.xml" /> <tool file="metag_tools/blat_wrapper.xml" /> + <tool file="metag_tools/mapping_to_ucsc.xml" /> </section> </toolbox>
1
0
0
0
[hg] galaxy 1514: New tool to format short read mapping data as ...
by greg@scofield.bx.psu.edu
22 Sep '08
22 Sep '08
details:
http://www.bx.psu.edu/hg/galaxy/rev/33e06a98b6d8
changeset: 1514:33e06a98b6d8 user: guru date: Wed Sep 17 16:42:08 2008 -0400 description: New tool to format short read mapping data as a UCSC custom track,. 2 file(s) affected in this change: tools/metag_tools/mapping_to_ucsc.py tools/metag_tools/mapping_to_ucsc.xml diffs (415 lines): diff -r cf17b5a16eff -r 33e06a98b6d8 tools/metag_tools/mapping_to_ucsc.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/metag_tools/mapping_to_ucsc.py Wed Sep 17 16:42:08 2008 -0400 @@ -0,0 +1,204 @@ +#! /usr/bin/python + +from galaxy import eggs +import sys, tempfile, os + +assert sys.version_info[:2] >= (2.4) + +def stop_err(msg): + sys.stderr.write(msg) + sys.exit() + +def main(): + + out_fname = sys.argv[1] + in_fname = sys.argv[2] + chr_col = int(sys.argv[3])-1 + coord_col = int(sys.argv[4])-1 + track_type = sys.argv[5] + if track_type == 'coverage' or track_type == 'both': + coverage_col = int(sys.argv[6])-1 + cname = sys.argv[7] + cdescription = sys.argv[8] + ccolor = sys.argv[9].replace('-',',') + cvisibility = sys.argv[10] + if track_type == 'snp' or track_type == 'both': + if track_type == 'both': + j = 5 + else: + j = 0 + #sname = sys.argv[7+j] + sdescription = sys.argv[6+j] + svisibility = sys.argv[7+j] + #ref_col = int(sys.argv[10+j])-1 + read_col = int(sys.argv[8+j])-1 + + + # Sort the input file based on chromosome (alphabetically) and start co-ordinates (numerically) + sorted_infile = tempfile.NamedTemporaryFile() + try: + os.system("sort -k %d,%d -k %dn -o %s %s" %(chr_col+1,chr_col+1,coord_col+1,sorted_infile.name,in_fname)) + except Exception, exc: + stop_err( 'Initialization error -> %s' %str(exc) ) + + #generate chr list + sorted_infile.seek(0) + chr_vals = [] + for line in file( sorted_infile.name ): + line = line.strip() + if not(line): + continue + try: + fields = line.split('\t') + chr = fields[chr_col] + if chr not in chr_vals: + chr_vals.append(chr) + except: + pass + if not(chr_vals): + stop_err("Skipped all lines as invalid.") + + if track_type == 'coverage' or track_type == 'both': + if track_type == 'coverage': + fout = open( out_fname, "w" ) + else: + fout = tempfile.NamedTemporaryFile() + fout.write('''track type=wiggle_0 name="%s" description="%s" color=%s visibility=%s\n''' \ + % ( cname, cdescription, ccolor, cvisibility )) + if track_type == 'snp' or track_type == 'both': + fout_a = tempfile.NamedTemporaryFile() + fout_t = tempfile.NamedTemporaryFile() + fout_g = tempfile.NamedTemporaryFile() + fout_c = tempfile.NamedTemporaryFile() + fout_ref = tempfile.NamedTemporaryFile() + + fout_a.write('''track type=wiggle_0 name="%s" description="%s" color=%s visibility=%s\n''' \ + % ( "Track A", sdescription, '255,0,0', svisibility )) + fout_t.write('''track type=wiggle_0 name="%s" description="%s" color=%s visibility=%s\n''' \ + % ( "Track T", sdescription, '0,255,0', svisibility )) + fout_g.write('''track type=wiggle_0 name="%s" description="%s" color=%s visibility=%s\n''' \ + % ( "Track G", sdescription, '0,0,255', svisibility )) + fout_c.write('''track type=wiggle_0 name="%s" description="%s" color=%s visibility=%s\n''' \ + % ( "Track C", sdescription, '255,0,255', svisibility )) + + + sorted_infile.seek(0) + for line in file( sorted_infile.name ): + line = line.strip() + if not(line): + continue + try: + fields = line.split('\t') + chr = fields[chr_col] + start = int(fields[coord_col]) + assert start > 0 + except: + continue + try: + ind = chr_vals.index(chr) #encountered chr for the 1st time + del chr_vals[ind] + prev_start = '' + header = "variableStep chrom=%s\n" %(chr) + if track_type == 'coverage' or track_type == 'both': + coverage = int(fields[coverage_col]) + line1 = "%s\t%s\n" %(start,coverage) + fout.write("%s%s" %(header,line1)) + if track_type == 'snp' or track_type == 'both': + a = t = g = c = 0 + fout_a.write("%s" %(header)) + fout_t.write("%s" %(header)) + fout_g.write("%s" %(header)) + fout_c.write("%s" %(header)) + try: + #ref_nt = fields[ref_col].capitalize() + read_nt = fields[read_col].capitalize() + try: + nt_ind = ['A','T','G','C'].index(read_nt) + if nt_ind == 0: + a+=1 + elif nt_ind == 1: + t+=1 + elif nt_ind == 2: + g+=1 + else: + c+=1 + except ValueError: + pass + except: + pass + prev_start = start + except ValueError: + if start != prev_start: + if track_type == 'coverage' or track_type == 'both': + coverage = int(fields[coverage_col]) + fout.write("%s\t%s\n" %(start,coverage)) + if track_type == 'snp' or track_type == 'both': + if a: + fout_a.write("%s\t%s\n" %(prev_start,a)) + if t: + fout_t.write("%s\t%s\n" %(prev_start,t)) + if g: + fout_g.write("%s\t%s\n" %(prev_start,g)) + if c: + fout_c.write("%s\t%s\n" %(prev_start,c)) + a = t = g = c = 0 + try: + #ref_nt = fields[ref_col].capitalize() + read_nt = fields[read_col].capitalize() + try: + nt_ind = ['A','T','G','C'].index(read_nt) + if nt_ind == 0: + a+=1 + elif nt_ind == 1: + t+=1 + elif nt_ind == 2: + g+=1 + else: + c+=1 + except ValueError: + pass + except: + pass + prev_start = start + else: + if track_type == 'snp' or track_type == 'both': + try: + #ref_nt = fields[ref_col].capitalize() + read_nt = fields[read_col].capitalize() + try: + nt_ind = ['A','T','G','C'].index(read_nt) + if nt_ind == 0: + a+=1 + elif nt_ind == 1: + t+=1 + elif nt_ind == 2: + g+=1 + else: + c+=1 + except ValueError: + pass + except: + pass + + if track_type == 'snp' or track_type == 'both': + if a: + fout_a.write("%s\t%s\n" %(prev_start,a)) + if t: + fout_t.write("%s\t%s\n" %(prev_start,t)) + if g: + fout_g.write("%s\t%s\n" %(prev_start,g)) + if c: + fout_c.write("%s\t%s\n" %(prev_start,c)) + + fout_a.seek(0) + fout_g.seek(0) + fout_t.seek(0) + fout_c.seek(0) + + if track_type == 'snp': + os.system("cat %s %s %s %s >> %s" %(fout_a.name,fout_t.name,fout_g.name,fout_c.name,out_fname)) + elif track_type == 'both': + fout.seek(0) + os.system("cat %s %s %s %s %s | cat > %s" %(fout.name,fout_a.name,fout_t.name,fout_g.name,fout_c.name,out_fname)) +if __name__ == "__main__": + main() \ No newline at end of file diff -r cf17b5a16eff -r 33e06a98b6d8 tools/metag_tools/mapping_to_ucsc.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/metag_tools/mapping_to_ucsc.xml Wed Sep 17 16:42:08 2008 -0400 @@ -0,0 +1,202 @@ +<tool id="mapToUCSC" name="Format mapping data" version="1.0.0"> + <description> as UCSC custom track</description> + <command interpreter="python"> + mapping_to_ucsc.py + $out_file1 + $input + $chr_col + $coord_col + $track.track_type + #if $track.track_type == "coverage" or $track.track_type == "both" + $track.coverage_col + "${track.cname}" + "${track.cdescription}" + "${track.ccolor}" + "${track.cvisibility}" + #end if + #if $track.track_type == "snp" or $track.track_type == "both" + "${track.sdescription}" + "${track.svisibility}" + $track.col2 + #end if + </command> + <inputs> + <param format="tabular" name="input" type="data" label="Select mapping data"/> + <param name="chr_col" type="data_column" data_ref="input" label="Column for reference chromosome" /> + <param name="coord_col" type="data_column" data_ref="input" numerical="True" label="Numerical column for reference co-ordinate" /> + <conditional name="track"> + <param name="track_type" type="select" label="Display"> + <option value="snp" selected="true">SNPs</option> + <option value="coverage">Read coverage</option> + <option value="both">Both</option> + </param> + <when value = "coverage"> + <param name="coverage_col" type="data_column" data_ref="input" numerical="True" label="Numerical column for read coverage" /> + <param name="cname" type="text" size="15" value="User Track" label="Coverage track name"> + <validator type="length" max="15"/> + </param> + <param name="cdescription" type="text" value="User Supplied Coverage Track (from Galaxy)" label="Coverage track description"> + <validator type="length" max="60" size="15"/> + </param> + <param label="Coverage track Color" name="ccolor" type="select"> + <option selected="yes" value="0-0-0">Black</option> + <option value="255-0-0">Red</option> + <option value="0-255-0">Green</option> + <option value="0-0-255">Blue</option> + <option value="255-0-255">Magenta</option> + <option value="0-255-255">Cyan</option> + <option value="255-215-0">Gold</option> + <option value="160-32-240">Purple</option> + <option value="255-140-0">Orange</option> + <option value="255-20-147">Pink</option> + <option value="92-51-23">Dark Chocolate</option> + <option value="85-107-47">Olive green</option> + </param> + <param label="Coverage track Visibility" name="cvisibility" type="select"> + <option selected="yes" value="1">Dense</option> + <option value="2">Full</option> + <option value="3">Pack</option> + <option value="4">Squish</option> + <option value="0">Hide</option> + </param> + </when> + + <when value = "snp"> + <!-- + <param name="col1" type="data_column" data_ref="input" label="Column containing the reference nucleotide" /> + --> + <param name="col2" type="data_column" data_ref="input" label="Column containing the read nucleotide" /> + <!-- + <param name="sname" type="text" size="15" value="User Track-2" label="SNP track name"> + <validator type="length" max="15"/> + </param> + --> + <param name="sdescription" type="text" value="User Supplied Track (from Galaxy)" label="SNP track description"> + <validator type="length" max="60" size="15"/> + </param> + <param label="SNP track Visibility" name="svisibility" type="select"> + <option selected="yes" value="1">Dense</option> + <option value="2">Full</option> + <option value="3">Pack</option> + <option value="4">Squish</option> + <option value="0">Hide</option> + </param> + </when> + + <when value = "both"> + <param name="coverage_col" type="data_column" data_ref="input" numerical="True" label="Numerical column for read coverage" /> + <param name="cname" type="text" size="15" value="User Track" label="Coverage track name"> + <validator type="length" max="15"/> + </param> + <param name="cdescription" type="text" size="15" value="User Supplied Track (from Galaxy)" label="Coverage track description"> + <validator type="length" max="60"/> + </param> + <param label="Coverage track Color" name="ccolor" type="select"> + <option selected="yes" value="0-0-0">Black</option> + <option value="255-0-0">Red</option> + <option value="0-255-0">Green</option> + <option value="0-0-255">Blue</option> + <option value="255-0-255">Magenta</option> + <option value="0-255-255">Cyan</option> + <option value="255-215-0">Gold</option> + <option value="160-32-240">Purple</option> + <option value="255-140-0">Orange</option> + <option value="255-20-147">Pink</option> + <option value="92-51-23">Dark Chocolate</option> + <option value="85-107-47">Olive green</option> + </param> + <param label="Coverage track Visibility" name="cvisibility" type="select"> + <option selected="yes" value="1">Dense</option> + <option value="2">Full</option> + <option value="3">Pack</option> + <option value="4">Squish</option> + <option value="0">Hide</option> + </param> + <!-- + <param name="col1" type="data_column" data_ref="input" label="Column containing the reference nucleotide" /> + --> + <param name="col2" type="data_column" data_ref="input" label="Column containing the read nucleotide" /> + <!-- + <param name="sname" type="text" size="15" value="User Track-2" label="SNP track name"> + <validator type="length" max="15"/> + </param> + --> + <param name="sdescription" type="text" size="15" value="User Supplied Track (from Galaxy)" label="SNP track description"> + <validator type="length" max="60"/> + </param> + <param label="SNP track Visibility" name="svisibility" type="select"> + <option selected="yes" value="1">Dense</option> + <option value="2">Full</option> + <option value="3">Pack</option> + <option value="4">Squish</option> + <option value="0">Hide</option> + </param> + </when> + </conditional> + </inputs> + <outputs> + <data format="customtrack" name="out_file1"/> + </outputs> + + + <help> + +.. class:: infomark + +**What it does** + +This tool formats mapping data generated by short read mappers, as a custom track that can be displayed at UCSC genome browser. + +----- + +.. class:: warningmark + +**Note** + +This tool requires the mapping data to contain at least the following information: + +chromosome, genome coordinate, read nucleotide (if option to display is SNPs), read coverage (if option to display is Read coverage). + +----- + +**Example** + +For the following Mapping data:: + + #chr g_start read_id read_coord g_nt read_nt qual read_coverage + chrM 1 1:29:1672:1127/1 11 G G 40 134 + chrM 1 1:32:93:933/1 4 G A 40 134 + chrM 1 1:34:116:2032/1 11 G A 40 134 + chrM 1 1:39:207:964/1 1 G G 40 134 + chrM 2 1:3:359:848/1 1 G C 40 234 + chrM 2 1:40:1435:1013/1 1 G G 40 234 + chrM 3 1:40:730:972/1 9 G G 40 334 + chrM 4 1:42:1712:921/2 31 G T 35 434 + chrM 4 1:44:1649:493/1 4 G G 40 434 + +running this tool to display both SNPs and Read coverage will return the following tracks, containing aggregated data per genome co-ordinate:: + + track type=wiggle_0 name="Coverage Track" description="User Supplied Track (from Galaxy)" color=0,0,0 visibility=1 + variableStep chrom=chrM + 1 134 + 2 234 + 3 334 + 4 434 + track type=wiggle_0 name="Track A" description="User Supplied SNP Track (from Galaxy)" color=255,0,0 visibility=1 + variableStep chrom=chrM + 1 2 + track type=wiggle_0 name="Track T" description="User Supplied SNP Track (from Galaxy)" color=0,255,0 visibility=1 + variableStep chrom=chrM + 4 1 + track type=wiggle_0 name="Track G" description="User Supplied SNP Track (from Galaxy)" color=0,0,255 visibility=1 + variableStep chrom=chrM + 1 2 + 2 1 + 3 1 + 4 1 + track type=wiggle_0 name="Track C" description="User Supplied SNP Track (from Galaxy)" color=255,0,255 visibility=1 + variableStep chrom=chrM + 2 1 + + </help> +</tool>
1
0
0
0
← Newer
1
2
3
4
5
6
7
Older →
Jump to page:
1
2
3
4
5
6
7
Results per page:
10
25
50
100
200