Hi all,
If anyone is interested, I've just wrapped a little Python script I'd prepared early for use in Galaxy for a fairly common need - "back translating" a protein alignment into a nucleotide alignment by threading the unaligned sequence.
We're testing this locally, and barring any major issues I would expect to release this to the main Tool Shed in a week or so - but if any of you want to have a play with it now, and pass on feedback, please do so:
Development repository: https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans
Test Tool Shed release: http://testtoolshed.g2.bx.psu.edu/view/peterjc/align_back_trans
Planned Tool Shed release: http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans (pending)
This uses Biopython's AlignIO [*] module and so can handle a range of alignment file formats - for now I have restricted this to "fasta", "clustal" and "phylip" which I believe are all already in use on the tool shed - there is probably scope here for a more coordinated effort to define Galaxy datatypes in this area, including things like the PFAM/Stockholm format and strict/relaxed variants of PHYLIP.
Regards,
Peter
This new tool is now live on the main Tool Shed: http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans
Peter
On Tue, Feb 18, 2014 at 2:10 PM, Peter Cock p.j.a.cock@googlemail.com wrote:
Hi all,
If anyone is interested, I've just wrapped a little Python script I'd prepared early for use in Galaxy for a fairly common need - "back translating" a protein alignment into a nucleotide alignment by threading the unaligned sequence.
We're testing this locally, and barring any major issues I would expect to release this to the main Tool Shed in a week or so - but if any of you want to have a play with it now, and pass on feedback, please do so:
Development repository: https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans
Test Tool Shed release: http://testtoolshed.g2.bx.psu.edu/view/peterjc/align_back_trans
Planned Tool Shed release: http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans (pending)
This uses Biopython's AlignIO [*] module and so can handle a range of alignment file formats - for now I have restricted this to "fasta", "clustal" and "phylip" which I believe are all already in use on the tool shed - there is probably scope here for a more coordinated effort to define Galaxy datatypes in this area, including things like the PFAM/Stockholm format and strict/relaxed variants of PHYLIP.
Regards,
Peter
galaxy-dev@lists.galaxyproject.org