Changing fasta_to_tabular_converter.py
Hi all, I was just surprised to find what I consider to be a major bug in fasta_to_tabular_converter.py used to convert FASTA into tabular. Consider this toy example: >alpha ACGTAC >beta AGTGTA >gamma with some description AGGTACCA What the converter gives is two columns (title line and sequence), but the '>' is left in: >alpha (tab) ACGTAC >beta (tab) AGTGTA >gamma with some description (tab) AGGTACCA Given just two columns, what I was expecting was: alpha (tab) ACGTAC beta (tab) AGTGTA gamma with some description (tab) AGGTACCA I think this is a bug. In support of this view, I note the user-facing (now in the Tool Shed) removes the '>' symbol: https://toolshed.g2.bx.psu.edu/view/devteam/fasta_to_tabular https://github.com/galaxyproject/tools-devteam/tree/master/tools/fasta_to_ta... I have submitted a pull request to address this: https://github.com/galaxyproject/galaxy/pull/11 Note what I really wanted was three columns, the ID, comment and sequence: alpha (tab) (empty) (tab) ACGTAC beta (tab) (empty) (tab) AGTGTA gamma (tab) with some description (tab) AGGTACCA The user-facing tool does support this. I appreciate that changing the built-in implicit converter to give three column output could be a problem for backward compatibility (if anyone has written a workflow using the '>' version of the implicit conversion?), so I can make this conversion explicit in my workflow. Regards, Peter
participants (1)
-
Peter Cock