On Tue, Sep 28, 2010 at 3:43 PM, Bossers, Alex Alex.Bossers@wur.nl wrote:
Hi Peter, Nice work. We are working on some general tool wrappers as well (blat, mummer, etc) which have the same difficulty of lots to maintain if something changes...So I also follow your thread on the shared XML parts with great interest.
Nice to know this use case (BLAST) isn't a special case.
Regarding the blast xml to table. Its already in your distribution for megablast at metag_tools/megablast_xml_parser.xml there is also a basic wrapper for megablast.
I'd seen the megablast wrapper (currently for legacy NCBI BLAST, not BLAST+, as discussed earlier in the thread).
The metag_tools/megablast_xml_parser.xml script is close to what I had in mind, but not quite the same: I wanted to reproduce the default 12 column tabular output from the BLAST+ tools from the XML output. My thinking was we'd have lots of tools designed to work with the default tabular output from BLAST+, so an option to go to XML if needed for some steps and recover the tabular output later would be nice. [And similarly for the ASN.1 output.]
I guess here (and in general) there is scope for supporting all the tab fields which the NCBI command line tools support... I see Galaxy has some clever metadata for tracking different columns in interval data types - we'd need to do something similar for different BLAST columns. That is going to be more work of course, and not something I need immediately.
Peter