On Wed, Nov 20, 2013 at 7:10 PM, Dooley, Damion <Damion.Dooley@bccdc.ca> wrote:
I'm doing a 1 step generic reporting tool along the lines of the "BLAST XML to tabular" script by Peter. I was just about to ask about this line, which looked pretty much like a bug:
sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(" >"))
Then I found the patch from Nov 7th 2013:
https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/bl...
try: sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(" >")) except IndexError as e: stop_err("Problem splitting multuple hits?\n%r\n--> %s" % (hit_def, e))
Yay! But what I've seen in recent XML output reports is that the ">" content has been changed to ">" . E.g.
https://github.com/biopython/biopython/blob/master/Tests/Blast/mirna.xml
<Hit> <Hit_num>66</Hit_num> <Hit_id>gi|195029385|ref|XR_047134.1|</Hit_id> <Hit_def>Drosophila grimshawi miR-7-RA (Dgri\mir-7), ncRNA >gi|195336156|ref|XR_048470.1| Drosophila sechellia miR-7-RA (Dsec\mir-7), ncRNA >gi|195585143|ref|XR_050309.1| Drosophila simulans miR-7-RA (Dsim\mir-7), ncRNA</Hit_def> <Hit_accession>XR_047134</Hit_accession> ...
So perhaps a stop_err() could be avoided, if test is for ">" instead? I assume that no variants of python ElementTree.iterparse() will unescape content when returned via the iterator?
Damion
On Wed, Nov 20, 2013 at 7:31 PM, Dooley, Damion <Damion.Dooley@bccdc.ca> wrote:
Woops - I realize now findtext() must be unescaping all ">", so Peter was trying to address other non-splitting occurances of " >" as per his patch notes. But perhaps a stop_err() isn't merrited in this case?
So ignore my test for ">" comment.
Regards,
Damion
OK - good. I was worried that there might be some inconsistency between different databases of versions of BLAST about how the > was encoded. As to why I treat this as a fatal error (calling stop_err), the alternative would be to issue a warning to stderr, and guess what the data ought to look like? That just seems like asking for trouble - a big red error should ensure I hear bug reports ;) Zen of Python: Errors should never pass silently. Peter