I'm doing a 1 step generic reporting tool along the lines of the "BLAST XML to tabular" script by Peter. I was just about to ask about this line, which looked pretty much like a bug: sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(" >")) Then I found the patch from Nov 7th 2013: https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/bl... try: sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(" >")) except IndexError as e: stop_err("Problem splitting multuple hits?\n%r\n--> %s" % (hit_def, e)) Yay! But what I've seen in recent XML output reports is that the ">" content has been changed to ">" . E.g. https://github.com/biopython/biopython/blob/master/Tests/Blast/mirna.xml <Hit> <Hit_num>66</Hit_num> <Hit_id>gi|195029385|ref|XR_047134.1|</Hit_id> <Hit_def>Drosophila grimshawi miR-7-RA (Dgri\mir-7), ncRNA >gi|195336156|ref|XR_048470.1| Drosophila sechellia miR-7-RA (Dsec\mir-7), ncRNA >gi|195585143|ref|XR_050309.1| Drosophila simulans miR-7-RA (Dsim\mir-7), ncRNA</Hit_def> <Hit_accession>XR_047134</Hit_accession> ... So perhaps a stop_err() could be avoided, if test is for ">" instead? I assume that no variants of python ElementTree.iterparse() will unescape content when returned via the iterator? Damion