On Mar 25, 2010, at 1:09 PM, rsharris@bx.psu.edu wrote:
Howdy, Brant,
hi bob,
I'm at a conference this week and my internet connectivity is spotty.
no problem, thanks for the fast response.
Short answer is "no", lastz doesn't provide a best-only filter. I am not sure what your input files are (one-to-one, many-to-many, one-to-many?).
it's basically many to many right now, as long my RAM holds and i don't go one-to-many.
For one-to-one you could use sort and head to solve this (at command line-- not sure specifically how you would do this in galaxy). For many-to-one, such as mapping a large number of reads to a genome, it's a little more difficult because you want to sort to occur separately per read.
i've basically been parsing the lastz output to build a dictionary like so: { read1: { score1:match1, score2:match2 }, read2: { score1:match1, } } where score1…scoreN is an integer, then: for read in matches: m = matches[read] bestMatch = m[max(m.keys())] … do stuff … this seems fast enough with a few tens of thousands of lines returned from lastz.
It isn't an overly difficult thing to do though. ... laptop battery going down. Gotta sign off.
i basically just wanted to see if i had missed anything and maybe give the old +1 for --bestmatch as a potential filter parameter in the future ; ). (given enough time, i'll poke around in the source and see what i can do with my meager C skills) thanks again for the extremely rapid response! enjoy your conference... best, b < * ) (_ \\ _ ||