question about lastz parameters
Hi, I've been using lastz to map some 454 reads to a reference sequence on galaxy. I'm not clear on what the mapping modes do for the analysis. If I am already setting a cut-off value for the percent identity for reporting matches then what exactly does the mode setting do? In the tutorial you set a roche-454 98% mode based for a re-sequencing example and then set the identity to 90%, what parameters in lastz does the mapping mode change? thanks, H -- Heather Kent Bioinformatics Core Facility Public Health Agency of Canada National Microbiology Laboratories (204)784-7503 Quote of the week: Most people are other people. Their thoughts are someone else's opinions, their lives a mimicry, their passions a quotation. Oscar Wilde (1854 - 1900), De Profundis, 1905
Howdy, Heather, (speaking for lastz, not for galaxy) The mapping mode in galaxy ( http://tinyurl.com/galaxy-mode ) corresponds to one of lastz's yasra shortcuts (yasra is a short read assembler from our lab). These shortcuts set up several seeding and scoring parameters, and were tuned on read data for which the expected/ observed identity was at the indicated level. One reason for doing this is to make the sensitivity fit the data, which greatly improves speed when identity is high, eliminating many false positives at early stages in the alignment process. The identity setting in lastz is a post-processing filter ( http://tinyurl.com/galaxy-filter ). After alignments have been discovered, the last step before writing them to output is to pass them through several filters (identity, coverage, etc.). The yasra shortcuts *do* include an identity filter setting, but I believe the galaxy setting is overriding that. Thus, you can set the mode to 98%, configuring scoring to be very specific, yet still set identity to 90%, allowing alignments lower than 98% to be reported. The yasra shortcuts are probably too stringent in terms of the identity filter. If mean identity is, say 98%, we should expect some meaningful alignments below the mean-- typically a histogram of identity is similar to a normal distribution (as a crude approximation). This is something we have discussed changing locally, and it is possible that the specific yasra settings will change in the future. How to do so in a way that doesn't spoil existing pipelines is an issue. Bob H On Aug 4, 2010, at 5:25 PM, Heather Kent wrote:
Hi, I've been using lastz to map some 454 reads to a reference sequence on galaxy. I'm not clear on what the mapping modes do for the analysis. If I am already setting a cut-off value for the percent identity for reporting matches then what exactly does the mode setting do? In the tutorial you set a roche-454 98% mode based for a re-sequencing example and then set the identity to 90%, what parameters in lastz does the mapping mode change?
thanks, H
-- Heather Kent Bioinformatics Core Facility Public Health Agency of Canada National Microbiology Laboratories (204)784-7503
Quote of the week: Most people are other people. Their thoughts are someone else's opinions, their lives a mimicry, their passions a quotation. Oscar Wilde (1854 - 1900), De Profundis, 1905 _______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
participants (2)
-
Bob Harris
-
Heather Kent