Hi Peter - Regular expressions are a great, simple, and often fast solution to cleaning up seqs - glad that it is working for you. I am sure others will be interested in your tools, too, once you have them ready for the Tool Shed. Thanks for all of your contributions to Galaxy! Jen On 2/9/11 7:42 AM, Peter Cock wrote:
On Wed, Feb 9, 2011 at 3:32 PM, Jennifer Jackson<jen@bx.psu.edu> wrote:
Hi Peter,
Sorry about that, I did a double check and you are right, the tool doesn't "screen" sequences. Maybe try BLAT itself to identify then clip? It depends on how long your primers are - shorter than 20 will need some tuning. Ask UCSC directly (Galt) about how to configure for this type of match: genome@ucsc.edu.
Best,
Jen Galaxy team
Hi Jen,
Yes, my primer sequences are short - up to 22bp, so I don't think BLAST is a good solution. I've been using regular expressions and it seems to work nicely on my current 454 data (it would need testing on some large datasets before I was happy it could be used on say a full run of Illumina).
For the work in progress, see: https://bitbucket.org/peterjc/galaxy-central/src/filter_fasta/tools/primers/
At the time of writing I have three tools, for FASTA, FASTQ and SFF files. As per my recent email I am considering merging them into one single tool: http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-February/004294.html
Peter
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org