PE reads d ont map after trimming
Hi, I hoped that someone could shed some light onto this . I am attempting to map a set of 2x 150 illumina PE data from a DNA resequencing project. The run had an issue where the quality of the last 50 or so reads of the second run tail off quite considerably. I thought to trim the second read such that the poorer sequence bases are stripped form the end of the read. I attempted to do this using the FASTQ quality trimmer then mapping both reads using bowtie. When I did this however, I went from an alignment of 34% passing filter reads aligned not trimmed (which is not good to start off with), to 0.18%. trim command was set as defaults: Any ideas what I could be doing wrong? ******************************************************************************************************************** This message may contain confidential information. If you are not the intended recipient please inform the sender that you have received the message in error before deleting it. Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents: to do so is strictly prohibited and may be unlawful. Thank you for your co-operation. NHSmail is the secure email and directory service available for all NHS staff in England and Scotland NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients NHSmail provides an email address for your career in the NHS and can be accessed anywhere For more information and to find out how you can switch, visit www.connectingforhealth.nhs.uk/nhsmail ********************************************************************************************************************
Hi Chris, It may be that your reads no longer meet the default threshold parameters for Bowtie. To start, running a tool such as FastQC will give you some information about the length, quality distribution, and other metrics of your data. Doing this before and after trimming would likely be helpful, and perhaps guide you to the optimal trim options. Next, open the "Bowtie settings to use: Full parameter list" (default is "Commonly used"). Comparing the values to their meaning in the documentation and to your data will likely pin-point the problem. For example, if you have trimmed most of your data down to 25 bases, but have a "Seed length (-l):" of 28 (bases), then most of your data will not align. Short documentation is on the tool form to be convenient, for example: -l INT Seed length. The number of bases on the high-quality end of the read to which the -n ceiling applies. Must be at least 5. [28] -n INT Mismatch seed. Maximum number of mismatches permitted in the seed (defined with seed length option). Can be 0, 1, 2, or 3. [2] Best wishes for your project, Jen Galaxy team On 3/1/12 8:57 AM, Buxton Chris (NORTH BRISTOL NHS TRUST) wrote:
Hi, I hoped that someone could shed some light onto this . I am attempting to map a set of 2x 150 illumina PE data from a DNA resequencing project. The run had an issue where the quality of the last 50 or so reads of the second run tail off quite considerably. I thought to trim the second read such that the poorer sequence bases are stripped form the end of the read. I attempted to do this using the FASTQ quality trimmer then mapping both reads using bowtie. When I did this however, I went from an alignment of 34% passing filter reads aligned not trimmed (which is not good to start off with), to 0.18%. trim command was set as defaults: Any ideas what I could be doing wrong?
********************************************************************************************************************
This message may contain confidential information. If you are not the intended recipient please inform the sender that you have received the message in error before deleting it. Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents: to do so is strictly prohibited and may be unlawful.
Thank you for your co-operation.
NHSmail is the secure email and directory service available for all NHS staff in England and Scotland NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients NHSmail provides an email address for your career in the NHS and can be accessed anywhere For more information and to find out how you can switch, visit www.connectingforhealth.nhs.uk/nhsmail
********************************************************************************************************************
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
Hi, I'm wondering if there is a script(s) or set of command lines to download corresponding to the 'Compare two datasets' under 'Join, Subtract and Group' set of tools in Galaxy.I want to run this locally on the command line on many files. I've searched under toolshed but can't seem to find it. Many Thanks, Ken
Hi Ken, This tool is a wrapper around the unix utility called "join". In the Galaxy source, it has two components, the XML definition for the tool form: http://bitbucket.org/galaxy/galaxy-central/src/c2a2ae70c051/tools/filters/co... And the wrapped utility itself: http://bitbucket.org/galaxy/galaxy-central/src/c2a2ae70c051/tools/filters/jo... But, if you have access to the a unix line command, then you can just use the join command directly. Before using join, you must first use another utility called sort. The steps are: 1 - sort both files on the columns that you will be joining on 2 - join the files on those columns A google search will find many examples for these tools. These links are particularly friendly: http://www.computerhope.com/unix/ujoin.htm http://www.computerhope.com/unix/usort.htm Best wishes for your project, Jen Galaxy team On 3/6/12 3:30 PM, kenlee nakasugi wrote:
Hi,
I'm wondering if there is a script(s) or set of command lines to download corresponding to the 'Compare two datasets' under 'Join, Subtract and Group' set of tools in Galaxy. I want to run this locally on the command line on many files. I've searched under toolshed but can't seem to find it.
Many Thanks, Ken
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
participants (3)
-
Buxton Chris (NORTH BRISTOL NHS TRUST)
-
Jennifer Jackson
-
kenlee nakasugi