details: http://www.bx.psu.edu/hg/galaxy/rev/d6d156b04767 changeset: 3473:d6d156b04767 user: Dan Blankenberg <dan@bx.psu.edu> date: Wed Mar 03 14:34:31 2010 -0500 description: Changes for some FASTQ tools. Move several options of the Groomer tool under an advanced options conditional. Update help for groomer, filter and manipulation tools. diffstat: tools/fastq/fastq_filter.xml | 10 +- tools/fastq/fastq_groomer.xml | 277 ++++++++++++++++++------------------ tools/fastq/fastq_manipulation.xml | 2 +- 3 files changed, 142 insertions(+), 147 deletions(-) diffs (526 lines): diff -r bfcf6a3249c7 -r d6d156b04767 tools/fastq/fastq_filter.xml --- a/tools/fastq/fastq_filter.xml Wed Mar 03 14:30:32 2010 -0500 +++ b/tools/fastq/fastq_filter.xml Wed Mar 03 14:34:31 2010 -0500 @@ -16,7 +16,7 @@ <param name="paired_end" label="This is paired end data" type="boolean" truevalue="paired_end" falsevalue="single_end" checked="False"/> <repeat name="fastq_filters" title="Quality Filter on a Range of Bases" help="The above settings do not apply to these filters."> <conditional name="offset_type"> - <param name="base_offset_type" type="select" label="Define Base Offsets as"> + <param name="base_offset_type" type="select" label="Define Base Offsets as" help="Absolute for e.g. fixed length reads.<br>Percentage for e.g. variable length reads."> <option value="offsets_absolute" selected="true">Absolute Values</option> <option value="offsets_percent">Percentage of Read Length</option> </param> @@ -289,15 +289,15 @@ <help> This tool allows you to build complex filters to be applied to each read in a FASTQ file. -Basic Options: +**Basic Options:** * You can specify a minimum and maximum read lengths. * You can specify minimum and maximum per base quality scores, with optionally specifying the number of bases that are allowed to deviate from this range (default of 0 deviant bases). * If your data is paired-end, select the proper checkbox; this will cause each read to be internally split down the middle and filters applied to each half using the offsets specified. -Advance Options: +**Advance Options:** * You can specify any number of advanced filters. - * Offsets are defined, starting at zero, increasing from the ends of the reads. For example, a quality string of "ABCDEFG", with offsets of 1 and 1 specified will yield "BCDEF". - * You can specify either absolute offset values, or percentage offset values. When using the percent-based method, offsets are rounded to the nearest integer. + * 5' and 3' offsets are defined, starting at zero, increasing from the respective end of the reads. For example, a quality string of "ABCDEFG", with 5' and 3' offsets of 1 and 1, respectively, specified will yield "BCDEF". + * You can specify either absolute offset values, or percentage offset values. *Absolute Values* based offsets are useful for fixed length reads (e.g. Illumina or SOLiD data). *Percentage of Read Length* based offsets are useful for variable length reads (e.g. 454 data). When using the percent-based method, offsets are rounded to the nearest integer. * The user specifies the aggregating action (min, max, sum, mean) to perform on the quality score values found between the specified offsets to be used with the user defined comparison operation and comparison value. * If a set of offsets is specified that causes the remaining quality score list to be of length zero, then the read will **pass** the quality filter unless the size range filter is used to remove these reads. diff -r bfcf6a3249c7 -r d6d156b04767 tools/fastq/fastq_groomer.xml --- a/tools/fastq/fastq_groomer.xml Wed Mar 03 14:30:32 2010 -0500 +++ b/tools/fastq/fastq_groomer.xml Wed Mar 03 14:34:31 2010 -0500 @@ -1,6 +1,17 @@ -<tool id="fastq_groomer" name="FASTQ Groomer" version="1.0.1"> +<tool id="fastq_groomer" name="FASTQ Groomer" version="1.0.2"> <description>convert between various FASTQ quality formats</description> - <command interpreter="python">fastq_groomer.py '$input_file' '$input_type' '$output_file' '$output_type' '$force_quality_encoding' '$summarize_input'</command> + <command interpreter="python">fastq_groomer.py '$input_file' '$input_type' '$output_file' +#if str( $options_type['options_type_selector'] ) == 'basic': +#if str( $input_type ) == 'cssanger': +'cssanger' +#else: +'sanger' +#end if +'ascii' 'summarize_input' +#else: +'${options_type.output_type}' '${options_type.force_quality_encoding}' '${options_type.summarize_input}' +#end if +</command> <inputs> <param name="input_file" type="data" format="fastq" label="File to groom" /> <param name="input_type" type="select" label="Input FASTQ quality scores type"> @@ -9,63 +20,110 @@ <option value="sanger" selected="True">Sanger</option> <option value="cssanger">Color Space Sanger</option> </param> - <param name="output_type" type="select" label="Output FASTQ quality scores type"> - <option value="solexa">Solexa</option> - <option value="illumina">Illumina 1.3+</option> - <option value="sanger" selected="True">Sanger (recommended)</option> - <option value="cssanger">Color Space Sanger</option> + <conditional name="options_type"> + <param name="options_type_selector" type="select" label="Advanced Options"> + <option value="basic" selected="True">Hide Advanced Options</option> + <option value="advanced">Show Advanced Options</option> </param> - <param name="force_quality_encoding" type="select" label="Force Quality Score encoding"> - <option value="None">Use Source Encoding</option> - <option value="ascii" selected="True">ASCII</option> - <option value="decimal">Decimal</option> - </param> - <param name="summarize_input" type="select" label="Summarize input data"> - <option value="summarize_input" selected="True">Summarize Input</option> - <option value="dont_summarize_input">Do not Summarize Input (faster)</option> - </param> + <when value="basic"> + <!-- no options --> + </when> + <when value="advanced"> + <param name="output_type" type="select" label="Output FASTQ quality scores type" help="Galaxy tools are designed to work with the Sanger Quality score format."> + <option value="solexa">Solexa</option> + <option value="illumina">Illumina 1.3+</option> + <option value="sanger" selected="True">Sanger (recommended)</option> + <option value="cssanger">Color Space Sanger</option> + </param> + <param name="force_quality_encoding" type="select" label="Force Quality Score encoding"> + <option value="None">Use Source Encoding</option> + <option value="ascii" selected="True">ASCII</option> + <option value="decimal">Decimal</option> + </param> + <param name="summarize_input" type="select" label="Summarize input data"> + <option value="summarize_input" selected="True">Summarize Input</option> + <option value="dont_summarize_input">Do not Summarize Input (faster)</option> + </param> + </when> + </conditional> </inputs> <outputs> - <data name="output_file" format="fastq"> + <data name="output_file" format="fastqsanger"> <change_format> - <when input="output_type" value="solexa" format="fastqsolexa" /> - <when input="output_type" value="illumina" format="fastqillumina" /> - <when input="output_type" value="sanger" format="fastqsanger" /> - <when input="output_type" value="cssanger" format="fastqcssanger" /> + <when input="input_type" value="cssanger" format="fastqcssanger" /> + <when input="options_type.output_type" value="solexa" format="fastqsolexa" /> + <when input="options_type.output_type" value="illumina" format="fastqillumina" /> + <when input="options_type.output_type" value="sanger" format="fastqsanger" /> + <when input="options_type.output_type" value="cssanger" format="fastqcssanger" /> </change_format> </data> </outputs> <tests> <!-- These tests include test files adapted from supplemental material in Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009 Dec 16. --> <!-- Unfortunately, cannot test for expected failures --> + <!-- Test basic options --> + <test> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> + <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="basic" /> + <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" /> + </test> + <test> + <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastq" /> + <param name="input_type" value="cssanger" /> + <param name="options_type_selector" value="basic" /> + <output name="output_file" file="sanger_full_range_as_cssanger.fastqcssanger" /> + </test> + <test> + <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" /> + <param name="input_type" value="illumina" /> + <param name="options_type_selector" value="basic" /> + <output name="output_file" file="illumina_full_range_as_sanger.fastqsanger" /> + </test> + <test> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> + <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="basic" /> + <output name="output_file" file="solexa_full_range_as_sanger.fastqsanger" /> + </test> + <test> + <param name="input_file" value="sanger_full_range_as_illumina.fastqillumina" ftype="fastq" /> + <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="basic" /> + <output name="output_file" file="sanger_full_range_as_illumina.fastqillumina" /> + </test> <!-- Test grooming from illumina --> <test> - <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastqillumina" /> + <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" /> <param name="input_type" value="illumina" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="illumina" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="illumina_full_range_original_illumina.fastqillumina" /> </test> <test> - <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastqillumina" /> + <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" /> <param name="input_type" value="illumina" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="illumina_full_range_as_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastqillumina" /> + <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" /> <param name="input_type" value="illumina" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="illumina_full_range_as_solexa.fastqsolexa" /> </test> <test> - <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastqillumina" /> + <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" /> <param name="input_type" value="illumina" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="cssanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> @@ -73,32 +131,36 @@ </test> <!-- Test grooming from sanger --> <test> - <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="illumina" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_as_illumina.fastqillumina" /> </test> <test> - <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_as_solexa.fastqsolexa" /> </test> <test> - <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="cssanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> @@ -106,32 +168,36 @@ </test> <!-- Test grooming from solexa --> <test> - <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="solexa_full_range_original_solexa.fastqsolexa" /> </test> <test> - <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="illumina" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="solexa_full_range_as_illumina.fastqillumina" /> </test> <test> - <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="solexa_full_range_as_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="cssanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> @@ -139,32 +205,36 @@ </test> <!-- Test grooming from cssanger --> <test> - <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastqcssanger" /> + <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastq" /> <param name="input_type" value="cssanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="cssanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_as_cssanger.fastqcssanger" /> </test> <test> - <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastqcssanger" /> + <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastq" /> <param name="input_type" value="cssanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastqcssanger" /> + <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastq" /> <param name="input_type" value="cssanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="illumina" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_as_illumina.fastqillumina" /> </test> <test> - <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastqcssanger" /> + <param name="input_file" value="sanger_full_range_as_cssanger.fastqcssanger" ftype="fastq" /> <param name="input_type" value="cssanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> @@ -172,24 +242,27 @@ </test> <!-- Test fastq with line wrapping --> <test> - <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="wrapping_as_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="illumina" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="wrapping_as_illumina.fastqillumina" /> </test> <test> - <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="wrapping_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="None" /> <param name="summarize_input" value="summarize_input" /> @@ -198,16 +271,18 @@ <!-- Test forcing quality score encoding --> <!-- Sanger, range 0 - 93 --> <test> - <param name="input_file" value="sanger_full_range_as_decimal_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_as_decimal_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="ascii" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" /> </test> <test> - <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastqsanger" /> + <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" /> <param name="input_type" value="sanger" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="sanger" /> <param name="force_quality_encoding" value="decimal" /> <param name="summarize_input" value="summarize_input" /> @@ -215,16 +290,18 @@ </test> <!-- Solexa, range -5 - 62 --> <test> - <param name="input_file" value="solexa_full_range_as_decimal_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_as_decimal_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="ascii" /> <param name="summarize_input" value="summarize_input" /> <output name="output_file" file="solexa_full_range_original_solexa.fastqsolexa" /> </test> <test> - <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastqsolexa" /> + <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" /> <param name="input_type" value="solexa" /> + <param name="options_type_selector" value="advanced" /> <param name="output_type" value="solexa" /> <param name="force_quality_encoding" value="decimal" /> <param name="summarize_input" value="summarize_input" /> @@ -236,6 +313,8 @@ This tool offers several conversions options relating to the FASTQ format. +When using *Basic* options, the output will be *sanger* formatted or *cssanger* formatted (when the input is Color Space Sanger). + When converting, if a quality score falls outside of the target score range, it will be coerced to the closest available value (i.e. the minimum or maximum). When converting between Solexa and the other formats, quality scores are mapped between Solexa and PHRED scales using the equations found in Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009 Dec 16. @@ -244,106 +323,22 @@ ----- -**Examples** +**Quality Score Comparison** -1. Converting the Solexa FASTQ data:: +:: - @Solexa scores from -5 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT - + - ;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ - @Solexa scores from 62 to -5 inclusive (in that order) - TGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<; + SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS + ...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII + ..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX + !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ + | | | | | | + 33 59 64 73 104 126 + + S - Sanger Phred+33, 93 values (0, 93) (0 to 60 expected in raw reads) + I - Illumina 1.3 Phred+64, 62 values (0, 62) (0 to 40 expected in raw reads) + X - Solexa Solexa+64, 67 values (-5, 62) (-5 to 40 expected in raw reads) -- will produce the following Sanger FASTQ data:: - - @Solexa scores from -5 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT - + - ""##$$%%&&'()*++,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ - @Solexa scores from 62 to -5 inclusive (in that order) - TGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - _^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,++*)('&&%%$$##"" - -- will produce the following Illumina 1.3+ FASTQ data:: - - @Solexa scores from -5 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT - + - AABBCCDDEEFGHIJJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ - @Solexa scores from 62 to -5 inclusive (in that order) - TGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJJIHGFEEDDCCBBAA - -2. Converting the Illumina 1.3+ FASTQ data:: - - @Illumina PHRED scores from 0 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG - + - @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ - @Illumina PHRED scores from 62 to 0 inclusive (in that order) - GCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@ - -- will produce the following Sanger FASTQ data:: - - @Illumina PHRED scores from 0 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG - + - !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ - @Illumina PHRED scores from 62 to 0 inclusive (in that order) - GCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - _^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"! - -- will produce the following Solexa FASTQ data:: - - @Illumina PHRED scores from 0 to 62 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG - + - ;;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ - @Illumina PHRED scores from 62 to 0 inclusive (in that order) - GCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJHGFECB@>;; - -3. Converting standard Sanger FASTQ:: - - @Sanger PHRED scores from 0 to 93 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTAC - + - !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ - @Sanger PHRED scores from 93 to 0 inclusive (in that order) - CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"! - -- will produce the following Solexa FASTQ data:: - - @Sanger PHRED scores from 0 to 93 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTAC - + - ;;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - @Sanger PHRED scores from 93 to 0 inclusive (in that order) - CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJHGFECB@>;; - -- will produce the following Illumina 1.3+ FASTQ data:: - - @Sanger PHRED scores from 0 to 93 inclusive (in that order) - ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTAC - + - @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - @Sanger PHRED scores from 93 to 0 inclusive (in that order) - CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA - + - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@ +Diagram adapted from http://en.wikipedia.org/wiki/FASTQ_format </help> </tool> diff -r bfcf6a3249c7 -r d6d156b04767 tools/fastq/fastq_manipulation.xml --- a/tools/fastq/fastq_manipulation.xml Wed Mar 03 14:30:32 2010 -0500 +++ b/tools/fastq/fastq_manipulation.xml Wed Mar 03 14:34:31 2010 -0500 @@ -89,7 +89,7 @@ </when> <when value="trim"> <conditional name="offset_type"> - <param name="base_offset_type" type="select" label="Define Base Offsets as"> + <param name="base_offset_type" type="select" label="Define Base Offsets as" help="Absolute for e.g. fixed length reads.<br>Percentage for e.g. variable length reads."> <option value="offsets_absolute" selected="true">Absolute Values</option> <option value="offsets_percent">Percentage of Read Length</option> </param>