commit/galaxy-central: greg: Eliminate the non-functional PacBio tools from the distribution and the sample tool config.

10 Nov 2011

1 new commit in galaxy-central:


https://bitbucket.org/galaxy/galaxy-central/changeset/d9408bcb3cf4/
changeset:   d9408bcb3cf4
user:        greg
date:        2011-11-10 18:09:45
summary:     Eliminate the non-functional PacBio tools from the distribution and the sample tool config.
affected #:  14 files

diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tool_conf.xml.sample

--- a/tool_conf.xml.sample
+++ b/tool_conf.xml.sample
@@ -477,17 +477,6 @@
     <tool file="vcf_tools/filter.xml" /><tool file="vcf_tools/extract.xml" /></section>
-  <section name="PacBio/Illumina Assembly" id="hybrid">
-    <tool file="ilmn_pacbio/quake.xml"/>
-    <tool file="ilmn_pacbio/quake_pe.xml"/>
-    <tool file="ilmn_pacbio/soap_denovo.xml"/>
-    <!--
-    Uncomment this tool when we support the HDF5 format
-    <tool file="ilmn_pacbio/smrtpipe_filter.xml"/>
-    -->
-    <tool file="ilmn_pacbio/smrtpipe_hybrid.xml"/>
-    <tool file="ilmn_pacbio/assembly_stats.xml"/>
-  </section><!--
   TODO: uncomment the following EMBOSS section whenever
   moving to test, but comment it in .sample to eliminate


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/abyss.xml
--- a/tools/ilmn_pacbio/abyss.xml
+++ /dev/null
@@ -1,30 +0,0 @@
-<tool id="abyss" name="ABySS" version="1.0.0">
-  <description>Short-read de Bruijn assembly</description>
-  <command interpreter="python">
-    quake_wrapper.py -k $k -r $input1 -p 8 > $output1
-  </command>
-  <inputs>
-    <param name="input1" format="fastq" type="data" label="Select FASTQ file to correct" />
-    <param name="k" type="integer" value="16" label="Size of k-mers to correct" />
-  </inputs>
-  <outputs>
-    <data format="fastq" name="output1" label="Error-corrected reads from ${on_string}" />
-  </outputs>
-  <help>
-
-**What it does**
-
-TBD.  Calls ABySS assembler
-
-**Parameter list**
-
-k
-
-**Output**
-
-Corrected reads
-
-  </help>
-</tool>
-
-


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/assembly_stats.py
--- a/tools/ilmn_pacbio/assembly_stats.py
+++ /dev/null
@@ -1,83 +0,0 @@
-#!/usr/bin/env python
-#
-#Copyright (c) 2011, Pacific Biosciences of California, Inc.
-#
-#All rights reserved.
-#
-#Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-#    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-#    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-#    * Neither the name of Pacific Biosciences nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
-#
-#THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-#WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR ITS CONTRIBUTORS BE LIABLE FOR ANY
-#DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
-#LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-#
-import sys, os
-from optparse import OptionParser
-from galaxy import eggs
-import pkg_resources
-pkg_resources.require( 'bx-python' )
-from bx.seq.fasta import FastaReader
-
-def getStats( fastaFile, genomeLength, minContigLength ):
-    lengths = []
-    stats = { "Num" : 0,
-              "Sum" : 0, 
-              "Max" : 0, 
-              "Avg" : 0,
-              "N50" : 0,
-              "99%" : 0 }
-    fasta_reader = FastaReader( open( fastaFile, 'rb' ) )
-    while True:
-        seq = fasta_reader.next()
-        if not seq:
-            break
-        if seq.length < minContigLength:
-            continue
-        lengths.append( seq.length )
-    if lengths:
-        stats[ 'Num' ] = len( lengths )
-        stats[ 'Sum' ] = sum( lengths )
-        stats[ 'Max' ] = max( lengths )
-        stats[ 'Avg' ] = int( sum( lengths ) / float( len( lengths ) ) )
-        stats[ 'N50' ] = 0
-        stats[ '99%' ] = 0
-        if genomeLength == 0:
-            genomeLength = sum( lengths )
-        lengths.sort()
-        lengths.reverse()
-        lenSum = 0
-        stats[ "99%" ] = len( lengths )
-        for idx, length in enumerate( lengths ):
-            lenSum += length
-            if ( lenSum > genomeLength / 2 ):
-                stats[ "N50" ] = length
-                break
-        lenSum = 0
-        for idx, length in enumerate( lengths ):
-            lenSum += length
-            if lenSum > genomeLength * 0.99:
-                stats[ "99%" ] = idx + 1
-                break
-    return stats
-
-def __main__():
-    #Parse Command Line
-    usage = 'Usage: %prog input output --minContigLength'
-    parser = OptionParser( usage=usage )
-    parser.add_option( "--minContigLength", dest="minContigLength", help="Minimum length of contigs to analyze" )
-    parser.add_option( "--genomeLength", dest="genomeLength", help="Length of genome for which to calculate N50s" )
-    parser.set_defaults( minContigLength=0, genomeLength=0 )
-    options, args = parser.parse_args()
-    input_fasta_file = args[ 0 ]
-    output_tabular_file = args[ 1 ]
-    statKeys = "Num Sum Max Avg N50 99%".split( " " )
-    stats = getStats( input_fasta_file, int( options.genomeLength ), int( options.minContigLength ) )
-    fout = open( output_tabular_file, "w" )
-    fout.write( "%s\n" % "\t".join( map( lambda key: str( stats[ key ] ), statKeys ) ) )
-    fout.close()
-
-if __name__=="__main__": __main__()


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/assembly_stats.xml
--- a/tools/ilmn_pacbio/assembly_stats.xml
+++ /dev/null
@@ -1,54 +0,0 @@
-<tool id="assembly_stats" name="Assembly Statistics" version="1.0.0">
-    <description>Calculate common measures of assembly quality</description>
-    <command interpreter="python">
-        assembly_stats.py $input1 $output1 --minContigLength=${minLength}
-    </command>
-    <inputs>
-        <param name="input1" format="fasta" type="data" label="Select FASTA file containing contigs"/>
-        <param name="minLength" type="integer" value="0" label="Minimum length of contigs to consider"/>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="Assembly statistics for ${on_string}"/>
-    </outputs>
-    <tests>
-        <test>
-            <param name="input1" value="3.fasta" ftype="fasta"/>
-            <param name="minLength" value="100"/>
-            <output name="output1" ftype="tabular" file="assembly_stats.tabular" />
-        </test>
-    </tests>
-    <help>
-
-**What it does**
-
-Reports standard measures of *de novo* assembly quality such as number of contigs, sum of contigs, mean contig length, and N50.
-
-**Parameter list**
-
-Minimum length
-    Only include contigs of this size or greater for calculating statistics.
-
-**Output**
-
-Num contigs
-    Total number of contigs in the assembly
-
-Sum of contig lengths
-    Total sum of contig lengths
-
-Maximum contig length
-    Maximum of the contig lengths
-
-Mean contig length
-    Average contig length
-
-N50
-    Contig length at which 50% of the assembly is contained in contigs of this size or greater.
-
-99%
-    Number of contigs accounting for 99% of the observed assembly.
-
-    </help>
-</tool>
-
-


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/cov_model.py
--- a/tools/ilmn_pacbio/cov_model.py
+++ /dev/null
@@ -1,238 +0,0 @@
-#!/usr/bin/env python
-from optparse import OptionParser, SUPPRESS_HELP
-import os, random, quake
-
-############################################################
-# cov_model.py
-#
-# Given a file of kmer counts, reports the cutoff to use
-# to separate trusted/untrusted kmers.
-############################################################
-
-############################################################
-# main
-############################################################
-def main():
-    usage = 'usage: %prog [options] <counts file>'
-    parser = OptionParser(usage)
-    parser.add_option('--int', dest='count_kmers', action='store_true', default=False, help='Kmers were counted as integers w/o the use of quality values [default: %default]')
-    parser.add_option('--ratio', dest='ratio', type='int', default=200, help='Likelihood ratio to set trusted/untrusted cutoff [default: %default]')
-    parser.add_option('--no_sample', dest='no_sample', action='store_true', default=False, help='Do not sample kmer coverages into kmers.txt because its already done [default: %default]')
-    # help='Model kmer coverage as a function of GC content of kmers [default: %default]'
-    parser.add_option('--gc', dest='model_gc', action='store_true', default=False, help=SUPPRESS_HELP)
-    (options, args) = parser.parse_args()
-
-    if len(args) != 1:
-        parser.error('Must provide kmers counts file')
-    else:
-        ctsf = args[0]
-
-    if options.count_kmers:
-        model_cutoff(ctsf, options.ratio)
-        print 'Cutoff: %s' % open('cutoff.txt').readline().rstrip()
-        
-    else:
-        if options.model_gc:
-            model_q_gc_cutoffs(ctsf, 25000, options.ratio)
-        else:
-            model_q_cutoff(ctsf, 50000, options.ratio, options.no_sample)
-            print 'Cutoff: %s' % open('cutoff.txt').readline().rstrip()
-
-
-############################################################
-# model_cutoff
-#
-# Make a histogram of kmers to give to R to learn the cutoff
-############################################################
-def model_cutoff(ctsf, ratio):
-    # make kmer histogram
-    cov_max = 0
-    for line in open(ctsf):
-        cov = int(line.split()[1])
-        if cov > cov_max:
-            cov_max = cov
-
-    kmer_hist = [0]*cov_max
-    for line in open(ctsf):
-        cov = int(line.split()[1])
-        kmer_hist[cov-1] += 1
-
-    cov_out = open('kmers.hist', 'w')
-    for cov in range(0,cov_max):
-        if kmer_hist[cov]:
-            print >> cov_out, '%d\t%d' % (cov+1,kmer_hist[cov])
-    cov_out.close()
-
-    os.system('R --slave --args %d < %s/cov_model.r 2> r.log' % (ratio,quake.quake_dir))
-
-
-############################################################
-# model_q_cutoff
-#
-# Sample kmers to give to R to learn the cutoff
-# 'div100' is necessary when the number of kmers is too 
-# large for random.sample, so we only consider every 100th
-# kmer.
-############################################################
-def model_q_cutoff(ctsf, sample, ratio, no_sample=False):
-    if not no_sample:
-        # count number of kmer coverages
-        num_covs = 0
-        for line in open(ctsf):
-            num_covs += 1
-
-        # choose random kmer coverages
-        div100 = False
-        if sample >= num_covs:
-            rand_covs = range(num_covs)
-        else:
-            if num_covs > 1000000000:
-                div100 = True
-                rand_covs = random.sample(xrange(num_covs/100), sample)
-            else:
-                rand_covs = random.sample(xrange(num_covs), sample)
-        rand_covs.sort()
-
-        # print to file
-        out = open('kmers.txt', 'w')
-        kmer_i = 0
-        rand_i = 0
-        for line in open(ctsf):
-            if div100:
-                if kmer_i % 100 == 0 and kmer_i/100 == rand_covs[rand_i]:
-                    print >> out, line.split()[1]
-                    rand_i += 1
-                    if rand_i >= sample:
-                        break
-            else:
-                if kmer_i == rand_covs[rand_i]:
-                    print >> out, line.split()[1]
-                    rand_i += 1
-                    if rand_i >= sample:
-                        break
-            kmer_i += 1
-        out.close()
-
-    os.system('R --slave --args %d < %s/cov_model_qmer.r 2> r.log' % (ratio,quake.quake_dir))
-
-
-############################################################
-# model_q_gc_cutoffs
-#
-# Sample kmers to give to R to learn the cutoff for each
-# GC value
-############################################################
-def model_q_gc_cutoffs(ctsf, sample, ratio):
-    # count number of kmer coverages at each at
-    k = len(open(ctsf).readline().split()[0])
-    num_covs_at = [0]*(k+1)
-    for line in open(ctsf):
-        kmer = line.split()[0]
-        num_covs_at[count_at(kmer)] += 1
-
-    # for each AT bin
-    at_cutoffs = []
-    for at in range(1,k):
-        # sample covs
-        if sample >= num_covs_at[at]:
-            rand_covs = range(num_covs_at[at])
-        else:
-            rand_covs = random.sample(xrange(num_covs_at[at]), sample)
-        rand_covs.sort()
-
-        # print to file
-        out = open('kmers.txt', 'w')
-        kmer_i = 0
-        rand_i = 0
-        for line in open(ctsf):
-            (kmer,cov) = line.split()
-            if count_at(kmer) == at:
-                if kmer_i == rand_covs[rand_i]:
-                    print >> out, cov
-                    rand_i += 1
-                    if rand_i >= sample:
-                        break
-                kmer_i += 1
-        out.close()
-        
-        os.system('R --slave --args %d < %s/cov_model_qmer.r 2> r%d.log' % (ratio,quake.quake_dir,at))
-
-        at_cutoffs.append( open('cutoff.txt').readline().rstrip() )
-        if at in [1,k-1]:   # setting extremes to next closests
-            at_cutoffs.append( open('cutoff.txt').readline().rstrip() )
-
-        os.system('mv kmers.txt kmers.at%d.txt' % at)
-        os.system('mv cutoff.txt cutoff.at%d.txt' % at)
-
-    out = open('cutoffs.gc.txt','w')
-    print >> out, '\n'.join(at_cutoffs)
-    out.close()
-
-
-############################################################
-# model_q_gc_cutoffs_bigmem
-#
-# Sample kmers to give to R to learn the cutoff for each
-# GC value
-############################################################
-def model_q_gc_cutoffs_bigmem(ctsf, sample, ratio):
-    # input coverages
-    k = 0
-    for line in open(ctsf):
-        (kmer,cov) = line.split()
-        if k == 0:
-            k = len(kmer)
-            at_covs = ['']*(k+1)
-        else:
-            at = count_at(kmer)
-            if at_covs[at]:
-                at_covs[at].append(cov)
-            else:
-                at_covs[at] = [cov]
-
-    for at in range(1,k):
-        print '%d %d' % (at,len(at_covs[at]))
-
-    # for each AT bin
-    at_cutoffs = []
-    for at in range(1,k):
-        # sample covs
-        if sample >= len(at_covs[at]):
-            rand_covs = at_covs[at]
-        else:
-            rand_covs = random.sample(at_covs[at], sample)
-
-        # print to file
-        out = open('kmers.txt', 'w')
-        for rc in rand_covs:
-            print >> out, rc
-        out.close()
-
-        os.system('R --slave --args %d < %s/cov_model_qmer.r 2> r%d.log' % (ratio,quake.quake_dir,at))
-
-        at_cutoffs.append( open('cutoff.txt').readline().rstrip() )
-        if at in [1,k-1]:   # setting extremes to next closests
-            at_cutoffs.append( open('cutoff.txt').readline().rstrip() )
-
-        os.system('mv kmers.txt kmers.at%d.txt' % at)
-        os.system('mv cutoff.txt cutoff.at%d.txt' % at)
-
-    out = open('cutoffs.gc.txt','w')
-    print >> out, '\n'.join(at_cutoffs)
-    out.close()
-        
-    
-############################################################
-# count_at
-#
-# Count A's and T's in the given sequence
-############################################################
-def count_at(seq):
-    return len([nt for nt in seq if nt in ['A','T']])
-
-
-############################################################
-# __main__
-############################################################
-if __name__ == '__main__':
-    main()


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/quake.py
--- a/tools/ilmn_pacbio/quake.py
+++ /dev/null
@@ -1,136 +0,0 @@
-#!/usr/bin/env python
-from optparse import OptionParser, SUPPRESS_HELP
-import os, random, sys
-import cov_model
-
-############################################################
-# quake.py
-#
-# Launch pipeline to correct errors in Illumina sequencing
-# reads.
-############################################################
-
-#r_dir = '/nfshomes/dakelley/research/error_correction/bin'
-quake_dir = os.path.abspath(os.path.dirname(sys.argv[0]))
-
-############################################################
-# main
-############################################################
-def main():
-    usage = 'usage: %prog [options]'
-    parser = OptionParser(usage)
-    parser.add_option('-r', dest='readsf', help='Fastq file of reads')
-    parser.add_option('-f', dest='reads_listf', help='File containing fastq file names, one per line or two per line for paired end reads.')
-    parser.add_option('-k', dest='k', type='int', help='Size of k-mers to correct')
-    parser.add_option('-p', dest='proc', type='int', default=4, help='Number of processes [default: %default]')
-    parser.add_option('-q', dest='quality_scale', type='int', default=-1, help='Quality value ascii scale, generally 64 or 33. If not specified, it will guess.')
-    parser.add_option('--no_count', dest='no_count', action='store_true', default=False, help='Kmers are already counted and in expected file [reads file].qcts or [reads file].cts [default: %default]')
-    parser.add_option('--no_cut', dest='no_cut', action='store_true', default=False, help='Coverage model is optimized and cutoff was printed to expected file cutoff.txt [default: %default]')
-    parser.add_option('--int', dest='counted_kmers', action='store_true', default=False, help='Kmers were counted as integers w/o the use of quality values [default: %default]')
-    parser.add_option('--ratio', dest='ratio', type='int', default=200, help='Likelihood ratio to set trusted/untrusted cutoff.  Generally set between 10-1000 with lower numbers suggesting a lower threshold. [default: %default]')
-    # help='Model kmer coverage as a function of GC content of kmers [default: %default]'
-    parser.add_option('--gc', dest='model_gc', action='store_true', default=False, help=SUPPRESS_HELP)
-    parser.add_option('--headers', action='store_true', default=False, help='Output original read headers (i.e. pass --headers to correct)' )
-    (options, args) = parser.parse_args()
-
-    if not options.readsf and not options.reads_listf:
-        parser.error('Must provide fastq file of reads with -r or file with list of fastq files of reads with -f')
-    if not options.k:
-        parser.error('Must provide k-mer size with -k')
-    if options.quality_scale == -1:
-        options.quality_scale = guess_quality_scale(options.readsf, options.reads_listf)
-
-    if options.counted_kmers:
-        cts_suf = 'cts'
-    else:
-        cts_suf = 'qcts'
-    if options.readsf:
-        ctsf = '%s.%s' % (os.path.splitext( os.path.split(options.readsf)[1] )[0], cts_suf)
-        reads_str = '-r %s' % options.readsf
-    else:
-        ctsf = '%s.%s' % (os.path.split(options.reads_listf)[1], cts_suf)
-        reads_str = '-f %s' % options.reads_listf
-
-    if not options.no_count and not options.no_cut:
-        count_kmers(options.readsf, options.reads_listf, options.k, ctsf, options.quality_scale)
-
-    if not options.no_cut:
-        # model coverage
-        if options.counted_kmers:
-            cov_model.model_cutoff(ctsf, options.ratio)
-        else:
-            if options.model_gc:
-                cov_model.model_q_gc_cutoffs(ctsf, 10000, options.ratio)
-            else:
-                cov_model.model_q_cutoff(ctsf, 25000, options.ratio)
-
-
-    if options.model_gc:
-        # run correct C++ code
-        os.system('%s/correct %s -k %d -m %s -a cutoffs.gc.txt -p %d -q %d' % (quake_dir,reads_str, options.k, ctsf, options.proc, options.quality_scale))
-
-    else:
-        cutoff = open('cutoff.txt').readline().rstrip()
-
-        # run correct C++ code
-        headers = '--headers' if options.headers else ''
-        os.system('%s/correct %s %s -k %d -m %s -c %s -p %d -q %d' % (quake_dir,headers, reads_str, options.k, ctsf, cutoff, options.proc, options.quality_scale))
-
-
-################################################################################
-# guess_quality_scale
-# Guess at ascii scale of quality values by examining
-# a bunch of reads and looking for quality values < 64,
-# in which case we set it to 33.
-################################################################################
-def guess_quality_scale(readsf, reads_listf):
-    reads_to_check = 1000
-    if not readsf:
-        readsf = open(reads_listf).readline().split()[0]
-
-    fqf = open(readsf)
-    reads_checked = 0
-    header = fqf.readline()
-    while header and reads_checked < reads_to_check:
-        seq = fqf.readline()
-        mid = fqf.readline()
-        qual = fqf.readline().rstrip()
-        reads_checked += 1
-        for q in qual:
-            if ord(q) < 64:
-                print 'Guessing quality values are on ascii 33 scale'
-                return 33
-        header = fqf.readline()
-
-    print 'Guessing quality values are on ascii 64 scale'
-    return 64
-        
-
-
-############################################################
-# count_kmers
-#
-# Count kmers in the reads file using AMOS count-kmers or
-# count-qmers
-############################################################
-def count_kmers(readsf, reads_listf, k, ctsf, quality_scale):
-    # find files
-    fq_files = []
-    if readsf:
-        fq_files.append(readsf)
-    else:
-        for line in open(reads_listf):
-            for fqf in line.split():
-                fq_files.append(fqf)
-
-    if ctsf[-4:] == 'qcts':
-        os.system('cat %s | %s/count-qmers -k %d -q %d > %s' % (' '.join(fq_files), quake_dir, k, quality_scale, ctsf))
-    else:
-        os.system('cat %s | %s/count-kmers -k %d > %s' % (' '.join(fq_files), quake_dir, k, ctsf))
-    
-            
-############################################################
-# __main__
-############################################################
-if __name__ == '__main__':
-    main()


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/quake.xml
--- a/tools/ilmn_pacbio/quake.xml
+++ /dev/null
@@ -1,44 +0,0 @@
-<tool id="quake" name="Quake" version="1.0.0">
-  <description>Quality-aware error correction</description>
-  <command interpreter="python">
-    quake_wrapper.py --default_cutoff=10 --headers -k $k -f $fofnfile -p 12 > $output1
-  </command>
-  <inputs>
-    <param name="input1" format="fastq" type="data" label="Select FASTQ file to correct" />
-    <param name="k" type="integer" value="16" label="Size of k-mers to correct" />
-  </inputs>
-  <configfiles>
-    <configfile name="fofnfile">
-${input1.file_name}
-    </configfile>
-  </configfiles>
-  <outputs>
-    <data format="fastq" name="output1" label="Error-corrected reads from ${on_string}" />
-  </outputs>
-  <help>
-
-**What it does**
-
-Applies the Quake_ algorithm for quality-aware correction of
-substitution error in short reads.
-
-Kelley DR, Schatz MC, Salzberg SL.
-"Quake: quality-aware detection and correction of sequencing errors."
-*Genome Biol.* 2010;11(11):R116.
-
-.. _Quake: http://www.cbcb.umd.edu/software/quake
-
-**Parameter list**
-
-k
-    k-mer size for detecting spurious k-mers versus true k-mers from
-    the genome.  Recommendations for choosing a value of k can be found
-    here_.
-
-.. _here: http://www.cbcb.umd.edu/software/quake/faq.html
-
-**Output**
-
-A FASTQ file of corrected and trimmed reads.
-  </help>
-</tool>


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/quake_pe.xml
--- a/tools/ilmn_pacbio/quake_pe.xml
+++ /dev/null
@@ -1,53 +0,0 @@
-<tool id="quake_pe" name="Quake PE" version="1.0.0">
-  <description>Quality-aware error correction for paired-end reads</description>
-  <command interpreter="python">
-    quake_wrapper.py --default_cutoff=$cutoff --headers -k $k -f $fofnfile -p 12 --output=$output1,$output2
-  </command>
-  <inputs>
-    <param name="input1" format="fastq" type="data" label="FASTQ file for forward reads" />
-    <param name="input2" format="fastq" type="data" label="FASTQ file for reverse reads" />
-    <param name="k" type="integer" value="16" label="Size of k-mers to correct" />
-    <param name="cutoff" type="integer" value="0" label="Default coverage cutoff if estimation fails"/>
-  </inputs>
-  <configfiles>
-    <configfile name="fofnfile">${input1.file_name} ${input2.file_name}
-    </configfile>
-  </configfiles>
-  <outputs>
-    <data format="fastq" name="output1" label="Error-corrected forward reads from ${on_string}" />
-    <data format="fastq" name="output2" label="Error-corrected reverse reads from ${on_string}" />
-  </outputs>
-  <help>
-
-**What it does**
-
-Applies the Quake_ algorithm for quality-aware correction of
-substitution error in short reads.  This form of the tool is customized
-for correcting paired-end reads.
-
-Kelley DR, Schatz MC, Salzberg SL.
-"Quake: quality-aware detection and correction of sequencing errors."
-*Genome Biol.* 2010;11(11):R116.
-
-.. _Quake: http://www.cbcb.umd.edu/software/quake
-
-**Parameter list**
-
-K-mer size
-    k-mer size for detecting spurious k-mers versus true k-mers from
-    the genome.  Recommendations for choosing a value of k can be found
-    here_.
-
-Default coverage cutoff
-    If the appropriate coverage cutoff can not be found then Quake can be
-    forced to proceed anyways with the supplied cutoff.  In this case, 
-    the optimal cutoff can be estimated by examining
-    the k-mer coverage histogram by eye.
-
-.. _here: http://www.cbcb.umd.edu/software/quake/faq.html
-
-**Output**
-
-A FASTQ file of corrected and trimmed reads.
-  </help>
-</tool>


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/quake_wrapper.py
--- a/tools/ilmn_pacbio/quake_wrapper.py
+++ /dev/null
@@ -1,132 +0,0 @@
-#!/usr/bin/python
-#
-# Copyright (c) 2011, Pacific Biosciences of California, Inc.
-#
-# All rights reserved.
-#
-#Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-#    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-#    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-#    * Neither the name of Pacific Biosciences nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
-#
-#THIS SOFTWARE IS PROVIDED BY PACIFIC BIOSCIENCES AND ITS CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-#WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR ITS CONTRIBUTORS BE LIABLE FOR ANY
-#DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
-#LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-#
-import sys
-import os
-import subprocess
-
-QUAKE_EXE = os.path.join( os.path.dirname(os.path.abspath(sys.argv[0])), 'quake.py' )
-cmdLine = sys.argv
-cmdLine.pop(0)
-
-#
-# horribly not robust, but it was a pain to rewrite everything with
-# optparse
-#
-j = -1
-cut = 0
-for i,arg in enumerate(cmdLine):
-    if '--default_cutoff' in arg:
-        j = i
-        cut = int(arg.split('=')[1])
-if j>=0:
-    cmdLine = cmdLine[:j] + cmdLine[j+1:]
-
-j = -1
-output=''
-for i,arg in enumerate(cmdLine):
-    if '--output' in arg:
-        j = i
-        output = arg.split('=')[1]
-if j>=0:
-    cmdLine = cmdLine[:j] + cmdLine[j+1:]
-
-def backticks( cmd, merge_stderr=True ):
-    """
-    Simulates the perl backticks (``) command with error-handling support
-    Returns ( command output as sequence of strings, error code, error message )
-    """
-    if merge_stderr:
-        _stderr = subprocess.STDOUT
-    else:
-        _stderr = subprocess.PIPE
-
-    p = subprocess.Popen( cmd, shell=True, stdin=subprocess.PIPE,
-                          stdout=subprocess.PIPE, stderr=_stderr,
-                          close_fds=True )
-
-    out = [ l[:-1] for l in p.stdout.readlines() ]
-
-    p.stdout.close()
-    if not merge_stderr:
-        p.stderr.close()
-
-    # need to allow process to terminate
-    p.wait()
-
-    errCode = p.returncode and p.returncode or 0
-    if p.returncode>0:
-        errorMessage = os.linesep.join(out)
-        output = []
-    else:
-        errorMessage = ''
-        output = out
-
-    return output, errCode, errorMessage
-
-def to_stdout():
-    def toCorFastq(f):
-        stem, ext = os.path.splitext( os.path.basename(f) )
-        dir = os.path.dirname(f)
-        corFastq = os.path.join(dir,'%s.cor%s' % (stem,ext) )
-        if not os.path.exists(corFastq):
-            print >>sys.stderr, "Can't find path %s" % corFastq
-            sys.exit(1)
-        return corFastq
-    if '-r' in cmdLine:
-        fastqFile = cmdLine[ cmdLine.index('-r')+1 ]
-        corFastq = toCorFastq(fastqFile)
-        infile = open( corFastq, 'r' )
-        for line in infile:
-            sys.stdout.write( line )
-        infile.close()
-    else:
-        fofnFile = cmdLine[ cmdLine.index('-f')+1 ]
-        infile = open(fofnFile,'r')
-        for line in infile:
-            line = line.strip()
-            if len(line)>0:
-                fastqFiles = line.split()
-                break
-        infile.close()
-        outs = output.split(',')
-        for o,f in zip(outs,fastqFiles):
-            cf = toCorFastq(f)
-            os.system( 'cp %s %s' % ( cf, o ) )
-
-def run():
-    cmd = '%s %s' % ( QUAKE_EXE, " ".join(cmdLine) )
-    output, errCode, errMsg = backticks( cmd )
-
-    if errCode==0:
-        to_stdout()
-    else:
-        # if Quake exits with an error in cutoff determination we  
-        # can force correction if requested
-        if 'cutoff.txt' in errMsg and cut>0:
-            outfile = open( 'cutoff.txt', 'w' )
-            print >>outfile, str(cut)
-            outfile.close()
-            cmd = '%s --no_count --no_cut %s' % ( QUAKE_EXE, " ".join(cmdLine) )
-            output, errCode, errMsg = backticks( cmd )
-        if errCode==0:
-            to_stdout()
-        else:
-            print >>sys.stderr, errMsg
-            sys.exit(1)
-
-if __name__=='__main__': run()


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/smrtpipe.py
--- a/tools/ilmn_pacbio/smrtpipe.py
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env python
-# EASY-INSTALL-SCRIPT: 'pbpy==0.1','smrtpipe.py'
-__requires__ = 'pbpy==0.1'
-import pkg_resources
-pkg_resources.run_script('pbpy==0.1', 'smrtpipe.py')


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/smrtpipe_filter.xml
--- a/tools/ilmn_pacbio/smrtpipe_filter.xml
+++ /dev/null
@@ -1,67 +0,0 @@
-<tool id="smrtpipe_filter" name="SMRTpipe Filter" version="1.0.0">
-  <description>Produce filtered reads from a set of PacBio primary analysis outputs.</description>
-  <command interpreter="python">
-    smrtpipe_galaxy.py --output=data/filtered_subreads.fasta --galaxy_output=${outfile} ${iniFile}
-  </command>
-  <inputs>
-    <conditional name="source">
-      <param name="input_source" type="select" label="Choose the source for the analysis inputs">
-        <option value="path">Path to fofn or multiple bas.h5 paths</option>
-        <option value="history">History</option>
-      </param>
-      <when value="path">
-        <repeat name="inputFiles" title="Input files">
-          <param name="path" type="text" label="File path" size="75"/>
-        </repeat>
-      </when>
-      <when value="history">
-        <param name="input1" type="data" format="tabular" label="File containing input paths" />
-      </when>
-    </conditional>
-    <param name="minimum_readlength" type="integer" value="50" label="Minimum raw readlength" />
-    <param name="minimum_readscore" type="float" value="0.75" label="Minimum read quality" />
-  </inputs>
-  <configfiles>
-    <configfile name="iniFile">
-[input]
-#if $source.input_source=="history":
-#for $l in open($source.input1.file_name,'r'):
-$l
-#end for
-#else
-#for $p in $source.inputFiles
-${p.path}
-#end for
-#end if
-
-[S_Filter]
-filters=MinRL=${minimum_readlength},MinReadScore=${minimum_readscore}
-    </configfile>
-  </configfiles>
-  <outputs>
-    <data name="outfile" format="fasta" label="Filtered subreads" />
-  </outputs>
-  <help>
-
-**What it does**
-
-Filters PacBio bas.h5 files and produces a FASTA file of filtered subreads.
-
-In PacBio SMRT sequencing, the template format is a SMRTbell: a circular
-molecule with adapters at two locations in the circle.  The subreads are the
-portions of the read between adapters.
-
-**Parameter list**
-
-Minimum readlength
-    Only keep reads from ZMWs that produced this many bases or more.
-
-Minimum read quality
-    Only keep reads with overall quality scores of this value or more.  The read quality score is a *de novo* prediction of the accuracy of the read.
-
-**Output**
-
-FASTA file of filtered reads.
-
-  </help>
-</tool>


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/smrtpipe_galaxy.py
--- a/tools/ilmn_pacbio/smrtpipe_galaxy.py
+++ /dev/null
@@ -1,265 +0,0 @@
-#!/usr/bin/python
-import sys
-import os
-import subprocess
-import optparse as op
-import xml.etree.cElementTree as et
-
-TRACE=False
-#
-# Turn on tracing to dump out __input__.xml and __settings__.xml somewhere
-#
-#TRACE=True
-#TRACE_PATH='/home/UNIXHOME/jsorenson'
-
-class SmrtpipeGalaxy:
-    """Wrapper for running smrtpipe under galaxy"""
-    def __init__( self, argv ):
-        self.__parseOptions( argv )
-
-    def __parseOptions( self, argv ):
-        usage = 'Usage: %prog [--help] [options] smrtpipe.ini'
-        parser = op.OptionParser( usage=usage, description=SmrtpipeGalaxy.__doc__ )
-        parser.add_option( "--output",
-                           help="Designate a file generated by smrtpipe as the expected output for galaxy" )
-        parser.add_option( "--nproc", type="int",
-                           help="Number of processes to use (-D NPROC)" )
-        parser.add_option( "--galaxy_output",
-                           help="File name provided by galaxy where output should be placed" )
-        parser.add_option( "--dry_run", action="store_true", 
-                           help="Create auxiliary XML files and exit" )
-        parser.add_option( "--dat_extension", 
-                           help="Soft link .dat files to have this extension (some pipelines require certain extensions)" ) 
-
-        parser.set_defaults( output=None, dry_run=False, galaxy_output=None,
-            dat_extension=None, nproc=0 )
-        self.options, self.args = parser.parse_args( argv )
-
-        if len(self.args)!=2:
-            parser.error( 'Expected 1 argument' )
-
-        self.configFile = self.args[1]
-
-    def __parseConfig( self ):
-        infile = open( self.configFile, 'r' )
-        section = None
-        sections = []
-        for line in infile:
-            l = line.strip()
-            if len(l)==0 or line.startswith('#'):
-                continue
-            if l.startswith('[') and l.endswith(']'):
-                section = section_factory( l[1:-1] )
-                sections.append(section)
-                continue
-            if section is None:
-                continue
-            if '=' in l:
-                section.addParameterLine(l)
-            else:
-                section.addLine(l)
-        infile.close()
-        return sections
-
-    def transferOutput( self ):
-        if not self.options.output or not self.options.galaxy_output:
-            return True, ''
-        if not os.path.exists(self.options.output):
-            return False, "Can't find file %s (job error?)" % self.options.output
-        os.system( 'cp %s %s' % (self.options.output, self.options.galaxy_output ))
-        return True, ''
-
-    def run( self ):
-        if not os.path.exists( self.configFile ):
-            print >>sys.stderr, "Can't find config file %s" % self.configFile
-            return 1
-
-        sections = self.__parseConfig()
-
-        if len(sections)==0:
-            print >>sys.stderr, "No sections found in %s" % self.configFile
-            return 1
-        if sections[0].name != 'input':
-            print >>sys.stderr, "No [input] section found in %s" % self.configFile
-            return 1
-
-        INPUT_FILE = '__input__.xml'
-        SETTINGS_FILE = '__settings__.xml'
-
-        sections[0].softLinkDats( self.options.dat_extension )
-        inputXml = sections[0].makeXmlElement()
-        write_xml_to_file( INPUT_FILE, inputXml )
-        if TRACE:
-            write_xml_to_file( os.path.join(TRACE_PATH,INPUT_FILE), inputXml )
-
-        settings = et.Element( 'smrtpipeSettings' )
-        for s in sections[1:]:
-            s.makeXmlElement( settings )
-
-        write_xml_to_file( SETTINGS_FILE, settings )
-        if TRACE:
-            write_xml_to_file( os.path.join(TRACE_PATH,SETTINGS_FILE), settings )
-
-        nproc = '-D NPROC=%d' % self.options.nproc if self.options.nproc>0 else ''
-        cmd = 'smrtpipe.py %s --params=%s xml:%s > smrtpipe.err 2>1' % \
-            ( nproc, SETTINGS_FILE, INPUT_FILE )
-
-        if self.options.dry_run:
-            print 'Command to run:'
-            print cmd
-            return 0
-
-        out, errCode, errMsg = backticks( cmd )
-        if errCode!=0:
-            print >>sys.stderr, "error while running: %s" % cmd
-            print >>sys.stderr, errMsg
-            if os.path.exists('log/smrtpipe.log'):
-                print >>sys.stderr, 'Log:'
-                infile = open('log/smrtpipe.log','r')
-                for line in infile: sys.stderr.write(line)
-                infile.close()
-            return errCode
-
-        success, errMsg = self.transferOutput()
-        if not success:
-            print >>sys.stderr, errMsg
-            return 1
-
-        return 0
-
-def write_xml_to_file( fileName, root ):
-    outfile = open( fileName, 'w' )
-    outfile.write( '<?xml version="1.0"?>\n' )
-    outfile.write( et.tostring(root) + '\n' )
-    outfile.close()
-
-def section_factory( name ):
-    if name=='input':
-        return InputSection(name)
-    else:
-        return Section(name)
-
-class Section:
-    def __init__( self, name ):
-        self._name = name
-        self._lines = []        
-        self._vars = {}
-
-    @property
-    def name(self):
-        return self._name
-
-    def addLine( self, line ):
-        self._lines.append(line)
-
-    def addParameterLine( self, line ):
-        self.addLine(line)
-        i = line.find( '=' )
-        key = line[:i].strip()
-        value = line[i+1:].strip()
-        self._vars[key] = value
-
-    def makeXmlElement( self, settings ):
-        if self._name=='global':
-            root = et.SubElement( settings, "protocol", {'name':'generic'} )
-        else:
-            root = et.SubElement( settings, "module", {'name':self._name} )
-        for k,v in self._vars.iteritems():
-            param = et.SubElement( root, 'param', {'name':k} )
-            val = et.SubElement( param, 'value' )
-            val.text = v
-        return None
-
-    def __str__( self ):
-        "for debugging"
-        buffer = [ 'S { name=' ]
-        buffer.append(self._name)
-        buffer.append('; lines=%s' % ','.join(self._lines) )
-        for k,v in self._vars.iteritems():
-            buffer.append('; %s=%s' % (k,v) )
-        buffer.append(' }')
-        return ''.join(buffer)
-
-class InputSection( Section ):
-    def __init__( self, name ):
-        Section.__init__(self,name)
-
-    def softLinkDats( self, newExtension ):
-        if not newExtension:
-            return
-        newLines = []
-        for l in self._lines:
-            if ':' in l:
-                protocol = l[:l.find(':')+1]
-                file = l[l.find(':')+1:]
-            else:
-                protocol = ''
-                file = l
-            if os.path.exists(file) and file.endswith('.dat'):
-                newFile = '%s.%s' % ( file, newExtension )
-                if not os.path.exists(newFile):
-                    os.system( 'ln -s %s %s' % ( file, newFile ) )
-                newLines.append(protocol+newFile)
-            else:
-                newLines.append(l)
-        self._lines = newLines
-
-    def makeXmlElement( self, parent=None ):
-        root = et.Element( "pacbioAnalysisInputs" )
-        data = et.SubElement( root, 'dataReferences' )
-        iRef = 0
-        for l in self._lines:
-            def add(x,iRef):
-                if len(x)==0: return iRef
-                node = et.SubElement( data, 'url' )
-                if ':' in x:
-                    node.attrib[ 'ref' ] = x
-                else:
-                    node.attrib[ 'ref' ] = 'run:0000000-%04d' % iRef
-                    node2 = et.SubElement( node, 'location' )
-                    node2.text = x
-                return iRef+1
-            if l.endswith('fofn') and os.path.exists(l):
-                infile = open(l,'r')
-                for j,line in enumerate(infile): iRef=add(line.strip(),iRef)
-                infile.close()
-            else:
-                iRef=add(l,iRef)
-        return root
-
-def backticks( cmd, merge_stderr=True ):
-    """
-    Simulates the perl backticks (``) command with error-handling support
-    Returns ( command output as sequence of strings, error code, error message )
-    """
-    if merge_stderr:
-        _stderr = subprocess.STDOUT
-    else:
-        _stderr = subprocess.PIPE
-
-    p = subprocess.Popen( cmd, shell=True, stdin=subprocess.PIPE,
-                          stdout=subprocess.PIPE, stderr=_stderr,
-                          close_fds=True )
-
-    out = [ l[:-1] for l in p.stdout.readlines() ]
-
-    p.stdout.close()
-    if not merge_stderr:
-        p.stderr.close()
-
-    # need to allow process to terminate
-    p.wait()
-
-    errCode = p.returncode and p.returncode or 0
-    if p.returncode>0:
-        errorMessage = os.linesep.join(out)
-        output = []
-    else:
-        errorMessage = ''
-        output = out
-        
-    return output, errCode, errorMessage
-
-if __name__=='__main__':
-    app = SmrtpipeGalaxy( sys.argv )
-    sys.exit( app.run() )


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/smrtpipe_hybrid.xml
--- a/tools/ilmn_pacbio/smrtpipe_hybrid.xml
+++ /dev/null
@@ -1,59 +0,0 @@
-<tool id="smrtpipe_hybrid" name="AHA" version="1.0.0">
-  <description>Assemble contigs from a set of contigs and PacBio reads.</description>
-  <command interpreter="python">
-    smrtpipe_galaxy.py --nproc=24 --dat_extension=fasta --output=data/scaffold.fasta --galaxy_output=${outfile} ${iniFile}
-  </command>
-  <!--
-  <command>cp ${iniFile} ${outfile}</command>
-  -->
-  <inputs>
-    <param name="contigs" format="fasta" type="data" label="Starting Contigs"/>
-    <param name="reads" format="fasta" type="data" label="PacBio Reads"/>
-    <param name="schedule" type="text" value="6,3,75;6,3,75;5,3,75;5,3,75;6,2,75;6,2,75;5,2,75;5,2,75" label="Parameter Schedule" size="60"/>
-  </inputs>
-  <configfiles>
-    <configfile name="iniFile">
-[input]
-assembled_contigs:${contigs}
-file:${reads}
-
-[HybridAssembly]
-instrumentModel=RS
-cleanup=False
-untangler=pacbio
-#set $schedule2 = $schedule.replace('X',';')
-paramSchedule=${schedule2}
-dontFillin=False
-longReadsAsStrobe=True
-exactQueryIds=True
-rm4Opts=-minMatch 7 -minFrac 0.1 -minPctIdentity 65 -bestn 10 -noSplitSubreads
-numberProcesses=16
-cluster=False
-minRepeatLength=100000
-    </configfile>
-  </configfiles>
-  <outputs>
-    <data name="outfile" format="fasta" label="Hybrid assembly contigs from ${on_string}"/>
-  </outputs>
-  <help>
-
-**What it does**
-
-The AHA assembly algorithm is an AMOS_-based pipeline
-for finishing bacterial-sized
-genomes using draft contigs and PacBio reads.
-
-.. _AMOS: http://sourceforge.net/apps/mediawiki/amos
-
-**Parameter list**
-
-Parameter schedule
-    The parameter schedule is a semi-colon delimited list of triples.  Each triple represents an iteration of hybrid assembly (alignment/scaffolding/gap-filling).  The three paremeters for each iteration are the Z-score, number of reads required to define a link, and the minimum length of subreads used in links.
-
-**Output**
-
-FASTA file containing scaffolded and gap-filled contigs resulting from the 
-hybrid assembly.
-
-  </help>
-</tool>


diff -r 3565ae97160f4a70b3910b54bcd98c0f4c7d1796 -r d9408bcb3cf4bcb81e9dac4ddeda74cbc164e123 tools/ilmn_pacbio/soap_denovo.xml
--- a/tools/ilmn_pacbio/soap_denovo.xml
+++ /dev/null
@@ -1,73 +0,0 @@
-<tool id="soap_denovo" name="SOAPdenovo" version="1.0.0">
-  <description>Short-read de novo assembly</description>
-  <!--
-      # SOAPdenovo-127mer all -s ${soap_config} -o assembly -K ${k} -p 8 -d -D
-      # cat ${soap_config} > ${output1}
-      # cp ${soap_config} ${output1} &&
-  -->
-  <command>
-      SOAPdenovo-127mer all -s ${soap_config} -o assembly -K ${k} -p 24 -d -D -R
-  </command>
-  <inputs>
-    <conditional name="inputs">
-      <param name="read_type" type="select" label="Illumina read type">
-        <option value="single">Single fragment</option>
-        <option value="paired">Paired-end</option>
-      </param>
-      <when value="single">
-        <param name="input1" format="fastq" type="data" label="FASTQ file for reads"/>
-      </when>
-      <when value="paired">
-        <param name="input1" format="fastq" type="data" label="FASTQ file for forward reads"/>
-        <param name="input2" format="fastq" type="data" label="FASTQ file for reverse reads"/>
-        <param name="d" type="integer" value="500" label="Estimated insert size for paired-end reads" />
-      </when>
-    </conditional>
-    <param name="k" type="integer" value="23" label="Size of k for forming the de Bruijn overlap graph" />
-  </inputs>
-  <configfiles>
-    <configfile name="soap_config">max_rd_len=105
-[LIB]
-#if $inputs.read_type == "single"
-q=${inputs.input1.file_name}
-#else
-avg_ins=${inputs.d}
-asm_flags=3
-reverse_seq=0
-q1=${inputs.input1.file_name}
-q2=${inputs.input2.file_name}
-#end if
-    </configfile>
-  </configfiles>
-  <outputs>
-    <data name="assembled_contigs" format="fasta" from_work_dir="assembly.scafSeq" label="Assembled contigs from ${on_string}" />
-  </outputs>
-  <help>
-
-**What it does**
-
-Runs SOAPdenovo_ to generate a genome assembly
-using single-fragment or paired-end short reads.
-
-Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J.
-"De novo assembly of human genomes with massively parallel short read sequencing."
-*Genome Res.* 2010 Feb;20(2):265-72.
-
-.. _SOAPdenovo: http://soap.genomics.org.cn/soapdenovo.html
-
-**Parameter list**
-
-k
-    k-mer size for constructing the de Bruijn graph.  The appropriate size of k is genome and data set dependent, but a good starting choice might be 75% of the read length.
-
-Insert size
-    For paired-end libraries, the expected insert size.
-
-**Output**
-
-assembly
-
-  </help>
-</tool>
-
-

Repository URL: https://bitbucket.org/galaxy/galaxy-central/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.

    

Bitbucket

tags

participants (1)