galaxy-dev
Threads by month
- ----- 2025 -----
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- 10008 discussions

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/ebe3e881ac25
changeset: 2843:ebe3e881ac25
user: Kelly Vincent <kpvincent(a)bx.psu.edu>
date: Wed Oct 07 15:31:18 2009 -0400
description:
Updated several sample index loc files to make it clearer how the actual loc files should appear
4 file(s) affected in this change:
tool-data/bowtie_indices.loc.sample
tool-data/sam_fa_indices.loc.sample
tool-data/sequence_index_base.loc.sample
tool-data/sequence_index_color.loc.sample
diffs (89 lines):
diff -r 31c577c6fd49 -r ebe3e881ac25 tool-data/bowtie_indices.loc.sample
--- a/tool-data/bowtie_indices.loc.sample Wed Oct 07 15:25:16 2009 -0400
+++ b/tool-data/bowtie_indices.loc.sample Wed Oct 07 15:31:18 2009 -0400
@@ -1,8 +1,8 @@
#This is a sample file distributed with Galaxy that enables tools
#to use a directory of Bowtie indexed sequences data files. You will need
#to create these data files and then create a bowtie_indices.loc file
-#similar to this one (store it in this directory ) that points to
-#the directories in which those files are stored. The bowtie_indices.loc
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The bowtie_indices.loc
#file has this format (white space characters are TAB characters):
#
#<build> <file_base>
@@ -26,3 +26,4 @@
#exist, but it is the prefix for the actual index files. For example:
#
#hg18 /depot/data2/galaxy/bowtie/hg18/hg18
+#hg19 /depot/data2/galaxy/bowtie/hg19/hg19
diff -r 31c577c6fd49 -r ebe3e881ac25 tool-data/sam_fa_indices.loc.sample
--- a/tool-data/sam_fa_indices.loc.sample Wed Oct 07 15:25:16 2009 -0400
+++ b/tool-data/sam_fa_indices.loc.sample Wed Oct 07 15:31:18 2009 -0400
@@ -1,17 +1,17 @@
#This is a sample file distributed with Galaxy that enables tools
#to use a directory of Samtools indexed sequences data files. You will need
#to create these data files and then create a sam_fa_indices.loc file
-#similar to this one (store it in this directory ) that points to
-#the directories in which those files are stored. The sam_fa_indices.loc
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The sam_fa_indices.loc
#file has this format (white space characters are TAB characters):
#
-#<index> <seq> <location>
+#index <seq> <location>
#
#So, for example, if you had hg18 indexed stored in
#/depot/data2/galaxy/sam/,
#then the sam_fa_indices.loc entry would look like this:
#
-#hg18 /depot/data2/galaxy/sam/hg18.fa
+#index hg18 /depot/data2/galaxy/sam/hg18.fa
#
#and your /depot/data2/galaxy/sam/ directory
#would contain hg18.fa and hg18.fa.fai files:
@@ -24,4 +24,5 @@
#exist, but it should never be directly used. Instead, the name serves
#as a prefix for the index file. For example:
#
-#hg18 /depot/data2/galaxy/sam/hg18.fa
+#index hg18 /depot/data2/galaxy/sam/hg18.fa
+#index hg19 /depot/data2/galaxy/sam/hg19.fa
diff -r 31c577c6fd49 -r ebe3e881ac25 tool-data/sequence_index_base.loc.sample
--- a/tool-data/sequence_index_base.loc.sample Wed Oct 07 15:25:16 2009 -0400
+++ b/tool-data/sequence_index_base.loc.sample Wed Oct 07 15:31:18 2009 -0400
@@ -1,8 +1,8 @@
#This is a sample file distributed with Galaxy that enables tools
#to use a directory of BWA indexed sequences data files. You will need
#to create these data files and then create a sequence_index_base.loc file
-#similar to this one (store it in this directory ) that points to
-#the directories in which those files are stored. The sequence_index_base.loc
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The sequence_index_base.loc
#file has this format (white space characters are TAB characters):
#
#<build> <file_base>
@@ -26,3 +26,4 @@
#exist, but it is the prefix for the actual index files. For example:
#
#phiX /depot/data2/galaxy/phiX/base/phiX.fa
+#hg18 /depot/data2/galaxy/hg18/base/hg18.fa
diff -r 31c577c6fd49 -r ebe3e881ac25 tool-data/sequence_index_color.loc.sample
--- a/tool-data/sequence_index_color.loc.sample Wed Oct 07 15:25:16 2009 -0400
+++ b/tool-data/sequence_index_color.loc.sample Wed Oct 07 15:31:18 2009 -0400
@@ -1,8 +1,8 @@
#This is a sample file distributed with Galaxy that enables tools
#to use a directory of BWA indexed sequences data files. You will need
#to create these data files and then create a sequence_index_color.loc file
-#similar to this one (store it in this directory ) that points to
-#the directories in which those files are stored. The sequence_index_color.loc
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The sequence_index_color.loc
#file has this format (white space characters are TAB characters):
#
#<build> <file_base>
@@ -26,3 +26,4 @@
#exist, but it is the prefix for the actual index files. For example:
#
#phiX /depot/data2/galaxy/phiX/color/phiX.fa
+#hg18 /depot/data2/galaxy/hg18/color/hg18.fa
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/e73efc9387ee
changeset: 2844:e73efc9387ee
user: Greg Von Kuster <greg(a)bx.psu.edu>
date: Wed Oct 07 16:37:48 2009 -0400
description:
Incorporate code to provide UCSC and Gbrowse integration for wiggle files contributed by Brad Chapman - handles ticket # 134.
3 file(s) affected in this change:
lib/galaxy/datatypes/genetics.py
lib/galaxy/datatypes/interval.py
lib/galaxy/datatypes/tabular.py
diffs (263 lines):
diff -r ebe3e881ac25 -r e73efc9387ee lib/galaxy/datatypes/genetics.py
--- a/lib/galaxy/datatypes/genetics.py Wed Oct 07 15:31:18 2009 -0400
+++ b/lib/galaxy/datatypes/genetics.py Wed Oct 07 16:37:48 2009 -0400
@@ -56,10 +56,6 @@
def get_estimated_display_viewport( self, dataset ):
"""Return a chrom, start, stop tuple for viewing a file."""
raise notImplemented
-
- def as_ucsc_display_file( self, dataset, **kwd ):
- """Returns file"""
- return file(dataset.file_name,'r')
def ucsc_links( self, dataset, type, app, base_url ):
""" from the ever-helpful angie hinrichs angie(a)soe.ucsc.edu
diff -r ebe3e881ac25 -r e73efc9387ee lib/galaxy/datatypes/interval.py
--- a/lib/galaxy/datatypes/interval.py Wed Oct 07 15:31:18 2009 -0400
+++ b/lib/galaxy/datatypes/interval.py Wed Oct 07 16:37:48 2009 -0400
@@ -493,7 +493,6 @@
"""Initialize datatype, by adding GBrowse display app"""
Tabular.__init__(self, **kwd)
self.add_display_app ( 'c_elegans', 'display in Wormbase', 'as_gbrowse_display_file', 'gbrowse_links' )
-
def set_meta( self, dataset, overwrite = True, **kwd ):
i = 0
for i, line in enumerate( file ( dataset.file_name ) ):
@@ -508,7 +507,6 @@
except:
pass
Tabular.set_meta( self, dataset, overwrite = overwrite, skip = i )
-
def make_html_table( self, dataset, skipchars=[] ):
"""Create HTML table, used for displaying peek"""
out = ['<table cellspacing="0" cellpadding="3">']
@@ -524,11 +522,6 @@
except Exception, exc:
out = "Can't create peek %s" % exc
return out
-
- def as_gbrowse_display_file( self, dataset, **kwd ):
- """Returns file contents that can be displayed in GBrowse apps."""
- return open( dataset.file_name )
-
def get_estimated_display_viewport( self, dataset ):
"""
Return a chrom, start, stop tuple for viewing a file. There are slight differences between gff 2 and gff 3
@@ -568,7 +561,6 @@
return ( seqid, str( start ), str( stop ) )
else:
return ( '', '', '' )
-
def gbrowse_links( self, dataset, type, app, base_url ):
ret_val = []
if dataset.has_data:
@@ -582,7 +574,6 @@
link = "%s?start=%s&stop=%s&ref=%s&dbkey=%s" % ( site_url, start, stop, seqid, dataset.dbkey )
ret_val.append( ( site_name, link ) )
return ret_val
-
def sniff( self, filename ):
"""
Determines whether the file is in gff format
@@ -639,7 +630,6 @@
def __init__(self, **kwd):
"""Initialize datatype, by adding GBrowse display app"""
Gff.__init__(self, **kwd)
-
def set_meta( self, dataset, overwrite = True, **kwd ):
i = 0
for i, line in enumerate( file ( dataset.file_name ) ):
@@ -666,7 +656,6 @@
if valid_start and valid_end and start < end and strand in self.valid_gff3_strand and phase in self.valid_gff3_phase:
break
Tabular.set_meta( self, dataset, overwrite = overwrite, skip = i )
-
def sniff( self, filename ):
"""
Determines whether the file is in gff version 3 format
@@ -740,9 +729,70 @@
MetadataElement( name="columns", default=3, desc="Number of columns", readonly=True, visible=False )
+ def __init__( self, **kwd ):
+ Tabular.__init__( self, **kwd )
+ self.add_display_app( 'ucsc', 'display at UCSC', 'as_ucsc_display_file', 'ucsc_links' )
+ self.add_display_app( 'gbrowse', 'display in Gbrowse', 'as_gbrowse_display_file', 'gbrowse_links' )
+ def get_estimated_display_viewport( self, dataset ):
+ value = ( "", "", "" )
+ num_check_lines = 100 # only check up to this many non empty lines
+ for i, line in enumerate( file( dataset.file_name ) ):
+ line = line.rstrip( '\r\n' )
+ if line and line.startswith( "browser" ):
+ chr_info = line.split()[-1]
+ wig_chr, coords = chr_info.split( ":" )
+ start, end = coords.split( "-" )
+ value = ( wig_chr, start, end )
+ break
+ if i > num_check_lines:
+ break
+ return value
+ def _get_remote_call_url( self, redirect_url, site_name, dataset, type, app, base_url ):
+ """Retrieve the URL to call out to an external site and retrieve data.
+ This routes our external URL through a local galaxy instance which makes
+ the data available, followed by redirecting to the remote site with a
+ link back to the available information.
+ """
+ internal_url = "%s" % url_for( controller='dataset', dataset_id=dataset.id, action='display_at', filename='%s_%s' % ( type, site_name ) )
+ base_url = app.config.get( "display_at_callback", base_url )
+ if base_url.startswith( 'https://' ):
+ base_url = base_url.replace( 'https', 'http', 1 )
+ display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s&authz_method=display_at" % \
+ ( base_url, url_for( controller='root' ), dataset.id, type ) )
+ link = '%s?redirect_url=%s&display_url=%s' % ( internal_url, redirect_url, display_url )
+ return link
+ def _get_viewer_range( self, dataset ):
+ """Retrieve the chromosome, start, end for an external viewer."""
+ if dataset.has_data:
+ viewport_tuple = self.get_estimated_display_viewport( dataset )
+ if viewport_tuple:
+ chrom = viewport_tuple[0]
+ start = viewport_tuple[1]
+ stop = viewport_tuple[2]
+ return ( chrom, start, stop )
+ return ( None, None, None )
+ def gbrowse_links( self, dataset, type, app, base_url ):
+ ret_val = []
+ chrom, start, stop = self._get_viewer_range( dataset )
+ if chrom is not None:
+ for site_name, site_url in util.get_gbrowse_sites_by_build( dataset.dbkey ):
+ if site_name in app.config.gbrowse_display_sites:
+ redirect_url = urllib.quote_plus( "%s%s/?ref=%s&start=%s&stop=%s&eurl=%%s" % ( site_url, dataset.dbkey, chrom, start, stop ) )
+ link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url )
+ ret_val.append( ( site_name, link ) )
+ return ret_val
+ def ucsc_links( self, dataset, type, app, base_url ):
+ ret_val = []
+ chrom, start, stop = self._get_viewer_range( dataset )
+ if chrom is not None:
+ for site_name, site_url in util.get_ucsc_by_build( dataset.dbkey ):
+ if site_name in app.config.ucsc_display_sites:
+ redirect_url = urllib.quote_plus( "%sdb=%s&position=%s:%s-%s&hgt.customText=%%s" % ( site_url, dataset.dbkey, chrom, start, stop ) )
+ link = self._get_remote_call_url( redirect_url, site_name, dataset, type, app, base_url )
+ ret_val.append( ( site_name, link ) )
+ return ret_val
def make_html_table( self, dataset ):
return Tabular.make_html_table( self, dataset, skipchars=['track', '#'] )
-
def set_meta( self, dataset, overwrite = True, **kwd ):
i = 0
for i, line in enumerate( file ( dataset.file_name ) ):
@@ -761,7 +811,6 @@
if do_break:
break
Tabular.set_meta( self, dataset, overwrite = overwrite, skip = i )
-
def sniff( self, filename ):
"""
Determines wether the file is in wiggle format
@@ -792,7 +841,6 @@
return False
except:
return False
-
def get_track_window(self, dataset, data, start, end):
"""
Assumes we have a numpy file.
@@ -817,7 +865,6 @@
y = data[ t_start : t_end ]
return zip(x.tolist(), y.tolist())
-
def get_track_resolution( self, dataset, start, end):
range = end - start
# Determine appropriate resolution to plot ~1000 points
@@ -826,7 +873,6 @@
resolution = min( resolution, 100000 )
resolution = max( resolution, 1 )
return resolution
-
def get_track_type( self ):
return "LineTrack"
@@ -882,8 +928,6 @@
except:
#return "."
return ('', '', '')
- def as_ucsc_display_file( self, dataset ):
- return open(dataset.file_name)
def ucsc_links( self, dataset, type, app, base_url ):
ret_val = []
if dataset.has_data:
@@ -948,58 +992,6 @@
return False
return True
-class GBrowseTrack ( Tabular ):
- """GMOD GBrowseTrack"""
- file_ext = "gbrowsetrack"
-
- def __init__(self, **kwd):
- """Initialize datatype, by adding GBrowse display app"""
- Tabular.__init__(self, **kwd)
- self.add_display_app ('c_elegans', 'display in Wormbase', 'as_gbrowse_display_file', 'gbrowse_links' )
-
- def set_readonly_meta( self, dataset, skip=1, **kwd ):
- """Resets the values of readonly metadata elements."""
- Tabular.set_readonly_meta( self, dataset, skip = skip, **kwd )
-
- def set_meta( self, dataset, overwrite = True, **kwd ):
- Tabular.set_meta( self, dataset, overwrite = overwrite, skip = 1 )
-
- def make_html_table( self, dataset ):
- return Tabular.make_html_table( self, dataset, skipchars=['track', '#'] )
-
- def get_estimated_display_viewport( self, dataset ):
- #TODO: fix me...
- return ('', '', '')
-
- def gbrowse_links( self, dataset, type, app, base_url ):
- ret_val = []
- if dataset.has_data:
- viewport_tuple = self.get_estimated_display_viewport(dataset)
- if viewport_tuple:
- chrom = viewport_tuple[0]
- start = viewport_tuple[1]
- stop = viewport_tuple[2]
- for site_name, site_url in util.get_gbrowse_sites_by_build(dataset.dbkey):
- if site_name in app.config.gbrowse_display_sites:
- display_url = urllib.quote_plus( "%s%s/display_as?id=%i&display_app=%s" % (base_url, url_for( controller='root' ), dataset.id, type) )
- link = "%sname=%s&ref=%s:%s..%s&eurl=%s" % (site_url, dataset.dbkey, chrom, start, stop, display_url )
- ret_val.append( (site_name, link) )
- return ret_val
-
- def as_gbrowse_display_file( self, dataset, **kwd ):
- """Returns file contents that can be displayed in GBrowse apps."""
- #TODO: fix me...
- return open(dataset.file_name)
-
- def sniff( self, filename ):
- """
- Determines whether the file is in gbrowsetrack format.
-
- GBrowseTrack files are built within Galaxy.
- TODO: Not yet sure what this file will look like. Fix this sniffer and add some unit tests here as soon as we know.
- """
- return False
-
if __name__ == '__main__':
import doctest, sys
doctest.testmod(sys.modules[__name__])
diff -r ebe3e881ac25 -r e73efc9387ee lib/galaxy/datatypes/tabular.py
--- a/lib/galaxy/datatypes/tabular.py Wed Oct 07 15:31:18 2009 -0400
+++ b/lib/galaxy/datatypes/tabular.py Wed Oct 07 16:37:48 2009 -0400
@@ -205,6 +205,10 @@
def display_peek( self, dataset ):
"""Returns formatted html of peek"""
return self.make_html_table( dataset )
+ def as_gbrowse_display_file( self, dataset, **kwd ):
+ return open( dataset.file_name )
+ def as_ucsc_display_file( self, dataset, **kwd ):
+ return open( dataset.file_name )
class Taxonomy( Tabular ):
def __init__(self, **kwd):
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/6252781aa157
changeset: 2841:6252781aa157
user: Kelly Vincent <kpvincent(a)bx.psu.edu>
date: Wed Oct 07 12:43:15 2009 -0400
description:
Added fastq (generic) datatype and deleted fastqsolexa datatype
8 file(s) affected in this change:
datatypes_conf.xml.sample
lib/galaxy/datatypes/registry.py
lib/galaxy/datatypes/sequence.py
lib/galaxy/datatypes/test/1.fastq
lib/galaxy/datatypes/test/2.fastq
test-data/1.fastq
test-data/2gen.fastq
test/functional/test_sniffing_and_metadata_settings.py
diffs (380 lines):
diff -r ecb6d86a5a9c -r 6252781aa157 datatypes_conf.xml.sample
--- a/datatypes_conf.xml.sample Wed Oct 07 11:48:30 2009 -0400
+++ b/datatypes_conf.xml.sample Wed Oct 07 12:43:15 2009 -0400
@@ -22,11 +22,8 @@
<datatype extension="fasta" type="galaxy.datatypes.sequence:Fasta" display_in_upload="true">
<converter file="fasta_to_tabular_converter.xml" target_datatype="tabular"/>
</datatype>
+ <datatype extension="fastq" type="galaxy.datatypes.sequence:Fastq" display_in_upload="true"/>
<datatype extension="fastqsanger" type="galaxy.datatypes.sequence:FastqSanger" display_in_upload="true"/>
- <datatype extension="fastqsolexa" type="galaxy.datatypes.sequence:FastqSolexa" display_in_upload="true">
- <converter file="fastqsolexa_to_fasta_converter.xml" target_datatype="fasta"/>
- <converter file="fastqsolexa_to_qual_converter.xml" target_datatype="qualsolexa"/>
- </datatype>
<datatype extension="genetrack" type="galaxy.datatypes.tracks:GeneTrack"/>
<datatype extension="gff" type="galaxy.datatypes.interval:Gff" display_in_upload="true">
<converter file="gff_to_bed_converter.xml" target_datatype="bed"/>
@@ -200,8 +197,8 @@
<sniffer type="galaxy.datatypes.qualityscore:QualityScoreSOLiD"/>
<sniffer type="galaxy.datatypes.qualityscore:QualityScore454"/>
<sniffer type="galaxy.datatypes.sequence:Fasta"/>
- <sniffer type="galaxy.datatypes.sequence:FastqSolexa"/>
<sniffer type="galaxy.datatypes.sequence:FastqSanger"/>
+ <sniffer type="galaxy.datatypes.sequence:Fastq"/>
<sniffer type="galaxy.datatypes.interval:Wiggle"/>
<sniffer type="galaxy.datatypes.images:Html"/>
<sniffer type="galaxy.datatypes.sequence:Axt"/>
diff -r ecb6d86a5a9c -r 6252781aa157 lib/galaxy/datatypes/registry.py
--- a/lib/galaxy/datatypes/registry.py Wed Oct 07 11:48:30 2009 -0400
+++ b/lib/galaxy/datatypes/registry.py Wed Oct 07 12:43:15 2009 -0400
@@ -119,8 +119,8 @@
'customtrack' : interval.CustomTrack(),
'csfasta' : sequence.csFasta(),
'fasta' : sequence.Fasta(),
+ 'fastq' : sequence.Fastq(),
'fastqsanger' : sequence.FastqSanger(),
- 'fastqsolexa' : sequence.FastqSolexa(),
'gff' : interval.Gff(),
'gff3' : interval.Gff3(),
'genetrack' : tracks.GeneTrack(),
@@ -149,8 +149,8 @@
'customtrack' : 'text/plain',
'csfasta' : 'text/plain',
'fasta' : 'text/plain',
+ 'fastq' : 'text/plain',
'fastqsanger' : 'text/plain',
- 'fastqsolexa' : 'text/plain',
'gff' : 'text/plain',
'gff3' : 'text/plain',
'interval' : 'text/plain',
@@ -179,8 +179,8 @@
qualityscore.QualityScoreSOLiD(),
qualityscore.QualityScore454(),
sequence.Fasta(),
- sequence.FastqSolexa(),
sequence.FastqSanger(),
+ sequence.Fastq(),
interval.Wiggle(),
images.Html(),
sequence.Axt(),
diff -r ecb6d86a5a9c -r 6252781aa157 lib/galaxy/datatypes/sequence.py
--- a/lib/galaxy/datatypes/sequence.py Wed Oct 07 11:48:30 2009 -0400
+++ b/lib/galaxy/datatypes/sequence.py Wed Oct 07 12:43:15 2009 -0400
@@ -1,5 +1,5 @@
"""
-Image classes
+Sequence classes
"""
import data
@@ -134,10 +134,10 @@
pass
return False
-class FastqSolexa( Sequence ):
- """Class representing a FASTQ sequence ( the Solexa variant )"""
- file_ext = "fastqsolexa"
-
+class Fastq ( Sequence ):
+ """Class representing a generic FASTQ sequence"""
+ file_ext = "fastq"
+
def set_peek( self, dataset ):
if not dataset.dataset.purged:
dataset.peek = data.get_file_peek( dataset.file_name )
@@ -145,102 +145,46 @@
else:
dataset.peek = 'file does not exist'
dataset.blurb = 'file purged from disk'
-
- def sniff( self, filename ):
+
+ def sniff ( self, filename ):
"""
- Determines whether the file is in fastqsolexa format (Solexa Variant)
+ Determines whether the file is in generic fastq format
For details, see http://maq.sourceforge.net/fastq.shtml
- Note: There are two kinds of FASTQ files, known as "Sanger" (sometimes called "Standard") and Solexa
+ Note: There are three kinds of FASTQ files, known as "Sanger" (sometimes called "Standard"), Solexa, and Illumina
These differ in the representation of the quality scores
- >>> fname = get_test_fname( '1.fastqsolexa' )
- >>> FastqSolexa().sniff( fname )
+ >>> fname = get_test_fname( '1.fastqsanger' )
+ >>> Fastq().sniff( fname )
True
- >>> fname = get_test_fname( '2.fastqsolexa' )
- >>> FastqSolexa().sniff( fname )
+ >>> fname = get_test_fname( '2.fastqsanger' )
+ >>> Fastq().sniff( fname )
True
"""
headers = get_headers( filename, None )
- bases_regexp = re.compile( "^[NGTAC]*$" )
+ bases_regexp = re.compile( "^[NGTAC]*" )
+ # check that first block looks like a fastq block
try:
if len( headers ) >= 4 and headers[0][0] and headers[0][0][0] == "@" and headers[2][0] and headers[2][0][0] == "+" and headers[1][0]:
# Check the sequence line, make sure it contains only G/C/A/T/N
if not bases_regexp.match( headers[1][0] ):
return False
-
- # Check quality score: integer or ascii char.
- try:
- check = int(headers[3][0])
- qscore_int = True
- except:
- qscore_int = False
-
- # check length and range of quality scores
- if qscore_int:
- if len( headers[3] ) != len( headers[1][0] ):
- return False
- if not self.check_qual_values_within_range(headers[3], 'int'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[7], 'int'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[11], 'int'):
- return False
- except IndexError:
- pass
- except IndexError:
- pass
- else:
- if len( headers[3][0] ) != len( headers[1][0] ):
- return False
- if not self.check_qual_values_within_range(headers[3][0], 'char'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[7][0], 'char'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[11][0], 'char'):
- return False
- except IndexError:
- pass
- except IndexError:
- pass
return True
return False
except:
return False
- def check_qual_values_within_range( self, qual_seq, score_type ):
- if score_type == 'char':
- for val in qual_seq:
- if ord(val) < 59 or ord(val) > 104:
- return False
- elif score_type == 'int':
- for val in qual_seq:
- if int(val) < -5 or int(val) > 40:
- return False
- return True
-
-class FastqSanger( Sequence ):
+
+class FastqSanger( Fastq ):
"""Class representing a FASTQ sequence ( the Sanger variant )"""
file_ext = "fastqsanger"
-
- def set_peek( self, dataset ):
- if not dataset.dataset.purged:
- dataset.peek = data.get_file_peek( dataset.file_name )
- dataset.blurb = data.nice_size( dataset.get_size() )
- else:
- dataset.peek = 'file does not exist'
- dataset.blurb = 'file purged from disk'
def sniff( self, filename ):
"""
Determines whether the file is in fastqsanger format (Sanger Variant)
For details, see http://maq.sourceforge.net/fastq.shtml
- Note: There are two kinds of FASTQ files, known as "Sanger" (sometimes called "Standard") and Solexa
+ Note: There are three kinds of FASTQ files, known as "Sanger" (sometimes called "Standard"), Solexa, and Illumina
These differ in the representation of the quality scores
>>> fname = get_test_fname( '1.fastqsanger' )
@@ -254,60 +198,33 @@
bases_regexp = re.compile( "^[NGTAC]*$" )
try:
if len( headers ) >= 4 and headers[0][0] and headers[0][0][0] == "@" and headers[2][0] and headers[2][0][0] == "+" and headers[1][0]:
- # Check the sequence line, make sure it contains only G/C/A/T/N
- if not bases_regexp.match( headers[1][0] ):
- return False
- # Check quality score: integer or ascii char.
- try:
- check = int(headers[3][0])
- qscore_int = True
- except:
- qscore_int = False
-
- # check length and range of quality scores
- if qscore_int:
- if len( headers[3] ) != len( headers[1][0] ):
- return False
- if not self.check_qual_values_within_range(headers[3], 'int'):
- return False
+ # look through first 20 blocks and make sure bases valid and qualities valid
+ for i in range( 1, 80, 4 ):
try:
- if not self.check_qual_values_within_range(headers[7], 'int'):
+ # check that bases are legitimate
+ if not bases_regexp.match( headers[i][0] ):
return False
- try:
- if not self.check_qual_values_within_range(headers[11], 'int'):
- return False
- except IndexError:
- pass
+ # check length of qualities (matching bases)
+ if len( headers[i+2][0] ) != len( headers[1][0] ):
+ return False
+ # check qualities within fastqsanger range
+ if not self.check_qual_values_within_range( headers[i+2][0] ):
+ return False
except IndexError:
pass
- else:
- if len( headers[3][0] ) != len( headers[1][0] ):
- return False
- if not self.check_qual_values_within_range(headers[3][0], 'char'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[7][0], 'char'):
- return False
- try:
- if not self.check_qual_values_within_range(headers[11][0], 'char'):
- return False
- except IndexError:
- pass
- except IndexError:
- pass
- return True
- return False
+ return True
+ return False
except:
return False
- def check_qual_values_within_range( self, qual_seq, score_type ):
- if score_type == 'char':
- for val in qual_seq:
- if ord(val) >= 33 and ord(val) <= 126:
- return True
- elif score_type == 'int':
- for val in qual_seq:
- if int(val) >= 0 and int(val) <= 93:
- return True
+ def check_qual_values_within_range( self, qual_seq ):
+ under59 = False
+ for val in qual_seq:
+ if ord(val) < 33 or ord(val) > 126:
+ return False
+ if not under59 and ord(val) < 59:
+ under59 = True
+ if under59:
+ return True
return False
try:
@@ -521,7 +438,7 @@
>>> fname = get_test_fname( 'alignment.lav' )
>>> Axt().sniff( fname )
False
- """
+ """
headers = get_headers( filename, None )
if len(headers) < 4:
return False
diff -r ecb6d86a5a9c -r 6252781aa157 lib/galaxy/datatypes/test/1.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/lib/galaxy/datatypes/test/1.fastq Wed Oct 07 12:43:15 2009 -0400
@@ -0,0 +1,8 @@
+@HANNIBAL_1_FC302VTAAXX:2:1:228:167
+GAATTGATCAGGACATAGGACAACTGTAGGCACCAT
++HANNIBAL_1_FC302VTAAXX:2:1:228:167
+40 40 40 40 35 40 40 40 25 40 40 26 40 9 33 11 40 35 17 40 40 33 40 7 9 15 3 22 15 30 11 17 9 4 9 4
+@HANNIBAL_1_FC302VTAAXX:2:1:156:340
+GAGTTCTCGTCGCCTGTAGGCACCATCAATCGTATG
++HANNIBAL_1_FC302VTAAXX:2:1:156:340
+40 15 40 17 6 36 40 40 40 25 40 9 35 33 40 14 14 18 15 17 19 28 31 4 24 18 27 14 15 18 2 8 12 8 11 9
\ No newline at end of file
diff -r ecb6d86a5a9c -r 6252781aa157 lib/galaxy/datatypes/test/2.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/lib/galaxy/datatypes/test/2.fastq Wed Oct 07 12:43:15 2009 -0400
@@ -0,0 +1,8 @@
+@seq1
+GACAGCTTGGTTTTTAGTGAGTTGTTCCTTTCTTT
++seq1
+hhhhhhhhhhhhhhhhhhhhhhhhhhPW@hhhhhh
+@seq2
+GCAATGACGGCAGCAATAAACTCAACAGGTGCTGG
++seq2
+hhhhhhhhhhhhhhYhhahhhhWhAhFhSIJGChO
\ No newline at end of file
diff -r ecb6d86a5a9c -r 6252781aa157 test-data/1.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/1.fastq Wed Oct 07 12:43:15 2009 -0400
@@ -0,0 +1,8 @@
+@HANNIBAL_1_FC302VTAAXX:2:1:228:167
+GAATTGATCAGGACATAGGACAACTGTAGGCACCAT
++HANNIBAL_1_FC302VTAAXX:2:1:228:167
+40 40 40 40 35 40 40 40 25 40 40 26 40 9 33 11 40 35 17 40 40 33 40 7 9 15 3 22 15 30 11 17 9 4 9 4
+@HANNIBAL_1_FC302VTAAXX:2:1:156:340
+GAGTTCTCGTCGCCTGTAGGCACCATCAATCGTATG
++HANNIBAL_1_FC302VTAAXX:2:1:156:340
+40 15 40 17 6 36 40 40 40 25 40 9 35 33 40 14 14 18 15 17 19 28 31 4 24 18 27 14 15 18 2 8 12 8 11 9
\ No newline at end of file
diff -r ecb6d86a5a9c -r 6252781aa157 test-data/2gen.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/2gen.fastq Wed Oct 07 12:43:15 2009 -0400
@@ -0,0 +1,8 @@
+@seq1
+GACAGCTTGGTTTTTAGTGAGTTGTTCCTTTCTTT
++seq1
+hhhhhhhhhhhhhhhhhhhhhhhhhhPW@hhhhhh
+@seq2
+GCAATGACGGCAGCAATAAACTCAACAGGTGCTGG
++seq2
+hhhhhhhhhhhhhhYhhahhhhWhAhFhSIJGChO
\ No newline at end of file
diff -r ecb6d86a5a9c -r 6252781aa157 test/functional/test_sniffing_and_metadata_settings.py
--- a/test/functional/test_sniffing_and_metadata_settings.py Wed Oct 07 11:48:30 2009 -0400
+++ b/test/functional/test_sniffing_and_metadata_settings.py Wed Oct 07 12:43:15 2009 -0400
@@ -81,16 +81,6 @@
assert latest_hda is not None, "Problem retrieving fasta hda from the database"
if not latest_hda.name == '1.fasta' and not latest_hda.extension == 'fasta':
raise AssertionError, "fasta data type was not correctly sniffed."
- def test_030_fastqsolexa_datatype( self ):
- """Testing correctly sniffing fastqsolexa ( the Solexa variant ) data type upon upload"""
- self.upload_file( '1.fastqsolexa' )
- self.verify_dataset_correctness( '1.fastqsolexa' )
- self.check_history_for_string( '1.fastqsolexa format: <span class="fastqsolexa">fastqsolexa</span>, database: \? Info: uploaded fastqsolexa file' )
- latest_hda = galaxy.model.HistoryDatasetAssociation.query() \
- .order_by( desc( galaxy.model.HistoryDatasetAssociation.table.c.create_time ) ).first()
- assert latest_hda is not None, "Problem retrieving fastqsolexa hda from the database"
- if not latest_hda.name == '1.fastqsolexa' and not latest_hda.extension == 'fastqsolexa':
- raise AssertionError, "fastqsolexa data type was not correctly sniffed."
def test_035_gff_datatype( self ):
"""Testing correctly sniffing gff data type upon upload"""
self.upload_file( '5.gff' )
@@ -236,6 +226,16 @@
assert latest_hda is not None, "Problem retrieving sam hda from the database"
if not latest_hda.name == '1.sam' and not latest_hda.extension == 'sam':
raise AssertionError, "sam data type was not correctly sniffed."
+ def test_095_fastq_datatype( self ):
+ """Testing correctly sniffing fastq ( generic ) data type upon upload"""
+ self.upload_file( '2gen.fastq' )
+ self.verify_dataset_correctness( '2gen.fastq' )
+ self.check_history_for_string( '2gen.fastq format: <span class="fastq">fastq</span>, database: \? Info: uploaded fastq file' )
+ latest_hda = galaxy.model.HistoryDatasetAssociation.query() \
+ .order_by( desc( galaxy.model.HistoryDatasetAssociation.table.c.create_time ) ).first()
+ assert latest_hda is not None, "Problem retrieving fastq hda from the database"
+ if not latest_hda.name == '2gen.fastq' and not latest_hda.extension == 'fastq':
+ raise AssertionError, "fastq data type was not correctly sniffed."
def test_9999_clean_up( self ):
self.delete_history( id=self.security.encode_id( history1.id ) )
self.logout()
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/ecb6d86a5a9c
changeset: 2840:ecb6d86a5a9c
user: jeremy goecks <jeremy.goecks at emory.edu>
date: Wed Oct 07 11:48:30 2009 -0400
description:
Reorganization and rewording of history Options menu.
1 file(s) affected in this change:
templates/root/index.mako
diffs (56 lines):
diff -r 3ad620871b25 -r ecb6d86a5a9c templates/root/index.mako
--- a/templates/root/index.mako Wed Oct 07 11:18:14 2009 -0400
+++ b/templates/root/index.mako Wed Oct 07 11:48:30 2009 -0400
@@ -6,12 +6,18 @@
$(function(){
$("#history-options-button").css( "position", "relative" );
make_popupmenu( $("#history-options-button"), {
- "List your histories": null,
- "Stored by you": function() {
+ "History Lists": null,
+ "My Saved Histories": function() {
galaxy_main.location = "${h.url_for( controller='history', action='list')}";
},
+ "My Shared Histories": function() {
+ galaxy_main.location = "${h.url_for( controller='history', action='list', operation='sharing' )}";
+ },
+ "Histories Shared with Me": function() {
+ galaxy_main.location = "${h.url_for( controller='history', action='list_shared')}";
+ },
"Current History": null,
- "Create new": function() {
+ "Create New": function() {
galaxy_history.location = "${h.url_for( controller='root', action='history_new' )}";
},
"Clone": function() {
@@ -20,13 +26,13 @@
"Share": function() {
galaxy_main.location = "${h.url_for( controller='history', action='share' )}";
},
- "Extract workflow": function() {
+ "Extract Workflow": function() {
galaxy_main.location = "${h.url_for( controller='workflow', action='build_from_current_history' )}";
},
- "Dataset security": function() {
+ "Dataset Security": function() {
galaxy_main.location = "${h.url_for( controller='root', action='history_set_default_permissions' )}";
},
- "Show deleted datasets": function() {
+ "Show Deleted Datasets": function() {
galaxy_history.location = "${h.url_for( controller='root', action='history', show_deleted=True)}";
},
"Delete": function()
@@ -35,13 +41,6 @@
{
galaxy_main.location = "${h.url_for( controller='history', action='delete_current' )}";
}
- },
- "Manage shared histories": null,
- "Shared by you": function() {
- galaxy_main.location = "${h.url_for( controller='history', action='list', operation='sharing' )}";
- },
- "Shared with you": function() {
- galaxy_main.location = "${h.url_for( controller='history', action='list_shared')}";
}
});
});
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/31c577c6fd49
changeset: 2842:31c577c6fd49
user: guru
date: Wed Oct 07 15:25:16 2009 -0400
description:
Fixed bug in bed_to_interval_index converter.
1 file(s) affected in this change:
lib/galaxy/datatypes/converters/bed_to_interval_index_converter.py
diffs (14 lines):
diff -r 6252781aa157 -r 31c577c6fd49 lib/galaxy/datatypes/converters/bed_to_interval_index_converter.py
--- a/lib/galaxy/datatypes/converters/bed_to_interval_index_converter.py Wed Oct 07 12:43:15 2009 -0400
+++ b/lib/galaxy/datatypes/converters/bed_to_interval_index_converter.py Wed Oct 07 15:25:16 2009 -0400
@@ -15,8 +15,8 @@
offset = 0
for line in open(input_fname, "r"):
- feature = line.split()
- if not feature or feature[0] == "track" or feature[0] == "#":
+ feature = line.strip().split()
+ if not feature or feature[0].startswith("track") or feature[0].startswith("#"):
offset += len(line)
continue
chrom = feature[0]
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/cb842f737d46
changeset: 2837:cb842f737d46
user: Kanwei Li <kanwei(a)gmail.com>
date: Tue Oct 06 23:47:08 2009 -0400
description:
refactor trackster, better failure message
6 file(s) affected in this change:
lib/galaxy/datatypes/indexers/coverage.py
lib/galaxy/datatypes/indexers/wiggle.py
lib/galaxy/web/controllers/tracks.py
static/scripts/trackster.js
templates/tracks/browser.mako
templates/tracks/new_browser.mako
diffs (252 lines):
diff -r b25297e88f96 -r cb842f737d46 lib/galaxy/datatypes/indexers/coverage.py
--- a/lib/galaxy/datatypes/indexers/coverage.py Tue Oct 06 21:25:01 2009 -0400
+++ b/lib/galaxy/datatypes/indexers/coverage.py Tue Oct 06 23:47:08 2009 -0400
@@ -2,7 +2,7 @@
"""
Read a chromosome of coverage data, and write it as a npy array, as
-well as averages over regions of progessively larger size in powers of 10
+well as averages over regions of progressively larger size in powers of 10
"""
from __future__ import division
diff -r b25297e88f96 -r cb842f737d46 lib/galaxy/datatypes/indexers/wiggle.py
--- a/lib/galaxy/datatypes/indexers/wiggle.py Tue Oct 06 21:25:01 2009 -0400
+++ b/lib/galaxy/datatypes/indexers/wiggle.py Tue Oct 06 23:47:08 2009 -0400
@@ -2,7 +2,7 @@
"""
Read a chromosome of wiggle data, and write it as a npy array, as
-well as averages over regions of progessively larger size in powers of 10
+well as averages over regions of progressively larger size in powers of 10
"""
from __future__ import division
diff -r b25297e88f96 -r cb842f737d46 lib/galaxy/web/controllers/tracks.py
--- a/lib/galaxy/web/controllers/tracks.py Tue Oct 06 21:25:01 2009 -0400
+++ b/lib/galaxy/web/controllers/tracks.py Tue Oct 06 23:47:08 2009 -0400
@@ -49,7 +49,8 @@
# FIXME: hardcoding this for now, but it should be derived from the available
# converters
-browsable_types = set( ["wig", "bed" ] )
+browsable_types = ( "wig", "bed" )
+
class TracksController( BaseController ):
"""
@@ -92,7 +93,7 @@
if dataset.metadata.dbkey == dbkey and dataset.extension in browsable_types:
datasets[dataset.id] = (dataset.extension, dataset.name)
# Render the template
- return trans.fill_template( "tracks/new_browser.mako", dbkey=dbkey, dbkey_set=dbkey_set, datasets=datasets )
+ return trans.fill_template( "tracks/new_browser.mako", converters=browsable_types, dbkey=dbkey, dbkey_set=dbkey_set, datasets=datasets )
@web.expose
def browser(self, trans, dataset_ids, chrom=""):
diff -r b25297e88f96 -r cb842f737d46 static/scripts/trackster.js
--- a/static/scripts/trackster.js Tue Oct 06 21:25:01 2009 -0400
+++ b/static/scripts/trackster.js Tue Oct 06 23:47:08 2009 -0400
@@ -5,35 +5,6 @@
var DENSITY = 1000,
DATA_ERROR = "There was an error in indexing this dataset.",
DATA_NONE = "No data for this chrom/contig.";
-
-var DataCache = function( type, track ) {
- this.type = type;
- this.track = track;
- this.cache = Object();
-};
-$.extend( DataCache.prototype, {
- get: function( resolution, position ) {
- var cache = this.cache;
- if ( !( cache[resolution] && cache[resolution][position] ) ) {
- if ( !cache[resolution] ) {
- cache[resolution] = Object();
- }
- var low = position * DENSITY * resolution;
- var high = ( position + 1 ) * DENSITY * resolution;
- cache[resolution][position] = { state: "loading" };
-
- $.getJSON( data_url, { track_type: this.track.track_type, chrom: this.track.view.chrom, low: low, high: high, dataset_id: this.track.dataset_id }, function ( data ) {
- if( data == "pending" ) {
- setTimeout( fetcher, 5000 );
- } else {
- cache[resolution][position] = { state: "loaded", values: data };
- }
- $(document).trigger( "redraw" );
- });
- }
- return cache[resolution][position];
- }
-});
var View = function( chrom, max_length ) {
this.chrom = chrom;
@@ -234,7 +205,7 @@
this.container_div.addClass( "line-track" );
this.content_div.css( "height", this.height_px + "px" );
this.dataset_id = dataset_id;
- this.cache = new DataCache( "", this );
+ this.cache = new Cache(50);
};
$.extend( LineTrack.prototype, TiledTrack.prototype, {
init: function() {
@@ -254,6 +225,21 @@
}
});
},
+ get_data: function( resolution, position ) {
+ var key = resolution + '-' + position,
+ cache = this.cache;
+
+ if ( !cache[key] ) {
+ var low = position * DENSITY * resolution,
+ high = ( position + 1 ) * DENSITY * resolution;
+
+ $.getJSON( data_url, { track_type: this.track_type, chrom: this.view.chrom, low: low, high: high, dataset_id: this.dataset_id }, function ( data ) {
+ cache[key] = data;
+ $(document).trigger( "redraw" );
+ });
+ }
+ return cache[key];
+ },
draw_tile: function( resolution, tile_index, parent_element, w_scale, h_scale ) {
if (!this.vertical_range) // We don't have the necessary information yet
return;
@@ -261,13 +247,13 @@
var tile_low = tile_index * DENSITY * resolution,
tile_high = ( tile_index + 1 ) * DENSITY * resolution,
tile_length = DENSITY * resolution;
- var chunk = this.cache.get( resolution, tile_index );
- var element;
- if ( chunk.state == "loading" ) {
- element = $("<div class='loading tile'></div>");
- } else {
- element = $("<canvas class='tile'></canvas>");
+ var data = this.get_data( resolution, tile_index );
+ if ( !data ) {
+ in_path = false;
+ return null;
}
+ var element = $("<canvas class='tile'></canvas>");
+
element.css( {
position: "absolute",
top: 0,
@@ -275,18 +261,13 @@
});
parent_element.append( element );
// Chunk is still loading, do nothing
- if ( chunk.state == "loading" ) {
- in_path = false;
- return null;
- }
+
var canvas = element;
canvas.get(0).width = Math.ceil( tile_length * w_scale );
canvas.get(0).height = this.height_px;
var ctx = canvas.get(0).getContext("2d");
var in_path = false;
ctx.beginPath();
- var data = chunk.values;
- if (!data) return;
for ( var i = 0; i < data.length - 1; i++ ) {
var x = data[i][0] - tile_low;
var y = data[i][1];
diff -r b25297e88f96 -r cb842f737d46 templates/tracks/browser.mako
--- a/templates/tracks/browser.mako Tue Oct 06 21:25:01 2009 -0400
+++ b/templates/tracks/browser.mako Tue Oct 06 23:47:08 2009 -0400
@@ -7,7 +7,7 @@
<%def name="javascripts()">
${parent.javascripts()}
-${h.js( "jquery", "jquery.event.drag", "jquery.mousewheel", "trackster" )}
+${h.js( "jquery", "jquery.event.drag", "jquery.mousewheel", "lrucache", "trackster" )}
<script type="text/javascript">
diff -r b25297e88f96 -r cb842f737d46 templates/tracks/new_browser.mako
--- a/templates/tracks/new_browser.mako Tue Oct 06 21:25:01 2009 -0400
+++ b/templates/tracks/new_browser.mako Tue Oct 06 23:47:08 2009 -0400
@@ -11,39 +11,48 @@
</script>
</%def>
-<div class="form">
- <div class="form-title">Select datasets to include in browser</div>
- <div id="dbkey" class="form-body">
- <form id="form" method="POST">
- <div class="form-row">
- <label for="dbkey">Reference genome build (dbkey): </label>
- <div class="form-row-input">
- <select name="dbkey" id="dbkey" refresh_on_change="true">
- %for tmp_dbkey in dbkey_set:
- <option value="${tmp_dbkey}"
- %if tmp_dbkey == dbkey:
- selected="selected"
- %endif
- >${tmp_dbkey}</option>
- %endfor
- </select>
+% if not converters:
+ <div class="errormessagelarge">
+ There are no available converters needed for visualization. Please verify that your tool_conf.xml file contains
+ converters for datatypes (see tool_conf.xml.sample) for examples.
+ </div>
+
+% else:
+ <div class="form">
+ <div class="form-title">Select datasets to include in browser</div>
+
+ <div id="dbkey" class="form-body">
+ <form id="form" method="POST">
+ <div class="form-row">
+ <label for="dbkey">Reference genome build (dbkey): </label>
+ <div class="form-row-input">
+ <select name="dbkey" id="dbkey" refresh_on_change="true">
+ %for tmp_dbkey in dbkey_set:
+ <option value="${tmp_dbkey}"
+ %if tmp_dbkey == dbkey:
+ selected="selected"
+ %endif
+ >${tmp_dbkey}</option>
+ %endfor
+ </select>
+ </div>
+ <div style="clear: both;"></div>
</div>
- <div style="clear: both;"></div>
+ <div class="form-row">
+ <label for="dataset_ids">Datasets to include: </label>
+ %for dataset_id, (dataset_ext, dataset_name) in datasets.iteritems():
+ <div>
+ <input type="checkbox" id="${dataset_id}" name="dataset_ids" value="${dataset_id}" />
+ <label style="display:inline; font-weight: normal" for="${dataset_id}">[${dataset_ext}] ${dataset_name}</label>
+ </div>
+ %endfor
+
+ <div style="clear: both;"></div>
+ </div>
</div>
<div class="form-row">
- <label for="dataset_ids">Datasets to include: </label>
- %for dataset_id, (dataset_ext, dataset_name) in datasets.iteritems():
- <div>
- <input type="checkbox" id="${dataset_id}" name="dataset_ids" value="${dataset_id}" />
- <label style="display:inline; font-weight: normal" for="${dataset_id}">[${dataset_ext}] ${dataset_name}</label>
- </div>
- %endfor
-
- <div style="clear: both;"></div>
+ <input type="submit" name="browse" value="Browse"/>
</div>
- </div>
- <div class="form-row">
- <input type="submit" name="browse" value="Browse"/>
- </div>
- </form>
-</div>
+ </form>
+ </div>
+% endif
1
0
details: http://www.bx.psu.edu/hg/galaxy/rev/3ad620871b25
changeset: 2839:3ad620871b25
user: Anton Nekrutenko <anton(a)bx.psu.edu>
date: Wed Oct 07 11:18:14 2009 -0400
description:
ngs updates
6 file(s) affected in this change:
tool_conf.xml.sample
tools/fastx_toolkit/fastq_quality_converter.xml
tools/fastx_toolkit/fastq_to_fasta.xml
tools/fastx_toolkit/fastx_quality_statistics.xml
tools/metag_tools/split_paired_reads.xml
tools/next_gen_conversion/fastq_gen_conv.xml
diffs (251 lines):
diff -r 9a75d2428e21 -r 3ad620871b25 tool_conf.xml.sample
--- a/tool_conf.xml.sample Wed Oct 07 11:12:39 2009 -0400
+++ b/tool_conf.xml.sample Wed Oct 07 11:18:14 2009 -0400
@@ -72,10 +72,6 @@
<tool file="maf/maf_to_fasta.xml" />
<tool file="fasta_tools/tabular_to_fasta.xml" />
<tool file="fastx_toolkit/fastq_to_fasta.xml" />
- <tool file="next_gen_conversion/solid_to_fastq.xml" />
- <tool file="next_gen_conversion/fastq_conversions.xml" />
- <tool file="fastx_toolkit/fastq_quality_converter.xml" />
- <tool file="next_gen_conversion/fastq_gen_conv.xml" />
</section>
<section name="Extract Features" id="features">
<tool file="filters/ucsc_gene_bed_to_exon_bed.xml" />
@@ -175,32 +171,27 @@
</section>
<section name="NGS: QC and manipulation" id="cshl_library_information">
<label text="Generic FASTQ data" id="fastq" />
+ <tool file="next_gen_conversion/fastq_gen_conv.xml" />
+ <tool file="fastx_toolkit/fastq_quality_converter.xml" />
<tool file="fastx_toolkit/fastx_quality_statistics.xml" />
<tool file="fastx_toolkit/fastq_quality_boxplot.xml" />
<tool file="fastx_toolkit/fastx_nucleotides_distribution.xml" />
- <!-- <tool file="fastx_toolkit/fasta_clipping_histogram.xml" /> -->
- <!-- <tool file="fastx_toolkit/fastx_clipper.xml" /> -->
- <tool file="fastx_toolkit/fastx_trimmer.xml" />
- <tool file="fastx_toolkit/fastx_renamer.xml" />
- <tool file="fastx_toolkit/fastx_reverse_complement.xml" />
- <tool file="fastx_toolkit/fastx_artifacts_filter.xml" />
- <tool file="fastx_toolkit/fastq_quality_filter.xml" />
- <!--<tool file="fastx_toolkit/fastx_barcode_splitter.xml" />-->
<tool file="metag_tools/split_paired_reads.xml" />
<label text="Roche-454 data" id="454" />
<tool file="metag_tools/short_reads_figure_score.xml" />
<tool file="metag_tools/short_reads_trim_seq.xml" />
<label text="AB-SOLiD data" id="solid" />
+ <tool file="next_gen_conversion/solid_to_fastq.xml" />
<tool file="solid_tools/solid_qual_stats.xml" />
<tool file="solid_tools/solid_qual_boxplot.xml" />
</section>
<section name="NGS: Mapping" id="solexa_tools">
<!-- <tool file="sr_mapping/lastz_wrapper.xml" /> -->
+ <tool file="sr_mapping/bowtie_wrapper.xml" />
+ <tool file="sr_mapping/bwa_wrapper.xml" />
<tool file="metag_tools/megablast_wrapper.xml" />
<tool file="metag_tools/megablast_xml_parser.xml" />
- <tool file="sr_mapping/bowtie_wrapper.xml" />
- <tool file="sr_mapping/bwa_wrapper.xml" />
- </section>
+ </section>
<section name="NGS: SAM Tools" id="samtools">
<tool file="samtools/sam_bitwise_flag_filter.xml" />
<tool file="samtools/sam2interval.xml" />
diff -r 9a75d2428e21 -r 3ad620871b25 tools/fastx_toolkit/fastq_quality_converter.xml
--- a/tools/fastx_toolkit/fastq_quality_converter.xml Wed Oct 07 11:12:39 2009 -0400
+++ b/tools/fastx_toolkit/fastq_quality_converter.xml Wed Oct 07 11:18:14 2009 -0400
@@ -2,7 +2,7 @@
<description>(ASCII-Numeric)</description>
<command>zcat -f $input | fastq_quality_converter $QUAL_FORMAT -o $output -Q $offset</command>
<inputs>
- <param format="fastqsolexa,fastqsanger" name="input" type="data" label="Library to convert" />
+ <param format="fastq" name="input" type="data" label="Library to convert" />
<param name="QUAL_FORMAT" type="select" label="Desired output format">
<option value="-a">ASCII (letters) quality scores</option>
@@ -11,7 +11,7 @@
<param name="offset" type="select" label="FASTQ ASCII offset">
<option value="33">33</option>
- <option value="64">64</option>
+ <option selected="true" value="64">64</option>
</param>
</inputs>
@@ -47,7 +47,7 @@
</tests>
<outputs>
- <data format="fastqsolexa" name="output" metadata_source="input" />
+ <data format="fastq" name="output" metadata_source="input" />
</outputs>
<help>
diff -r 9a75d2428e21 -r 3ad620871b25 tools/fastx_toolkit/fastq_to_fasta.xml
--- a/tools/fastx_toolkit/fastq_to_fasta.xml Wed Oct 07 11:12:39 2009 -0400
+++ b/tools/fastx_toolkit/fastq_to_fasta.xml Wed Oct 07 11:18:14 2009 -0400
@@ -3,7 +3,7 @@
<command>gunzip -cf $input | fastq_to_fasta $SKIPN $RENAMESEQ -o $output -v </command>
<inputs>
- <param format="fastqsolexa,fastqsanger" name="input" type="data" label="FASTQ Library to convert" />
+ <param format="fastq" name="input" type="data" label="FASTQ Library to convert" />
<param name="SKIPN" type="select" label="Discard sequences with unknown (N) bases ">
<option value="">yes</option>
diff -r 9a75d2428e21 -r 3ad620871b25 tools/fastx_toolkit/fastx_quality_statistics.xml
--- a/tools/fastx_toolkit/fastx_quality_statistics.xml Wed Oct 07 11:12:39 2009 -0400
+++ b/tools/fastx_toolkit/fastx_quality_statistics.xml Wed Oct 07 11:18:14 2009 -0400
@@ -3,11 +3,8 @@
<command>zcat -f $input | fastx_quality_stats -o $output -Q $offset</command>
<inputs>
- <param format="fasta,fastqsolexa,fastqsanger" name="input" type="data" label="Library to analyse" />
- <param name="offset" type="select" label="FASTQ ASCII offset">
- <option value="33">33</option>
- <option value="64">64</option>
- </param>
+ <param format="fastqsanger" name="input" type="data" label="Library to analyse" />
+ <param name="offset" type="hidden" value="33"/>
</inputs>
<tests>
diff -r 9a75d2428e21 -r 3ad620871b25 tools/metag_tools/split_paired_reads.xml
--- a/tools/metag_tools/split_paired_reads.xml Wed Oct 07 11:12:39 2009 -0400
+++ b/tools/metag_tools/split_paired_reads.xml Wed Oct 07 11:18:14 2009 -0400
@@ -4,7 +4,7 @@
split_paired_reads.py $input $output1 $output2
</command>
<inputs>
- <param name="input" type="data" format="fastqsolexa,fastqsanger" label="Your paired-end file" />
+ <param name="input" type="data" format="fastqsanger" label="Your paired-end file" />
</inputs>
<outputs>
<data name="output1" format="input"/>
@@ -12,8 +12,8 @@
</outputs>
<tests>
<test>
- <param name="input" value="split_paired_reads_test1.fastq" ftype="fastqsolexa" />
- <output name="output1" file="split_paired_reads_test1.out1" fype="fastqsolexa" />
+ <param name="input" value="split_paired_reads_test1.fastq" ftype="fastqsanger"/>
+ <output name="output1" file="split_paired_reads_test1.out1" ftype="fastqsanger"/>
</test>
</tests>
<help>
diff -r 9a75d2428e21 -r 3ad620871b25 tools/next_gen_conversion/fastq_gen_conv.xml
--- a/tools/next_gen_conversion/fastq_gen_conv.xml Wed Oct 07 11:12:39 2009 -0400
+++ b/tools/next_gen_conversion/fastq_gen_conv.xml Wed Oct 07 11:18:14 2009 -0400
@@ -1,5 +1,5 @@
<tool id="fastq_gen_conv" name="FASTQ Groomer" version="1.0.0">
- <description>converts any type of FASTQ file to Sanger type and validates data</description>
+ <description>converts any FASTQ to Sanger</description>
<command interpreter="python">
fastq_gen_conv.py
--input=$input
@@ -18,24 +18,24 @@
--output=$output
</command>
<inputs>
- <param name="input" type="data" format="fastq" label="FASTQ file to check:" />
+ <param name="input" type="data" format="fastq" label="Groom this dataset" />
<conditional name="origTypeChoice">
- <param name="origType" type="select" label="What type of FASTQ do you think this is?">
- <option value="solexa">Solexa</option>
- <option value="illumina">Illumina</option>
- <option value="sanger">Sanger</option>
+ <param name="origType" type="select" label="How do you think quality values are scaled?" help="See below for explanation">
+ <option value="solexa">Solexa/Illumina 1.0</option>
+ <option value="illumina">Illumina 1.3+</option>
+ <option value="sanger">Sanger (validation only)</option>
</param>
<when value="solexa" />
<when value="illumina" />
<when value="sanger">
<conditional name="howManyBlocks">
- <param name="allOrNot" type="select" label="Do you want to do a subset of lines, or do the whole file?">
- <option value="all">Check all</option>
- <option value="not">Select blocks</option>
+ <param name="allOrNot" type="select" label="Since your fastq is already in Sanger format you can check it for consistency">
+ <option value="all">Check all (may take a while)</option>
+ <option selected="true" value="not">Check selected number of blocks</option>
</param>
<when value="all" />
<when value="not">
- <param name="blocks" type="integer" value="1000" label="How many blocks (four lines each) do you want to do?" />
+ <param name="blocks" type="integer" value="1000" label="How many blocks (four lines each) do you want to check?" />
</when>
</conditional>
</when>
@@ -62,39 +62,45 @@
**What it does**
-This tool takes a FASTQ file (Solexa or Illumina) and converts it to Sanger format. It only converts valid blocks. It also can confirm the validity of Sanger FASTQ.
+Galaxy pipeline for mapping of Illumina data requires data to be in fastq format with quality values conforming to so called "Sanger" format. Unfortunately there are many other types of fastq. Thus the main objective of this tool is to "groom" multiple types of fastq into Sanger-conforming fastq that can be used in downstream application such as mapping.
+
+.. class:: infomark
+
+**TIP**: If the input dataset is already in Sanger format the tool does not perform conversion. However validation (described below) is still performed.
-----
-**Example**
+**Types of fastq datasets**
-- Converting the following Solexa FASTQ file::
+A good description of fastq datasets can be found `here`__, while a description of Galaxy's fastq "logic" can be found `here`__. Because ranges of quality values within different types of fastq datasets overlap it very difficult to detect them automatically. This tool supports conversion of two commonly found types (Solexa/Illumina 1.0 and Illumina 1.3+) into fastq Sanger.
- @seq1
- AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
- +
- ;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
- @seq2
- AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
- +
- ;>@BCElcH@KLMNOPQ>STZVWbYu[\]^_?a=;d?fghijklmno
- @seq3
- AGTCGTCGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
- +
- 7>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
+ .. __: http://en.wikipedia.org/wiki/FASTQ_format
+ .. __: http://bitbucket.org/galaxy/galaxy-central/wiki/NGS
-- will produce the following Sanger FASTQ data::
+.. class:: warningmark
- @seq1
- AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
- +
- "#$%%''()+,-./0123456789:;<=>?@#BCDEFGHIJKLMNOP
- @seq2
- AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
- +
- "#$%%'MD)$,-./012#45;78C:V%lt;=>?@#B""E#GHIJKLMNOP
-
-- Note that seq3 was not converted, because it contained an invalid Solexa quality value (7).
+**NOTE** that there is also a type of fastq format where quality values are represented by a list of space-delimited integers (e.g., 40 40 20 15 -5 20 ...). This tool **does not** handle such fastq. If you have such a dataset, it needs to be converted into ASCII-type fastq (where quality values are encoded by characters) by "Numeric-to-ASCII" utility before it can accepted by this tool.
+
+-----
+
+**Validation**
+
+In addition to converting quality values to Sanger format the tool also checks the input dataset for consistency. Specifically, it performs these four checks:
+
+- skips empty lines
+- checks that blocks are properly formed by making sure that:
+
+ #. there are four lines per block
+ #. the first line starts with "@"
+ #. the third line starts with "+"
+ #. lengths of second line (sequences) and the fourth line (quality string) are identical
+
+- checks that quality values are within range for the chosen fastq format (e.g., the format provided by the user in **How do you think quality values are scaled?** drop down.
+
+To see exactly what the tool does you can take a look at its source code `here`__.
+
+ .. __: http://bitbucket.org/galaxy/galaxy-central/src/tip/tools/next_gen_conversio…
+
</help>
</tool>
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/b25297e88f96
changeset: 2836:b25297e88f96
user: Kelly Vincent <kpvincent(a)bx.psu.edu>
date: Tue Oct 06 21:25:01 2009 -0400
description:
Added FASTQ \"Groomer\" tool to converters section. Relies on new datatype (fastq) which will be added later.
7 file(s) affected in this change:
test-data/fastq_gen_conv_in1.fastq
test-data/fastq_gen_conv_in2.fastq
test-data/fastq_gen_conv_out1.fastqsanger
test-data/fastq_gen_conv_out2.fastqsanger
tool_conf.xml.sample
tools/next_gen_conversion/fastq_gen_conv.py
tools/next_gen_conversion/fastq_gen_conv.xml
diffs (370 lines):
diff -r 2fb0a64c6aaa -r b25297e88f96 test-data/fastq_gen_conv_in1.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/fastq_gen_conv_in1.fastq Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,16 @@
+@seq1
+AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
+@seq2
+AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+;>@BCElcH@KLMNOPQ>STZVWbYu[\]^_?a=;d?fghijklmno
+@seq3
+AGTCGTCGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+7>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
+@seq4
+AGTCGTAGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
diff -r 2fb0a64c6aaa -r b25297e88f96 test-data/fastq_gen_conv_in2.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/fastq_gen_conv_in2.fastq Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,24 @@
+@seq1
+AAAGGTTTCTCTTTTGGAAATATCTAAATCCC
++
+!"#$%&\'()*+,-./0123456789:;<=>.
+@seq2
+GGGTCTCCCAGAATGATTAGAGCCGTATAGGA
++
+?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]
+@seq3
+GCGGTTCAATACGATTACCACCATGATAAATA
++
+?Aa.1ghB2K!#lk(02GY[[II])Kwl+,5M
+@seq4
+AGTCTTTTCCTCTAAAATAACATAGGATACTA
++
+ghY)N375Nh.,Ol>==/<:2#i&d%#KdNII
+@seq5
+GAGGACTCATGGTAGGTATTTTACATGACATT
++
+IIgy%hf6#394bd&hNMWL$OPB63II*,+-
+@seq6
+GGCCTACATTCATTTACGAGACTAATTAGGGA
++
+IIIIIgd6#5%jKO&.,D+s3aW=cdGB#a1$
\ No newline at end of file
diff -r 2fb0a64c6aaa -r b25297e88f96 test-data/fastq_gen_conv_out1.fastqsanger
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/fastq_gen_conv_out1.fastqsanger Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,12 @@
+@seq1
+AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+"#$%%''()+,-./0123456789:;<=>?@#BCDEFGHIJKLMNOP
+@seq2
+AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+"#$%%'MD)$,-./012#45;78C:V<=>?@#B""E#GHIJKLMNOP
+@seq4
+AGTCGTAGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
++
+"#$%%''()+,-./0123456789:;<=>?@#BCDEFGHIJKLMNOP
diff -r 2fb0a64c6aaa -r b25297e88f96 test-data/fastq_gen_conv_out2.fastqsanger
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/fastq_gen_conv_out2.fastqsanger Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,12 @@
+@seq1
+AAAGGTTTCTCTTTTGGAAATATCTAAATCCC
++
+!"#$%&\'()*+,-./0123456789:;<=>.
+@seq2
+GGGTCTCCCAGAATGATTAGAGCCGTATAGGA
++
+?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]
+@seq3
+GCGGTTCAATACGATTACCACCATGATAAATA
++
+?Aa.1ghB2K!#lk(02GY[[II])Kwl+,5M
diff -r 2fb0a64c6aaa -r b25297e88f96 tool_conf.xml.sample
--- a/tool_conf.xml.sample Tue Oct 06 16:55:47 2009 -0400
+++ b/tool_conf.xml.sample Tue Oct 06 21:25:01 2009 -0400
@@ -75,6 +75,7 @@
<tool file="next_gen_conversion/solid_to_fastq.xml" />
<tool file="next_gen_conversion/fastq_conversions.xml" />
<tool file="fastx_toolkit/fastq_quality_converter.xml" />
+ <tool file="next_gen_conversion/fastq_gen_conv.xml" />
</section>
<section name="Extract Features" id="features">
<tool file="filters/ucsc_gene_bed_to_exon_bed.xml" />
diff -r 2fb0a64c6aaa -r b25297e88f96 tools/next_gen_conversion/fastq_gen_conv.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/next_gen_conversion/fastq_gen_conv.py Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,169 @@
+"""
+Converts any type of FASTQ file to Sanger type and makes small adjustments if necessary.
+
+usage: %prog [options]
+ -i, --input=i: Input FASTQ candidate file
+ -r, --origType=r: Original type
+ -a, --allOrNot=a: Whether or not to check all blocks
+ -b, --blocks=b: Number of blocks to check
+ -o, --output=o: Output file
+
+usage: %prog input_file oroutput_file
+"""
+
+import math, sys
+from galaxy import eggs
+import pkg_resources; pkg_resources.require( "bx-python" )
+from bx.cookbook import doc_optparse
+
+def stop_err( msg ):
+ sys.stderr.write( "%s\n" % msg )
+ sys.exit()
+
+def all_bases_valid(seq):
+ """Confirm that the sequence contains only bases"""
+ valid_bases = ['a', 'A', 'c', 'C', 'g', 'G', 't', 'T', 'N']
+ for base in seq:
+ if base not in valid_bases:
+ return False
+ return True
+
+def __main__():
+ #Parse Command Line
+ options, args = doc_optparse.parse( __doc__ )
+ orig_type = options.origType
+ if orig_type == 'sanger' and options.allOrNot == 'not':
+ max_blocks = int(options.blocks)
+ else:
+ max_blocks = -1
+ fin = file(options.input, 'r')
+ fout = file(options.output, 'w')
+ range_min = 1000
+ range_max = -5
+ block_num = 0
+ bad_blocks = 0
+ base_len = -1
+ line_count = 0
+ lines = []
+ line = fin.readline()
+ while line:
+ if max_blocks >= 0 and block_num > 0 and orig_type == 'sanger' and max_blocks < block_num:
+ print 'break'
+ break
+ if line.strip():
+ # the line that starts of a block, with a name
+ if line_count % 4 == 0 and line.startswith('@'):
+ lines.append(line)
+ block_num += 1
+ else:
+ # if we expect a sequence of bases
+ if line_count % 4 == 1 and all_bases_valid(line.strip()):
+ lines.append(line)
+ base_len = len(line.strip())
+ # if we expect the second name line
+ elif line_count % 4 == 2 and line.startswith('+'):
+ lines.append(line)
+ # if we expect a sequence of qualities and it's the expected length
+ elif line_count % 4 == 3:
+ split_line = line.strip().split()
+ # decimal qualities
+ if len(split_line) == base_len:
+ # convert
+ phred_list = []
+ for ch in split_line:
+ int_ch = int(ch)
+ if int_ch < range_min:
+ range_min = int_ch
+ if int_ch > range_max:
+ range_max = int_ch
+ if int_ch >= 0 and int_ch <= 93:
+ phred_list.append(chr(int_ch + 33))
+ # make sure we haven't lost any quality values
+ if len(phred_list) == base_len:
+ # print first three lines
+ for l in lines:
+ fout.write(l)
+ # print converted quality line
+ fout.write(''.join(phred_list))
+ # reset
+ lines = []
+ base_len = -1
+ # abort if so
+ else:
+ bad_blocks += 1
+ lines = []
+ base_len = -1
+ # ascii qualities
+ elif len(split_line[0]) == base_len:
+ qualities = []
+ # print converted quality line
+ if orig_type == 'illumina':
+ for c in line.strip():
+ if ord(c) - 64 < range_min:
+ range_min = ord(c) - 64
+ if ord(c) - 64 > range_max:
+ range_max = ord(c) - 64
+ if ord(c) < 64 or ord(c) > 126:
+ bad_blocks += 1
+ base_len = -1
+ lines = []
+ break
+ else:
+ qualities.append( chr( ord(c) - 31 ) )
+ quals = ''.join(qualities)
+ elif orig_type == 'solexa':
+ for c in line.strip():
+ if ord(c) - 64 < range_min:
+ range_min = ord(c) - 64
+ if ord(c) - 64 > range_max:
+ range_max = ord(c) - 64
+ if ord(c) < 59 or ord(c) > 126:
+ bad_blocks += 1
+ base_len = -1
+ lines = []
+ break
+ else:
+ p = 10.0**( ( ord(c) - 64 ) / -10.0 ) / ( 1 + 10.0**( ( ord(c) - 64 ) / -10.0 ) )
+ qualities.append( chr( int( -10.0*math.log10( p ) ) + 33 ) )
+ quals = ''.join(qualities)
+ else: # 'sanger'
+ for c in line.strip():
+ if ord(c) - 33 < range_min:
+ range_min = ord(c) - 33
+ if ord(c) - 33 > range_max:
+ range_max = ord(c) - 33
+ if ord(c) < 33 or ord(c) > 126:
+ bad_blocks += 1
+ base_len = -1
+ lines = []
+ break
+ else:
+ qualities.append(c)
+ quals = ''.join(qualities)
+ # make sure we don't have bad qualities
+ if len(quals) == base_len:
+ # print first three lines
+ for l in lines:
+ fout.write(l)
+ # print out quality line
+ fout.write(quals+'\n')
+ # reset
+ lines = []
+ base_len = -1
+ else:
+ bad_blocks += 1
+ base_len = -1
+ lines = []
+ line_count += 1
+ line = fin.readline()
+ fout.close()
+ fin.close()
+ if range_min != 1000 and range_min != -5:
+ outmsg = 'The range of quality values found were: %s to %s' % (range_min, range_max)
+ else:
+ outmsg = ''
+ if bad_blocks > 0:
+ outmsg += '\nThere were %s bad blocks skipped' % (bad_blocks)
+ sys.stdout.write(outmsg)
+
+if __name__=="__main__": __main__()
\ No newline at end of file
diff -r 2fb0a64c6aaa -r b25297e88f96 tools/next_gen_conversion/fastq_gen_conv.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/next_gen_conversion/fastq_gen_conv.xml Tue Oct 06 21:25:01 2009 -0400
@@ -0,0 +1,100 @@
+<tool id="fastq_gen_conv" name="FASTQ Groomer" version="1.0.0">
+ <description>converts any type of FASTQ file to Sanger type and validates data</description>
+ <command interpreter="python">
+ fastq_gen_conv.py
+ --input=$input
+ --origType=$origTypeChoice.origType
+ #if $origTypeChoice.origType == "sanger":
+ --allOrNot=$origTypeChoice.howManyBlocks.allOrNot
+ #if $origTypeChoice.howManyBlocks.allOrNot == "not":
+ --blocks=$origTypeChoice.howManyBlocks.blocks
+ #else:
+ --blocks="None"
+ #end if
+ #else:
+ --allOrNot="None"
+ --blocks="None"
+ #end if
+ --output=$output
+ </command>
+ <inputs>
+ <param name="input" type="data" format="fastq" label="FASTQ file to check:" />
+ <conditional name="origTypeChoice">
+ <param name="origType" type="select" label="What type of FASTQ do you think this is?">
+ <option value="solexa">Solexa</option>
+ <option value="illumina">Illumina</option>
+ <option value="sanger">Sanger</option>
+ </param>
+ <when value="solexa" />
+ <when value="illumina" />
+ <when value="sanger">
+ <conditional name="howManyBlocks">
+ <param name="allOrNot" type="select" label="Do you want to do a subset of lines, or do the whole file?">
+ <option value="all">Check all</option>
+ <option value="not">Select blocks</option>
+ </param>
+ <when value="all" />
+ <when value="not">
+ <param name="blocks" type="integer" value="1000" label="How many blocks (four lines each) do you want to do?" />
+ </when>
+ </conditional>
+ </when>
+ </conditional>
+ </inputs>
+ <outputs>
+ <data name="output" format="fastqsanger"/>
+ </outputs>
+ <tests>
+ <test>
+ <param name="input" value="fastq_gen_conv_in1.fastq" ftype="fastq" />
+ <param name="origType" value="solexa" />
+ <output name="output" format="fastqsanger" file="fastq_gen_conv_out1.fastqsanger" />
+ </test>
+ <test>
+ <param name="input" value="fastq_gen_conv_in2.fastq" ftype="fastq" />
+ <param name="origType" value="sanger" />
+ <param name="allOrNot" value="not" />
+ <param name="blocks" value="3" />
+ <output name="output" format="fastqsanger" file="fastq_gen_conv_out2.fastqsanger" />
+ </test>
+ </tests>
+ <help>
+
+**What it does**
+
+This tool takes a FASTQ file (Solexa or Illumina) and converts it to Sanger format. It only converts valid blocks. It also can confirm the validity of Sanger FASTQ.
+
+-----
+
+**Example**
+
+- Converting the following Solexa FASTQ file::
+
+ @seq1
+ AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
+ +
+ ;>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
+ @seq2
+ AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
+ +
+ ;>@BCElcH@KLMNOPQ>STZVWbYu[\]^_?a=;d?fghijklmno
+ @seq3
+ AGTCGTCGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
+ +
+ 7>@BCEFGHJKLMNOPQRSTUVWXYZ[\]^_?abcdefghijklmno
+
+- will produce the following Sanger FASTQ data::
+
+ @seq1
+ AGTCGTGGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
+ +
+ "#$%%''()+,-./0123456789:;<=>?@#BCDEFGHIJKLMNOP
+ @seq2
+ AGTCGTTGTCATCGTGACTAGTCGATCTAGCTAGCTCTCTAGAGTGT
+ +
+ "#$%%'MD)$,-./012#45;78C:V%lt;=>?@#B""E#GHIJKLMNOP
+
+- Note that seq3 was not converted, because it contained an invalid Solexa quality value (7).
+
+ </help>
+</tool>
1
0

09 Oct '09
details: http://www.bx.psu.edu/hg/galaxy/rev/9a75d2428e21
changeset: 2838:9a75d2428e21
user: Greg Von Kuster <greg(a)bx.psu.edu>
date: Wed Oct 07 11:12:39 2009 -0400
description:
Add Arabidopsis integration at TAIR and UCLA patch provided by Brad Chapman ( handles # 135 ).
6 file(s) affected in this change:
lib/galaxy/config.py
tool-data/shared/gbrowse/gbrowse_build_sites.txt
tool-data/shared/ucsc/builds.txt
tool-data/shared/ucsc/manual_builds.txt
tool-data/shared/ucsc/ucsc_build_sites.txt
universe_wsgi.ini.sample
diffs (66 lines):
diff -r cb842f737d46 -r 9a75d2428e21 lib/galaxy/config.py
--- a/lib/galaxy/config.py Tue Oct 06 23:47:08 2009 -0400
+++ b/lib/galaxy/config.py Wed Oct 07 11:12:39 2009 -0400
@@ -73,8 +73,8 @@
self.use_memdump = string_as_bool( kwargs.get( 'use_memdump', 'False' ) )
self.log_memory_usage = string_as_bool( kwargs.get( 'log_memory_usage', 'False' ) )
self.log_events = string_as_bool( kwargs.get( 'log_events', 'False' ) )
- self.ucsc_display_sites = kwargs.get( 'ucsc_display_sites', "main,test,archaea" ).lower().split(",")
- self.gbrowse_display_sites = kwargs.get( 'gbrowse_display_sites', "main,test" ).lower().split(",")
+ self.ucsc_display_sites = kwargs.get( 'ucsc_display_sites', "main,test,archaea,ucla" ).lower().split(",")
+ self.gbrowse_display_sites = kwargs.get( 'gbrowse_display_sites', "main,test,tair" ).lower().split(",")
self.genetrack_display_sites = kwargs.get( 'genetrack_display_sites', "main,test" ).lower().split(",")
self.brand = kwargs.get( 'brand', None )
self.wiki_url = kwargs.get( 'wiki_url', 'http://g2.trac.bx.psu.edu/' )
diff -r cb842f737d46 -r 9a75d2428e21 tool-data/shared/gbrowse/gbrowse_build_sites.txt
--- a/tool-data/shared/gbrowse/gbrowse_build_sites.txt Tue Oct 06 23:47:08 2009 -0400
+++ b/tool-data/shared/gbrowse/gbrowse_build_sites.txt Wed Oct 07 11:12:39 2009 -0400
@@ -1,3 +1,4 @@
# wormbase sites / supported genomes
-main http://www.wormbase.org/db/seq/gbgff/c_elegans/ c_elegans,c_briggsae,c_remanei,c_brenneri,c_japonica,p_pristionchus,b_malayi
-test http://dev.wormbase.org/db/seq/gbrowse/c_elegans/ c_elegans,c_briggsae,c_remanei,c_brenneri,c_japonica,p_pristionchus,b_malayi
+main http://www.wormbase.org/db/seq/gbgff/c_elegans/ c_elegans,c_briggsae,c_remanei,c_brenneri,c_japonica,p_pristionchus,b_malayi
+test http://dev.wormbase.org/db/seq/gbrowse/c_elegans/ c_elegans,c_briggsae,c_remanei,c_brenneri,c_japonica,p_pristionchus,b_malayi
+tair http://arabidopsis.org/cgi-bin/gbrowse/ arabidopsis_tair8,arabidopsis
diff -r cb842f737d46 -r 9a75d2428e21 tool-data/shared/ucsc/builds.txt
--- a/tool-data/shared/ucsc/builds.txt Tue Oct 06 23:47:08 2009 -0400
+++ b/tool-data/shared/ucsc/builds.txt Wed Oct 07 11:12:39 2009 -0400
@@ -786,3 +786,6 @@
aeroHydr_ATCC7966 Aeromonas hydrophila subsp. hydrophila ATCC 7966 (aeroHydr_ATCC7966)
baciAnth_AMES Bacillus anthracis str. Ames (baciAnth_AMES)
shewOnei Shewanella oneidensis MR-1 (shewOnei)
+arabidopsis Arabidopsis thaliana TAIR9
+arabidopsis_tair8 Arabidopsis thaliana TAIR8
+araTha1 Arabidopsis thaliana TAIR7
diff -r cb842f737d46 -r 9a75d2428e21 tool-data/shared/ucsc/manual_builds.txt
--- a/tool-data/shared/ucsc/manual_builds.txt Tue Oct 06 23:47:08 2009 -0400
+++ b/tool-data/shared/ucsc/manual_builds.txt Wed Oct 07 11:12:39 2009 -0400
@@ -665,3 +665,6 @@
shewOnei Shewanella oneidensis MR-1 plasmid_pMR-1=161613,chr=4969803
15217 Human herpesvirus 1 NC_001806=152261
lMaj5 Leishmania major 2005 chr1=268984,chr2=355714,chr3=384518,chr4=441313,chr5=465823,chr6=516874,chr7=596348,chr8=574972,chr9=573441,chr10=570864,chr11=582575,chr12=675347,chr13=654604,chr14=622648,chr15=629514,chr16=714659,chr17=684831,chr18=739751,chr19=702212,chr20=742551,chr21=772974,chr22=716608,chr23=772567,chr24=840950,chr25=912849,chr26=1091579,chr27=1130447,chr28=1160128,chr29=1212674,chr30=1403454,chr31=1484336,chr32=1604650,chr33=1583673,chr34=1866754,chr35=2090491,chr36=2682183
+arabidopsis Arabidopsis thaliana TAIR9
+arabidopsis_tair8 Arabidopsis thaliana TAIR8
+araTha1 Arabidopsis thaliana TAIR7
diff -r cb842f737d46 -r 9a75d2428e21 tool-data/shared/ucsc/ucsc_build_sites.txt
--- a/tool-data/shared/ucsc/ucsc_build_sites.txt Tue Oct 06 23:47:08 2009 -0400
+++ b/tool-data/shared/ucsc/ucsc_build_sites.txt Wed Oct 07 11:12:39 2009 -0400
@@ -4,3 +4,4 @@
archaea http://archaea.ucsc.edu/cgi-bin/hgTracks? alkaEhrl_MLHE_1,shewW318,idioLoih_L2TR,sulSol1,erwiCaro_ATROSEPTICA,symbTher_IAM14863,moorTher_ATCC39073,therFusc_YX,methHung1,bradJapo,therElon,shewPutrCN32,pediPent_ATCC25745,mariMari_MCS10,nanEqu1,baciSubt,chlaTrac,magnMagn_AMB_1,chroViol,ralsSola,acidCryp_JF_5,erytLito_HTCC2594,desuVulg_HILDENBOROUG,pyrAer1,sulfToko1,shewANA3,paraSp_UWE25,geobKaus_HTA426,rhizEtli_CFN_42,uncuMeth_RCI,candBloc_FLORIDANUS,deinRadi,yersPest_CO92,saccEryt_NRRL_2338,rhodRHA1,candCars_RUDDII,burkMall_ATCC23344,eschColi_O157H7,burk383,psycIngr_37,rhodSpha_2_4_1,wolbEndo_OF_DROSOPHIL,burkViet_G4,propAcne_KPA171202,enteFaec_V583,campJeju_81_176,acidJS42,heliPylo_26695,pseuHalo_TAC125,chroSale_DSM3043,methVann1,archFulg1,neisMeni_Z2491_1,fusoNucl,vermEise_EF01_2,anabVari_ATCC29413,tropWhip_TW08_27,heliHepa,acinSp_ADP1,anapMarg_ST_MARIES,natrPhar1,haheChej_KCTC_2396,therPetr_RKU_1,neisGono_FA1090_1,colwPsyc_34H,desuPsyc_LSV54,hyphNept_ATCC15444,vibrC
hol1,deinGeot_DSM11300,strePyog_M1_GAS,franCcI3,salmTyph,metaSedu,lactSali_UCC118,trepPall,neisMeni_MC58_1,syntWolf_GOETTINGEN,flavJohn_UW101,methBoon1,haemSomn_129PT,shewLoihPV4,igniHosp1,haemInfl_KW20,haloHalo_SL1,ferrAcid1,sphiAlas_RB2256,candPela_UBIQUE_HTCC1,caldSacc_DSM8903,aerPer1,lactPlan,carbHydr_Z_2901,therTher_HB8,vibrVuln_YJ016_1,rhodPalu_CGA009,acidCell_11B,siliPome_DSS_3,therVolc1,haloWals1,rubrXyla_DSM9941,shewAmaz,nocaJS61,vibrVuln_CMCP6_1,sinoMeli,ureaUrea,baciHalo,bartHens_HOUSTON_1,nitrWino_NB_255,hypeButy1,methBurt2,polaJS66,mesoLoti,methMari_C7,caulCres,neisMeni_FAM18_1,acidBact_ELLIN345,caldMaqu1,salmEnte_PARATYPI_ATC,glucOxyd_621H,cytoHutc_ATCC33406,nitrEuro,therMari,coxiBurn,woliSucc,heliPylo_HPAG1,mesoFlor_L1,pyrHor1,methAeol1,procMari_CCMP1375,pyroArse1,oenoOeni_PSU_1,alcaBork_SK2,wiggBrev,actiPleu_L20,lactLact,methJann1,paraDeni_PD1222,borrBurg,pyroIsla1,orieTsut_BORYONG,shewMR4,methKand1,methCaps_BATH,onioYell_PHYTOPLASMA,bordBron,cenaSymb1,burkCe
no_HI2424,franTula_TULARENSIS,pyrFur2,mariAqua_VT8,heliPylo_J99,psycArct_273_4,vibrChol_MO10_1,vibrPara1,rickBell_RML369_C,metAce1,buchSp,ehrlRumi_WELGEVONDEN,methLabrZ_1,chlaPneu_CWL029,thioCrun_XCL_2,pyroCali1,chloTepi_TLS,stapAure_MU50,novoArom_DSM12444,magnMC1,zymoMobi_ZM4,salmTyph_TY2,chloChlo_CAD3,azoaSp_EBN1,therTher_HB27,bifiLong,picrTorr1,listInno,bdelBact,gramFors_KT0803,sulfAcid1,geobTher_NG80_2,peloCarb,ralsEutr_JMP134,mannSucc_MBEL55E,syneSp_WH8102,methTherPT1,clavMich_NCPPB_382,therAcid1,syntAcid_SB,porpGing_W83,therNeut0,leifXyli_XYLI_CTCB0,shewFrig,photProf_SS9,thioDeni_ATCC25259,methMaze1,desuRedu_MI_1,burkThai_E264,campFetu_82_40,blocFlor,jannCCS1,nitrMult_ATCC25196,streCoel,soliUsit_ELLIN6076,pastMult,saliRube_DSM13855,methTher1,nostSp,shigFlex_2A,saccDegr_2_40,oceaIhey,dehaEthe_195,rhodRubr_ATCC11170,arthFB24,shewMR7,pireSp,anaeDeha_2CP_C,haloVolc1,dichNodo_VCS1703A,tricEryt_IMS101,mycoGeni,thioDeni_ATCC33889,methSmit1,geobUran_RF4,shewDeni,halMar1,desuHa
fn_Y51,methStad1,granBeth_CGDNIH1,therPend1,legiPneu_PHILADELPHIA,vibrChol_O395_1,nitrOcea_ATCC19707,campJeju_RM1221,methPetr_PM1,heliAcin_SHEEBA,eschColi_APEC_O1,peloTher_SI,haloHalo1,syntFuma_MPOB,xyleFast,gloeViol,leucMese_ATCC8293,bactThet_VPI_5482,xantCamp,sodaGlos_MORSITANS,geobSulf,roseDeni_OCH_114,coryEffi_YS_314,brucMeli,mycoTube_H37RV,vibrFisc_ES114_1,pyrAby1,burkXeno_LB400,polyQLWP,stapMari1,peloLute_DSM273,burkCeno_AU_1054,shewBalt,nocaFarc_IFM10152,ente638,mculMari1,saliTrop_CNB_440,neorSenn_MIYAYAMA,aquiAeol,dechArom_RCB,myxoXant_DK_1622,burkPseu_1106A,burkCepa_AMMD,methMari_C5_1,azorCaul2,methFlag_KT,leptInte,eschColi_K12,synePCC6,baumCica_HOMALODISCA,methBark1,pseuAeru,geobMeta_GS15,eschColi_CFT073,photLumi,metMar1,hermArse,campJeju,therKoda1,aeroHydr_ATCC7966,baciAnth_AMES,shewOnei,therTeng,lawsIntr_PHE_MN1_00
#Harvested from http://genome-test.cse.ucsc.edu/cgi-bin/das/dsn
test http://genome-test.cse.ucsc.edu/cgi-bin/hgTracks? anoCar1,ce4,ce3,ce2,ce1,loxAfr1,rn2,eschColi_O157H7_1,rn4,droYak1,heliPylo_J99_1,droYak2,dp3,dp2,caeRem2,caeRem1,oryLat1,eschColi_K12_1,homIni13,homIni14,droAna1,droAna2,oryCun1,sacCer1,heliHepa1,droGri1,sc1,dasNov1,choHof1,tupBel1,mm9,mm8,vibrChol1,mm5,mm4,mm7,mm6,mm3,mm2,rn3,venter1,galGal3,galGal2,ornAna1,equCab1,cioSav2,rheMac2,eutHer13,droPer1,droVir2,droVir1,heliPylo_26695_1,euaGli13,calJac1,campJeju1,droSim1,hg13,hg15,hg16,hg17,monDom1,monDom4,droMoj1,petMar1,droMoj2,vibrChol_MO10_1,vibrPara1,gliRes13,vibrVuln_YJ016_1,braFlo1,cioSav1,lauRas13,dm1,canFam1,canFam2,ci1,echTel1,ci2,caePb1,dm3,ponAbe2,falciparum,xenTro1,xenTro2,nonAfr13,fr2,fr1,gasAcu1,dm2,apiMel1,apiMel2,eschColi_O157H7EDL933_1,priPac1,panTro1,hg18,panTro2,campJeju_RM1221_1,canHg12,vibrChol_O395_1,vibrFisc_ES114_1,danRer5,danRer4,danRer3,danRer2,danRer1,tetNig1,afrOth13,bosTau1,eschColi_CFT073_1,bosTau3,bosTau2,bosTau4,rodEnt13,droEre1,priMat13,vibrVu
ln_CMCP6_1,cb2,cb3,cb1,borEut13,droSec1,felCat3,strPur1,strPur2,otoGar1,catArr1,anoGam1,triCas2
+ucla http://epigenomics.mcdb.ucla.edu/cgi-bin/hgTracks? araTha1
diff -r cb842f737d46 -r 9a75d2428e21 universe_wsgi.ini.sample
--- a/universe_wsgi.ini.sample Tue Oct 06 23:47:08 2009 -0400
+++ b/universe_wsgi.ini.sample Wed Oct 07 11:12:39 2009 -0400
@@ -91,8 +91,8 @@
use_new_layout = true
# Comma separated list of UCSC / gbrowse / GeneTrack browsers to use for viewing
-ucsc_display_sites = main,test,archaea
-gbrowse_display_sites = main,test
+ucsc_display_sites = main,test,archaea,ucla
+gbrowse_display_sites = main,test,tair
genetrack_display_sites = main,test
# Serving static files (needed if running standalone)
1
0
details: http://www.bx.psu.edu/hg/galaxy/rev/2fb0a64c6aaa
changeset: 2835:2fb0a64c6aaa
user: guru
date: Tue Oct 06 16:55:47 2009 -0400
description:
Fixing broken tool configs
3 file(s) affected in this change:
tools/fastx_toolkit/fastx_reverse_complement.xml
tools/fastx_toolkit/fastx_trimmer.xml
tools/filters/trimmer.xml
diffs (35 lines):
diff -r b14f99a4f736 -r 2fb0a64c6aaa tools/fastx_toolkit/fastx_reverse_complement.xml
--- a/tools/fastx_toolkit/fastx_reverse_complement.xml Tue Oct 06 15:00:32 2009 -0400
+++ b/tools/fastx_toolkit/fastx_reverse_complement.xml Tue Oct 06 16:55:47 2009 -0400
@@ -47,6 +47,7 @@
TACCNNCTTTGAATTACAAGGANGAGGCTACAGACA
+CSHL_1_FC42AGWWWXX:8:1:3:740
26 27 17 15 5 5 24 26 29 31 32 33 27 21 27 33 33 33 33 33 33 27 5 27 33 33 33 33 33 33 33 33 34 33 33 33
+
------
This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
diff -r b14f99a4f736 -r 2fb0a64c6aaa tools/fastx_toolkit/fastx_trimmer.xml
--- a/tools/fastx_toolkit/fastx_trimmer.xml Tue Oct 06 15:00:32 2009 -0400
+++ b/tools/fastx_toolkit/fastx_trimmer.xml Tue Oct 06 16:55:47 2009 -0400
@@ -3,7 +3,7 @@
<command>zcat -f '$input' | fastx_trimmer -v -f $first -l $last -o $output</command>
<inputs>
- <param format="fasta,fastasanger" name="input" type="data" label="Library to clip" />
+ <param format="fasta,fastqsanger" name="input" type="data" label="Library to clip" />
<param name="first" size="4" type="integer" value="1">
<label>First base to keep</label>
diff -r b14f99a4f736 -r 2fb0a64c6aaa tools/filters/trimmer.xml
--- a/tools/filters/trimmer.xml Tue Oct 06 15:00:32 2009 -0400
+++ b/tools/filters/trimmer.xml Tue Oct 06 16:55:47 2009 -0400
@@ -4,7 +4,7 @@
trimmer.py -a -f $input1 -c $col -s $start -e $end -i $ignore $fastq > $out_file1
</command>
<inputs>
- <param format="tabular,text" name="input1" type="data" label="this dataset"/>
+ <param format="tabular,txt" name="input1" type="data" label="this dataset"/>
<param name="col" type="integer" value="0" label="Trim this column only" help="0 = process entire line" />
<param name="start" type="integer" size="10" value="1" label="Trim from the beginning to this position" help="1 = do not trim the beginning"/>
<param name="end" type="integer" size="10" value="0" label="Remove everything from this position to the end" help="0 = do not trim the end"/>
1
0