[hg] galaxy 3375: Introduce a new style of external display appl...
details: http://www.bx.psu.edu/hg/galaxy/rev/049a86b8691d changeset: 3375:049a86b8691d user: Dan Blankenberg <dan@bx.psu.edu> date: Fri Feb 12 11:01:30 2010 -0500 description: Introduce a new style of external display applications. Display applications can now be entirely defined using xml files, similar to how tools are integrated. Applications are assigned to specific datatypes (i.e. on an extension basis) via the datatypes_conf.xml file. View the sample display applications at /display_applications/[ucsc/]*.xml for examples of usage. Provided sample display applications: View BAM files (with bai indexes) at UCSC using BigDataUrl support. ucsc interval as bed viewer - not enabled by default (the old style display app is still used by default; both can be used simultaneously - but this would likely be confusing) GeneTrack viewer - any interval datatype can now be viewed at GeneTrack, if the application is enabled for a particular datatype; also a valid display application for genetrack datatype. Display applications can make full use of datatype converters, even allowing explicitly defined multi-step conversions, e.g. interval --> bed --> genetrack; the datatype conversion framework will need to be enhanced to natively support multi-step conversions before this can be done implicitly. A new datatype, bedstrict, has been defined, the only way to have an item with this datatype is to be created by a tool; metadata cannot be edited; and sniffing this datatype would require aggressively parsing the entirety of the file. A bedstrict file must conform exactly to the BED specification (whereas Galaxy allows BED files to have non-standard columns). These files are suitable e.g. for display at the UCSC genome browser and is used by the new ucsc interval display application. Add a bed to bedstrict converter, this is used by the ucsc interval display application. Add a bed to genetrack converter, this is used by the new GeneTrack display application. TODO: If the GeneTrack indexer can be enhanced to accept column assignments, this should be an interval to genetrack converter. Several performance enhancements available for the ucsc tools, such as bigurl support, potential speed improvement when loading a user's history than the old style for certain displays, e.g. ucsc interval display no longer requires the viewport (position) to be calculated for each relevant history item in a users history; this calculation now occurs on a separate page after the user clicks a view link. Non-strict BED files no longer have their content calculated on the fly and then streamed, etc. Refer to additional comments in code. diffstat: datatypes_conf.xml.sample | 15 +- display_applications/genetrack.xml | 18 + display_applications/ucsc/bam.xml | 8 + display_applications/ucsc/interval_as_bed.xml | 37 + lib/galaxy/datatypes/converters/bed_to_genetrack_converter.py | 44 ++ lib/galaxy/datatypes/converters/bed_to_genetrack_converter.xml | 17 + lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.py | 114 +++++ lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.xml | 15 + lib/galaxy/datatypes/data.py | 9 +- lib/galaxy/datatypes/display_applications/__init__.py | 1 + lib/galaxy/datatypes/display_applications/application.py | 116 +++++ lib/galaxy/datatypes/display_applications/helpers.py | 31 + lib/galaxy/datatypes/display_applications/parameters.py | 195 ++++++++++ lib/galaxy/datatypes/interval.py | 30 + lib/galaxy/datatypes/registry.py | 18 +- lib/galaxy/web/buildapp.py | 1 + lib/galaxy/web/controllers/dataset.py | 71 +++- templates/dataset/display_application/display.mako | 17 + templates/dataset/display_application/launch_display.mako | 15 + templates/root/history_common.mako | 6 + 20 files changed, 773 insertions(+), 5 deletions(-) diffs (981 lines): diff -r 81d84a03f2ec -r 049a86b8691d datatypes_conf.xml.sample --- a/datatypes_conf.xml.sample Thu Feb 11 16:18:45 2010 -0500 +++ b/datatypes_conf.xml.sample Fri Feb 12 11:01:30 2010 -0500 @@ -5,11 +5,19 @@ <datatype extension="axt" type="galaxy.datatypes.sequence:Axt" display_in_upload="true"/> <datatype extension="bam" type="galaxy.datatypes.binary:Bam" mimetype="application/octet-stream" display_in_upload="true"> <converter file="bam_to_bai.xml" target_datatype="bai"/> + <display file="ucsc/bam.xml" /> </datatype> <datatype extension="bed" type="galaxy.datatypes.interval:Bed" display_in_upload="true"> <converter file="bed_to_gff_converter.xml" target_datatype="gff"/> <converter file="interval_to_coverage.xml" target_datatype="coverage"/> <converter file="bed_to_interval_index_converter.xml" target_datatype="interval_index"/> + <converter file="bed_to_genetrack_converter.xml" target_datatype="genetrack"/> + <!-- <display file="ucsc/interval_as_bed.xml" /> --> + <display file="genetrack.xml" /> + </datatype> + <datatype extension="bedstrict" type="galaxy.datatypes.interval:BedStrict"> + <display file="ucsc/interval_as_bed.xml" /> + <display file="genetrack.xml" /> </datatype> <datatype extension="binseq.zip" type="galaxy.datatypes.binary:Binseq" mimetype="application/zip" display_in_upload="true"/> <datatype extension="len" type="galaxy.datatypes.chrominfo:ChromInfo" display_in_upload="true"> @@ -26,7 +34,9 @@ </datatype> <datatype extension="fastq" type="galaxy.datatypes.sequence:Fastq" display_in_upload="true"/> <datatype extension="fastqsanger" type="galaxy.datatypes.sequence:FastqSanger" display_in_upload="true"/> - <datatype extension="genetrack" type="galaxy.datatypes.tracks:GeneTrack"/> + <datatype extension="genetrack" type="galaxy.datatypes.tracks:GeneTrack"> + <!-- <display file="genetrack.xml" /> --> + </datatype> <datatype extension="gff" type="galaxy.datatypes.interval:Gff" display_in_upload="true"> <converter file="gff_to_bed_converter.xml" target_datatype="bed"/> </datatype> @@ -36,7 +46,10 @@ <datatype extension="html" type="galaxy.datatypes.images:Html" mimetype="text/html"/> <datatype extension="interval" type="galaxy.datatypes.interval:Interval" display_in_upload="true"> <converter file="interval_to_bed_converter.xml" target_datatype="bed"/> + <converter file="interval_to_bedstrict_converter.xml" target_datatype="bedstrict"/> <indexer file="interval_awk.xml" /> + <!-- <display file="ucsc/interval_as_bed.xml" /> --> + <display file="genetrack.xml" /> </datatype> <datatype extension="jpg" type="galaxy.datatypes.images:Image" mimetype="image/jpeg"/> <datatype extension="laj" type="galaxy.datatypes.images:Laj"/> diff -r 81d84a03f2ec -r 049a86b8691d display_applications/genetrack.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/display_applications/genetrack.xml Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,18 @@ +<display id="genetrack_interval" version="1.0.0" name="view in"> + <link id="genetrack" name="GeneTrack"> + <url target_frame="galaxy_main">http://genetrack.g2.bx.psu.edu/galaxy?filename=${encoded_filename.qp}&hashkey=${hash_key.qp}&input=${qp(str($genetrack_file.id))}&GALAXY_URL=${galaxy_url.qp}</url> + <param type="data" name="bed_file" viewable="False" format="bed,genetrack"/> <!-- for now, we'll explicitly take care of the multi-step conversion; walk genetrack datatype down as a conversion of genetrack to genetrack doesn't exist and would likely be pointless --> + <param type="data" dataset="bed_file" name="genetrack_file" format="genetrack" viewable="False" /> + <param type="template" name="galaxy_url" strip="True" > + ${BASE_URL}/tool_runner?tool_id=predict2genetrack + </param> + <param type="template" name="hash_key" strip="True" > + #from galaxy.util.hash_util import hmac_new + ${hmac_new( $APP.config.tool_secret, $genetrack_file.file_name )} + </param> + <param type="template" name="encoded_filename" strip="True" > + #import binascii + ${binascii.hexlify( $genetrack_file.file_name )} + </param> + </link> +</display> diff -r 81d84a03f2ec -r 049a86b8691d display_applications/ucsc/bam.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/display_applications/ucsc/bam.xml Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,8 @@ +<display id="ucsc_bam" version="1.0.0" name="display at UCSC"> + <link id="main" name="main"> + <url>http://genome.ucsc.edu/cgi-bin/hgTracks?db=${qp($bam_file.dbkey)}&hgt.customText=${qp($track.url)}</url> + <param type="data" name="bam_file" url="galaxy.bam" strip_https="True" /> + <param type="data" name="bai_file" url="galaxy.bam.bai" metadata="bam_index" strip_https="True" /><!-- UCSC expects index file to exist as bam_file_name.bai --> + <param type="template" name="track" viewable="True" strip_https="True">track type=bam name="${bam_file.name}" bigDataUrl=${bam_file.url} db=${bam_file.dbkey}</param> + </link> +</display> diff -r 81d84a03f2ec -r 049a86b8691d display_applications/ucsc/interval_as_bed.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/display_applications/ucsc/interval_as_bed.xml Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,37 @@ +<display id="ucsc_interval_as_bed" version="1.0.0" name="display at UCSC"> + <link id="main" name="main"> + <url>http://genome.ucsc.edu/cgi-bin/hgTracks?db=${qp($bed_file.dbkey)}&position=${position.qp}&hgt.customText=${bed_file.qp}</url> + <param type="data" name="bed_file" url="galaxy.bed" format="bedstrict"/> <!-- Galaxy allows BED files to contain non-standard fields beyond the first 3 columns, UCSC does not: force use of converter which will make strict BED6+ file --> + <param type="template" name="position" strip="True" > +#set line_count = 0 +#set chrom = None +#set start = float( 'inf' ) +#set end = 0 +#for $line in open( $bed_file.file_name ): + #if $line_count > 10: ##10 max lines to check for view port + #break + #end if + #if not $line.startswith( "#" ): + #set $fields = $line.split( "\t" ) + #try: + #if len( $fields ) >= max( $bed_file.metadata.startCol, $bed_file.metadata.endCol, $bed_file.metadata.chromCol ): + #if $chrom is None or $fields[ $bed_file.metadata.chromCol - 1 ] == $chrom: + #set chrom = $fields[ $bed_file.metadata.chromCol - 1 ] + #set start = min( $start, int( $fields[ $bed_file.metadata.startCol - 1 ] ) ) + #set end = max( $end, int( $fields[ $bed_file.metadata.endCol - 1 ] ) ) + #end if + #end if + #except: + #pass + #end try + #end if + #set line_count += 1 +#end for +#if $chrom is not None: +${chrom}:${start}-${end + 1} +#else: +:- +#end if + </param> + </link> +</display> diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/converters/bed_to_genetrack_converter.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/converters/bed_to_genetrack_converter.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,44 @@ +#!/usr/bin/env python + +#FIXME: THIS IS 1:1 COPY OF THE SAME FUNCTIONED TOOL - ALLOW REGULAR TOOLS TO MASCARADE AS CONVERTERS + +""" +Wraps genetrack.scripts.tabs2genetrack so the tool can be executed from Galaxy. + +usage: %prog input output shift +""" + +import sys, shutil, os +from galaxy import eggs +import pkg_resources +pkg_resources.require( "GeneTrack" ) + +from genetrack.scripts import tabs2genetrack +from genetrack import logger + +if __name__ == "__main__": + import os + os.environ[ 'LC_ALL' ] = 'C' + #os.system( 'export' ) + + parser = tabs2genetrack.option_parser() + + options, args = parser.parse_args() + + # uppercase the format + options.format = options.format.upper() + + if options.format not in ('BED', 'GFF'): + sys.stdout = sys.stderr + parser.print_help() + sys.exit(-1) + + logger.disable(options.verbosity) + + # missing file names + if not (options.inpname and options.outname and options.format): + parser.print_help() + sys.exit(-1) + else: + tabs2genetrack.transform(inpname=options.inpname, outname=options.outname,\ + format=options.format, shift=options.shift, index=options.index, options=options) diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/converters/bed_to_genetrack_converter.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/converters/bed_to_genetrack_converter.xml Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,17 @@ +<tool id="CONVERTER_bed_to_genetrack_0" name="Convert BED to GeneTrack Index" version="1.0.0"> +<!-- FIXME: THIS IS ALMOST 1:1 COPY OF THE SAME FUNCTIONED TOOL - ALLOW REGULAR TOOLS TO MASCARADE AS CONVERTERS +Using a shift of 0, but tool allows specifying... +--> +<!-- <description>__NOT_USED_CURRENTLY_FOR_CONVERTERS__</description> --> + <command interpreter="python">bed_to_genetrack_converter.py -i $input1 -o $output1 -s 0 -v 0 -f BED -x</command> + <inputs> + <page> + <param format="bed" name="input1" type="data" label="Choose BED file"/> + </page> + </inputs> + <outputs> + <data format="genetrack" name="output1"/> + </outputs> + <help> + </help> +</tool> diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,114 @@ +#!/usr/bin/env python +#Dan Blankenberg + +import sys +from galaxy import eggs +import pkg_resources; pkg_resources.require( "bx-python" ) +import bx.intervals.io + +assert sys.version_info[:2] >= ( 2, 4 ) + +def stop_err( msg ): + sys.stderr.write( msg ) + sys.exit() + +def __main__(): + output_name = sys.argv[1] + input_name = sys.argv[2] + try: + chromCol = int( sys.argv[3] ) - 1 + except: + stop_err( "'%s' is an invalid chrom column, correct the column settings before attempting to convert the data format." % str( sys.argv[3] ) ) + try: + startCol = int( sys.argv[4] ) - 1 + except: + stop_err( "'%s' is an invalid start column, correct the column settings before attempting to convert the data format." % str( sys.argv[4] ) ) + try: + endCol = int( sys.argv[5] ) - 1 + except: + stop_err( "'%s' is an invalid end column, correct the column settings before attempting to convert the data format." % str( sys.argv[5] ) ) + try: + strandCol = int( sys.argv[6] ) - 1 + except: + strandCol = -1 + try: + nameCol = int( sys.argv[7] ) - 1 + except: + nameCol = -1 + try: + extension = sys.argv[8] + except: + extension = 'interval' #default extension + + skipped_lines = 0 + first_skipped_line = None + out = open( output_name,'w' ) + count = 0 + #does file already conform to bed strict? + #if so, we want to keep extended columns, otherwise we'll create a generic 6 column bed file + strict_bed = True + if extension == 'bed' and ( chromCol, startCol, endCol, nameCol, strandCol ) == ( 0, 1, 2, 3, 5 ): + for count, line in enumerate( open( input_name ) ): + line = line.strip() + if line == "" or line.startswith("#"): + skipped_lines += 1 + if first_skipped_line is None: + first_skipped_line = count + 1 + continue + fields = line.split('\t') + try: + if len(fields) > 12: + strict_bed = False + break + if len(fields) > 6: + int(fields[6]) + if len(fields) > 7: + int(fields[7]) + if len(fields) > 8: + if int(fields[8]) != 0: + strict_bed = False + break + if len(fields) > 9: + int(fields[9]) + if len(fields) > 10: + fields2 = fields[10].rstrip(",").split(",") #remove trailing comma and split on comma + for field in fields2: + int(field) + if len(fields) > 11: + fields2 = fields[11].rstrip(",").split(",") #remove trailing comma and split on comma + for field in fields2: + int(field) + except: + strict_bed = False + break + out.write( "%s\n" % line ) + else: + strict_bed = False + out.close() + + if not strict_bed: + skipped_lines = 0 + first_skipped_line = None + out = open( output_name,'w' ) + count = 0 + for count, region in enumerate( bx.intervals.io.NiceReaderWrapper( open( input_name, 'r' ), chrom_col=chromCol, start_col=startCol, end_col=endCol, strand_col=strandCol, fix_strand=True, return_header=False, return_comments=False ) ): + try: + if nameCol >= 0: + name = region.fields[nameCol] + else: + raise IndexError + except: + name = "region_%i" % count + try: + + out.write( "%s\t%i\t%i\t%s\t%i\t%s\n" % ( region.chrom, region.start, region.end, name, 0, region.strand ) ) + except: + skipped_lines += 1 + if first_skipped_line is None: + first_skipped_line = count + 1 + out.close() + print "%i regions converted to BED." % ( count + 1 - skipped_lines ) + if skipped_lines > 0: + print "Skipped %d blank or invalid lines starting with line # %d." % ( skipped_lines, first_skipped_line ) + +if __name__ == "__main__": __main__() diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/converters/interval_to_bedstrict_converter.xml Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,15 @@ +<tool id="CONVERTER_interval_to_bedstrict_0" name="Convert Genomic Intervals To Strict BED"> + <!-- <description>__NOT_USED_CURRENTLY_FOR_CONVERTERS__</description> --> + <!-- Used on the metadata edit page. --> + <command interpreter="python">interval_to_bedstrict_converter.py $output1 $input1 ${input1.metadata.chromCol} ${input1.metadata.startCol} ${input1.metadata.endCol} ${input1.metadata.strandCol} ${input1.metadata.nameCol} ${input1.extension}</command> + <inputs> + <page> + <param format="interval" name="input1" type="data" label="Choose intervals"/> + </page> + </inputs> + <outputs> + <data format="bedstrict" name="output1"/> + </outputs> + <help> + </help> +</tool> diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/data.py --- a/lib/galaxy/datatypes/data.py Thu Feb 11 16:18:45 2010 -0500 +++ b/lib/galaxy/datatypes/data.py Fri Feb 12 11:01:30 2010 -0500 @@ -63,6 +63,7 @@ object.__init__(self, **kwd) self.supported_display_apps = self.supported_display_apps.copy() self.composite_files = self.composite_files.copy() + self.display_applications = odict() def write_from_stream(self, dataset, stream): """Writes data from a stream""" fd = open(dataset.file_name, 'wb') @@ -198,6 +199,12 @@ del self.supported_display_apps[app_id] except: log.exception('Tried to remove display app %s from datatype %s, but this display app is not declared.' % ( type, self.__class__.__name__ ) ) + def clear_display_apps( self ): + self.supported_display_apps = {} + def add_display_application( self, display_application ): + """New style display applications""" + assert display_application.id not in self.display_applications, 'Attempted to add a display application twice' + self.display_applications[ display_application.id ] = display_application def get_display_types(self): """Returns display types available""" return self.supported_display_apps.keys() @@ -239,7 +246,7 @@ """This function adds a job to the queue to convert a dataset to another type. Returns a message about success/failure.""" converter = trans.app.datatypes_registry.get_converter_by_target_type( original_dataset.ext, target_type ) if converter is None: - raise "A converter does not exist for %s to %s." % ( original_dataset.ext, target_type ) + raise Exception( "A converter does not exist for %s to %s." % ( original_dataset.ext, target_type ) ) #Generate parameter dictionary params = {} #determine input parameter name and add to params diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/display_applications/__init__.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/display_applications/__init__.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,1 @@ + diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/display_applications/application.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/display_applications/application.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,116 @@ +#Contains objects for using external display applications +from galaxy.util import parse_xml +from galaxy.util.odict import odict +from galaxy.util.template import fill_template +from galaxy.web import url_for +from parameters import DisplayApplicationParameter, DEFAULT_DATASET_NAME +from urllib import quote_plus +from helpers import encode_dataset_user + +#Any basic functions that we want to provide as a basic part of parameter dict should be added to this dict +BASE_PARAMS = { 'qp': quote_plus, 'url_for':url_for } #url_for has route memory... + +class DisplayApplicationLink( object ): + @classmethod + def from_elem( cls, elem, display_application ): + rval = DisplayApplicationLink( display_application ) + rval.id = elem.get( 'id', None ) + assert rval.id, 'Link elements require a id.' + rval.name = elem.get( 'name', rval.id ) + rval.url = elem.find( 'url' ) + assert rval.url is not None, 'A url element must be provided for link elements.' + for param_elem in elem.findall( 'param' ): + param = DisplayApplicationParameter.from_elem( param_elem, rval ) + assert param, 'Unable to load parameter from element: %s' % param_elem + rval.parameters[ param.name ] = param + rval.url_param_name_map[ param.url ] = param.name + return rval + def __init__( self, display_application ): + self.display_application = display_application + self.parameters = odict() #parameters are populated in order, allowing lower listed ones to have values of higher listed ones + self.url_param_name_map = {} + self.url = None + self.id = None + self.name = None + def get_display_url( self, data, trans ): + dataset_hash, user_hash = encode_dataset_user( trans, data, trans.user ) + return url_for( controller = 'dataset', action = "display_application", dataset_id = dataset_hash, user_id = user_hash, app_name = self.display_application.id, link_name = self.id, app_action = 'display' ) + def get_inital_values( self, data, trans ): + rval = odict( { 'BASE_URL': trans.request.base, 'APP': trans.app } ) #trans automatically appears as a response, need to add properties of trans that we want here + for key, value in BASE_PARAMS.iteritems(): #add helper functions/variables + rval[ key ] = value + rval[ DEFAULT_DATASET_NAME ] = data #always have the display dataset name available + return rval + def build_parameter_dict( self, data, trans ): + other_values = self.get_inital_values( data, trans ) + for name, param in self.parameters.iteritems(): + assert name not in other_values, "The display parameter '%s' has been defined more than once." % name + if param.ready( other_values ): + other_values[ name ] = param.get_value( other_values, trans )#subsequent params can rely on this value + else: + other_values[ name ] = None + return False, other_values #need to stop here, next params may need this value + return True, other_values #we built other_values, lets provide it as well, or else we will likely regenerate it in the next step + +class PopulatedDisplayApplicationLink( object ): + def __init__( self, display_application_link, data, trans ): + self.link = display_application_link + self.data = data + self.trans = trans + self.ready, self.parameters = self.link.build_parameter_dict( self.data, trans ) + def display_ready( self ): + return self.ready + def get_param_value( self, name ): + value = None + if self.ready: + value = self.parameters.get( name, None ) + assert value, 'Unknown parameter requested' + return value + def preparing_display( self ): + if not self.ready: + return self.link.parameters[ self.parameters.keys()[ -1 ] ].is_preparing( self.parameters ) + return False + def prepare_display( self ): + if not self.ready and not self.preparing_display(): + other_values = self.parameters + for name, param in self.link.parameters.iteritems(): + if other_values.keys()[ -1 ] == name: #found last parameter to be populated + value = param.prepare( other_values, self.trans ) + if value is None: + return #we can go no further until we have a value for this parameter + other_values[ name ] = value + def display_url( self ): + assert self.display_ready(), 'Display is not yet ready, cannot generate display link' + return fill_template( self.link.url.text, context = self.parameters ) + def get_param_name_by_url( self, name ): + assert name in self.link.url_param_name_map, "Unknown URL parameter name provided: %s" % name + return self.link.url_param_name_map[ name ] + +class DisplayApplication( object ): + @classmethod + def from_file( cls, filename, datatypes_registry ): + return cls.from_elem( parse_xml( filename ).getroot(), datatypes_registry ) + @classmethod + def from_elem( cls, elem, datatypes_registry ): + display_id = elem.get( 'id', None ) + assert display_id, "ID tag is required for a Display Application" + name = elem.get( 'name', display_id ) + version = elem.get( 'version', None ) + rval = DisplayApplication( display_id, name, datatypes_registry, version ) + for link_elem in elem.findall( 'link' ): + link = DisplayApplicationLink.from_elem( link_elem, rval ) + if link: + rval.links[ link.id ] = link + return rval + def __init__( self, display_id, name, datatypes_registry, version = None ): + self.id = display_id + self.name = name + self.datatypes_registry = datatypes_registry + if version is None: + version = "1.0.0" + self.version = version + self.links = odict() + def get_link( self, link_name, data, trans ): + #returns a link object with data knowledge to generate links + return PopulatedDisplayApplicationLink( self.links[ link_name ], data, trans ) + diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/display_applications/helpers.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/display_applications/helpers.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,31 @@ +import pkg_resources +pkg_resources.require( "pycrypto" ) +from Crypto.Cipher import Blowfish + +def encode_dataset_user( trans, dataset, user ): + #encode dataset id as usual + #encode user id using the dataset create time as the key + dataset_hash = trans.security.encode_id( dataset.id ) + if user is None: + user_id = 'None' + else: + user_id = str( user.id ) + # Pad to a multiple of 8 with leading "!" + user_id = ( "!" * ( 8 - len( user_id ) % 8 ) ) + user_id + cipher = Blowfish.new( str( dataset.create_time ) ) + return dataset_hash, cipher.encrypt( user_id ).encode( 'hex' ) + +def decode_dataset_user( trans, dataset_hash, user_hash ): + #decode dataset id as usual + #decode user id using the dataset create time as the key + dataset_id = trans.security.decode_id( dataset_hash ) + dataset = trans.sa_session.query( trans.app.model.HistoryDatasetAssociation ).get( dataset_id ) + assert dataset, "Bad Dataset id provided to decode_dataset_user" + cipher = Blowfish.new( str( dataset.create_time ) ) + user_id = cipher.decrypt( user_hash.decode( 'hex' ) ).lstrip( "!" ) + if user_id =='None': + user = None + else: + user = trans.sa_session.query( trans.app.model.User ).get( int( user_id ) ) + assert user, "A Bad user id was passed to decode_dataset_user" + return dataset, user diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/display_applications/parameters.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lib/galaxy/datatypes/display_applications/parameters.py Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,195 @@ +#Contains parameters that are used in Display Applications +from helpers import encode_dataset_user +from galaxy.util import string_as_bool +from galaxy.util.bunch import Bunch +from galaxy.util.template import fill_template +from galaxy.web import url_for +import mimetypes + +DEFAULT_DATASET_NAME = 'dataset' + +class DisplayApplicationParameter( object ): + """ Abstract Class for Display Application Parameters """ + + type = None + + @classmethod + def from_elem( cls, elem, link ): + param_type = elem.get( 'type', None ) + assert param_type, 'DisplayApplicationParameter requires a type' + return parameter_type_to_class[ param_type ]( elem, link ) + def __init__( self, elem, link ): + self.name = elem.get( 'name', None ) + assert self.name, 'DisplayApplicationParameter requires a name' + self.link = link + self.url = elem.get( 'url', self.name ) #name used in url for display purposes defaults to name; e.g. want the form of file.ext, where a '.' is not allowed as python variable name/keyword + self.mime_type = elem.get( 'mimetype', None ) + self.guess_mime_type = string_as_bool( elem.get( 'guess_mimetype', 'False' ) ) + self.viewable = string_as_bool( elem.get( 'viewable', 'False' ) ) #only allow these to be viewed via direct url when explicitly set to viewable + self.strip = string_as_bool( elem.get( 'strip', 'False' ) ) + self.strip_https = string_as_bool( elem.get( 'strip_https', 'False' ) ) + def get_value( self, other_values, trans ): + raise Exception, 'Unimplemented' + def prepare( self, other_values, trans ): + return self.get_value( other_values, trans ) + def ready( self, other_values ): + return True + def is_preparing( self, other_values ): + return False + +class DisplayApplicationDataParameter( DisplayApplicationParameter ): + """ Parameter that returns a file_name containing the requested content """ + + type = 'data' + + def __init__( self, elem, link ): + DisplayApplicationParameter.__init__( self, elem, link ) + self.extensions = elem.get( 'format', None ) + if self.extensions: + self.extensions = self.extensions.split( "," ) + self.metadata = elem.get( 'metadata', None ) + self.dataset = elem.get( 'dataset', DEFAULT_DATASET_NAME ) # 'dataset' is default name assigned to dataset to be displayed + assert not ( self.extensions and self.metadata ), 'A format or a metadata can be defined for a DisplayApplicationParameter, but not both.' + self.viewable = string_as_bool( elem.get( 'viewable', 'True' ) ) #data params should be viewable + self.force_url_param = string_as_bool( elem.get( 'force_url_param', 'False' ) ) + self.force_conversion = string_as_bool( elem.get( 'force_conversion', 'False' ) ) + @property + def formats( self ): + if self.extensions: + return tuple( map( type, map( self.link.display_application.datatypes_registry.get_datatype_by_extension, self.extensions ) ) ) + return None + def _get_dataset_like_object( self, other_values ): + #this returned object has file_name, state, and states attributes equivalent to a DatasetAssociation + data = other_values.get( self.dataset, None ) + assert data, 'Base dataset could not be found in values provided to DisplayApplicationDataParameter' + if isinstance( data, DisplayDataValueWrapper ): + data = data.value + if self.metadata: + rval = getattr( data.metadata, self.metadata, None ) + assert rval, 'Unknown metadata name (%s) provided for dataset type (%s).' % ( self.metadata, data.datatype.__class__.name ) + return Bunch( file_name = rval.file_name, state = data.state, states = data.states, extension='data' ) + elif self.extensions and ( self.force_conversion or not isinstance( data.datatype, self.formats ) ): + for ext in self.extensions: + rval = data.get_converted_files_by_type( ext ) + if rval: + return rval[0] + assert data.find_conversion_destination( self.formats )[0] is not None, "No conversion path found for data param: %s" % self.name + return None + return data + def get_value( self, other_values, trans ): + data = self._get_dataset_like_object( other_values ) + if data: + return DisplayDataValueWrapper( data, self, other_values, trans ) + return None + def prepare( self, other_values, trans ): + data = self._get_dataset_like_object( other_values ) + if not data and self.formats: + data = other_values.get( self.dataset, None ) + trans.sa_session.refresh( data ) + #start conversion + #FIXME: Much of this is copied (more than once...); should be some abstract method elsewhere called from here + #find target ext + target_ext, converted_dataset = data.find_conversion_destination( self.formats, converter_safe = True ) + if target_ext and not converted_dataset: + assoc = trans.app.model.ImplicitlyConvertedDatasetAssociation( parent = data, file_type = target_ext, metadata_safe = False ) + new_data = data.datatype.convert_dataset( trans, data, target_ext, return_output = True, visible = False ).values()[0] + new_data.hid = data.hid + new_data.name = data.name + trans.sa_session.add( new_data ) + trans.sa_session.flush() + assoc.dataset = new_data + trans.sa_session.add( assoc ) + trans.sa_session.flush() + elif converted_dataset and converted_dataset.state == converted_dataset.states.ERROR: + raise Exception, "Dataset conversion failed for data parameter: %s" % self.name + return self.get_value( other_values, trans ) + def is_preparing( self, other_values ): + value = self._get_dataset_like_object( other_values ) + if value and value.state in ( value.states.NEW, value.states.UPLOAD, value.states.QUEUED, value.states.RUNNING ): + return True + return False + def ready( self, other_values ): + value = self._get_dataset_like_object( other_values ) + if value: + if value.state == value.states.OK: + return True + elif value.state == value.states.ERROR: + raise Exception( 'A data display parameter is in the error state: %s' % ( self.name ) ) + return False + +class DisplayApplicationTemplateParameter( DisplayApplicationParameter ): + """ Parameter that returns a string containing the requested content """ + + type = 'template' + + def __init__( self, elem, link ): + DisplayApplicationParameter.__init__( self, elem, link ) + self.text = elem.text + def get_value( self, other_values, trans ): + value = fill_template( self.text, context = other_values ) + if self.strip: + value = value.strip() + return DisplayParameterValueWrapper( value, self, other_values, trans ) + +parameter_type_to_class = { DisplayApplicationDataParameter.type:DisplayApplicationDataParameter, DisplayApplicationTemplateParameter.type:DisplayApplicationTemplateParameter } + +class DisplayParameterValueWrapper( object ): + ACTION_NAME = 'param' + def __init__( self, value, parameter, other_values, trans ): + self.value = value + self.parameter = parameter + self.other_values = other_values + self.trans = trans + self._dataset_hash, self._user_hash = encode_dataset_user( trans, self.other_values[ DEFAULT_DATASET_NAME ], trans.user ) + def __str__( self ): + return str( self.value ) + def mime_type( self ): + if self.parameter.mime_type is not None: + return self.parameter.mime_type + if self.parameter.guess_mime_type: + mime, encoding = mimetypes.guess_type( self.parameter.url ) + if not mime: + mime = self.trans.app.datatypes_registry.get_mimetype_by_extension( ".".split( self.parameter.url )[ -1 ], None ) + if mime: + return mime + return 'text/plain' + @property + def url( self ): + base_url = self.trans.request.base + if self.parameter.strip_https and base_url[ : 5].lower() == 'https': + base_url = "http%s" % base_url[ 5: ] + return "%s%s" % ( base_url, url_for( controller = 'dataset', action = "display_application", dataset_id = self._dataset_hash, user_id = self._user_hash, app_name = self.parameter.link.display_application.id, link_name = self.parameter.link.id, app_action = self.action_name, action_param = self.parameter.url ) ) + @property + def action_name( self ): + return self.ACTION_NAME + @property + def qp( self ): + #returns quoted str contents + return self.other_values[ 'qp' ]( str( self ) ) + def __getattr__( self, key ): + return getattr( self.value, key ) + +class DisplayDataValueWrapper( DisplayParameterValueWrapper ): + ACTION_NAME = 'data' + def __str__( self ): + #string of data param is filename + return str( self.value.file_name ) + def mime_type( self ): + if self.parameter.mime_type is not None: + return self.parameter.mime_type + if self.parameter.guess_mime_type: + mime, encoding = mimetypes.guess_type( self.parameter.url ) + if not mime: + mime = self.trans.app.datatypes_registry.get_mimetype_by_extension( ".".split( self.parameter.url )[ -1 ], None ) + if mime: + return mime + return self.other_values[ DEFAULT_DATASET_NAME ].get_mime() + @property + def action_name( self ): + if self.parameter.force_url_param: + return super( DisplayParameterValueWrapper, self ).action_name + return self.ACTION_NAME + @property + def qp( self ): + #returns quoted url contents + return self.other_values[ 'qp' ]( self.url ) diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/interval.py --- a/lib/galaxy/datatypes/interval.py Thu Feb 11 16:18:45 2010 -0500 +++ b/lib/galaxy/datatypes/interval.py Fri Feb 12 11:01:30 2010 -0500 @@ -486,6 +486,36 @@ def get_track_type( self ): return "FeatureTrack", "interval_index" +class BedStrict( Bed ): + """Tab delimited data in strict BED format - no non-standard columns allowed""" + + file_ext = "bedstrict" + + #no user change of datatype allowed + allow_datatype_change = False + + #Read only metadata elements + MetadataElement( name="chromCol", default=1, desc="Chrom column", readonly=True, param=metadata.MetadataParameter ) + MetadataElement( name="startCol", default=2, desc="Start column", readonly=True, param=metadata.MetadataParameter ) #TODO: start and end should be able to be set to these or the proper thick[start/end]? + MetadataElement( name="endCol", default=3, desc="End column", readonly=True, param=metadata.MetadataParameter ) + MetadataElement( name="strandCol", desc="Strand column (click box & select)", readonly=True, param=metadata.MetadataParameter, no_value=0, optional=True ) + MetadataElement( name="nameCol", desc="Name/Identifier column (click box & select)", readonly=True, param=metadata.MetadataParameter, no_value=0, optional=True ) + MetadataElement( name="columns", default=3, desc="Number of columns", readonly=True, visible=False ) + + def __init__( self, **kwd ): + Tabular.__init__( self, **kwd ) + self.clear_display_apps() #only new style display applications for this datatype + + def set_meta( self, dataset, overwrite = True, **kwd ): + Tabular.set_meta( self, dataset, overwrite = overwrite, **kwd) #need column count first + if dataset.metadata.columns >= 4: + dataset.metadata.nameCol = 4 + if dataset.metadata.columns >= 6: + dataset.metadata.strandCol = 6 + + def sniff( self, filename ): + return False #NOTE: This would require aggressively validating the entire file + class _RemoteCallMixin: def _get_remote_call_url( self, redirect_url, site_name, dataset, type, app, base_url ): """Retrieve the URL to call out to an external site and retrieve data. diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/datatypes/registry.py --- a/lib/galaxy/datatypes/registry.py Thu Feb 11 16:18:45 2010 -0500 +++ b/lib/galaxy/datatypes/registry.py Fri Feb 12 11:01:30 2010 -0500 @@ -6,6 +6,7 @@ import data, tabular, interval, images, sequence, qualityscore, genetics, xml, coverage, tracks, chrominfo, binary import galaxy.util from galaxy.util.odict import odict +from display_applications.application import DisplayApplication class ConfigurationError( Exception ): pass @@ -23,6 +24,7 @@ self.indexers = [] self.sniff_order = [] self.upload_file_formats = [] + self.display_applications = odict() #map a display application id to a display application if root_dir and config: # Parse datatypes_conf.xml tree = galaxy.util.parse_xml( config ) @@ -32,6 +34,7 @@ registration = root.find( 'registration' ) self.datatype_converters_path = os.path.join( root_dir, registration.get( 'converters_path', 'lib/galaxy/datatypes/converters' ) ) self.datatype_indexers_path = os.path.join( root_dir, registration.get( 'indexers_path', 'lib/galaxy/datatypes/indexers' ) ) + self.display_applications_path = os.path.join( root_dir, registration.get( 'display_path', 'display_applications' ) ) if not os.path.isdir( self.datatype_converters_path ): raise ConfigurationError( "Directory does not exist: %s" % self.datatype_converters_path ) if not os.path.isdir( self.datatype_indexers_path ): @@ -79,6 +82,17 @@ optional = composite_file.get( 'optional', False ) mimetype = composite_file.get( 'mimetype', None ) self.datatypes_by_extension[extension].add_composite_file( name, optional=optional, mimetype=mimetype ) + for display_app in elem.findall( 'display' ): + display_file = display_app.get( 'file', None ) + assert display_file is not None, "A file must be specified for a datatype display tag." + display_app = DisplayApplication.from_file( os.path.join( self.display_applications_path, display_file ), self ) + if display_app: + if display_app.id in self.display_applications: + #if we already loaded this display application, we'll use the first one again + display_app = self.display_applications[ display_app.id ] + self.log.debug( "Loaded display application '%s' for datatype '%s'" % ( display_app.id, extension ) ) + self.display_applications[ display_app.id ] = display_app #Display app by id + self.datatypes_by_extension[ extension ].add_display_application( display_app ) except Exception, e: self.log.warning( 'Error loading datatype "%s", problem: %s' % ( extension, str( e ) ) ) @@ -213,13 +227,13 @@ def get_available_tracks(self): return self.available_tracks - def get_mimetype_by_extension(self, ext ): + def get_mimetype_by_extension(self, ext, default = 'application/octet-stream' ): """Returns a mimetype based on an extension""" try: mimetype = self.mimetypes_by_extension[ext] except KeyError: #datatype was never declared - mimetype = 'application/octet-stream' + mimetype = default self.log.warning('unknown mimetype in data factory %s' % ext) return mimetype diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/web/buildapp.py --- a/lib/galaxy/web/buildapp.py Thu Feb 11 16:18:45 2010 -0500 +++ b/lib/galaxy/web/buildapp.py Fri Feb 12 11:01:30 2010 -0500 @@ -74,6 +74,7 @@ webapp.add_route( '/:controller/:action', action='index' ) webapp.add_route( '/:action', controller='root', action='index' ) webapp.add_route( '/datasets/:dataset_id/:action/:filename', controller='dataset', action='index', dataset_id=None, filename=None) + webapp.add_route( '/display_application/:dataset_id/:user_id/:app_name/:link_name/:app_action/:action_param', controller='dataset', action='display_application', dataset_id=None, user_id=None, app_name = None, link_name = None, app_action = None, action_param = None ) webapp.add_route( '/u/:username/d/:slug', controller='dataset', action='display_by_username_and_slug' ) webapp.add_route( '/u/:username/p/:slug', controller='page', action='display_by_username_and_slug' ) webapp.add_route( '/u/:username/h/:slug', controller='history', action='display_by_username_and_slug' ) diff -r 81d84a03f2ec -r 049a86b8691d lib/galaxy/web/controllers/dataset.py --- a/lib/galaxy/web/controllers/dataset.py Thu Feb 11 16:18:45 2010 -0500 +++ b/lib/galaxy/web/controllers/dataset.py Fri Feb 12 11:01:30 2010 -0500 @@ -4,6 +4,7 @@ from galaxy.web.framework.helpers import time_ago, iff, grids from galaxy import util, datatypes, jobs, web, model from cgi import escape, FieldStorage +from galaxy.datatypes.display_applications.helpers import decode_dataset_user from email.MIMEText import MIMEText @@ -188,7 +189,7 @@ dataset_id = int( dataset_id ) except ValueError: dataset_id = trans.security.decode_id( dataset_id ) - data = data = trans.sa_session.query( trans.app.model.HistoryDatasetAssociation ).get( dataset_id ) + data = trans.sa_session.query( trans.app.model.HistoryDatasetAssociation ).get( dataset_id ) if not data: raise paste.httpexceptions.HTTPRequestRangeNotSatisfiable( "Invalid reference dataset id: %s." % str( dataset_id ) ) current_user_roles = trans.get_current_user_roles() @@ -339,6 +340,74 @@ else: return trans.show_error_message( "You are not allowed to view this dataset at external sites. Please contact your Galaxy administrator to acquire management permissions for this dataset." ) + @web.expose + def display_application( self, trans, dataset_id=None, user_id=None, app_name = None, link_name = None, app_action = None, action_param = None ): + """Access to external display applications""" + #decode ids + data, user = decode_dataset_user( trans, dataset_id, user_id ) + if not data: + raise paste.httpexceptions.HTTPRequestRangeNotSatisfiable( "Invalid reference dataset id: %s." % str( dataset_id ) ) + if user: + user_roles = user.all_roles() + else: + user_roles = [] + if None in [ app_name, link_name ]: + return trans.show_error_message( "A display application name and link name must be provided." ) + + if app_action is None: + app_action = "display" # default action is display + + if trans.app.security_agent.can_access_dataset( user_roles, data.dataset ): + msg = [] + refresh = False + display_app = trans.app.datatypes_registry.display_applications.get( app_name ) + assert display_app, "Unknown display application has been requested: %s" % app_name + display_link = display_app.get_link( link_name, data, trans ) + assert display_link, "Unknown display link has been requested: %s" % link_name + if data.state == data.states.ERROR: + msg.append( ( 'This dataset is in an error state, you cannot view it at an external display application.', 'error' ) ) + elif data.deleted: + msg.append( ( 'This dataset has been deleted, you cannot view it at an external display application.', 'error' ) ) + elif data.state != data.states.OK: + msg.append( ( 'You must wait for this dataset to be created before you can view it at an external display application.', 'info' ) ) + refresh = True + else: + #We have permissions, dataset is not deleted and is in OK state, allow access + if display_link.display_ready(): + if app_action in [ 'data', 'param' ]: + assert action_param, "An action param must be provided for a data or param action" + #data is used for things with filenames that could be passed off to a proxy + #in case some display app wants all files to be in the same 'directory', + #data can be forced to param, but not the other way (no filename for other direction) + #get param name from url param name + action_param = display_link.get_param_name_by_url( action_param ) + value = display_link.get_param_value( action_param ) + assert value, "An invalid parameter name was provided: %s" % action_param + assert value.parameter.viewable, "This parameter is not viewable." + if value.parameter.type == 'data': + content_length = os.path.getsize( value.file_name ) + rval = open( value.file_name ) + else: + rval = str( value ) + content_length = len( rval ) + trans.response.set_content_type( value.mime_type() ) + trans.response.headers[ 'Content-Length' ] = content_length + return rval + elif app_action == "display": + return trans.fill_template_mako( "dataset/display_application/launch_display.mako", display_link = display_link ) + else: + msg.append( ( 'Invalid action provided: %s' % app_action, 'error' ) ) + else: + msg.append( ( 'This display application is being prepared.', 'info' ) ) + if app_action == "display": + refresh = True + if not display_link.preparing_display(): + display_link.prepare_display() + else: + raise Exception( 'Attempted a view action (%s) on a non-ready display application' % app_action ) + return trans.fill_template_mako( "dataset/display_application/display.mako", msg = msg, display_app = display_app, display_link = display_link, refresh = refresh ) + return trans.show_error_message( 'You do not have permission to view this dataset at an external display application.' ) + def _undelete( self, trans, id ): try: id = int( id ) diff -r 81d84a03f2ec -r 049a86b8691d templates/dataset/display_application/display.mako --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/templates/dataset/display_application/display.mako Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,17 @@ +<%inherit file="/base.mako"/> +<%namespace file="/message.mako" import="render_msg" /> +<%def name="title()">Display Application: ${display_link.link.display_application.name} ${display_link.link.name}</%def> +<% refresh_rate = 10 %> +%if refresh: +<script type="text/javascript"> + setTimeout( "location.reload(true);", ${ refresh_rate * 1000 } ); +</script> +%endif +%for message, message_type in msg: + ${render_msg( message, message_type )} +%endfor +%if refresh: +<p> +This page will <a href="javascript:location.reload(true);">refresh</a> after ${refresh_rate} seconds. +</p> +%endif diff -r 81d84a03f2ec -r 049a86b8691d templates/dataset/display_application/launch_display.mako --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/templates/dataset/display_application/launch_display.mako Fri Feb 12 11:01:30 2010 -0500 @@ -0,0 +1,15 @@ +<%inherit file="/base.mako"/> +<%def name="title()">Launching Display Application: ${display_link.link.display_application.name} ${display_link.link.name}</%def> + +<script type="text/javascript"> + location.href = '${display_link.display_url()}'; +</script> +<p> +All data has been prepared for the external display application: ${display_link.link.display_application.name} ${display_link.link.name}. +</p> +<p> +You are now being automatically forwarded to the external application. +</p> +<p> +Click <a href="${display_link.display_url()}">here</a> if this redirect has failed. +</p> diff -r 81d84a03f2ec -r 049a86b8691d templates/root/history_common.mako --- a/templates/root/history_common.mako Thu Feb 11 16:18:45 2010 -0500 +++ b/templates/root/history_common.mako Fri Feb 12 11:01:30 2010 -0500 @@ -101,6 +101,12 @@ %endif %endfor %endif + %for display_app in data.datatype.display_applications.itervalues(): + | ${display_app.name} + %for link_app in display_app.links.itervalues(): + <a target="${link_app.url.get( 'target_frame', '_blank' )}" href="${link_app.get_display_url( data, trans )}">${_(link_app.name)}</a> + %endfor + %endfor </div> %if data.peek != "no peek": <div><pre id="peek${data.id}" class="peek">${_(data.display_peek())}</pre></div>
participants (1)
-
Greg Von Kuster