Hi all!
I want to load a R-workspace within a galaxy module (.rdat-file,
R-Project) and therefore built the galaxy-.rdat datatype (binary).
.rdat-files are gzipped and are only recognized within R if they are
still zipped.
However, the corresponding .dat-file is an uncompressed version of the
original .rdat file as I figured out using a hex-editor.
I couldn't find any documentation how to change this behaviour, nor
answers to similar Questions in this list.
Would be happy for any answere that points me in the right direction.
Details
#####
datatypes_conf.xml:
-----------------------------------
<?xml version="1.0"?>
<datatypes>
<registration converters_path="lib/galaxy/datatypes/converters"
display_path="display_applications">
[...]
<datatype extension="rdat" type="galaxy.datatypes.binary:Rdat"
mimetype="application/octet-stream" display_in_upload="true"/>
[...]
</registration>
<sniffers>
[...]
<sniffer type="galaxy.datatypes.binary:Rdat"/>
[...]
</sniffers>
</datatypes>
binary.py:
------------------
[...]
class Rdat( Binary ):
"""Class describing an rdat binary file (R-workspace)"""
file_ext = "rdat"
#MetadataElement( name="Rdat", desc="R-workspace",
param=metadata.FileParameter, readonly=True, no_value=None,
visible=False, optional=True )
"""
def __init__( self, **kwd ):
Binary.__init__( self, **kwd )
self._name = "Rdat"
"""
def set_peek( self, dataset, is_multi_byte=False ):
if not dataset.dataset.purged:
dataset.peek = "Binary rdat file (R-workspace)"
dataset.blurb = data.nice_size( dataset.get_size() )
else:
dataset.peek = 'file does not exist'
dataset.blurb = 'file purged from disk'
def display_peek( self, dataset ):
try:
return dataset.peek
except:
return "Binary rdat file (%s)" % ( data.nice_size(
dataset.get_size() ) )
def get_mime( self ):
"""Returns the mime type of the datatype"""
return 'application/octet-stream'
def sniff( self, filename ):
# rdat is compressed in the gzip format, and must not be
uncompressed in Galaxy.
# The first 4 bytes of any rdat file are RDX2
try:
header = gzip.open( filename ).read(4) #(4)=>4Bytes
if binascii.b2a_hex( header ) == binascii.hexlify( 'RDX2' ):
#check if there is the RDX2 signature
return True
return False
except:
return False
try:
header = open( filename ).read(4) #(4)=>4Bytes
if binascii.b2a_hex( header ) == binascii.hexlify(
'RDX2' ): #check if there is the RDX2 signature
return True
return False
except:
return False
--
Dr. Christian Hundsrucker
Institute for Functional Genomics
Computational Diagnostics Group
University of Regensburg
Josef Engertstr. 9
93053 Regensburg, Germany