Hello Leandro,

I believe this behavior is due to the make_library_uploaded_dataset() method in the ~/lib/galaxy/web/controllers/library_common controller.  The current method looks like this:

    def make_library_uploaded_dataset( self, trans, cntrller, params, name, path, type, library_bunch, in_folder=None ):
        library_bunch.replace_dataset = None # not valid for these types of upload
        uploaded_dataset = util.bunch.Bunch()
        # Remove compressed file extensions, if any
        new_name = name
        if new_name.endswith( '.gz' ):
            new_name = new_name.rstrip( '.gz' )
        elif new_name.endswith( '.zip' ):
            new_name = new_name.rstrip( '.zip' )
        uploaded_dataset.name = new_name
        uploaded_dataset.path = path
        uploaded_dataset.type = type
        uploaded_dataset.ext = None
        uploaded_dataset.file_type = params.file_type
        uploaded_dataset.dbkey = params.dbkey
        uploaded_dataset.space_to_tab = params.space_to_tab
        if in_folder:
            uploaded_dataset.in_folder = in_folder
        uploaded_dataset.data = upload_common.new_upload( trans, cntrller, uploaded_dataset, library_bunch )
        link_data_only = params.get( 'link_data_only', 'copy_files' )
        uploaded_dataset.link_data_only = link_data_only
        if link_data_only == 'link_to_files':
            uploaded_dataset.data.file_name = os.path.abspath( path )
            # Since we are not copying the file into Galaxy's managed
            # default file location, the dataset should never be purgable.
            uploaded_dataset.data.dataset.purgable = False
            trans.sa_session.add_all( ( uploaded_dataset.data, uploaded_dataset.data.dataset ) )
            trans.sa_session.flush()
        return uploaded_dataset

Here are the code changes that I believe will resolve the issue.  However, I have not tested this, so if you wouldn't mind letting me know if this works for you, I'll commit the changes to the central repo.

    def make_library_uploaded_dataset( self, trans, cntrller, params, name, path, type, library_bunch, in_folder=None ):
        link_data_only = params.get( 'link_data_only', 'copy_files' )
        library_bunch.replace_dataset = None # not valid for these types of upload
        uploaded_dataset = util.bunch.Bunch()
        new_name = name
        # Remove compressed file extensions, if any, but only if
        # we're copying files into Galaxy's file space.
        if link_data_only == 'copy_files':
            if new_name.endswith( '.gz' ):
                new_name = new_name.rstrip( '.gz' )
            elif new_name.endswith( '.zip' ):
                new_name = new_name.rstrip( '.zip' )
        uploaded_dataset.name = new_name
        uploaded_dataset.path = path
        uploaded_dataset.type = type
        uploaded_dataset.ext = None
        uploaded_dataset.file_type = params.file_type
        uploaded_dataset.dbkey = params.dbkey
        uploaded_dataset.space_to_tab = params.space_to_tab
        if in_folder:
            uploaded_dataset.in_folder = in_folder
        uploaded_dataset.data = upload_common.new_upload( trans, cntrller, uploaded_dataset, library_bunch )
        uploaded_dataset.link_data_only = link_data_only
        if link_data_only == 'link_to_files':
            uploaded_dataset.data.file_name = os.path.abspath( path )
            # Since we are not copying the file into Galaxy's managed
            # default file location, the dataset should never be purgable.
            uploaded_dataset.data.dataset.purgable = False
            trans.sa_session.add_all( ( uploaded_dataset.data, uploaded_dataset.data.dataset ) )
            trans.sa_session.flush()
        return uploaded_dataset


Thanks!

Greg


On Jan 20, 2012, at 7:42 AM, Leandro Hermida wrote:

Hello,

We've created a new binary datatype for .fastq.gz files following the
same methodology as the BAM files since we don't want our fasta.gz
files to be gunzipped.  I added the appropriate code in upload.py to
make sure of this.  This new datatype and extension successfully does
not gunzip our files.  But when we upload it into a data library via
the data library "Upload via filesystem paths" it for some reason
automatically strips the .gz part out. When we take the same .fastq.gz
file and upload it via Get Data -> Upload File it works fine, nothing
is stripped from file name. Where is it doing this and how can prevent
from stripping the .gz via the data library menus?

thanks,
Leandro
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Greg Von Kuster
Galaxy Development Team
greg@bx.psu.edu