A converted dataset would be fine too. I'm working on an enhancement that would allow the metadata to be provided when the file is uploaded/registered via the API. So to do what you say, I'd need to have a way of providing that converted dataset. The files I'm talking about are concatenated GZIP files, and the GZIP format specification doesn't contain any information about the size of the compressed data, only the uncompressed size (and then, modulo 2^32). AFAIK, anything in Galaxy that tried to create the auxiliary index would need to read and decompress all the data in the file to do that - easily an hours' worth of work for some of our full genome runs. We have all that information already when we make the file, so I'd prefer to just give it to Galaxy at the start. I could place stuff in a special section in the first GZIP header, but then this capability would not be as general-purpose as it could be. I also want to prevent unnecessary gzip decompression in python, because serious decompression in versions before 2.7 is so slow as to be unusable for large datasets. Is there a way to upload that converted dataset when I upload/register the main file? I'd also need to know how to write such a file. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: James Taylor [mailto:james@jamestaylor.org] Sent: Friday, August 26, 2011 5:37 AM To: Duddy, John Cc: galaxy-dev Subject: Re: [galaxy-dev] Storing a dict as metadata Hey John, are you sure you don't want to use a "converted dataset" rather than a metadata element for this. This is how we handle most types of secondary indexes for visualization. If you do it this way, the converter that creates the offset index is just another tool (but registered in datatypes_conf.xml) and the index it self is another dataset that can be accessed through the converted datasets relationship. On Aug 25, 2011, at 6:12 PM, Duddy, John wrote:
I'd like to have a datatype with a dict as metadata. This dict() would store file offsets to enable seeking around to process different sections of the file.
How do I add a dictionary data metadata element?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: