
Edward Kirton wrote:
In your position I agree that is a pragmatic choice.
Thanks for helping me muddle through my options.
You might be able to modify the file upload code to gzip any FASTQ files... that would prevent uncompressed FASTQ getting into new histories.
Right!
I wonder if Galaxy would benefit from a new fastqsanger-gzip (etc) datatype? However this seems generally useful (not just for FASTQ) so perhaps a more general mechanism would be better where tool XML files can say which file types they accept and which of those can/must be compressed (possily not just gzip format?).
Perhaps we can flesh-out what more general solutions would look like...
Imagine the fastq datatypes were left alone and instead there's a mechanism by which files which haven't been used as input for x days get compressed by a cron job. the file server knows how to uncompress such files on the fly when needed. For the most part, files are uncompressed during analysis and are compressed when the files exist as an archive within galaxy.
Ideally, there'd just be a column on the dataset table indicating whether the dataset is compressed or not, and then tools get a new way to indicate whether they can directly read compressed inputs, or whether the input needs to be decompressed first. --nate
An even simpler solution would be an archive/compress button which users could use when they're done with a history. Users could still copy (uncompressed) datasets into a new history for further analysis.
Of course there's also the solution mentioned in the 2010 galaxy developer's conference about automatic compression at the system level. Not a possibility for me, but is attractive. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: