On Tue, Feb 19, 2013 at 6:45 PM, Nicola Soranzo <soranzo@crs4.it> wrote:
Il giorno mar, 19/02/2013 alle 14.15 +0000, Peter Cock ha scritto:
On Tue, Feb 19, 2013 at 2:00 PM, James Taylor <james@jamestaylor.org> wrote:
On Tue, Feb 19, 2013 at 6:32 AM, Peter Cock wrote:
I think it could make sense to define generic 'asn1' and 'asn1-binary' formats in the Galaxy core (name suggestions welcome)
What about
extension="asn" type="galaxy.datatypes.data:GenericAsn1"
and
extension="asnb" type="galaxy.datatypes.binary:GenericAsn1Binary"
like GenericXml class?
Those seem sensible to me as the class names, although I'm not so sure about the format names (aka 'extensions' in Galaxy terms). I'd prefer to see the '1' in the name for clarity. My suggestions of 'asn1' and 'asn1-binary' were based on NCBI usage. Perhaps the Galaxy team could comment on their views here for conciseness versus clarity in file format names for Galaxy?
Would a pull request implementing this be acceptable?
Yes. My understanding is that ASN is a completely flexible metaformat, like XML, and so should be under either Text or Data, with appropriate subtypes defined for blast, et cetera.
Thank James,
Yes - very like XML, but with the subtlety that ASN.1 comes in text and binary favours (which I presume applies to all the variants, although the binary versions may not be as commonly used for the smaller files).
Nicola - do you want to make a pull request to galaxy-central defining ASN.1 text and binary formats (which we can then subclass for the NCBI BLAST+ wrappers)? Or should I?
If you mean a minimal implementation, I can surely do that. If something more elaborated is needed, then probably you are more qualified than me!
The current minimal implementation you sent me for BLAST+ would be an excellent start. Things like data type sniffers etc would be a nice to have feature, but can be added later I think. And the sooner this gets into the Galaxy core, the sooner we can use it in the BLAST+ wrappers :)
I think the mime-type for the base ASN.1 text format should probably be text/plain based on the NCBI usage patterns.
Ok.
I'm not sure what the mime-type for the base ASN.1 binary format should be (but I don't think it should be chemical/ncbi-asn1-binary).
application/octet-stream ?
Probably OK - we can/should test this by uploading a test binary ASN.1 file into Galaxy and downloading it out again. Thanks, Peter