I understand that instead of having one dataset with multiple files you are planning to use existing datasets and combine them in a ‘collection’. My concerns are:
This needs to be fleshed out much more, but this is not exactly what we are thinking. The main change is to make it possible for a history to contain items other than datasets. Groups of datasets would be one such thing. Multifile datasets would be another. Workflow invocations a third (needed to support extensions to the workflow system we are proposing).
1. Our data consists of 200-8000 files, can you imagine how many datasets we’ll end up with? It will be a mess.
Yes, it would, which is why there does need to be the concept of a homogenous dataset collection to support this.
5. We are already using the “m:xxx” type datasets (thanks John) in our project, I guess you don’t even have a timeframe for implementing the “collection” concept? I’m sure that for many projects using multi file datasets is a requirement now, not in ‘years’ time.
We recognize the need, but implementing these using the existing datasets with a prefix on the extension, and then special casing all over the place, is not a maintainable solution going forward. They should be implemented as their own entity.