I understand that instead of having one dataset with multiple files
you are
planning to use existing datasets and combine them in a ‘collection’. My
concerns are:
This needs to be fleshed out much more, but this is not exactly what
we are thinking. The main change is to make it possible for a history
to contain items other than datasets. Groups of datasets would be one
such thing. Multifile datasets would be another. Workflow invocations
a third (needed to support extensions to the workflow system we are
proposing).
1. Our data consists of 200-8000 files, can you imagine how many
datasets
we’ll end up with? It will be a mess.
Yes, it would, which is why there does need to be the concept of a
homogenous dataset collection to support this.
5. We are already using the “m:xxx” type datasets (thanks John) in
our
project, I guess you don’t even have a timeframe for implementing the
“collection” concept? I’m sure that for many projects using multi file
datasets is a requirement now, not in ‘years’ time.
We recognize the need, but implementing these using the existing
datasets with a prefix on the extension, and then special casing all
over the place, is not a maintainable solution going forward. They
should be implemented as their own entity.