I'm working on unpacking a zip file into multiple datasets.

I think this is the code path

Upload.py

UploadToolAction

upload_common.py:

get_uploaded_datesets

new_upload

new_history_upload or new_library_upload

Then a job gets spooled

Which calles add_file in

data_source/upload.py

And does the expansion of the zip

I can unpack the zip and create files in the dataset's path there.

But I don't know how to create more dataset associations, and I'm not sure that it makes sense to create datasets on the fly in data_source/upload.py .

Should I pass some information along with data_source/upload.py about how to create dataset object and associate them with library/history associations?

Or maybe I can pass in some kind of a callback that can handle the dataset expansion?

(I'm pretty new to python, but it seems similar to ruby)

I thought about a composite dataset, but that seems like overloading that concept. Really the files I'm thinking about uplaoding are 8 independent BAMs or fastqs or whatever – not a set of files that are related to each other.

Any suggestions?

Brad

Brad Langhorst

New England Biolabs

langhorst@neb.com