Hello Leandro, I've forwarded your request to the galaxy-dev mail list as this is where issues like this are discussed. I want to make sure I'm clear on this issue. Can you provide some clarification? On Apr 11, 2012, at 5:05 AM, Leandro Hermida wrote:
Based on our initial use of the sample tracking system I haven't found any additional bugs, but we did realize one big functionality that is missing which makes the system somewhat hard to use.
The current mechanism to link datasets to their corresponding samples is very cumbersome and takes a long time as it has to be done one sample at a time with a lot of UI clicking and there is a big potential for human error. I would say without an easier way to do this it would detract people from using the sample tracking system. Do you have any ideas to change/enhance the way this is done?
The current UI is something that we have plans to improve - the underlying framework is flexible and allows for improvement in several areas. We will certainly consider any recommendations from the community, including yours below.
My initial suggestion would be that when you upload the samples using a CSV file that you can have a field like "DatasetsName" after FolderName where you can have a colon (:) separated list of file paths from the configured external service? That would solve it I think and make things much easier.
I've pasted an example screenshot below for reference. At the point that you are importing samples from a CSV file, the sequence run is not yet started (generally speaking), so there are no sequence run datasets (file names) yet produced. So in order to create the csv file with the correct sequence run file names, the user will have to know beforehand the resulting files produced by the sequencer. Will this always be possible with your "Illumina external service"? If so, what are the names of the files, and how are they distinguished between runs? Do they just go in different directories in the sequencer that are also know beforehand? On the other hand, are you saying that your lab will perform the sequence run and wait until the run is complete and the datasets are produced and then create the csv file, entering the known dataset file names to produce the sample line items for the sequencing request? The weakness of this process is that your lab's customers will not be able to use Galaxy sample tracking to view the status of their requests throughout their lifecycle since the request's sample line items will not be created until the run is finished. I'll work on implementing this enhancement, but I want to understand how you're lab will use it. Any additional information you can provide will be helpful. Thanks! Greg Add samples to sequencing request "one" Name State Data Library Folder History Workflow (required) For each sample, select the data library and folder in which you would like the run datasets deposited. To automatically run a workflow on run datastets, select a history first and then the desired workflow. Layout Grid1 Sample form layout 1 Copy samples from sample Select the sample from which the new sample should be copied or leave selection as None to add a new "generic" sample. Click the Add sample button for each new sample and click the Save button when you have finished adding samples. Import samples from csv file The csv file must be in the following format. The [:FieldValue] is optional, the named form field will contain the value after the ':' if included. SampleName,DataLibraryName,FolderName,HistoryName,WorkflowName,Field1Name:Field1Value,Field2Name:Field2Value...