Hello Leandro,

I've forwarded your request to the galaxy-dev mail list as this is where issues like this are discussed.

I want to make sure I'm clear on this issue.  Can you provide some clarification?

On Apr 11, 2012, at 5:05 AM, Leandro Hermida wrote:

Based on our initial use of the sample tracking system I haven't found
any additional bugs, but we did realize one big functionality that is
missing which makes the system somewhat hard to use.

The current mechanism to link datasets to their corresponding samples
is very cumbersome and takes a long time as it has to be done one
sample at a time with a lot of UI clicking and there is a big
potential for human error.  I would say without an easier way to do
this it would detract people from using the sample tracking system. Do
you have any ideas to change/enhance the way this is done?

The current UI is something that we have plans to improve - the underlying framework is flexible and allows for improvement in several areas.  We will certainly consider any recommendations from the community, including yours below.


My initial
suggestion would be that when you upload the samples using a CSV file
that you can have a field like "DatasetsName" after FolderName where
you can have a colon (:) separated list of file paths from the
configured external service? That would solve it I think and make
things much easier.


I've pasted an example screenshot below for reference.  At the point that you are importing samples from a CSV file, the sequence run is not yet started (generally speaking), so there are no sequence run datasets (file names) yet produced.  So in order to create the csv file with the correct sequence run file names, the user will have to know beforehand the resulting files produced by the sequencer.  Will this always be possible with your "Illumina external service"?  If so, what are the names of the files, and how are they distinguished between runs?  Do they just go in different directories in the sequencer that are also know beforehand?

On the other hand, are you saying that your lab will perform the sequence run and wait until the run is complete and the datasets are produced and then create the csv file, entering the known dataset file names to produce the sample line items for the sequencing request?  The weakness of this process is that your lab's customers will not be able to use Galaxy sample tracking to view the status of their requests throughout their lifecycle since the request's sample line items will not be created until the run is finished.  

I'll work on implementing this enhancement, but I want to understand how you're lab will use it.  Any additional information you can provide will be helpful.

Thanks!

Greg

Add samples to sequencing request "one"

NameStateData LibraryFolderHistoryWorkflow
(required)
For each sample, select the data library and folder in which you would like the run datasets deposited. To automatically run a workflow on run datastets, select a history first and then the desired workflow.

 Layout Grid1

 Sample form layout 1


Select the sample from which the new sample should be copied or leave selection as None to add a new "generic" sample.

  
Click the Add sample button for each new sample and click the Save button when you have finished adding samples.

Import samples from csv file

 
The csv file must be in the following format.
The [:FieldValue] is optional, the named form field will contain the value after the ':' if included.
SampleName,DataLibraryName,FolderName,HistoryName,WorkflowName,Field1Name:Field1Value,Field2Name:Field2Value...