More on multiple inputs to a workflow
So we loaded the change set that was developed at the hack-a-thon to our instance of galaxy. For the case where I have a directory of input files, where I want to run the same workflow for each, this change works great. There is one new issue I see, that I'd be interested in input on, and an old issue that I'd like a status on. First the new... So I have a workflow that does analysis on paired end mRNA Seq data. My workflow for single end works great, but with paired end, I can see an obvious way to submit two lists of files (one for the forward and one for the reverse), which can then be submitted to a series of the same workflow. Once I've chosen the forward to allow selection of multiple input files, the option to allow the reverse to also allow selection of multiple input files becomes disabled. This makes perfect sense, as, unless you've implemented it to associate pairs of files together, this probably wouldn't work anyway. Is there anyone out there working on a solution to this? If not, does anyone have suggestion as to how we could make it work? Now the old... A while back I suggested that it would be useful while editing a workflow, if there were a way to grab the name of an input file as a variable, and then use this variable in naming all later output files (this way I can propagate a sample name all the way through the workflow). Especially in a case where we are using the option to select multiple files to run the same workflow, this feature is very important. Is anyone working on this? If not, could someone point me in the right direction if I were to start trying to implement this feature? Thanks, Dave
On May 9, 2011, at 10:49 AM, Dave Walton wrote:
We did consider the case of paired reads, as well as the possibility of allowing for the product of all input dataset lists as we implemented multiple inputs at the hackathon, but chose to focus on the most commonly requested use case of a single ranged input for simplicity. The enhancement that you want to make isn't something I currently have on my list to add, though you're more than welcome to have a go at it. To make it work, you'd have to modify the run workflow template ( templates/workflow/run.mako ), as well as the run workflow controller ( lib/galaxy/web/controllers/workflow.py ) to handle the new logic.
This is something I've wanted to add as a part of a larger set of enhancement to the workflow parameters interface. I believe we talked about this before, as it seems familiar to me, but even now you could approximate some of the functionality you're looking for with workflow parameters. Another tool for managing the outputs of these large multi-input workflow runs is the "Send Results to New History" option in the run workflow interface. When used in combination with multiple inputs, this option actually generates a new numbered history for *each* separate workflow run, so "Sample Processing Run 1", "Sample Processing Run 2" and so on. All that said, if you wanted to implement this, you'd want to look at the run workflow template as well as the editor template (templates/workflow/editor.mako). Good luck, and definitely let us know if you end up writing something that you'd like to contribute back, -Dannon
Thanks very much Dannon. I will at least take a look at the code and see what is involved. The scientists here at the Lab consider this to be fairly important functionality (in both cases). I'll let you know if I move ahead and start to implement it. Thanks again, Dave On 5/13/11 9:30 AM, "Dannon Baker" <dannonbaker@me.com> wrote:
participants (2)
-
Dannon Baker
-
Dave Walton