Cool, I got a tweet about this tool from @GalaxyProject[1]. To further explain what I'm trying to accomplish here, as I realized not everybody might know what using "Multiple Output Files" and specifically "Number of Output datasets cannot be determined until tool run"[2] entails. The current Barcode Splitter available on Galaxy Main and based on FASTX-toolkit by Assaf Gordon, makes all output files accessible through HTML links. This is not very convenient, as if you want to use, and you probably do, these outputs in a downstream analysis inside Galaxy, your only solution right now is to download the linked files in the HTML output and manually re-import then into Galaxy. The tool I wrote includes the option of writing the output files(splitted FASTA or FASTQ files) with a naming convention that can be used with Galaxy's "Multiple Output Files" so all results files are automatically added to your history. I believe you still can't easily use this tool upstream in a workflow. As I far as I can tell tools without a known number of outputs can't be used upstream in workflows. I do think you can accomplish some automation using the API, although I haven't tested this yet. [1]https://twitter.com/galaxyproject/status/377497531745595392 [2]http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_o... Best, Carlos On Fri, Sep 6, 2013 at 1:58 PM, Carlos Borroto <carlos.borroto@gmail.com> wrote:
Hi,
I was wondering if I could get someone to test this new barcode splitter I wrote. The main reason for me to duplicate the already great fastx-toolkit based splitter, is so I can use galaxy's multiple output capabilities.
You can find this tool in the testtoolshed for now(after some more testing I will moved to the main toolshed): http://testtoolshed.g2.bx.psu.edu/view/cjav/split_by_barcode
Hopefully I got Galaxy's tool dependency system right(it works on my box, not that this says much) and installing this tool should be quite easy.
I have to say big thanks to Biopython and this[1] anonymous soul for making it quite easy to write the actual code doing the heavy lifting.
[1]https://gist.github.com/dgrtwo/3725741
Cheers, Carlos