On Thu, Sep 22, 2011 at 4:36 PM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
How does your tool handle this at the command line (ignoring Galaxy)? Does it expect a directory name or pattern, or just a really long command line string with many many file names?
Originally I have this config text file which specify a directory. And scripts will look into this directory for specific file name patterns. Since galaxy specifies its own file names, the pattern would not work. I'm actually tailoring my tools for galaxy because my original design is not flexible and it's just not well thought out. With galaxy I'm pretty happy that I get to split my tools up to be more fine-grained to attempt to stick to the Unix tool's "Write programs that do one thing and do it well" philosophy (well, more to the "one" part than to the "well" part). I am thinking of a few work-arounds. 1. Assuming that there is only one user, I could have the user specifies the first file, and than the number of files that would also be inputs, and I can have the tool figure out the file paths from the path of the first, plus the number increments. 2. For the tool prior to this, which generates these files (actually a FTP download tool which downloads .tar.gz), I would have it also to unzip and untar and than concat them. 3. For the tool prior to this, if there is anyway the tool would know which file names it is writing to. (According to what I know, it does not, not according to what's specified under "Number of Output datasets cannot be determined until tool run" ( http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files ), than it can output a text file which list the paths of the file. The subsequent tool can take this single file as input. I don't like 1 since it requires that service is used as a single user (otherwise the numbering could mess up). I don't like 2, since it violates the principle of Unix tools. It doesn't seem like its the design decision the Galaxy team would take. Furthermore, I think unzipping is unnecessarily taking up disk space. My program just parses directly off the gzip, but without unzipping I don't know how to reasonably concat. I like 3 best, but I do not seem to know the paths of the outputs since it's Galaxy which is silently moving and renaming the files behind the scene. Any suggestions? Timothy