Re: [galaxy-dev] Validating dynamic inputs stream of consciousness

6 Mar 2013

      On Wed, Mar 6, 2013 at 12:39 PM, James Taylor <james@jamestaylor.org> wrote:
...
On Wed, Mar 6, 2013 at 12:22 PM, John Chilton <chilton@msi.umn.edu> wrote:
...
This whole concept puts a lot of onus on the tool developer. A
biologist who has taken a two week course on perl could probably write
a Galaxy tool, they probably couldn't write a secure tool for a public
LWR. I think some experience in thinking about how to secure web
accessible applications and prevent injection style attacks is needed.
I will update the documentation urging additional caution with respect
to this.
What I'm trying to understand is whether this model for a public LWR
makes sense. It appears that your LWR will take a command line, and
then apply a series of validations to it. This set of validations
would need to be very comprehensive -- perhaps impossibly
comprehensive -- to be secure.
It would have to be impossibly comprehensive for many existing tool
XML files. The LWR documentation however has some advice on
reorganizing tools that makes it quite easy however. The idea is to
move all of the option/argument handling logic into a config file,
pass the config file into your wrapper as the only argument, and then
use optparse/argparse on the contents of the config file.
optparse/argparse then handle all of the validation logic, there is no
chance of an injection causing a new process to be spawned, etc... I
believe this model is pretty simple to implement and quite secure.

I know some members of the Galaxy team would like to get away from
tool wrappers. I am a tool wrapper fan however. In this context
especially it seems a small price to pay for security.
...
To me it would make more sense that the LWR takes an input values dict
and constructs the command line itself after validating everything (it
already has the toolbox, so this should be possible).
This has a conceptual appeal and was my first thought, but it too
would reduce the expressiveness of Galaxy tools. The tool templates
have access to a lot of things - not least among them is app. That is
not something you can send over the wire. It would also mean
implementation-wise that public LWRs are even more different than
traditional LWRs and would need a rewritten client and server.

Nonetheless, I am not opposed to this as an option. If the Galaxy team
wants to refactor all of the cheetah templating and wrapper stuff out
into an easy to use library you pass a dictionary into I would be
happy integrate it into the LWR :). (In this fictitious world where
you are modularizing Galaxy just for me, a library to install toolshed
repositories with dependencies and env files would be really awesome
to augment the LWR with as well :).)
...
...
That said, there have been in the recent past multiple tools on public
Galaxy servers (main included) that were developed by serious
programmers that allowed arbitrary code execution. This is something
the whole community (or at least that subset hosting public servers)
needs to address and take more seriously.
I could not agree more, though I see this as a somewhat different
issue. There are a number of things that would be really helpful here:
- Some automatic validation of command line construction to look for
common exploits (again, impossible to do comprehensively)
- Some kind of sandboxing, through support for chroot, zones, jails,
or (dare I dream) running under native client.
...
there is some malicious command that could get through. I cannot
guarantee that it is secure, but I would be eager for counter examples
or specific issues I can address.
I think this is what concerned me as well. I'm always worried about
security through comprehensive screening, someone almost always finds
a way around it. This is why the original python sandbox failed.
Constructing the command line from validated inputs seems safer (as
long as you trust the template that builds the command line).
...
If we are honest and accept that there are going to security problems
with the tools we wrap, one idea that might be worth pursuing for both
the LWR and Galaxy itself is running tools in chrooted environments or
at least as a different user then the webapp.
On this we completely agree.
Thanks a bunch for the feedback, I really apperciate it.

-John