I think one of the aspects where Galaxy is a bit soft is the ability to do distributed tasks. The current system of split/replicate/merge tasks based on file type is a bit limited and hard for tool developers to expand upon. Distributed computing is a non-trival thing to implement and I think it would be a better use of our time to use an already existing framework. And it would also mean one less API for tool writers to have to develop for.
I was wondering if anybody has looked at Mesos ( http://mesos.apache.org/ ). You can see an overview of the Mesos architecture at https://github.com/apache/mesos/blob/master/docs/Mesos-Architecture.md
The important thing about Mesos is that it provides an API for C/C++, Java/Scala and Python to write distributed frameworks. There are already implementations of frameworks for common parallel programming systems such as:
 - Hadoop (https://github.com/mesos/hadoop)
 - MPI (https://github.com/apache/mesos/blob/master/docs/Running-torque-or-mpi-on-mesos.md)
 - Spark (http://spark-project.org)
And you can find example Python framework at https://github.com/apache/mesos/tree/master/src/examples/python

Integration with Galaxy would have three parts:
1) Add a system config variable to Galaxy called 'MESOS_URL' that is then passed to tool wrappers and allows them to contact the local mesos infrastructure (assuming the system has been configured) or pass a null if the system isn't available. 
2) Write a tool runner that works as a mesos framework to executes single cpu jobs on the distributed system.
3) For instances where mesos is not available at a system wide level (say they only have access to an SGE based cluster), but the user wants to run distributed jobs, write a wrapper that can create a mesos cluster using the existing queueing system. For example, right now I run a Mesos system under the SGE queue system.

I'm curious to see what other people think.

Kyle