Hi Luqman,
Were you planning on using Galaxy CloudMan (usegalaxy.org/cloud) and integrating your tool (i.e., Java classes) into the Galaxy that it deploys or simply starting a new EC2 instance and setting up a Galaxy instance from scratch?
Either way, I would suggest trying the process out on your local system first. Adding new tools to Galaxy is pretty straightforward once you have the tool installed on the system, see https://bitbucket.org/galaxy/galaxy-central/wiki/AddToolTutorial. That will also allow you to test the overall functionality offered by Galaxy in the context of your own tool before trying to deploy the whole thing on the cloud.

Once you transition to the cloud though, you would have to repeat the process of installing the tool on the created instance as you have done on the local system followed by copying the tool wrapper created to integrate it with Galaxy. If you started with a clean instance (i.e., not Galaxy CloudMan), after you've installed your tool and integrated it with Galaxy, you could simply use the AWS web console to create an AMI automatically. Then, you would start the newly created AMI, start Galaxy and start processing your data. Note that any data you upload to an instance will be lost once you terminate the instance though, unless you associate an EBS volume with it and have Galaxy store analysis data there (this is easily configured in Galaxy's universe_wsgi.ini file).

Alternatively, you could use CloudMan and add your tool to the set of already existing tools as described here: https://bitbucket.org/galaxy/galaxy-central/wiki/Cloud/CustomizeGalaxyCloud
If using CloudMan, all of the details regarding data persistance and Galaxy setup are automatically managed for you (excluding the addition of your own tool).

Hope this helps,
Enis

On Mon, Mar 21, 2011 at 6:56 PM, Luqman Hodgkinson <luqman@berkeley.edu> wrote:




Dear Galaxy developers,
I have a collection of Java classes linked by a custom dataflow architecture. All classes are in a single project but some of these classes call executables written in languages other than Java. I am investigating the possibility of transitioning to Galaxy. Essentially my desires are to link these Java classes in a DAG representing the dataflow and to execute the dataflow in Amazon EC2. The data flowing along the edges are arbitrary custom Java classes. Additionally it is important to cache intermediate results. The data is acquired from a few web services: iRefIndex, IntAct, UniProt, and Gene Ontology. There are complex software dependencies so after setting up the dataflow I would like to save the entire system as an abstract machine image (AMI). How difficult would this transition be, and would it be worth the effort?
               Sincerely, with best wishes,
               Luqman Hodgkinson,
               Ph.D. student, UC-Berkeley





___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/