Re: [galaxy-user] Inquiry about Cloud Computing on Galaxy
Hi, Unfortunately Galaxy does not have a ready option to be used on a Hadoop based cluster. I am not familiar with methods of interacting with hadoop clusters; however, if there is DRMAA support for a hadoop manager, the interaction should be fairly straightforward. The distribution of job load however, may be more involved. I'm cc-ing the galaxy-user list so this question gets more attention and possibly a more complete answer. Enis On Wed, Dec 1, 2010 at 4:07 PM, Xiandong Meng <xiandongmeng@lbl.gov> wrote:
Hi Enis, Thank you for your great work on Galaxy Cloud. I am running a galaxy server for sequence analysis at Berkeley Lab. The galaxy works wel with our SGE clusters l. Due to the size of data, I am going to move the computation to a cloud computing using a Hadoop cluster. So far I only found information about cloud solution on Amazon EC2. I was wondering if Galaxy has a mechanism to support a general Hadoop cluster just like running a SGE or PBS cluster.
I would greatly apprecuiate if you give me some hints.
Thanks&Regards, X. Meng
On Wed, Dec 1, 2010 at 4:07 PM, Xiandong Meng <xiandongmeng@lbl.gov> wrote:
Hi Enis, Thank you for your great work on Galaxy Cloud. I am running a galaxy server for sequence analysis at Berkeley Lab. The galaxy works wel with our SGE clusters l. Due to the size of data, I am going to move the computation to a cloud computing using a Hadoop cluster. So far I only found information about cloud solution on Amazon EC2. I was wondering if Galaxy has a mechanism to support a general Hadoop cluster just like running a SGE or PBS cluster.
While we are planning to make it possible to run tools in Galaxy that use a hadoop cluster, hadoop is a completely different model from a general purpose cluster, and tools need to be specifically written for use with hadoop. Only some analyses decompose well under the map-reduce model that hadoop requires. Are you implementing your own analysis tools for use with hadoop? -- jt
We have several tools, like Kmer filtering, which can be run on Hadoop cluster. I saw Galaxy Cloud project on your web page. It would be great if you will develop a solution to integrate Hadoop option into Galaxy. Thanks, X. Meng On Wed, Dec 1, 2010 at 2:37 PM, James Taylor <james@jamestaylor.org> wrote:
On Wed, Dec 1, 2010 at 4:07 PM, Xiandong Meng <xiandongmeng@lbl.gov> wrote:
Hi Enis, Thank you for your great work on Galaxy Cloud. I am running a galaxy server for sequence analysis at Berkeley Lab. The galaxy works wel with our SGE clusters l. Due to the size of data, I am going to move the computation to a cloud computing using a Hadoop cluster. So far I only found information about cloud solution on Amazon EC2. I was wondering if Galaxy has a mechanism to support a general Hadoop cluster just like running a SGE or PBS cluster.
While we are planning to make it possible to run tools in Galaxy that use a hadoop cluster, hadoop is a completely different model from a general purpose cluster, and tools need to be specifically written for use with hadoop. Only some analyses decompose well under the map-reduce model that hadoop requires. Are you implementing your own analysis tools for use with hadoop?
-- jt
participants (3)
-
Enis Afgan
-
James Taylor
-
Xiandong Meng