New subject: Reducing costs in Cloud Galaxy

19 Mar 2012


      Greg,
Regarding the performance of different types of instances, I came across
this and thought you might potentially find it useful:
http://cloudharmony.com/benchmarks

Enis

On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards <gedwards2@gmail.com> wrote:
...
Enis,
Thanks. Will try that re the storage.
Greg E
On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan <eafgan@emory.edu> wrote:
...
Hi Greg,
On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards <gedwards2@gmail.com>wrote:
...
Hi,
I've got an implementation of some proteomics tools going well in Galaxy
on AWS EC2 under Cloudman. Thanks for the help along the way.
I need to drive the costs down a bit. I'm using an m1.large AMI and it's
costing about $180 - $200 / month. This is about 55% storage and 45%
instance costs. That's peanuts in some senses but for now we need to get it
down so that it comes out of petty cash for the department, while the case
is proven for it's use.
I have a few questions and would appreciate ny insights ..
1. AWS has just released an m1.medium and m1.small instance type, which
are 1/2 and 1/4 the cost of m1.large.
http://aws.amazon.com/ec2/instance-types/
http://aws.amazon.com/ec2/pricing/
I tried the m1.small and m1.medium with the latest Cloudman AMI *  *galaxy-cloudman-2011-03-22
(ami-da58aab3)
All seemed to install ok, but the Tools took up tp 30 minutes to start
execution on m1.medium, and never started on m1.small.
m1.medium only added about 15% to run times compared with m1.large,
can't say for m1.small. t1.micro does run (and for free in my Free Tier
first year) but blows execution times out by a factor of about 3 which is
too much.
Has anyone tried these new Instance Types ? (m1.small/medium)
I have no real experience with these instance types yet either so maybe
someone else can chime in on this?
...
2. The vast majority of the storage costs are fro the Gemome databases
in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to
the bare essentials ?
You can do this manually:
1. Start a new Galaxy cluster (ie, one you can easily delete later)
2. ssh into the master instance and delete whatever genomes you don't
need/want (these are all located under /mnt/galaxyIndices)
3. Create a new EBS volume of size that'll fit whatever's left on the
original volume, attach it and mount it
4. Copy over the data from the original volume to the new one while
keeping the directory structure the same (rsync is probably the best tool
for this)
5. Unmount & detach the new volume; create a snapshot from it
6. For the cluster you want to keep around (while it is terminated), edit
persistent_data.yaml in it's bucket on S3 and replace the existing snap ID
for the galaxyIndices with the snapshot ID you got in the previous step
7. Start that cluster and you should have a file system from the new
snapshot mounted.
8. Terminate & delete the cluster you created in step 1
If you don't want to have to do this the first time around on your custom
cluster, you can first try it with another temporary cluster and make sure
it all works as expected and then move on to the real cluster.
Best,
Enis
...
Using m1.small/medium and getting rid of the 700GB would being my costs
down to say $50 / month which is ok.
Thanks !
Greg E
--
Greg Edwards,
Port Jackson Bioinformatics
gedwards2@gmail.com
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
--
Greg Edwards,
Port Jackson Bioinformatics
gedwards2@gmail.com

Re: [galaxy-dev] Reducing costs in Cloud Galaxy

Enis Afgan

Dave Clements

Dannon Baker

tags

participants (3)