Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks Enis On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards <gedwards2@gmail.com> wrote:
Enis,
Thanks. Will try that re the storage.
Greg E
On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan <eafgan@emory.edu> wrote:
Hi Greg,
On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards <gedwards2@gmail.com>wrote:
Hi,
I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way.
I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use.
I have a few questions and would appreciate ny insights ..
1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large.
http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/
I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small.
m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much.
Has anyone tried these new Instance Types ? (m1.small/medium)
I have no real experience with these instance types yet either so maybe someone else can chime in on this?
2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ?
You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount & detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate & delete the cluster you created in step 1
If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster.
Best, Enis
Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok.
Thanks ! Greg E
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
Hi Enis, Greg, I've taken stuff from my this email, and previous conversations with Enis and put it in the wiki: http://wiki.g2.bx.psu.edu/Admin/Cloud/CapacityPlanning Please feel free to update/correct/enhance. Dave C. On Mon, Mar 19, 2012 at 2:58 PM, Enis Afgan <eafgan@emory.edu> wrote:
Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks
Enis
On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards <gedwards2@gmail.com> wrote:
Enis,
Thanks. Will try that re the storage.
Greg E
On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan <eafgan@emory.edu> wrote:
Hi Greg,
On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards <gedwards2@gmail.com>wrote:
Hi,
I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way.
I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use.
I have a few questions and would appreciate ny insights ..
1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large.
http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/
I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small.
m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much.
Has anyone tried these new Instance Types ? (m1.small/medium)
I have no real experience with these instance types yet either so maybe someone else can chime in on this?
2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ?
You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount & detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate & delete the cluster you created in step 1
If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster.
Best, Enis
Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok.
Thanks ! Greg E
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- http://galaxyproject.org/GCC2012 <http://galaxyproject.org/wiki/GCC2012> http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/
Just one extra thought on this-- If you leave your instance up all the time it may be worth looking into having a reserved micro instance up as the front end (cheap, or free, with your intro tier) with SGE submission disabled. Then, enable autoscaling(max 1) of m1.large/xlarge instances. -Dannon On Mar 19, 2012, at 7:20 PM, Dave Clements wrote:
Hi Enis, Greg,
I've taken stuff from my this email, and previous conversations with Enis and put it in the wiki:
http://wiki.g2.bx.psu.edu/Admin/Cloud/CapacityPlanning
Please feel free to update/correct/enhance.
Dave C.
On Mon, Mar 19, 2012 at 2:58 PM, Enis Afgan <eafgan@emory.edu> wrote: Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks
Enis
On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards <gedwards2@gmail.com> wrote: Enis,
Thanks. Will try that re the storage.
Greg E
On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan <eafgan@emory.edu> wrote: Hi Greg,
On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards <gedwards2@gmail.com> wrote: Hi,
I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way.
I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use.
I have a few questions and would appreciate ny insights ..
1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large.
http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/
I tried the m1.small and m1.medium with the latest Cloudman AMI galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small.
m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much.
Has anyone tried these new Instance Types ? (m1.small/medium) I have no real experience with these instance types yet either so maybe someone else can chime in on this?
2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ?
You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount & detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate & delete the cluster you created in step 1
If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster.
Best, Enis
Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok.
Thanks ! Greg E
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Greg Edwards, Port Jackson Bioinformatics gedwards2@gmail.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- http://galaxyproject.org/GCC2012 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (3)
-
Dannon Baker
-
Dave Clements
-
Enis Afgan