CloudMan - Galaxy - Sharing Confusion

Hi guys, I've asked some questions about sharing an instance but it doesn't seem to be working the way I'm expecting (Unfortunately I'm also new to Amazon EC2/S3 so that may be part of my difficulty). I'm thinking maybe if I can explain what I'm trying to do, you guys could tell me the best way to do it: My End Goal: So I want to install a bioinformatics program* (and it's dependencies e.g., R, samtools, biopython, etc) that will use SGE provided by CloudMan. Then once everything is configured and installed, I want to share it with other researchers. I'm envisioning them launching Cloudman via http://biocloudcentral.herokuapp.com, having some method to get at my customizations, loading their own data via SCP and then running it. What I've Tried So Far: I created an instance of CloudMan, I chose the data cluster option on the first dialog. Then I ssh'd in and installed stuff on /mnt/galaxyData Then I clicked the share icon on the cloudman front page. (But when I look in S3 I don't think I'm seeing the programs I installed, and I'm not sure how my /mnt/galaxyData volume can be shared with the sharestring.) Let me know your thoughts. Thanks again for the help, -Greg * I'm also hoping for an easy way to update the program while keeping it shared.

Greg;
I've asked some questions about sharing an instance but it doesn't seem to be working the way I'm expecting (Unfortunately I'm also new to Amazon EC2/S3 so that may be part of my difficulty). I'm thinking maybe if I can explain what I'm trying to do, you guys could tell me the best way to do it:
All of this will work fine the way you expect. It sounds like you might have to dig into the S3 buckets a bit more to get a sense of where everything is.
I created an instance of CloudMan, I chose the data cluster option on the first dialog. Then I ssh'd in and installed stuff on /mnt/galaxyData Then I clicked the share icon on the cloudman front page.
To answer your question from the other thread, you can see the share string for any instances you've created by clicking on the share icon again. It will show you available shared clusters.
(But when I look in S3 I don't think I'm seeing the programs I installed, and I'm not sure how my /mnt/galaxyData volume can be shared with the sharestring.)
All of the high level data for a shared instance will be in S3 folders in: cm-aBigLongUniqueName/Shared/date--time (which is also the share string) You are right that the volume is not in S3. It is an EBS snapshot, which you can see in the EC2 console under the 'Snapshots' link. The description will start with 'CloudMan share-a-cluster.' In the S3 bucket, the file persistent_data.yaml has the snapshot ID and uses this to restart the exact cluster later with your updated volume. In terms of costs, the S3 costs will be minimal since they are small files but the EBS snapshot does cost $.10/Gb/month. Hope this helps, Brad

Hi guys, Sorry if you get this twice (I got an email back after sending it that said it was in moderation because it was too big. So I cut out the other text). -- orig message: Thanks guys. That makes sense now. It sounds like I was almost there. Two follow up questions: 1. Can I terminate my main instance, then start a new Cloudman instance later and enter this share string to get back to where I was? That would save some money vs "stopping" my main instance. 2. If I need to update files on what I've shared, would I just do the updates, create a new share instance and delete the old one? Is that the best way? Thanks again, Greg

Hi Greg, Please see below. Two follow up questions:
1. Can I terminate my main instance, then start a new Cloudman instance later and enter this share string to get back to where I was? That would save some money vs "stopping" my main instance.
You should always terminate a cluster. Stopping an instance is actually not supported within CloudMan...
2. If I need to update files on what I've shared, would I just do the updates, create a new share instance and delete the old one? Is that the best way?
Exactly.
Thanks again,
Greg ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:

On Jan 12, 2012, at 1:29 PM, mailing list wrote:
Hi guys,
I've asked some questions about sharing an instance but it doesn't seem to be working the way I'm expecting (Unfortunately I'm also new to Amazon EC2/S3 so that may be part of my difficulty). I'm thinking maybe if I can explain what I'm trying to do, you guys could tell me the best way to do it:
My End Goal:
So I want to install a bioinformatics program* (and it's dependencies e.g., R, samtools, biopython, etc) that will use SGE provided by CloudMan. Then once everything is configured and installed, I want to share it with other researchers.
For what it's worth, including Galaxy might make this all a little bit more accessible. That said, this should definitely work in the barebones scenario you're suggesting.
I'm envisioning them launching Cloudman via http://biocloudcentral.herokuapp.com, having some method to get at my customizations, loading their own data via SCP and then running it.
What I've Tried So Far:
I created an instance of CloudMan, I chose the data cluster option on the first dialog. Then I ssh'd in and installed stuff on /mnt/galaxyData Then I clicked the share icon on the cloudman front page. (But when I look in S3 I don't think I'm seeing the programs I installed, and I'm not sure how my /mnt/galaxyData volume can be shared with the sharestring.)
You won't see the programs in S3. They're installed on an EBS volume which you've taken a snapshot of as a part of the sharing process. If you click 'Snapshots' in your AWS Management Console, you should see it. The shared cluster string is the correct way to provide your customizations to another user. From what it sounds like, you were *almost* there. If you click the instance sharing icon again, you should see something like: With a list of share strings and snapshot id's. Let me know if this isn't what you're seeing and I can try and troubleshoot. Good luck! -Dannon

Thanks guys. That makes sense now. It sounds like I was almost there. Two follow up questions: 1. Can I terminate my main instance, then start a new Cloudman instance later and enter this share string to get back to where I was? That would save some money vs "stopping" my main instance. 2. If I need to update files on what I've shared, would I just do the updates, create a new share instance and delete the old one? Is that the best way? Thanks again, Greg So one follow up question. Say I need to update something on my shared instance? On Thu, Jan 12, 2012 at 2:46 PM, Dannon Baker <dannonbaker@me.com> wrote:
On Jan 12, 2012, at 1:29 PM, mailing list wrote:
Hi guys,
I've asked some questions about sharing an instance but it doesn't seem to be working the way I'm expecting (Unfortunately I'm also new to Amazon EC2/S3 so that may be part of my difficulty). I'm thinking maybe if I can explain what I'm trying to do, you guys could tell me the best way to do it:
My End Goal:
So I want to install a bioinformatics program* (and it's dependencies e.g., R, samtools, biopython, etc) that will use SGE provided by CloudMan. Then once everything is configured and installed, I want to share it with other researchers.
For what it's worth, including Galaxy might make this all a little bit more accessible. That said, this should definitely work in the barebones scenario you're suggesting.
I'm envisioning them launching Cloudman via http://biocloudcentral.herokuapp.com, having some method to get at my customizations, loading their own data via SCP and then running it.
What I've Tried So Far:
I created an instance of CloudMan, I chose the data cluster option on the first dialog. Then I ssh'd in and installed stuff on /mnt/galaxyData Then I clicked the share icon on the cloudman front page. (But when I look in S3 I don't think I'm seeing the programs I installed, and I'm not sure how my /mnt/galaxyData volume can be shared with the sharestring.)
You won't see the programs in S3. They're installed on an EBS volume which you've taken a snapshot of as a part of the sharing process. If you click 'Snapshots' in your AWS Management Console, you should see it.
The shared cluster string is the correct way to provide your customizations to another user. From what it sounds like, you were *almost* there. If you click the instance sharing icon again, you should see something like:
With a list of share strings and snapshot id's. Let me know if this isn't what you're seeing and I can try and troubleshoot.
Good luck!
-Dannon

On Thu, Jan 12, 2012 at 2:46 PM, Dannon Baker <dannonbaker@me.com> wrote:
On Jan 12, 2012, at 1:29 PM, mailing list wrote:
My End Goal:
So I want to install a bioinformatics program* (and it's dependencies e.g., R, samtools, biopython, etc) that will use SGE provided by CloudMan. Then once everything is configured and installed, I want to share it with other researchers.
For what it's worth, including Galaxy might make this all a little bit more accessible. That said, this should definitely work in the barebones scenario you're suggesting.
I'm curious, how would Galaxy make this more accessible? That might be worth looking into. -Greg

Hi Greg, Galaxy makes the tools more accessible by allowing web access to those tools, it automatically handles job submission and management, as well as allows simpler integration with other tools to form complex analyses. Enis On Thu, Jan 12, 2012 at 8:59 PM, mailing list <margeemail@gmail.com> wrote:
On Thu, Jan 12, 2012 at 2:46 PM, Dannon Baker <dannonbaker@me.com> wrote:
On Jan 12, 2012, at 1:29 PM, mailing list wrote:
My End Goal:
So I want to install a bioinformatics program* (and it's dependencies e.g., R, samtools, biopython, etc) that will use SGE provided by CloudMan. Then once everything is configured and installed, I want to share it with other researchers.
For what it's worth, including Galaxy might make this all a little bit more accessible. That said, this should definitely work in the barebones scenario you're suggesting.
I'm curious, how would Galaxy make this more accessible? That might be worth looking into.
-Greg
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (4)
-
Brad Chapman
-
Dannon Baker
-
Enis Afgan
-
mailing list