CloudMan - Startup Options - Galaxy Cluster vs Data Cluster
Hi guys, When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings. What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate? Thanks, Greg Text of dialog: Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB Test Cluster: SGE only. No persistent storage is created.
Data Cluster only configures a persistent data volume and SGE. Galaxy Cluster includes a running Galaxy instance in addition to the persistent data volume and SGE configuration components from the Data Cluster option. The Galaxy Cluster option is what most users will probably want to choose. How much space is absolutely dependent on your needs. Keep in mind that you can increase size later if necessary via the admin UI. The maximum volume size (per an amazon limit to EBS volume size) is currently 1TB. -Dannon On Jan 10, 2012, at 2:42 PM, mailing list wrote:
Hi guys,
When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings.
What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate?
Thanks,
Greg
Text of dialog:
Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK
Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string
Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB
Test Cluster: SGE only. No persistent storage is created. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks. So if I just want to run my own program that uses SGE, it sounds like I shouldn't need a Galaxy instance. Is that right? Does the storage space parameter control how much space is at /mnt/galaxyTools? -Greg On Tue, Jan 10, 2012 at 2:50 PM, Dannon Baker <dannonbaker@me.com> wrote:
Data Cluster only configures a persistent data volume and SGE. Galaxy Cluster includes a running Galaxy instance in addition to the persistent data volume and SGE configuration components from the Data Cluster option. The Galaxy Cluster option is what most users will probably want to choose.
How much space is absolutely dependent on your needs. Keep in mind that you can increase size later if necessary via the admin UI. The maximum volume size (per an amazon limit to EBS volume size) is currently 1TB.
-Dannon
On Jan 10, 2012, at 2:42 PM, mailing list wrote:
Hi guys,
When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings.
What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate?
Thanks,
Greg
Text of dialog:
Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK
Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string
Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB
Test Cluster: SGE only. No persistent storage is created. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
It's also odd, if I launch CloudMan from the web interface here: https://biocloudcentral.herokuapp.com I don't see that dialog when I log into <public dns>/cloud. Is it handling that aspect for me? On Tue, Jan 10, 2012 at 2:53 PM, mailing list <margeemail@gmail.com> wrote:
Thanks. So if I just want to run my own program that uses SGE, it sounds like I shouldn't need a Galaxy instance. Is that right?
Does the storage space parameter control how much space is at /mnt/galaxyTools?
-Greg
On Tue, Jan 10, 2012 at 2:50 PM, Dannon Baker <dannonbaker@me.com> wrote:
Data Cluster only configures a persistent data volume and SGE. Galaxy Cluster includes a running Galaxy instance in addition to the persistent data volume and SGE configuration components from the Data Cluster option. The Galaxy Cluster option is what most users will probably want to choose.
How much space is absolutely dependent on your needs. Keep in mind that you can increase size later if necessary via the admin UI. The maximum volume size (per an amazon limit to EBS volume size) is currently 1TB.
-Dannon
On Jan 10, 2012, at 2:42 PM, mailing list wrote:
Hi guys,
When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings.
What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate?
Thanks,
Greg
Text of dialog:
Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK
Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string
Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB
Test Cluster: SGE only. No persistent storage is created. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Correct to the first point. The space available to /mnt/galaxyTools is not configurable through the interface, though you could manually resize a volume if you did need more room. /mnt/galaxyData is the volume referred to in the dialogs you're seeing. -Dannon On Jan 10, 2012, at 2:53 PM, mailing list wrote:
Thanks. So if I just want to run my own program that uses SGE, it sounds like I shouldn't need a Galaxy instance. Is that right?
Does the storage space parameter control how much space is at /mnt/galaxyTools?
-Greg
On Tue, Jan 10, 2012 at 2:50 PM, Dannon Baker <dannonbaker@me.com> wrote:
Data Cluster only configures a persistent data volume and SGE. Galaxy Cluster includes a running Galaxy instance in addition to the persistent data volume and SGE configuration components from the Data Cluster option. The Galaxy Cluster option is what most users will probably want to choose.
How much space is absolutely dependent on your needs. Keep in mind that you can increase size later if necessary via the admin UI. The maximum volume size (per an amazon limit to EBS volume size) is currently 1TB.
-Dannon
On Jan 10, 2012, at 2:42 PM, mailing list wrote:
Hi guys,
When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings.
What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate?
Thanks,
Greg
Text of dialog:
Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK
Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string
Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB
Test Cluster: SGE only. No persistent storage is created. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi guys, Another related question. So if I need a large data set for my program to run on, should that also be stored in GalaxyData? What is the best way to get the data on there? I saw something about FTP but it seemed kind of confusing. Also I chose the data cluster option so I don't think I have an instance of Galaxy running. Is that what would do the FTP? Thanks again, Greg On Tue, Jan 10, 2012 at 3:01 PM, Dannon Baker <dannonbaker@me.com> wrote:
Correct to the first point.
The space available to /mnt/galaxyTools is not configurable through the interface, though you could manually resize a volume if you did need more room. /mnt/galaxyData is the volume referred to in the dialogs you're seeing.
-Dannon
On Jan 10, 2012, at 2:53 PM, mailing list wrote:
Thanks. So if I just want to run my own program that uses SGE, it sounds like I shouldn't need a Galaxy instance. Is that right?
Does the storage space parameter control how much space is at /mnt/galaxyTools?
-Greg
On Tue, Jan 10, 2012 at 2:50 PM, Dannon Baker <dannonbaker@me.com> wrote:
Data Cluster only configures a persistent data volume and SGE. Galaxy Cluster includes a running Galaxy instance in addition to the persistent data volume and SGE configuration components from the Data Cluster option. The Galaxy Cluster option is what most users will probably want to choose.
How much space is absolutely dependent on your needs. Keep in mind that you can increase size later if necessary via the admin UI. The maximum volume size (per an amazon limit to EBS volume size) is currently 1TB.
-Dannon
On Jan 10, 2012, at 2:42 PM, mailing list wrote:
Hi guys,
When I first go to my cloudman page at <public dns>/cloud I get a dialog asking for some settings.
What does the Galaxy Cluster choice do? What does the Data Cluster choice do? Why can't I choose both? How much space should I allocate?
Thanks,
Greg
Text of dialog:
Galaxy Cluster: Galaxy application, available tools, reference datasets, SGE job manager, and a data volume. Specify the initial storage size (in Gigabytes): GBOK
Share-an-Instance Cluster: derive your cluster form someone else's cluster. Specify the provided cluster share-string (for example, cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00): Cluster share-string
Data Cluster: a persistent data volume and SGE. Specify the initial storage size (in Gigabytes): GB
Test Cluster: SGE only. No persistent storage is created. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Yes, galaxyData is the only persistent volume you have in your data only cluster, so everything should go there. SCP should work for getting data there, unless the file is already hosted somewhere in which case wget is probably a better choice. And yes, FTP is configured specifically for Galaxy uploads and is unavailable in the case of a non-galaxy cluster. The usernames used for logging in, etc, come straight from the Galaxy database. On Jan 12, 2012, at 12:56 PM, mailing list wrote:
Hi guys,
Another related question. So if I need a large data set for my program to run on, should that also be stored in GalaxyData? What is the best way to get the data on there?
I saw something about FTP but it seemed kind of confusing. Also I chose the data cluster option so I don't think I have an instance of Galaxy running. Is that what would do the FTP?
Thanks again,
Greg
Ok, yeah, SCP does make sense. Thanks. -Greg On Thu, Jan 12, 2012 at 1:06 PM, Dannon Baker <dannonbaker@me.com> wrote:
Yes, galaxyData is the only persistent volume you have in your data only cluster, so everything should go there. SCP should work for getting data there, unless the file is already hosted somewhere in which case wget is probably a better choice.
And yes, FTP is configured specifically for Galaxy uploads and is unavailable in the case of a non-galaxy cluster. The usernames used for logging in, etc, come straight from the Galaxy database.
On Jan 12, 2012, at 12:56 PM, mailing list wrote:
Hi guys,
Another related question. So if I need a large data set for my program to run on, should that also be stored in GalaxyData? What is the best way to get the data on there?
I saw something about FTP but it seemed kind of confusing. Also I chose the data cluster option so I don't think I have an instance of Galaxy running. Is that what would do the FTP?
Thanks again,
Greg
participants (2)
-
Dannon Baker
-
mailing list