Hi Brad
I used LVM2 to create the logical volume.
I re-launched a new Galaxy Cloudman instance since I already removed the previous one.
So, I have a LVM volume of 2 TB (1.8 TB netto)
You can see this in the picture below, 1.5 TB available + 336 GB used = 1.8 TB.
The error/warning = “Did not find a volume attached to instance i-xxxx as device ‘None’, file system ‘galaxyData’ (vols=[]”
If I launch an extra node, the /mnt/galaxyData is nicely mounted onto the node
ubuntu@ip-10-46-134-155:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 15G 12G 3.3G 79% /
devtmpfs 3.7G 116K 3.7G 1% /dev
none 3.8G 0 3.8G 0% /dev/shm
none 3.8G 96K 3.8G 1% /var/run
none 3.8G 0 3.8G 0% /var/lock
none 3.8G 0 3.8G 0% /lib/init/rw
/dev/sdb 414G 201M 393G 1% /mnt
domU-12-31-39-0A-62-12.compute-1.internal:/mnt/galaxyData
1.9T 336G 1.5T 19% /mnt/galaxyData
domU-12-31-39-0A-62-12.compute-1.internal:/mnt/galaxyTools
10G 1.7G 8.4G 17% /mnt/galaxyTools
domU-12-31-39-0A-62-12.compute-1.internal:/mnt/galaxyIndices
700G 654G 47G 94% /mnt/galaxyIndices
domU-12-31-39-0A-62-12.compute-1.internal:/opt/sge
15G 12G 3.3G 79% /opt/sge
Uploading a file is OK but the “Grooming” results in following error
(BTW this grooming succeeds in a “normal” Galaxy Cloudman setup on the same file with the same parameters used)
WARNING:galaxy.datatypes.registry:Overriding conflicting datatype with extension 'coverage', using datatype from /mnt/galaxyData/tmp/tmpGx9fsi.
I then moved the /mnt/galaxyData/tmp/tmpGx9fsi to /mnt/galaxyData/tmp/tmpGx9fsi.old but didn`t help.
I restarted all services (Galaxy, SGE, PostgreSQL) …
SGE Log
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|read job database with 0 entries in 0 seconds
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|E|error opening file "/opt/sge/default/common/./sched_configuration" for reading: No such file or directory
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|E|error opening file "/opt/sge/default/spool/qmaster/./sharetree" for reading: No such file or directory
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|qmaster hard descriptor limit is set to 8192
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|qmaster soft descriptor limit is set to 8192
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|qmaster will use max. 8172 file descriptors for communication
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|qmaster will accept max. 99 dynamic event clients
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|I|starting up GE 6.2u5 (lx24-amd64)
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|W|can't open job sequence number file "jobseqnum": for reading: No such file or directory -- guessing next number
02/15/2012 11:22:08| main|domU-12-31-39-0A-62-12|W|can't open ar sequence number file "arseqnum": for reading: No such file or directory -- guessing next number
02/15/2012 11:22:12|worker|domU-12-31-39-0A-62-12|E|adminhost "domU-12-31-39-0A-62-12.compute-1.internal" already exists
02/15/2012 11:22:13|worker|domU-12-31-39-0A-62-12|E|adminhost "domU-12-31-39-0A-62-12.compute-1.internal" already exists
Uploaded my fastq file (OK) and trying to “Groom”
GALAXY Log
galaxy.jobs.runners.drmaa DEBUG 2012-02-15 11:30:53,425 (30) submitting file /mnt/galaxyTools/galaxy-central/database/pbs/galaxy_30.sh
galaxy.jobs.runners.drmaa DEBUG 2012-02-15 11:30:53,425 (30) command is: python /mnt/galaxyTools/galaxy-central/tools/fastq/fastq_groomer.py '/mnt/galaxyData/files/000/dataset_58.dat' 'illumina' '/mnt/galaxyData/files/000/dataset_59.dat' 'sanger' 'ascii' 'summarize_input'; cd /mnt/galaxyTools/galaxy-central; /mnt/galaxyTools/galaxy-central/set_metadata.sh /mnt/galaxyData/files /mnt/galaxyData/tmp/job_working_directory/000/30 . /mnt/galaxyTools/galaxy-central/universe_wsgi.ini /mnt/galaxyData/tmp/tmp2GBeCB /mnt/galaxyData/tmp/job_working_directory/000/30/galaxy.json /mnt/galaxyData/tmp/job_working_directory/000/30/metadata_in_HistoryDatasetAssociation_59_Q8oYiT,/mnt/galaxyData/tmp/job_working_directory/000/30/metadata_kwds_HistoryDatasetAssociation_59_UXjfqE,/mnt/galaxyData/tmp/job_working_directory/000/30/metadata_out_HistoryDatasetAssociation_59_qWHyc4,/mnt/galaxyData/tmp/job_working_directory/000/30/metadata_results_HistoryDatasetAssociation_59_zGJk7G,,/mnt/galaxyData/tmp/job_working_directory/000/30/metadata_override_HistoryDatasetAssociation_59_KjamX7
galaxy.jobs.runners.drmaa ERROR 2012-02-15 11:30:53,427 Uncaught exception queueing job
Traceback (most recent call last):
File "/mnt/galaxyTools/galaxy-central/lib/galaxy/jobs/runners/drmaa.py", line 133, in run_next
self.queue_job( obj )
File "/mnt/galaxyTools/galaxy-central/lib/galaxy/jobs/runners/drmaa.py", line 213, in queue_job
job_id = self.ds.runJob(jt)
File "/mnt/galaxyTools/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py", line 331, in runJob
_h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate)
File "/mnt/galaxyTools/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py", line 213, in c
return f(*(args + (error_buffer, sizeof(error_buffer))))
File "/mnt/galaxyTools/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py", line 90, in error_check
raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value))
DeniedByDrmException: code 17: error: no suitable queues
148.177.129.210 - - [15/Feb/2012:11:30:56 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
galaxy.web.framework DEBUG 2012-02-15 11:30:59,815 Error: this request returned None from get_history(): http://127.0.0.1:8080/
127.0.0.1 - - [15/Feb/2012:11:30:59 +0000] "GET / HTTP/1.1" 200 - "-" "Python-urllib/2.6"
148.177.129.210 - - [15/Feb/2012:11:31:00 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
148.177.129.210 - - [15/Feb/2012:11:31:04 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
148.177.129.210 - - [15/Feb/2012:11:31:08 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
148.177.129.210 - - [15/Feb/2012:11:31:12 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
148.177.129.210 - - [15/Feb/2012:11:31:17 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
galaxy.web.framework DEBUG 2012-02-15 11:31:19,186 Error: this request returned None from get_history(): http://127.0.0.1:8080/
127.0.0.1 - - [15/Feb/2012:11:31:19 +0000] "GET / HTTP/1.1" 200 - "-" "Python-urllib/2.6"
148.177.129.210 - - [15/Feb/2012:11:31:21 +0000] "POST /root/history_item_updates HTTP/1.0" 200 - "http://ec2-23-20-77-195.compute-1.amazonaws.com/history" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.1) Gecko/20100101 Firefox/10.0.1"
Kind Regards
Yves
-----Original Message-----
From: Brad Chapman [mailto:chapmanb@50mail.com]
Sent: Wednesday, 15 February 2012 02:22
To: Wetzels, Yves [JRDBE Extern]; galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Galaxy Cloudman - How to analyse > 1TB data ?
Yves;
> I am currently investigating if Galaxy Cloudman can help us in analyzing
> large NGS datasets.
>
> I was first impressed by the simple setup, the autoscaling and
> useability of Galaxy Cloudman but soon ran into the EBS limit of 1 TB L
>
> I thought to be clever and umounted the /mnt/galaxyData EBS volume,
> created a logical volume of 2 TB and remounted this volume to
> /mnt/galaxyData.
How did you create this volume? I know there are some tricks to get
around the 1Tb limit:
http://alestic.com/2009/06/ec2-ebs-raid
In the screenshot you sent it looks like Cloudman is a bit confused
about the disk size. The Disk Status lists 1.2Tb out of 668Gb, which
might be the source of your problems.
> All is green as you can see from the picture below but running a tool is
> not possible since Galaxy is not configured to work with logical volume
> I assume.
Can you describe what errors you are seeing?
> It is truly a waste having this fine setup (autoscaling) but this is not
> useable if there is not enough storage ?
>
> Does anybody has experience with this ? Tips, tricks ...
The more general answer is that folks do not normally use EBS this way
since having large permanent EBS filesystems is expensive. S3 stores larger
data, up to 50Tb, at a more reasonable price. S3 files are then copied to a
transient EBS store, processed, and uploaded back to S3. This isn't as
automated since it will be highly dependent on your workflow and what
files you want to save, but might be worth exploring in general when
using EC2.
Hope this helps,
Brad