Hi all, I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM. Thanks, Joachim. -- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Are you moving the PostgreSQL/MySQL database over to ZFS or the actual storage of your datasets over to ZFS? I am assuming the latter. Remember, ZFS is just a file system, you still need a protocol, like NFS, to export the data to each of your machines. This is going to be your bottleneck, luckily the NFS clients supports write caching as described in our test here[1]. On our HPC cluster, we run xfs filesystem on top of Gluster and we have another filesystem using xfs on top of FraunhoferFS. I did test out ZFS on Linux roughly a year ago and in terms of Read/Write on a single machine it was slower than native ext4 and XFS. This of course was to be expected. However the added benefit of ZFS may be more favorable in your case, Snapshots, disk management, ZIL/SSD caching, etc. If you use a distributed filesystem like Gluster, FraunhoferFS, or even Lustre (2.x branch supports ZFS!) you will most certainly get some very good read/write speeds. However, it sounds like you are using a single machine, so your read/write is going to be slower than native ext4 and XFS --- trust but verify, run your own read/write tests. When I was using ZFS, this ZFSBuild[2] website was most helpful. Let me know if you have any other questions, -Adam [1]: http://www.gluster.org/pipermail/gluster-users/2012-September/034295.html [2]: http://www.zfsbuild.com/ -- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences Research Computing Support Office of Information Technology http://www.oit.uci.edu/rcs/ University of California, Irvine www.ics.uci.edu/~aebrenne/ aebrenne@uci.edu On Tue, Sep 10, 2013 at 3:29 AM, Joachim Jacob | VIB | <joachim.jacob@vib.be> wrote:
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Joachim, Something that may help with benchmarking: At the July 2012 GalaxyAdmins meetup, Anne Black-Ziegelbein talked about how they evaluated filesystem options. She also included benchmarking scripts and data. See http://wiki.galaxyproject.org/Community/GalaxyAdmins/Meetups/2012_07_09 Dave C On Tue, Sep 10, 2013 at 10:12 AM, Adam Brenner <aebrenne@uci.edu> wrote:
optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Are you moving the PostgreSQL/MySQL database over to ZFS or the actual storage of your datasets over to ZFS? I am assuming the latter. Remember, ZFS is just a file system, you still need a protocol, like NFS, to export the data to each of your machines. This is going to be your bottleneck, luckily the NFS clients supports write caching as described in our test here[1].
On our HPC cluster, we run xfs filesystem on top of Gluster and we have another filesystem using xfs on top of FraunhoferFS. I did test out ZFS on Linux roughly a year ago and in terms of Read/Write on a single machine it was slower than native ext4 and XFS. This of course was to be expected. However the added benefit of ZFS may be more favorable in your case, Snapshots, disk management, ZIL/SSD caching, etc. If you use a distributed filesystem like Gluster, FraunhoferFS, or even Lustre (2.x branch supports ZFS!) you will most certainly get some very good read/write speeds.
However, it sounds like you are using a single machine, so your read/write is going to be slower than native ext4 and XFS --- trust but verify, run your own read/write tests.
When I was using ZFS, this ZFSBuild[2] website was most helpful.
Let me know if you have any other questions, -Adam
[1]: http://www.gluster.org/pipermail/gluster-users/2012-September/034295.html [2]: http://www.zfsbuild.com/
-- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences
Research Computing Support Office of Information Technology http://www.oit.uci.edu/rcs/
University of California, Irvine www.ics.uci.edu/~aebrenne/ aebrenne@uci.edu
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to
On Tue, Sep 10, 2013 at 3:29 AM, Joachim Jacob | VIB | <joachim.jacob@vib.be> wrote: the
Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://wiki.galaxyproject.org/
Hi Joachim, At AgResearch we are using ZFS for our HPC storage, which is used by our internal Galaxy instance. Currently we are running on FreeNAS (FreeBSD derivative), but we are in transition to ZFS on Linux. We export the filesystem over NFS (10Gb Ethernet), but not the database (PostgreSQL). In general, you want block storage for a database, so I suggest you look for a solution other than NFS to host that. Our experience with ZFS has been very positive. However, FreeNAS is not really suited to our needs - it's more of a storage appliance, probably great for a home NAS. Hence the planned transition. I strongly recommend you follow the discussion on the ZFS discuss mailing list. There's a lot to learn about ZFS configuration, much more than you will glean from few posts here. http://zfsonlinux.org/lists.html cheers, Simon
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev- bounces@lists.bx.psu.edu] On Behalf Of Joachim Jacob | VIB | Sent: Tuesday, 10 September 2013 10:29 p.m. To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] ZFS storage recommendations
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
Thank you all for the reactions. Some details about my current ZFS and Galaxy setup: - Galaxy runs as a single virtual machine, with currently 20 cores, 80GB RAM. Will be 32 cores and about 160GB RAM soon. - The postgres database is on the virtual machine itself. - The 'files' and 'job_working_dir' are on an NFS exported directory, hosted on the host machine of the guest. - The NFS exported directory is an raidz1 dataset. - The raidsz1 runs on 7 550GB SAS disks, which are (unfortunately) controlled by a RAID hardware controller, but passed as RAID0 (JBOD not available). So raidz1 runs on 7 RAID0 disks (with settings in the hardware controller PERC H700: no read ahead, write through, 8 KB stripe size, disk cache policy enabled). - Compression and deduplication is enabled. - The directory on which the zfs dataset is mounted, is exported using the native linux NFS daemon to the Galaxy virtual machine. The 'zfs sharenfs' did not work (ownerships not set correctly - perhaps need some more investigation, but I found several times reports about sharenfs option in ZFS in linux is not behaving well...). The numbers: - my initial files database (ext4 on RAID5) is 3.0TB in size. On ZFS, with compression and deduplication, this database is *1.8TB *(-40%). - Did not yet provide a SLOG to host the ZIL and a L2ARC, since I have a cleared picture about the performance I can get. Would you advise preference over ZIL on a SLOG, or go for a SSD to host the L2ARC. - The cost for this performance of ZFS in storage is RAM: currently continuously using*284GB RAM* for ZFS! - The write and read speed is from the Galaxy VM over NFS is *~40MB/s and ~100MB/s* (tested by simply copying over rsync - I still need to check the presentation and scripts of Anne Black-Ziegelbein). This is a 66% decrease in previously achieved write and read speed (ext4 on hardware RAID5), but I feel that the benefits (deduplication, backing up via snapshots, data integrity,) outweigh this IO performance. (I am setting this ZFS up on a new server (well, actually 2 years old now, has served on another project well)) Currently our Galaxy uses this zfs with success! For your interest, my settings on the 'galaxydb' zfs tank below. (I was wondering if here some more wizardry can be applied). ************ [root@r910bits ~]# *zfs get all tank/galaxydb* NAME PROPERTY VALUE SOURCE tank/galaxydb type filesystem - tank/galaxydb creation Mon Sep 9 12:44 2013 - tank/galaxydb used 1.81T - tank/galaxydb available 1.66T - tank/galaxydb referenced 1.81T - tank/galaxydb compressratio 1.66x - tank/galaxydb mounted yes - tank/galaxydb quota none default tank/galaxydb reservation none default tank/galaxydb recordsize 128K default tank/galaxydb mountpoint /mnt/galaxydb local tank/galaxydb sharenfs rw=@galaxy local tank/galaxydb checksum on default tank/galaxydb compression lzjb local tank/galaxydb atime on default tank/galaxydb devices on default tank/galaxydb exec on default tank/galaxydb setuid on default tank/galaxydb readonly off default tank/galaxydb zoned off default tank/galaxydb snapdir hidden default tank/galaxydb aclinherit restricted default tank/galaxydb canmount on default tank/galaxydb xattr on default tank/galaxydb copies 1 default tank/galaxydb version 5 - tank/galaxydb utf8only off - tank/galaxydb normalization none - tank/galaxydb casesensitivity sensitive - tank/galaxydb vscan off default tank/galaxydb nbmand off default tank/galaxydb sharesmb off default tank/galaxydb refquota none default tank/galaxydb refreservation none default tank/galaxydb primarycache all default tank/galaxydb secondarycache all default tank/galaxydb usedbysnapshots 0 - tank/galaxydb usedbydataset 1.81T - tank/galaxydb usedbychildren 0 - tank/galaxydb usedbyrefreservation 0 - tank/galaxydb logbias latency default tank/galaxydb dedup on local tank/galaxydb mlslabel none default tank/galaxydb sync standard default tank/galaxydb refcompressratio 1.66x - tank/galaxydb written 1.81T - tank/galaxydb snapdev hidden default ***************** Cheers, Joachim Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team On 09/10/2013 11:29 PM, Guest, Simon wrote:
Hi Joachim,
At AgResearch we are using ZFS for our HPC storage, which is used by our internal Galaxy instance. Currently we are running on FreeNAS (FreeBSD derivative), but we are in transition to ZFS on Linux. We export the filesystem over NFS (10Gb Ethernet), but not the database (PostgreSQL). In general, you want block storage for a database, so I suggest you look for a solution other than NFS to host that.
Our experience with ZFS has been very positive. However, FreeNAS is not really suited to our needs - it's more of a storage appliance, probably great for a home NAS. Hence the planned transition.
I strongly recommend you follow the discussion on the ZFS discuss mailing list. There's a lot to learn about ZFS configuration, much more than you will glean from few posts here. http://zfsonlinux.org/lists.html
cheers, Simon
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev- bounces@lists.bx.psu.edu] On Behalf Of Joachim Jacob | VIB | Sent: Tuesday, 10 September 2013 10:29 p.m. To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] ZFS storage recommendations
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
Hi all, I'm a big fan of ZFS, we have long used it behind Galaxy Main. Some of our older servers are (still) Solaris, and the newest is FreeBSD. I've lately been using SmartOS for virtualization and while it has a drawback as a fileserver (since currently the nfs server can only run in the global zone, which is not ideal on SmartOS), there are other illumos derivatives that would probably be great for this task (e.g. OmniOS). Native ZFS in the OS in which it is developed is a win for me, especially when you are serving via simple NFS. For more complex network filesystems, Linux is probably preferable. I considered a separate ZIL and L2ARC for the latest ZFS server, but DTrace revealed that I probably would not see much of a performance benefit with our usage patterns. The memory usage you're seeing is to be expected - it will pretty much consume whatever is available for caching, but it's available to be freed if needed for something else. I wouldn't suggest rsync for performance testing. I typically do things like timed writing of blocks read from /dev/zero using dd, so that the source filesystem and checksumming algorithm can be taken out of the equation. And dedup/compression will of course cause a significant write penalty. If you can suffer the decreased space optimization, lzjb performs significantly better than gzip. gzip-1 is a nice compromise between the default gzip level and lzjb, as well. --nate On Sep 11, 2013, at 3:45 AM, Joachim Jacob | VIB | wrote:
Thank you all for the reactions.
Some details about my current ZFS and Galaxy setup:
- Galaxy runs as a single virtual machine, with currently 20 cores, 80GB RAM. Will be 32 cores and about 160GB RAM soon. - The postgres database is on the virtual machine itself. - The 'files' and 'job_working_dir' are on an NFS exported directory, hosted on the host machine of the guest. - The NFS exported directory is an raidz1 dataset. - The raidsz1 runs on 7 550GB SAS disks, which are (unfortunately) controlled by a RAID hardware controller, but passed as RAID0 (JBOD not available). So raidz1 runs on 7 RAID0 disks (with settings in the hardware controller PERC H700: no read ahead, write through, 8 KB stripe size, disk cache policy enabled). - Compression and deduplication is enabled. - The directory on which the zfs dataset is mounted, is exported using the native linux NFS daemon to the Galaxy virtual machine. The 'zfs sharenfs' did not work (ownerships not set correctly - perhaps need some more investigation, but I found several times reports about sharenfs option in ZFS in linux is not behaving well...).
The numbers: - my initial files database (ext4 on RAID5) is 3.0TB in size. On ZFS, with compression and deduplication, this database is *1.8TB *(-40%). - Did not yet provide a SLOG to host the ZIL and a L2ARC, since I have a cleared picture about the performance I can get. Would you advise preference over ZIL on a SLOG, or go for a SSD to host the L2ARC. - The cost for this performance of ZFS in storage is RAM: currently continuously using*284GB RAM* for ZFS! - The write and read speed is from the Galaxy VM over NFS is *~40MB/s and ~100MB/s* (tested by simply copying over rsync - I still need to check the presentation and scripts of Anne Black-Ziegelbein). This is a 66% decrease in previously achieved write and read speed (ext4 on hardware RAID5), but I feel that the benefits (deduplication, backing up via snapshots, data integrity,) outweigh this IO performance.
(I am setting this ZFS up on a new server (well, actually 2 years old now, has served on another project well))
Currently our Galaxy uses this zfs with success!
For your interest, my settings on the 'galaxydb' zfs tank below. (I was wondering if here some more wizardry can be applied).
************ [root@r910bits ~]# *zfs get all tank/galaxydb* NAME PROPERTY VALUE SOURCE tank/galaxydb type filesystem - tank/galaxydb creation Mon Sep 9 12:44 2013 - tank/galaxydb used 1.81T - tank/galaxydb available 1.66T - tank/galaxydb referenced 1.81T - tank/galaxydb compressratio 1.66x - tank/galaxydb mounted yes - tank/galaxydb quota none default tank/galaxydb reservation none default tank/galaxydb recordsize 128K default tank/galaxydb mountpoint /mnt/galaxydb local tank/galaxydb sharenfs rw=@galaxy local tank/galaxydb checksum on default tank/galaxydb compression lzjb local tank/galaxydb atime on default tank/galaxydb devices on default tank/galaxydb exec on default tank/galaxydb setuid on default tank/galaxydb readonly off default tank/galaxydb zoned off default tank/galaxydb snapdir hidden default tank/galaxydb aclinherit restricted default tank/galaxydb canmount on default tank/galaxydb xattr on default tank/galaxydb copies 1 default tank/galaxydb version 5 - tank/galaxydb utf8only off - tank/galaxydb normalization none - tank/galaxydb casesensitivity sensitive - tank/galaxydb vscan off default tank/galaxydb nbmand off default tank/galaxydb sharesmb off default tank/galaxydb refquota none default tank/galaxydb refreservation none default tank/galaxydb primarycache all default tank/galaxydb secondarycache all default tank/galaxydb usedbysnapshots 0 - tank/galaxydb usedbydataset 1.81T - tank/galaxydb usedbychildren 0 - tank/galaxydb usedbyrefreservation 0 - tank/galaxydb logbias latency default tank/galaxydb dedup on local tank/galaxydb mlslabel none default tank/galaxydb sync standard default tank/galaxydb refcompressratio 1.66x - tank/galaxydb written 1.81T - tank/galaxydb snapdev hidden default *****************
Cheers, Joachim
Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
On 09/10/2013 11:29 PM, Guest, Simon wrote:
Hi Joachim,
At AgResearch we are using ZFS for our HPC storage, which is used by our internal Galaxy instance. Currently we are running on FreeNAS (FreeBSD derivative), but we are in transition to ZFS on Linux. We export the filesystem over NFS (10Gb Ethernet), but not the database (PostgreSQL). In general, you want block storage for a database, so I suggest you look for a solution other than NFS to host that.
Our experience with ZFS has been very positive. However, FreeNAS is not really suited to our needs - it's more of a storage appliance, probably great for a home NAS. Hence the planned transition.
I strongly recommend you follow the discussion on the ZFS discuss mailing list. There's a lot to learn about ZFS configuration, much more than you will glean from few posts here. http://zfsonlinux.org/lists.html
cheers, Simon
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev- bounces@lists.bx.psu.edu] On Behalf Of Joachim Jacob | VIB | Sent: Tuesday, 10 September 2013 10:29 p.m. To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] ZFS storage recommendations
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thanks Nate. Compression is currently with lzjb. Will do some testing with dd. Joachim Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team On Wed 11 Sep 2013 04:03:55 PM CEST, Nate Coraor wrote:
Hi all,
I'm a big fan of ZFS, we have long used it behind Galaxy Main. Some of our older servers are (still) Solaris, and the newest is FreeBSD.
I've lately been using SmartOS for virtualization and while it has a drawback as a fileserver (since currently the nfs server can only run in the global zone, which is not ideal on SmartOS), there are other illumos derivatives that would probably be great for this task (e.g. OmniOS). Native ZFS in the OS in which it is developed is a win for me, especially when you are serving via simple NFS. For more complex network filesystems, Linux is probably preferable.
I considered a separate ZIL and L2ARC for the latest ZFS server, but DTrace revealed that I probably would not see much of a performance benefit with our usage patterns. The memory usage you're seeing is to be expected - it will pretty much consume whatever is available for caching, but it's available to be freed if needed for something else.
I wouldn't suggest rsync for performance testing. I typically do things like timed writing of blocks read from /dev/zero using dd, so that the source filesystem and checksumming algorithm can be taken out of the equation. And dedup/compression will of course cause a significant write penalty. If you can suffer the decreased space optimization, lzjb performs significantly better than gzip. gzip-1 is a nice compromise between the default gzip level and lzjb, as well.
--nate
On Sep 11, 2013, at 3:45 AM, Joachim Jacob | VIB | wrote:
Thank you all for the reactions.
Some details about my current ZFS and Galaxy setup:
- Galaxy runs as a single virtual machine, with currently 20 cores, 80GB RAM. Will be 32 cores and about 160GB RAM soon. - The postgres database is on the virtual machine itself. - The 'files' and 'job_working_dir' are on an NFS exported directory, hosted on the host machine of the guest. - The NFS exported directory is an raidz1 dataset. - The raidsz1 runs on 7 550GB SAS disks, which are (unfortunately) controlled by a RAID hardware controller, but passed as RAID0 (JBOD not available). So raidz1 runs on 7 RAID0 disks (with settings in the hardware controller PERC H700: no read ahead, write through, 8 KB stripe size, disk cache policy enabled). - Compression and deduplication is enabled. - The directory on which the zfs dataset is mounted, is exported using the native linux NFS daemon to the Galaxy virtual machine. The 'zfs sharenfs' did not work (ownerships not set correctly - perhaps need some more investigation, but I found several times reports about sharenfs option in ZFS in linux is not behaving well...).
The numbers: - my initial files database (ext4 on RAID5) is 3.0TB in size. On ZFS, with compression and deduplication, this database is *1.8TB *(-40%). - Did not yet provide a SLOG to host the ZIL and a L2ARC, since I have a cleared picture about the performance I can get. Would you advise preference over ZIL on a SLOG, or go for a SSD to host the L2ARC. - The cost for this performance of ZFS in storage is RAM: currently continuously using*284GB RAM* for ZFS! - The write and read speed is from the Galaxy VM over NFS is *~40MB/s and ~100MB/s* (tested by simply copying over rsync - I still need to check the presentation and scripts of Anne Black-Ziegelbein). This is a 66% decrease in previously achieved write and read speed (ext4 on hardware RAID5), but I feel that the benefits (deduplication, backing up via snapshots, data integrity,) outweigh this IO performance.
(I am setting this ZFS up on a new server (well, actually 2 years old now, has served on another project well))
Currently our Galaxy uses this zfs with success!
For your interest, my settings on the 'galaxydb' zfs tank below. (I was wondering if here some more wizardry can be applied).
************ [root@r910bits ~]# *zfs get all tank/galaxydb* NAME PROPERTY VALUE SOURCE tank/galaxydb type filesystem - tank/galaxydb creation Mon Sep 9 12:44 2013 - tank/galaxydb used 1.81T - tank/galaxydb available 1.66T - tank/galaxydb referenced 1.81T - tank/galaxydb compressratio 1.66x - tank/galaxydb mounted yes - tank/galaxydb quota none default tank/galaxydb reservation none default tank/galaxydb recordsize 128K default tank/galaxydb mountpoint /mnt/galaxydb local tank/galaxydb sharenfs rw=@galaxy local tank/galaxydb checksum on default tank/galaxydb compression lzjb local tank/galaxydb atime on default tank/galaxydb devices on default tank/galaxydb exec on default tank/galaxydb setuid on default tank/galaxydb readonly off default tank/galaxydb zoned off default tank/galaxydb snapdir hidden default tank/galaxydb aclinherit restricted default tank/galaxydb canmount on default tank/galaxydb xattr on default tank/galaxydb copies 1 default tank/galaxydb version 5 - tank/galaxydb utf8only off - tank/galaxydb normalization none - tank/galaxydb casesensitivity sensitive - tank/galaxydb vscan off default tank/galaxydb nbmand off default tank/galaxydb sharesmb off default tank/galaxydb refquota none default tank/galaxydb refreservation none default tank/galaxydb primarycache all default tank/galaxydb secondarycache all default tank/galaxydb usedbysnapshots 0 - tank/galaxydb usedbydataset 1.81T - tank/galaxydb usedbychildren 0 - tank/galaxydb usedbyrefreservation 0 - tank/galaxydb logbias latency default tank/galaxydb dedup on local tank/galaxydb mlslabel none default tank/galaxydb sync standard default tank/galaxydb refcompressratio 1.66x - tank/galaxydb written 1.81T - tank/galaxydb snapdev hidden default *****************
Cheers, Joachim
Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
On 09/10/2013 11:29 PM, Guest, Simon wrote:
Hi Joachim,
At AgResearch we are using ZFS for our HPC storage, which is used by our internal Galaxy instance. Currently we are running on FreeNAS (FreeBSD derivative), but we are in transition to ZFS on Linux. We export the filesystem over NFS (10Gb Ethernet), but not the database (PostgreSQL). In general, you want block storage for a database, so I suggest you look for a solution other than NFS to host that.
Our experience with ZFS has been very positive. However, FreeNAS is not really suited to our needs - it's more of a storage appliance, probably great for a home NAS. Hence the planned transition.
I strongly recommend you follow the discussion on the ZFS discuss mailing list. There's a lot to learn about ZFS configuration, much more than you will glean from few posts here. http://zfsonlinux.org/lists.html
cheers, Simon
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev- bounces@lists.bx.psu.edu] On Behalf Of Joachim Jacob | VIB | Sent: Tuesday, 10 September 2013 10:29 p.m. To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] ZFS storage recommendations
Hi all,
I am performing some tests to move my galaxy database to ZFS. Does anybody have experience with ZFS on linux, and some recommendations/experiences to optimize performance? The purpose is to share the database over NFS to the Galaxy VM.
Thanks, Joachim.
-- Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. =======================================================================
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (5)
-
Adam Brenner
-
Dave Clements
-
Guest, Simon
-
Joachim Jacob | VIB |
-
Nate Coraor