ZFS setup with deduplication

delicatepc

New Member
Nov 4, 2011
3
0
1
Hi Folks,

ProxMox Beta 2.0 is some really neat software. I tried a couple different KVM management solutions and none of them seems to hit the spot like proxmox.

So today I am wondering if any one is using ZFS in conjunction with ProxMox? Recently ZFS has become native for linux. I was wondering if anyone has data deduplication option enabled as well?

I am going to attempt to run this. Any suggests or words of advice?

~
D
 
we are using zfs on our solaris nexenta san ;) (via iscsi).


we don't use dedup, It use a looootttt of ram (75gb ram, for 1Tb dedup with 4k block). (disk space is cheaper than ram)

really? I was not aware of this. do all commercial dedup capable SAN´s works like this?
 
Don't know for other system,

but with zfs, for dedup, you need to have a hash of 300bytes in memory for each block you want to dedup.


when you write a new block, the hash is compared with the hashtable of all other blocks.

so you need to keep all the 300bytes hash in memory.

We are working with 4kb block (small filles), but to use less memory, you can use up to 128kb block. (so i'll take 300bytes by 128kb block, better ratio).


Also, dedup slowdown write, because for each write, you need to compute and compare hash.
 
thanks for clarification. did you ever analyzed the deduplication capabilities of btrfs?
 
I spent a few days using zfs in 4/2011, but 3ware raid is better for us as zfs raid can not have disks added.

rsync backups took 3x longer.

tested rsync with drives attached directly to sata on mb , and using a 3ware card with the drives in 'SingleDrive' mode. the rsync's were 56.90MB/s using 3-ware to control the drives [ raid-5 performance set] . the zfs was never more than 13MB/s .

also check these:
http://serverfault.com/questions/190207/how-can-i-add-one-disk-to-an-existing-raidz-zpool
http://lists.freebsd.org/pipermail/freebsd-fs/2009-June/006329.html
http://dtrace.org/blogs/ahl/2008/04/07/expand-o-matic-raid-z/

so as of 2011, zfs + linux are not ready for production use.
 
Last edited:
Thanks for all the replies.

Seems ZFS dedup is no go at this time. Shame. On paper sounds good.

-
dpc
 
I use ZFS Dedup (Nexenta) served to Proxmox by NFS since last year and its cool, but it depens on your scenario, of course. Dedup is a good choice when you need to clone a lot of similar machines, its incredible to copy a hundred of Windows XP 50 GB HD vms and to see that space in use its only 50 GB.

Physical ram needed its not a trivial count, (http://en.wikipedia.org/wiki/ZFS#Deduplication). I've 16GB Ram for 2 TB SCSI HBA LUN, and it works very very well with about a 1.80 ratio.

Spirit, disk space cheaper than Ram... Hum... For me and my experience, its more important to have a SAS or SCSI disk than cheaper and terrible bad SATA disks (We are talking about virtualization), in this case disk space isnt cheaper. Dedup inline (ZFS) needs a lot of RAM and CPU, its true, but it can be the best choice in some cases, and ZFS is for me the best file system. Some day btrfs could relegate it, who knows.

Tom, comercial dedup like Netapp doesnt do inline deduplication; its a batch proc that you execute when you want.

Jesús Feliz.
 
Hi janzun ,


We are also using cloning for our virtual machines (cloning os partition, windows or linux), around 400 cloned vm.
But cloning != dedup.
Cloning and snapshots doesn't take memory. (just a pointer to a base system).


For me, dedup can be usefull if you really have a big dedup ratio, so backup for exemple is a good choice.


if you really use dedup for your 50 windows vm, you dedup ratio should be be higher than 1.8. (like 50.0)




We have 400vm,using for os base system, it's take around 30 GB for the linux vm and 50GB for the windows vm , dedup ratio 1 ;)



But be carefull,if you really have activated dedup for your whole 2TB space, you need 150GB ram with 4KB block or 5GB ram with 128KB blocks.
If you doesn't have enough memory, zfs will put the hash tables on l2arc,then after on disk and i can become verry laggy and unresponsive.
 
Hi spirit,

We have 1.8 ratio for now because i havent 50 VM cloned in this moment, ratio would be so better, of course. My stat with a mixed env:

nmc@labnas:/$ zpool listNAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
ZFSLun 1.98T 684G 1.32T 33% 1.76x ONLINE -
syspool 68G 10.6G 57.4G 15% 1.00x ONLINE -


nmc@labnas:/$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
ZFSLun 1.17T 1.28T 34.5K /volumes/ZFSLun
ZFSLun/vmstore 1.17T 1.28T 1.17T /volumes/ZFSLun/vmstore

So you can see that from a 2TB Lun, I'm using 1.17T but really its only 600GB. Its cool for a mixed virtual environment (Debian, Ubuntu and RHEL) where im using a very expensive and fast SCSI disk pool.

If i need a 50 Windows XP pool, I can clon (For "clon" i want to say copy, cp comand) 50 times de same vm disk without scaring about free space. Yes, I know dedup will be poor when users begin to change their system, but it is minimal in homogeneus environment.

I have default size block, 128kb, and I know what you mean with performance over ZFS Dedup, but it could be cover with about 8/16 GB/Ram in my case. The last NexentaStor version 3.1.X has a Dtrace triger and it'll send you a warning if dedup table couldnt fit it RAM.

Last, i forgot to say in first post that I have 3 SSD 32GB local disk. Two for ZIL log in raid, and one for cache.

In short, dedup is not cheap for virtualization, but it can be a very good choice in some scenarios where you have a good but expensive disk pool.
 
Hi spirit,

We have 1.8 ratio for now because i havent 50 VM cloned in this moment, ratio would be so better, of course. My stat with a mixed env:

nmc@labnas:/$ zpool listNAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
ZFSLun 1.98T 684G 1.32T 33% 1.76x ONLINE -
syspool 68G 10.6G 57.4G 15% 1.00x ONLINE -


nmc@labnas:/$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
ZFSLun 1.17T 1.28T 34.5K /volumes/ZFSLun
ZFSLun/vmstore 1.17T 1.28T 1.17T /volumes/ZFSLun/vmstore

So you can see that from a 2TB Lun, I'm using 1.17T but really its only 600GB. Its cool for a mixed virtual environment (Debian, Ubuntu and RHEL) where im using a very expensive and fast SCSI disk pool.

If i need a 50 Windows XP pool, I can clon (For "clon" i want to say copy, cp comand) 50 times de same vm disk without scaring about free space. Yes, I know dedup will be poor when users begin to change their system, but it is minimal in homogeneus environment.

I have default size block, 128kb, and I know what you mean with performance over ZFS Dedup, but it could be cover with about 8/16 GB/Ram in my case. The last NexentaStor version 3.1.X has a Dtrace triger and it'll send you a warning if dedup table couldnt fit it RAM.

Last, i forgot to say in first post that I have 3 SSD 32GB local disk. Two for ZIL log in raid, and one for cache.

In short, dedup is not cheap for virtualization, but it can be a very good choice in some scenarios where you have a good but expensive disk pool.

Hi janzun,

You use 128Kb, so you have enough memory.

(We use 4Kb so we need more memory, and we have a big storage (64x600GB 15k sas + stec zeus ram zil + 1To L2arc ssd) , so dedup is not an option for us ;)

Just one question, why don't use the cloning feature of zfs ? no need to "cp", i take 1second to create new vm ;) (you can mix dedup and cloning)

Best Regards,

SPiRiT
 
Uou, you've a big and powerfull storage!!, and you need a ZFS Server acording to that (for dedup). More storage and less block size == more ram, of course ;)

About zfs clone... Hum... I dont use it because i believe it only works for a filesystem (?), so it could be ok in a Proxmox disk (FC or iSCSI), but not in NFS share. I can clone a filesystem, but i think i would have to change or add the NFS share in pve storage.

am i right? How can I do it in my case (NFS)?

Thanks! :)
 
I don't personnaly use nfs (iscsi via zvol cloned).
Isn't it possible to mount a mainfolder, and clone sub-directory folders (1 folder by vm) ?
I tell you the question,because i'm planning to work on nexenta-proxmox integration (iscsi + nfs).
 
Mmm, I think there is no problem with iscsi because is a 'complete' disk in which you can do a snapshot clone, but export NFS is refered to a 'folder', and you should to do a whole clone. I dont have the servers to hand now, but dont know if could be possible to do some type of trick/hack for it. Maybe in Nexenta shell... Really dont know.

I'll tell you if i discover something about it.

Regards.
 
I don't personnaly use nfs (iscsi via zvol cloned).
Isn't it possible to mount a mainfolder, and clone sub-directory folders (1 folder by vm) ?
I tell you the question,because i'm planning to work on nexenta-proxmox integration (iscsi + nfs).

Hello

Did you get nexenta-proxmox integration working? Nexenta is the next thing we'll try to use zfs .

Also yesterday I tried Debian GNU/kFreeBSD - squeeze. zfs works but debsd does not have nfs yet . see http://pve.proxmox.com/wiki/ZFS#Debian_GNU.2FkFreeBSD . It'll be great when that ready for nfs / iscsi production use .

I just searched and see that wheezy has a freebsd-nfs-server . SO I may try that before nexenta.
 
Hello

Did you get nexenta-proxmox integration working? Nexenta is the next thing we'll try to use zfs .

Also yesterday I tried Debian GNU/kFreeBSD - squeeze. zfs works but debsd does not have nfs yet . see http://pve.proxmox.com/wiki/ZFS#Debian_GNU.2FkFreeBSD . It'll be great when that ready for nfs / iscsi production use .

I just searched and see that wheezy has a freebsd-nfs-server . SO I may try that before nexenta.


Hi, yes it's working but in beta stage for the moment, so I'll not be ready for proxmox 2.0.
I think 2.1 it'll be ready.

(zvol creation/delete in proxmox pve-manager through nexenta api ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!