I have a new node for my cluster I'm testing out. The goal is for it to be mainly for backup/storage of everything.
Specs:
Motherboard Asrock X570M Pro4 with Ryzen 3400G (onboard graphics)
LSI 9211-8i with 8x 5TB 2.5" Seagates (in x16 PCIe slot)
2x WD 500GB NVME drives (cache for ZFS)
2x 1TB Samsung SSDs (ZFS RAID1 for OS/VMs)
Intel X550 10G Ethernet NIC (in 8x PCIe slot)
Running Proxmox 6.1
First - everything works.
here's the ZFS config for the array:
and it works just fine (haven't done any performance tuning yet):
I've been running some large transfers to make sure it's stable and working well before adding it to the cluster and moving all of my data, but I'm finding that the network on this node goes down overnight. The node says that the device is up but no traffic can get through from either direction (proxmox ui won't load, can't ping out from the node). I have to restart it to get the network to work again. And when I restart it, the storage zpool doesn't come back online, though all of the disks show up. I have to restart it 2 or 3 times for the storage pool to come back.
The fact that this only happens at night makes me think it's some default scheduled process that runs on the node in the early morning hours? Is that a thing?
i'm not sure where to start in debugging this. if anyone has any suggestions, that would be super helpful!
Cheers,
Jayson
Specs:
Motherboard Asrock X570M Pro4 with Ryzen 3400G (onboard graphics)
LSI 9211-8i with 8x 5TB 2.5" Seagates (in x16 PCIe slot)
2x WD 500GB NVME drives (cache for ZFS)
2x 1TB Samsung SSDs (ZFS RAID1 for OS/VMs)
Intel X550 10G Ethernet NIC (in 8x PCIe slot)
Running Proxmox 6.1
First - everything works.
here's the ZFS config for the array:
Code:
root@crucible:~# zpool status storage -v
pool: storage
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
nvme0n1p2 ONLINE 0 0 0
nvme1n1p2 ONLINE 0 0 0
cache
nvme0n1p1 ONLINE 0 0 0
nvme1n1p1 ONLINE 0 0 0
errors: No known data errors
and it works just fine (haven't done any performance tuning yet):
Code:
root@crucible:~# zpool iostat storage -v
capacity operations bandwidth
pool alloc free read write read write
------------- ----- ----- ----- ----- ----- -----
storage 2.54T 33.8T 0 294 1.82K 34.0M
raidz2 2.54T 33.8T 0 294 1.65K 34.0M
sda - - 0 36 212 4.25M
sdb - - 0 36 225 4.25M
sdc - - 0 36 211 4.25M
sdd - - 0 36 211 4.25M
sde - - 0 37 215 4.25M
sdf - - 0 36 201 4.25M
sdg - - 0 36 201 4.25M
sdh - - 0 36 207 4.25M
logs - - - - - -
mirror 0 9.50G 0 0 181 357
nvme0n1p2 - - 0 0 90 178
nvme1n1p2 - - 0 0 90 178
cache - - - - - -
nvme0n1p1 55.4G 345G 0 85 44 10.7M
nvme1n1p1 54.6G 345G 0 84 54 10.5M
------------- ----- ----- ----- ----- ----- -----
I've been running some large transfers to make sure it's stable and working well before adding it to the cluster and moving all of my data, but I'm finding that the network on this node goes down overnight. The node says that the device is up but no traffic can get through from either direction (proxmox ui won't load, can't ping out from the node). I have to restart it to get the network to work again. And when I restart it, the storage zpool doesn't come back online, though all of the disks show up. I have to restart it 2 or 3 times for the storage pool to come back.
The fact that this only happens at night makes me think it's some default scheduled process that runs on the node in the early morning hours? Is that a thing?
i'm not sure where to start in debugging this. if anyone has any suggestions, that would be super helpful!
Cheers,
Jayson
Last edited: