Hi folks,
this scenario occured to me twice (2nd time yesterday).
The first time i was pretty confused but i shrugged off the situation because i was totally clueless on how this could happen.
Since yesterday, and with the second event, i am buffled and need some help.
Here me out!
My System is a : ZFS
- I have a ZFS pool with two namespaces on a DC-Grade (with ppl) NVME device as slog and cache.
- I copied 106G (in both cases) of data from dataset A to dataset B in the same pool (with
- The copy worked flawless and am able to use the new data
- From this point on the NVME device gets hammered with writes
- No other device in the pool has any reads nor writes.
- Here is the strange thing:
I am aware of "write amplification" but was not expecting 35T (nor 130T!) writes for a 106G copy job.
Furthermore i would expect not to see writes hours (or in the first case even days) after the copy job.
For now i remove the L2ARC with the NVME and rerun the copy to check if some mysterious L2ARC refresh loop of death causes all this writes.
I am not bound to this L2ARC (on disk) as i have plenty of RAM available on this system.
I plan to stick with the SLOG (on disk) to protect the other drives from all the sync writes (writes are currently flushed on a 5sec basis).
I am grateful for any hints or advice for this situation.
If anybody needs more information i can ran some tests on this system as it is still in the test-phase.
Thanks in advance
Alex
p.s.
i will post an update as soon i re-ran the test without the L2ARC on disk.
this scenario occured to me twice (2nd time yesterday).
The first time i was pretty confused but i shrugged off the situation because i was totally clueless on how this could happen.
Since yesterday, and with the second event, i am buffled and need some help.
Here me out!
My System is a : ZFS
2.2.8-pve1 on proxmox 8.4.14- I have a ZFS pool with two namespaces on a DC-Grade (with ppl) NVME device as slog and cache.
- I copied 106G (in both cases) of data from dataset A to dataset B in the same pool (with
rsync -ahr ...)- The copy worked flawless and am able to use the new data
- From this point on the NVME device gets hammered with writes
- No other device in the pool has any reads nor writes.
- Here is the strange thing:
- Yesterday i had a whopping 35T writes on the NVME device !!after!! the copy was finished. (35T taken from
smartctl -a /dev/nvme0)- The first time this occured i had a !!whopping!! 130T of writes the NVME device !!after!! the copy was finished. (130T taken from
smartctl -a /dev/nvme0)- In both cases the writes where around 400-500M/s (info taken from
iotop)- In both cases i was not able to find the process which is responsible to this writes (with my knowledge of linux)
- In both cases the writes settled to 0 the moment i rebooted the system.
I am aware of "write amplification" but was not expecting 35T (nor 130T!) writes for a 106G copy job.
Furthermore i would expect not to see writes hours (or in the first case even days) after the copy job.
For now i remove the L2ARC with the NVME and rerun the copy to check if some mysterious L2ARC refresh loop of death causes all this writes.
I am not bound to this L2ARC (on disk) as i have plenty of RAM available on this system.
I plan to stick with the SLOG (on disk) to protect the other drives from all the sync writes (writes are currently flushed on a 5sec basis).
I am grateful for any hints or advice for this situation.
If anybody needs more information i can ran some tests on this system as it is still in the test-phase.
Thanks in advance
Alex
p.s.
i will post an update as soon i re-ran the test without the L2ARC on disk.
Last edited: