Periodic I/O Delay Spike Every 5 Minutes

leesteken · Apr 14, 2023

Moonrocks said:
CPU load is constant. The blue spike shows the IO Delay.
The green straight line in the first graph shows CPU Usage. (Or am I wrong?)

You're not wrong, I was looking at the average load and used the wrong term.

Moonrocks said:
However, as I said before there isnt a single CT which has a usage graph that correlates to the IO Delay graph.

I don't see IO Delay graphs for CTs and because of many processes waiting for I/O, it might not show up as high Disk usage (because the wait time lowers it).
Since all processes in CTs do show up in the host, can't you use some standard Linux tool to find out which processes are blocked on I/O every five minutes? I don't know what tool or command to use for this, sorry but I'm sure it must exist.

Moonrocks · Apr 14, 2023

leesteken said:
You're not wrong, I was looking at the average load and used the wrong term.

I don't see IO Delay graphs for CTs and because of many processes waiting for I/O, it might not show up as high Disk usage (because the wait time lowers it).
Since all processes in CTs do show up in the host, can't you use some standard Linux tool to find out which processes are blocked on I/O every five minutes? I don't know what tool or command to use for this, sorry but I'm sure it must exist.

When the spike happens most tasks gets blocked due to unavailable I/O.
Is there a way to find out “what” is causing the I/O Delay spike rather than whats being blocked?

UdoB · Apr 14, 2023

Just a guess out-of-the-blue: any chance you have some monitoring tool installed on each container which updates itself every 5 minutes - effectively in the same second? Perhaps it runs something like "df" every five minutes.

Look for a cron entry like "*/5 * * * * root something" in /etc/cron.d and the other usual places.

If you find something like this you may distribute execution a little bit by assigning "0/5" "1/5" "2/5" "3/5" and "4/5" to different instances.

Good luck

Moonrocks · Apr 14, 2023

UdoB said:
Just a guess out-of-the-blue: any chance you have some monitoring tool installed on each container which updates itself every 5 minutes - effectively in the same second? Perhaps it runs something like "df" every five minutes.

Look for a cron entry like "*/5 * * * * root something" in /etc/cron.d and the other usual places.

If you find something like this you may distribute execution a little bit by assigning "0/5" "1/5" "2/5" "3/5" and "4/5" to different instances.

Good luck

One important details I forgot to add was this.

Before I formatted the 5x 8TB HDDs, I had the HDDs setup as ZFS RAIZ-Z2 without using the Hardware Array (Add Storage option disabled), which I had exported via NFS and then added to the DC via NFS, I was seeing 80-90% and then realised that the performance hit was too big so I went ahead and destroyed the ZFS Pool, rebooted the server created a RAID 5, mount on a dir and add to the node as storage. Things seemed to go to normal in the beginning but then started noticing this short spike every 5 mins.

I am seeing the following cron jobs under /etc/cron.d/zfsutils-linux

Code:

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# TRIM the first Sunday of every month.
24 0 1-7 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/trim ]; then /usr/lib/zfs-linux/trim; fi

# Scrub the second Sunday of every month.
24 0 8-14 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ]; then /usr/lib/zfs-linux/scrub; fi

not sure if this is relevant.

Dunuin · Apr 14, 2023

Moonrocks said:

I am seeing the following cron jobs under /etc/cron.d/zfsutils-linux

Code:

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# TRIM the first Sunday of every month.
24 0 1-7 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/trim ]; then /usr/lib/zfs-linux/trim; fi

# Scrub the second Sunday of every month.
24 0 8-14 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ]; then /usr/lib/zfs-linux/scrub; fi

not sure if this is relevant.

Those are just the default ZFS maintaince tasks and only run once per month.

5x 18TB HDDs in raid5 sounds really terrible. That means you got a 72TB big storage that can only handle ~100 IOPS. Do random 4K reads/writes and you need over 7 years to read or fill that entire array

. Nothing I would like to run multiple OSs on causing massive small random IO.

Moonrocks · Apr 15, 2023

pos-sudo said:
This could potentially means that one of your drives is going to fail sooner or later indeed.

Could this potentially be cause for slow I/O to the Array?

pos-sudo · Apr 15, 2023

Moonrocks said:
Could this potentially be cause for slow I/O to the Array?

Yes indeed. What I think is that after each disk sync (probably every 5 minutes) the "erroneous" disk the I/O occur because of a failure in that disk.

Moonrocks · Apr 15, 2023

Update: I migrated the disks for all CTs to the boot disk (1 TB Samsung SSD) mounted as Thin LVM and the I/O delay spike is gone.

pos-sudo · Apr 16, 2023

Moonrocks said:
Update: I migrated the disks for all CTs to the boot disk (1 TB Samsung SSD) mounted as Thin LVM and the I/O delay spike is gone.

Good to know, I think that this could be prove that it has something to-do with the "failed" disk in the other storage pool.

Moonrocks · Apr 16, 2023

pos-sudo said:
Good to know, I think that this could be prove that it has something to-do with the "failed" disk in the other storage pool.

Replaced the disk. Array is regenerating, will post an update once thats done.

Moonrocks · Apr 17, 2023

Update 2: Replaced all disks, created a new RAID10 array mount as thin-lvm and moved CTs back to the new store. Periodic I/O Spike happening again. The spike seems to be for much shorter period now.

Moonrocks · Apr 17, 2023

Okay so, ran some more tests.
When I try to run yabs on a CT, I/O Loads corresponding to each test.

R+W disk test with 4k block size -> I/O Delay: 0-3%
R+W disk test with 64k block size -> I/O Delay: 10-15%
R+W disk test with 512k block size -> I/O Delay: 95-99%

Looks like the 512k bs tests are completely suffocating the Disk B/W.

Here's a dd for disk speed performed from the same CT:

Code:

$ dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync

1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.27249 s, 472 MB/s

Now, is this expected behaviour (due to low IOPS) on 6x8TB Enterprise SAS HDDs @ 7200RPM in a H/W RAID 10 mounted as Thin-LVM?
Does this indicate that the periodic load spike is happening because a CT is trying to perform a 512k read/write every 5 minutes or is there something wrong with my drives/array?

Dunuin · Apr 17, 2023

The smaller your block size of your benchmark, the more IOPS you hit the array with, the higher your IO delay will become. but doesn't make sense to me, that the 64K block size causes more IO delay than the 4K block size.

Moonrocks · Apr 17, 2023

Dunuin said:
The smaller your block size of your benchmark, the more IOPS you hit the array with, the higher your IO delay will become. but doesn't make sense to me, that the 64K block size causes more IO delay than the 4K block size.

What about disk bw? is it possible that the 512k block size tests are using up all the available bw?

Moonrocks · Apr 17, 2023

On the CT:

fio Disk Speed Tests (Mixed R/W 50/50):

Code:

---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 3.33 MB/s      (834) | 53.74 MB/s     (839)
Write      | 3.36 MB/s      (841) | 54.25 MB/s     (847)
Total      | 6.70 MB/s     (1.6k) | 108.00 MB/s   (1.6k)
           |                      |                    
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 89.76 MB/s     (175) | 128.29 MB/s    (125)
Write      | 94.53 MB/s     (184) | 136.84 MB/s    (133)
Total      | 184.29 MB/s    (359) | 265.14 MB/s    (258)

Please note that there are other VMs running too. I've already decided to upgrade to SSDs, I'm trying to evaluate if the drives are still good and I can still use this array for backups.

Also, is there an easy way to limit IOPS on all CTs to a certain value by default? Its impractical for us to limit IOPS on each CT (existing and new) manually.

Search

Search

Periodic I/O Delay Spike Every 5 Minutes

leesteken

Distinguished Member

Moonrocks

New Member

UdoB

Distinguished Member

Moonrocks

New Member

Dunuin

Distinguished Member

Moonrocks

New Member

pos-sudo

Member

Moonrocks

New Member

pos-sudo

Member

Moonrocks

New Member

Moonrocks

New Member

Moonrocks

New Member

Dunuin

Distinguished Member

Moonrocks

New Member

Moonrocks

New Member

We value your privacy