Bug in PVE::Tools::df when adding petabyte scale storage

illustris

Active Member
Sep 14, 2018
22
4
43
35
I've been seeing these lines on syslog since I added HDFS over an NFS gateway as a storage to proxmox.

Code:
Nov 12 12:08:40 hostname pvestatd[2220]: Use of uninitialized value $avail in int at /usr/share/perl5/PVE/Storage.pm line 1057.
Nov 12 12:08:40 hostname pvestatd[2220]: Use of uninitialized value $used in int at /usr/share/perl5/PVE/Storage.pm line 1058

Found a thread discussing something similar here. Just like that thread, Filesys::Df::df reports my storage sizes accurately, but PVE::Tools::df reports undef for avail and used.

I have two shared storages on my cluster, a 2.2T CEPH and a 2PB HDFS mounted over an NFS gateway. Here are the outputs of Filesys::Df::df and PVE::Tools::df for these.

Code:
# perl -w -e 'use Filesys::Df; use Data::Dumper; print Dumper Filesys::Df::df("/mnt/pve/hdfs",1);'
$VAR1 = {
          'su_blocks' => '2169720884166656',
          'ffree' => 2147483647,
          'bfree' => '265665460043776',
          'used' => '1904055424122880',
          'fper' => 0,
          'su_files' => 2147483647,
          'fused' => 0,
          'user_favail' => 2147483647,
          'blocks' => '2169720884166656',
          'user_fused' => 0,
          'files' => 2147483647,
          'su_bavail' => '265665460043776',
          'user_bavail' => '265665460043776',
          'favail' => 2147483647,
          'bavail' => '265665460043776',
          'user_used' => '1904055424122880',
          'user_files' => 2147483647,
          'user_blocks' => '2169720884166656',
          'per' => 88,
          'su_favail' => 2147483647
        };

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/hdfs");'
$VAR1 = {
          'total' => '265665460043776',
          'avail' => undef,
          'used' => undef
        };

Code:
# perl -w -e 'use Filesys::Df; use Data::Dumper; print Dumper Filesys::Df::df("/mnt/pve/cephfs",1);'
$VAR1 = {
          'bavail' => '2362701774848',
          'used' => 0,
          'su_blocks' => '2362701774848',
          'user_used' => undef,
          'per' => 0,
          'user_blocks' => undef,
          'su_bavail' => '2362701774848',
          'user_bavail' => '2362701774848',
          'bfree' => '2362701774848',
          'blocks' => '2362701774848'
        };

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/cephfs");'
$VAR1 = {
          'avail' => '2362701774848',
          'used' => '0',
          'total' => '2362701774848'
        };

Versions of packages:

Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)
pve-kernel-5.0: 6.0-9
pve-kernel-helper: 6.0-9
pve-kernel-4.15: 5.4-8
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.21-2-pve: 5.0.21-7
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 14.2.4-pve1
ceph-fuse: 14.2.4-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-5
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-7
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-7
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-9
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve1

Other than the syslog spam, this bug makes proxmox report incorrect values for capacity
Screenshot from 2019-11-12 12-34-07.png

The same issue happened on the other thread to someone using google drive through rclone. Rclone exposes google drive as a 1+PB mount point. From this, my best guess is that PVE::Tools::df has a bug that breaks reporting when your storage is beyond a certain size.
 
mhmm... after looking at this again, i have a hunch what it may be...

can you try executing the following perl script (with perl -T ) and post the output? :
Code:
use strict;
use warnings;

use Filesys::Df;
use Data::Dumper;
my $df = Filesys::Df::df("/mnt/pve/hdfs",1);

for my $v (qw(blocks used bavail)) {
    if (defined($df->{$v})) {
        if ($df->{$v} =~ /^(\d+)$/){
            print "$v only numbers:  $1\n";
        } elsif ($df->{$v} =~ /^([\d\.e]+)$/) {
            print "$v scientific:  $1\n";
        } else {
            print "$v unknown: $df->{$v}\n";
        }
    } else {
        print "$v not defined\n";
    }
}
 
Code:
# perl -T test.pl
blocks only numbers:  2169720884166656
used only numbers:  1909127342194688
bavail only numbers:  260593541971968
 
mhm... ok this would indicate that our code should work as expected... i am really baffled as what happens here...
could you try one more thing?

can you replace the line
Code:
my $df = Filesys::Df::df("/mnt/pve/hdfs",1);

with
Code:
use PVE::Tools;
my $df = PVE::Tools::run_fork(sub {
    return Filesys::Df::df("/mnt/pve/hdfs",1);
});

and test again?
if that work also, i am pretty much out of ideas what is happening here...
 
Code:
# perl -T test.pl
blocks unknown: 2.16972088416666e+15
used unknown: 1.90906153251635e+15
bavail only numbers:  260659351650304
 
Ah, bad regex for scientific notation. It should be:
Code:
use strict;
use warnings;

use Filesys::Df;
use Data::Dumper;

use PVE::Tools;
my $df = PVE::Tools::run_fork(sub {
    return Filesys::Df::df("/mnt/pve/hdfs",1);
});

for my $v (qw(blocks used bavail)) {
    if (defined($df->{$v})) {
        if ($df->{$v} =~ /^(\d+)$/){
            print "$v only numbers:  $1\n";
        } elsif ($df->{$v} =~ /^([\d\.e+]+)$/) {
            print "$v scientific:  $1\n";
        } else {
            print "$v unknown: $df->{$v}\n";
        }
    } else {
        print "$v not defined\n";
    }
}

Code:
# perl -T test.pl
blocks scientific:  2.16972088416666e+15
used scientific:  1.90846092915507e+15
bavail only numbers:  261259955011584
 
  • Like
Reactions: t.lamprecht
ah ok, yes this somehow makes sense... so we have to adapt the regex for scientific notation... thanks for helping debug this :)
i'll try to send a patch today... :)
 
  • Like
Reactions: illustris
Hi. I applied this patch on one server.

pvestatd still throws the same error
Code:
Nov 13 13:02:05 hostname pvestatd[3727]: Use of uninitialized value $avail in int at /usr/share/perl5/PVE/Storage.pm line 1057.
Nov 13 13:02:05 hostname pvestatd[3727]: Use of uninitialized value $used in int at /usr/share/perl5/PVE/Storage.pm line 1058.

PVE::Tools::df seems to be fine though

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/hdfs");'
$VAR1 = {
          'avail' => '238952455864320',
          'total' => '2.16972088416666e+15',
          'used' => '1.93076842830234e+15'
        };
 
did you restart the relevant daemons (pvestatd, pvedaemon, pveproxy) ?