Bug in PVE::Tools::df when adding petabyte scale storage

illustris

Member
Sep 14, 2018
22
4
23
34
I've been seeing these lines on syslog since I added HDFS over an NFS gateway as a storage to proxmox.

Code:
Nov 12 12:08:40 hostname pvestatd[2220]: Use of uninitialized value $avail in int at /usr/share/perl5/PVE/Storage.pm line 1057.
Nov 12 12:08:40 hostname pvestatd[2220]: Use of uninitialized value $used in int at /usr/share/perl5/PVE/Storage.pm line 1058

Found a thread discussing something similar here. Just like that thread, Filesys::Df::df reports my storage sizes accurately, but PVE::Tools::df reports undef for avail and used.

I have two shared storages on my cluster, a 2.2T CEPH and a 2PB HDFS mounted over an NFS gateway. Here are the outputs of Filesys::Df::df and PVE::Tools::df for these.

Code:
# perl -w -e 'use Filesys::Df; use Data::Dumper; print Dumper Filesys::Df::df("/mnt/pve/hdfs",1);'
$VAR1 = {
          'su_blocks' => '2169720884166656',
          'ffree' => 2147483647,
          'bfree' => '265665460043776',
          'used' => '1904055424122880',
          'fper' => 0,
          'su_files' => 2147483647,
          'fused' => 0,
          'user_favail' => 2147483647,
          'blocks' => '2169720884166656',
          'user_fused' => 0,
          'files' => 2147483647,
          'su_bavail' => '265665460043776',
          'user_bavail' => '265665460043776',
          'favail' => 2147483647,
          'bavail' => '265665460043776',
          'user_used' => '1904055424122880',
          'user_files' => 2147483647,
          'user_blocks' => '2169720884166656',
          'per' => 88,
          'su_favail' => 2147483647
        };

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/hdfs");'
$VAR1 = {
          'total' => '265665460043776',
          'avail' => undef,
          'used' => undef
        };

Code:
# perl -w -e 'use Filesys::Df; use Data::Dumper; print Dumper Filesys::Df::df("/mnt/pve/cephfs",1);'
$VAR1 = {
          'bavail' => '2362701774848',
          'used' => 0,
          'su_blocks' => '2362701774848',
          'user_used' => undef,
          'per' => 0,
          'user_blocks' => undef,
          'su_bavail' => '2362701774848',
          'user_bavail' => '2362701774848',
          'bfree' => '2362701774848',
          'blocks' => '2362701774848'
        };

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/cephfs");'
$VAR1 = {
          'avail' => '2362701774848',
          'used' => '0',
          'total' => '2362701774848'
        };

Versions of packages:

Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)
pve-kernel-5.0: 6.0-9
pve-kernel-helper: 6.0-9
pve-kernel-4.15: 5.4-8
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.21-2-pve: 5.0.21-7
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 14.2.4-pve1
ceph-fuse: 14.2.4-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-5
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-7
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-7
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-9
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve1

Other than the syslog spam, this bug makes proxmox report incorrect values for capacity
Screenshot from 2019-11-12 12-34-07.png

The same issue happened on the other thread to someone using google drive through rclone. Rclone exposes google drive as a 1+PB mount point. From this, my best guess is that PVE::Tools::df has a bug that breaks reporting when your storage is beyond a certain size.
 
mhmm... after looking at this again, i have a hunch what it may be...

can you try executing the following perl script (with perl -T ) and post the output? :
Code:
use strict;
use warnings;

use Filesys::Df;
use Data::Dumper;
my $df = Filesys::Df::df("/mnt/pve/hdfs",1);

for my $v (qw(blocks used bavail)) {
    if (defined($df->{$v})) {
        if ($df->{$v} =~ /^(\d+)$/){
            print "$v only numbers:  $1\n";
        } elsif ($df->{$v} =~ /^([\d\.e]+)$/) {
            print "$v scientific:  $1\n";
        } else {
            print "$v unknown: $df->{$v}\n";
        }
    } else {
        print "$v not defined\n";
    }
}
 
Code:
# perl -T test.pl
blocks only numbers:  2169720884166656
used only numbers:  1909127342194688
bavail only numbers:  260593541971968
 
mhm... ok this would indicate that our code should work as expected... i am really baffled as what happens here...
could you try one more thing?

can you replace the line
Code:
my $df = Filesys::Df::df("/mnt/pve/hdfs",1);

with
Code:
use PVE::Tools;
my $df = PVE::Tools::run_fork(sub {
    return Filesys::Df::df("/mnt/pve/hdfs",1);
});

and test again?
if that work also, i am pretty much out of ideas what is happening here...
 
Code:
# perl -T test.pl
blocks unknown: 2.16972088416666e+15
used unknown: 1.90906153251635e+15
bavail only numbers:  260659351650304
 
Ah, bad regex for scientific notation. It should be:
Code:
use strict;
use warnings;

use Filesys::Df;
use Data::Dumper;

use PVE::Tools;
my $df = PVE::Tools::run_fork(sub {
    return Filesys::Df::df("/mnt/pve/hdfs",1);
});

for my $v (qw(blocks used bavail)) {
    if (defined($df->{$v})) {
        if ($df->{$v} =~ /^(\d+)$/){
            print "$v only numbers:  $1\n";
        } elsif ($df->{$v} =~ /^([\d\.e+]+)$/) {
            print "$v scientific:  $1\n";
        } else {
            print "$v unknown: $df->{$v}\n";
        }
    } else {
        print "$v not defined\n";
    }
}

Code:
# perl -T test.pl
blocks scientific:  2.16972088416666e+15
used scientific:  1.90846092915507e+15
bavail only numbers:  261259955011584
 
  • Like
Reactions: t.lamprecht
ah ok, yes this somehow makes sense... so we have to adapt the regex for scientific notation... thanks for helping debug this :)
i'll try to send a patch today... :)
 
  • Like
Reactions: illustris
Hi. I applied this patch on one server.

pvestatd still throws the same error
Code:
Nov 13 13:02:05 hostname pvestatd[3727]: Use of uninitialized value $avail in int at /usr/share/perl5/PVE/Storage.pm line 1057.
Nov 13 13:02:05 hostname pvestatd[3727]: Use of uninitialized value $used in int at /usr/share/perl5/PVE/Storage.pm line 1058.

PVE::Tools::df seems to be fine though

Code:
# perl -e 'use PVE::Tools; use Data::Dumper; print Dumper PVE::Tools::df("/mnt/pve/hdfs");'
$VAR1 = {
          'avail' => '238952455864320',
          'total' => '2.16972088416666e+15',
          'used' => '1.93076842830234e+15'
        };
 
did you restart the relevant daemons (pvestatd, pvedaemon, pveproxy) ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!