Yes i added just that, will submit when tested
hey can your script also handles namespaces?
Yes i added just that, will submit when tested
I modifed the python script to handle namespaces https://github.com/BerndHanisch/pve/blob/patch-1/pbs/estiname-size.pyhey can your script also handles namespaces?
all
) and added more features to it. For example, I added the option to pass vmids as comma separated values as well as ranges and I added functions to summarize and order the list by highest consumption.Doesnt work on nested namespaces as described hereI rewrote parts of the script (some standard library functions were shadowed, e.g.all
) and added more features to it. For example, I added the option to pass vmids as comma separated values as well as ranges and I added functions to summarize and order the list by highest consumption.
It can be found here: https://gist.github.com/IamLunchbox/9002b1feb2ca501856b5661c3fe84315
@RolandK : The scripts already do check, if chunks are used several times within a given vmid and only count individual referenced chunks. I think checking, if a specific chunk is only used by one vm alone is probably even more complex.
/mnt/datastore/internal/
└── ns
├── customer1
│ └── ns
│ ├── customer1-hw1
│ ├── customer1-hw2
│ └── customer1-hw3
├── customer2
│ └── ns
│ ├── customer2-hw1
│ ├── customer2-hw2
│ └── customer2-hw3
├── customer3
│ └── ns
│ ├── proxmox4
│ ├── proxmox6
│ ├── proxmox7
│ └── proxmox8
/mnt/datastore/internal/ns/customer
? In that case it should work I think, because it looks for $datastore/ns/$namespace.for customer in /mnt/datastore/internal/ns/*; do python3 [...] $customer; done
, since the script does not support this recursive lookup.for customer in /mnt/datastore/internal/ns/*; do for namespace in /mnt/datastore/internal/ns/$customer/ns/*; do python3 [...] -n $namespace /mnt/datastore/internal/ns/$customer; done; done
Thanks for info. Ive do loops for my customer and do something nastyDid you try to hardcode your datastore path to the nested path, e.g./mnt/datastore/internal/ns/customer
? In that case it should work I think, because it looks for $datastore/ns/$namespace.
You could then loop with bash through all the customersfor customer in /mnt/datastore/internal/ns/*; do python3 [...] $customer; done
, since the script does not support this recursive lookup.
If you want to nest loops even further, you could additionally go forfor customer in /mnt/datastore/internal/ns/*; do for namespace in /mnt/datastore/internal/ns/$customer/ns/*; do python3 [...] -n $namespace /mnt/datastore/internal/ns/$customer; done; done
ESTIMATED DISK USAGE OCCUPIED ON PROXMOX-BACKUP (number of chunks * 4MB) FOR NAMESPACE customer1
customer1 / customer1-hw1 692.465 GB
-----------
TOTAL:
----------
692.465 GB
find -type f -name '*.fidx' -exec proxmox-backup-debug inspect file {} --decode - \; | tr -c -d "[:alnum:]\n" > customer1.chunks
then
:sort u in vim
and
cat customer1.chunks | egrep [0-9a-f]{64} | sed -r 's/(.{4})(.{60})/\/mnt\/datastore\/internal\/.chunks\/\1\/\1\2/' | tr "\n" "\0" | du --files0-from - -schm
255.320 GB total
root@proxmox-backup:~/petr# python3 histo.py /mnt/datastore/internal/.chunks/
File Size Histogram:
0 KB - 999 KB : 1434266 files
1000 KB - 1999 KB : 556446 files
2000 KB - 2999 KB : 96721 files
3000 KB - 3999 KB : 60269 files
4000 KB - 4999 KB : 34895 files
5000 KB - 5999 KB : 0 files
import os
import sys
import math
# Define bucket size in bytes (500 KB)
BUCKET_SIZE = 500 * 1024
def get_file_sizes(directory):
file_sizes = []
for foldername, subfolders, filenames in os.walk(directory):
for filename in filenames:
filepath = os.path.join(foldername, filename)
try:
file_size = os.path.getsize(filepath)
file_sizes.append(file_size)
except OSError as e:
print(f"Error retrieving size for {filepath}: {e}")
return file_sizes
def print_histogram(file_sizes, bucket_size):
if not file_sizes:
print("No files to display in histogram.")
return
# Create buckets
max_size = max(file_sizes)
num_buckets = math.ceil(max_size / bucket_size)
histogram = [0] * (num_buckets + 1)
# Populate buckets
for size in file_sizes:
bucket_index = size // bucket_size
histogram[bucket_index] += 1
# Print histogram
print("File Size Histogram:")
for i, count in enumerate(histogram):
lower_bound = i * bucket_size
upper_bound = (i + 1) * bucket_size - 1
print(f"{lower_bound // 1024} KB - {upper_bound // 1024} KB : {count} files")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python script.py <directory_path>")
sys.exit(1)
directory = sys.argv[1]
if not os.path.isdir(directory):
print(f"The path {directory} is not a valid directory.")
sys.exit(1)
file_sizes = get_file_sizes(directory)
print_histogram(file_sizes, BUCKET_SIZE)