Getting tons of (broken pipes) now on Backups.

devinacosta

Active Member
Aug 3, 2017
65
11
28
47
I have upgraded to the latest Proxmox 5.4-5 and now i'm seeing tons of broken pipes on backups. This seemed to only happen after I upgraded to the 5.4 release. Seems like it starts to copy the data then dies. Any ideas on what is causing this?

pve-manager/5.4-5/c6fdb264 (running kernel: 4.15.18-14-pve)

upload_2019-5-19_6-22-35.png

Task viewer: VM/CT 176 - Backup
OutputStatus
Stop
INFO: starting new backup job: vzdump 176 --mode snapshot --compress lzo --storage vProtect --remove 0
INFO: Starting Backup of VM 176 (qemu)
INFO: Backup started at 2019-05-18 20:38:48
INFO: status = running
INFO: update VM 176: -lock backup
INFO: VM Name: tgrouponline.com
INFO: include disk 'virtio0' 'rbd_vm:vm-176-disk-1' 20G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/vProtect/dump/vzdump-qemu-176-2019_05_18-20_38_48.vma.lzo'
INFO: started backup task '54d4f1d6-a88b-4d8a-998c-738c39c21de5'
INFO: status: 0% (167772160/21474836480), sparse 0% (1310720), duration 3, read/write 55/55 MB/s
INFO: status: 5% (1103101952/21474836480), sparse 3% (783835136), duration 6, read/write 311/50 MB/s
INFO: status: 13% (2852126720/21474836480), sparse 11% (2447163392), duration 9, read/write 583/28 MB/s
INFO: status: 15% (3401580544/21474836480), sparse 13% (2835427328), duration 12, read/write 183/53 MB/s
INFO: status: 16% (3598712832/21474836480), sparse 13% (2835947520), duration 15, read/write 65/65 MB/s
INFO: status: 17% (3783262208/21474836480), sparse 13% (2836688896), duration 18, read/write 61/61 MB/s
INFO: status: 18% (4005560320/21474836480), sparse 13% (2836848640), duration 21, read/write 74/74 MB/s
INFO: status: 19% (4219469824/21474836480), sparse 13% (2837569536), duration 24, read/write 71/71 MB/s
INFO: status: 26% (5771362304/21474836480), sparse 19% (4242907136), duration 27, read/write 517/48 MB/s
INFO: status: 36% (7887257600/21474836480), sparse 29% (6278139904), duration 30, read/write 705/26 MB/s
INFO: status: 37% (8119123968/21474836480), sparse 29% (6278467584), duration 33, read/write 77/77 MB/s
INFO: status: 38% (8355053568/21474836480), sparse 29% (6282293248), duration 36, read/write 78/77 MB/s
INFO: status: 44% (9625927680/21474836480), sparse 34% (7424094208), duration 39, read/write 423/43 MB/s
INFO: status: 57% (12418154496/21474836480), sparse 47% (10209120256), duration 42, read/write 930/2 MB/s
INFO: status: 58% (12482248704/21474836480), sparse 47% (10209161216), duration 54, read/write 5/5 MB/s
INFO: status: 59% (12680429568/21474836480), sparse 47% (10213003264), duration 57, read/write 66/64 MB/s
INFO: status: 60% (12893290496/21474836480), sparse 47% (10225545216), duration 61, read/write 53/50 MB/s
INFO: status: 61% (13103005696/21474836480), sparse 47% (10245505024), duration 65, read/write 52/47 MB/s
INFO: status: 62% (13396606976/21474836480), sparse 48% (10338848768), duration 68, read/write 97/66 MB/s
INFO: status: 77% (16592666624/21474836480), sparse 63% (13534908416), duration 71, read/write 1065/0 MB/s
INFO: status: 79% (17089691648/21474836480), sparse 64% (13891485696), duration 74, read/write 165/46 MB/s
INFO: status: 80% (17280532480/21474836480), sparse 64% (13907001344), duration 77, read/write 63/58 MB/s
TASK ERROR: broken pipe
 
Hi,

if the backup storage is a network share please check the network and if the storage is overloaded.
 
Hi,

if the backup storage is a network share please check the network and if the storage is overloaded.

We are experiencing a similar issue the backup location is Cephfs. Running the individual VM backups they work fine. It is only when they are run part of the scheduled backup do some of them fail with:

INFO: status: 97% (33365688320/34359738368), sparse 73% (25101967360), duration 277, read/write 464/22 MB/s
lzop: No space left on device: <stdout>
INFO: status: 97% (33601617920/34359738368), sparse 73% (25108201472), duration 284, read/write 33/32 MB/s
ERROR: vma_queue_write: write error - Broken pipe
INFO: aborting backup job
ERROR: Backup of VM 200 failed - vma_queue_write: write error - Broken pipe
INFO: Failed at 2019-09-23 10:26:19
INFO: Backup job finished with errors

How would we tell if the issue is that the storage or network is overloaded?
 
Part of scheduled backup:

INFO: Starting Backup of VM 200 (qemu)
INFO: Backup started at 2019-09-23 10:21:32
INFO: status = running
INFO: update VM 200: -lock backup
INFO: VM Name: bills-desktop
INFO: include disk 'scsi0' 'VM:vm-200-disk-0' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/cephfs/backup-20-weekly/dump/vzdump-qemu-200-2019_09_23-10_21_32.vma.lzo'
INFO: started backup task '3e45921a-6839-4ab9-916c-4ab99792a05e'
INFO: status: 0% (180355072/34359738368), sparse 0% (104837120), duration 3, read/write 60/25 MB/s
INFO: status: 1% (352321536/34359738368), sparse 0% (110231552), duration 7, read/write 42/41 MB/s
INFO: status: 2% (726007808/34359738368), sparse 0% (147267584), duration 15, read/write 46/42 MB/s
INFO: status: 3% (1065353216/34359738368), sparse 0% (166731776), duration 26, read/write 30/29 MB/s
INFO: status: 4% (1413480448/34359738368), sparse 0% (201433088), duration 34, read/write 43/39 MB/s
<snip>
INFO: status: 93% (31970951168/34359738368), sparse 69% (23776178176), duration 274, read/write 28/24 MB/s
INFO: status: 97% (33365688320/34359738368), sparse 73% (25101967360), duration 277, read/write 464/22 MB/s
lzop: No space left on device: <stdout>
INFO: status: 97% (33601617920/34359738368), sparse 73% (25108201472), duration 284, read/write 33/32 MB/s
ERROR: vma_queue_write: write error - Broken pipe
INFO: aborting backup job
ERROR: Backup of VM 200 failed - vma_queue_write: write error - Broken pipe
INFO: Failed at 2019-09-23 10:26:19
INFO: Backup job finished with errors

Manual backup:

INFO: starting new backup job: vzdump 200 --storage backup-20-weekly --remove 0 --node pve4 --mode snapshot --compress lzo
INFO: Starting Backup of VM 200 (qemu)
INFO: Backup started at 2019-09-23 10:42:52
INFO: status = running
INFO: update VM 200: -lock backup
INFO: VM Name: bills-desktop
INFO: include disk 'scsi0' 'VM:vm-200-disk-0' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/cephfs/backup-20-weekly/dump/vzdump-qemu-200-2019_09_23-10_42_52.vma.lzo'
INFO: started backup task '52a12d68-024c-4ddb-88a1-4c9e3725ed0f'
INFO: status: 0% (218103808/34359738368), sparse 0% (108740608), duration 3, read/write 72/36 MB/s
INFO: status: 1% (369098752/34359738368), sparse 0% (118784000), duration 6, read/write 50/46 MB/s
INFO: status: 2% (692060160/34359738368), sparse 0% (145252352), duration 12, read/write 53/49 MB/s
<snip>
INFO: status: 97% (33374076928/34359738368), sparse 73% (25091760128), duration 233, read/write 469/27 MB/s
INFO: status: 98% (33999028224/34359738368), sparse 74% (25457483776), duration 239, read/write 104/43 MB/s
INFO: status: 100% (34359738368/34359738368), sparse 75% (25818193920), duration 240, read/write 360/0 MB/s
INFO: transferred 34359 MB in 240 seconds (143 MB/s)
INFO: archive file size: 4.79GB
INFO: Finished Backup of VM 200 (00:04:04)
INFO: Backup finished at 2019-09-23 10:46:56
INFO: Backup job finished successfully
TASK OK
 
Found the issue. There was enough physical space but someone put a quota on the parent folder. Exceeding the quota did not show up until the next day.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!