Hello!
I'm pretty lost. If someone have an idea on this problem?
I've a Proxmox VE 1.8 cluster of 3 nodes (not updated, sorry). The servers are HP DL360 G7 with SAS disk.
On one node ONLY, the openvz CT backup generate errors during scheduled backup (each time). The openvz CT is 40Go.
The VM backup log generate megs output like this below:
Oct 23 13:16:02 INFO: tar: ./lib32/libresolv-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libanl.so.1: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/ld-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/librt.so.1: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_dns-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_nis-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libmemusage.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libSegFault.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libthread_db-1.0.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_hesiod-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_nisplus.so.2: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libcrypt-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./proc/: Warning: Cannot savedir: Input/output error
Oct 23 13:16:02 INFO: tar: ./proc: Warning: Cannot close: Bad file descriptor
Oct 23 13:16:02 INFO: Total bytes written: 40080384000 (38GiB, 7.1MiB/s)
Oct 23 13:16:02 INFO: archive file size: 25.95GB
Oct 23 13:16:02 INFO: delete old backup '/mnt/pve/Backup-VZDump/vzdump-openvz-115-2012_10_23-06_12_27.tgz'
Oct 23 13:17:07 INFO: Logical volume "vzsnap-mama-vs004-0" successfully removed
Oct 23 13:17:07 INFO: Finished Backup of VM 115 (01:45:05)
I checked file system with fsck /dev/mapper/pve-data and files seems ok. I can access the files listed in the log and copy them using rsync on another system.
I've found this in /var/logs/syslog:
Oct 23 13:15:43 cerimes-vs004 kernel: device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: ------------[ cut here ]------------
Oct 23 13:15:45 cerimes-vs004 kernel: WARNING: at fs/buffer.c:1164 mark_buffer_dirty+0x23/0x80()
Oct 23 13:15:45 cerimes-vs004 kernel: Hardware name: ProLiant DL360 G7
Oct 23 13:15:45 cerimes-vs004 kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss sunrpc kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_tcpudp xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables x_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp snd_pcm snd_timer snd soundcore snd_page_alloc psmouse evdev pcspkr serio_raw joydev hpilo container power_meter button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot usbhid hid ata_piix ehci_hcd ata_generic uhci_hcd libata usbcore nls_base bnx2 cciss thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Oct 23 13:15:45 cerimes-vs004 kernel: Pid: 2483, comm: tar Not tainted 2.6.32-4-pve #1
Oct 23 13:15:45 cerimes-vs004 kernel: Call Trace:
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa01352de>] ? ext3_commit_super+0x4f/0x6f [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0136b55>] ? ext3_handle_error+0x83/0xaa [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0136c85>] ? ext3_error+0x83/0x90 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81110a0e>] ? submit_bh+0x11c/0x123
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff811120ae>] ? ll_rw_block+0xb4/0xf8
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0133119>] ? ext3_find_entry+0x3e1/0x560 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81073da6>] ? charge_dcache+0x61/0xb9
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0133ae2>] ? ext3_lookup+0x30/0xe4 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f9412>] ? do_lookup+0xf1/0x178
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f9eab>] ? __link_path_walk+0x689/0x811
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fa1bb>] ? path_walk+0x44/0x85
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fb4db>] ? do_path_lookup+0x20/0x77
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fc81f>] ? user_path_at+0x48/0x79
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81066a16>] ? autoremove_wake_function+0x0/0x2e
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81073d07>] ? do_uncharge_dcache+0x3d/0x51
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f4d50>] ? vfs_fstatat+0x2c/0x57
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f4dd1>] ? sys_newlstat+0x11/0x30
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f1ee8>] ? vfs_write+0xcd/0x102
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f2030>] ? sys_write+0x49/0xc1
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
Oct 23 13:15:45 cerimes-vs004 kernel: ---[ end trace 4c43195544452298 ]---
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-3-pve: 2.6.32-13
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6
I'm pretty lost. If someone have an idea on this problem?
I've a Proxmox VE 1.8 cluster of 3 nodes (not updated, sorry). The servers are HP DL360 G7 with SAS disk.
On one node ONLY, the openvz CT backup generate errors during scheduled backup (each time). The openvz CT is 40Go.
The VM backup log generate megs output like this below:
Oct 23 13:16:02 INFO: tar: ./lib32/libresolv-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libanl.so.1: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/ld-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/librt.so.1: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_dns-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_nis-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libmemusage.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libSegFault.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libthread_db-1.0.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_hesiod-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libnss_nisplus.so.2: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./lib32/libcrypt-2.11.3.so: Warning: Cannot stat: Input/output error
Oct 23 13:16:02 INFO: tar: ./proc/: Warning: Cannot savedir: Input/output error
Oct 23 13:16:02 INFO: tar: ./proc: Warning: Cannot close: Bad file descriptor
Oct 23 13:16:02 INFO: Total bytes written: 40080384000 (38GiB, 7.1MiB/s)
Oct 23 13:16:02 INFO: archive file size: 25.95GB
Oct 23 13:16:02 INFO: delete old backup '/mnt/pve/Backup-VZDump/vzdump-openvz-115-2012_10_23-06_12_27.tgz'
Oct 23 13:17:07 INFO: Logical volume "vzsnap-mama-vs004-0" successfully removed
Oct 23 13:17:07 INFO: Finished Backup of VM 115 (01:45:05)
I checked file system with fsck /dev/mapper/pve-data and files seems ok. I can access the files listed in the log and copy them using rsync on another system.
I've found this in /var/logs/syslog:
Oct 23 13:15:43 cerimes-vs004 kernel: device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: ------------[ cut here ]------------
Oct 23 13:15:45 cerimes-vs004 kernel: WARNING: at fs/buffer.c:1164 mark_buffer_dirty+0x23/0x80()
Oct 23 13:15:45 cerimes-vs004 kernel: Hardware name: ProLiant DL360 G7
Oct 23 13:15:45 cerimes-vs004 kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss sunrpc kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_tcpudp xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables x_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp snd_pcm snd_timer snd soundcore snd_page_alloc psmouse evdev pcspkr serio_raw joydev hpilo container power_meter button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot usbhid hid ata_piix ehci_hcd ata_generic uhci_hcd libata usbcore nls_base bnx2 cciss thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Oct 23 13:15:45 cerimes-vs004 kernel: Pid: 2483, comm: tar Not tainted 2.6.32-4-pve #1
Oct 23 13:15:45 cerimes-vs004 kernel: Call Trace:
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff8111149d>] ? mark_buffer_dirty+0x23/0x80
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa01352de>] ? ext3_commit_super+0x4f/0x6f [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0136b55>] ? ext3_handle_error+0x83/0xaa [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0136c85>] ? ext3_error+0x83/0x90 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81110a0e>] ? submit_bh+0x11c/0x123
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff811120ae>] ? ll_rw_block+0xb4/0xf8
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0133119>] ? ext3_find_entry+0x3e1/0x560 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81073da6>] ? charge_dcache+0x61/0xb9
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffffa0133ae2>] ? ext3_lookup+0x30/0xe4 [ext3]
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f9412>] ? do_lookup+0xf1/0x178
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f9eab>] ? __link_path_walk+0x689/0x811
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fa1bb>] ? path_walk+0x44/0x85
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fb4db>] ? do_path_lookup+0x20/0x77
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810fc81f>] ? user_path_at+0x48/0x79
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81066a16>] ? autoremove_wake_function+0x0/0x2e
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81073d07>] ? do_uncharge_dcache+0x3d/0x51
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f4d50>] ? vfs_fstatat+0x2c/0x57
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f4dd1>] ? sys_newlstat+0x11/0x30
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f1ee8>] ? vfs_write+0xcd/0x102
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff810f2030>] ? sys_write+0x49/0xc1
Oct 23 13:15:45 cerimes-vs004 kernel: [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
Oct 23 13:15:45 cerimes-vs004 kernel: ---[ end trace 4c43195544452298 ]---
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
Oct 23 13:15:45 cerimes-vs004 kernel: EXT3-fs error (device dm-3): ext3_find_entry: reading directory #2624111 offset 0
Oct 23 13:15:45 cerimes-vs004 kernel: Buffer I/O error on device dm-3, logical block 0
Oct 23 13:15:45 cerimes-vs004 kernel: lost page write due to I/O error on dm-3
# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-3-pve: 2.6.32-13
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6