Hi all,
Another PROXMOX noob here. My host was getting to a "halt" state in the past few days. There's no specific timing, sometimes it halts after 18 hours, sometimes in 2 days, but I've noticed that after rebooting the system and checking the logs, the last logged lines are always related with journal rotations. During these situations, the only solution is to manually press the reset button. I'm only hosting KVM guests.
I've found some threads from some guys with similar issues and they also reported logs with journal rotation tasks logged just at the end of their logs.
Here is the relevant portion of my syslog: (I've marked in red the last line before and the first one after the reboot)
Here is my pveversion -v
# pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-19-pve
proxmox-ve-2.6.32: 2.3-96
pve-kernel-2.6.32-19-pve: 2.6.32-96
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-20
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-7
vncterm: 1.0-4
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-10
ksm-control-daemon: 1.1-1
#
Any ideas of what could be happening?
Regards,
Another PROXMOX noob here. My host was getting to a "halt" state in the past few days. There's no specific timing, sometimes it halts after 18 hours, sometimes in 2 days, but I've noticed that after rebooting the system and checking the logs, the last logged lines are always related with journal rotations. During these situations, the only solution is to manually press the reset button. I'm only hosting KVM guests.
I've found some threads from some guys with similar issues and they also reported logs with journal rotation tasks logged just at the end of their logs.
Here is the relevant portion of my syslog: (I've marked in red the last line before and the first one after the reboot)
Aug 12 00:17:01 pmx3host /USR/SBIN/CRON[94720]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 00:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 00:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 00:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376279184.400364 Aug 12 00:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376271984.400472 Aug 12 00:49:53 pmx3host pvedaemon[80670]: <root@pam> successful auth for user 'crioboo@pve' Aug 12 00:58:06 pmx3host pvedaemon[80656]: <root@pam> successful auth for user 'crioboo@pve' Aug 12 01:17:01 pmx3host /USR/SBIN/CRON[98222]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 01:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 01:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 01:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376282784.400572 Aug 12 01:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376275584.400502 Aug 12 02:17:01 pmx3host /USR/SBIN/CRON[101315]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 02:41:38 pmx3host pvedaemon[80670]: <root@pam> successful auth for user 'crioboo@pve' Aug 12 02:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 02:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 02:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376286384.400474 Aug 12 02:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376279184.400364 Aug 12 02:51:25 pmx3host pvedaemon[80670]: <crioboo@pve> starting task UPIDmx3host:0001940D:011E1FB7:520877DD:qmstop:101:crioboo@pve: Aug 12 02:51:25 pmx3host pvedaemon[103437]: stop VM 101: UPIDmx3host:0001940D:011E1FB7:520877DD:qmstop:101:crioboo@pve: Aug 12 02:51:25 pmx3host kernel: vmbr0: port 2(tap101i0) entering disabled state Aug 12 02:51:25 pmx3host kernel: vmbr0: port 2(tap101i0) entering disabled state Aug 12 02:51:25 pmx3host ntpd[1776]: Deleting interface #11 tap101i0, fe80::9834:6ff:feee:ff20#123, interface stats: received=0, sent=0, dropped=0, active_time=186600 secs Aug 12 02:51:26 pmx3host pvedaemon[80670]: <crioboo@pve> end task UPIDmx3host:0001940D:011E1FB7:520877DD:qmstop:101:crioboo@pve: OK Aug 12 02:51:37 pmx3host pvedaemon[103455]: start VM 101: UPIDmx3host:0001941F:011E247C:520877E9:qmstart:101:crioboo@pve: Aug 12 02:51:37 pmx3host pvedaemon[81161]: <crioboo@pve> starting task UPIDmx3host:0001941F:011E247C:520877E9:qmstart:101:crioboo@pve: Aug 12 02:51:38 pmx3host kernel: device tap101i0 entered promiscuous mode Aug 12 02:51:38 pmx3host kernel: vmbr0: port 2(tap101i0) entering forwarding state Aug 12 02:51:38 pmx3host pvedaemon[81161]: <crioboo@pve> end task UPIDmx3host:0001941F:011E247C:520877E9:qmstart:101:crioboo@pve: OK Aug 12 02:51:48 pmx3host kernel: tap101i0: no IPv6 routers present Aug 12 02:56:25 pmx3host ntpd[1776]: Listen normally on 13 tap101i0 fe80::ac1c:d3ff:fece:89ed UDP 123 Aug 12 03:17:01 pmx3host /USR/SBIN/CRON[104871]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 03:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 03:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 03:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376289984.400500 Aug 12 03:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376282784.400572 Aug 12 04:17:01 pmx3host /USR/SBIN/CRON[107908]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 04:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 04:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 04:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376293584.424462 Aug 12 04:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376286384.400474 Aug 12 05:17:01 pmx3host /USR/SBIN/CRON[110882]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 05:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 05:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 05:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376297184.400686 Aug 12 05:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376289984.400500 Aug 12 06:17:01 pmx3host /USR/SBIN/CRON[113834]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 12 06:25:01 pmx3host /USR/SBIN/CRON[114264]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )) 06:25:02 pmx3host rsyslogd: [origin software="rsyslogd" swVersion="4.6.4" x-pid="1672" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'. Aug 12 06:25:02 pmx3host rsyslogd: [origin software="rsyslogd" swVersion="4.6.4" x-pid="1672" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'. Aug 12 06:46:24 pmx3host rrdcached[1817]: flushing old values Aug 12 06:46:24 pmx3host rrdcached[1817]: rotating journals Aug 12 06:46:24 pmx3host rrdcached[1817]: started new journal /var/lib/rrdcached/journal//rrd.journal.1376300784.400659 Aug 12 06:46:24 pmx3host rrdcached[1817]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1376293584.424462 Aug 12 19:07:29 pmx3host kernel: imklog 4.6.4, log source = /proc/kmsg started. Aug 12 19:07:29 pmx3host rsyslogd: [origin software="rsyslogd" swVersion="4.6.4" x-pid="1835" x-info="http://www.rsyslog.com"] (re)start Aug 12 19:07:29 pmx3host kernel: Initializing cgroup subsys cpuset Aug 12 19:07:29 pmx3host kernel: Initializing cgroup subsys cpu Aug 12 19:07:29 pmx3host kernel: Linux version 2.6.32-19-pve (root@maui) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 SMP Wed May 15 07:32:52 CEST 2013 Aug 12 19:07:29 pmx3host kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-pve root=UUID=ef09aef3-8fdd-4738-80df-f5eb4caec0a5 ro quiet Aug 12 19:07:29 pmx3host kernel: KERNEL supported cpus: Aug 12 19:07:29 pmx3host kernel: Intel GenuineIntel Aug 12 19:07:29 pmx3host kernel: AMD AuthenticAMD Aug 12 19:07:29 pmx3host kernel: Centaur CentaurHauls Aug 12 19:07:29 pmx3host kernel: BIOS-provided physical RAM map: Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 0000000000000000 - 000000000009c000 (usable) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 0000000000100000 - 000000008c012000 (usable) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008c012000 - 000000008c0f0000 (ACPI NVS) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008c0f0000 - 000000008c4fb000 (ACPI data) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008c4fb000 - 000000008d8fb000 (ACPI NVS) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008d8fb000 - 000000008f602000 (ACPI data) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f602000 - 000000008f64f000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f64f000 - 000000008f6e4000 (ACPI data) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f6e4000 - 000000008f6ee000 (ACPI NVS) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f6ee000 - 000000008f6f1000 (ACPI data) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f6f1000 - 000000008f7cf000 (ACPI NVS) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f7cf000 - 000000008f800000 (ACPI data) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 000000008f800000 - 0000000090000000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 00000000a0000000 - 00000000b0000000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 00000000fc000000 - 00000000fd000000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 00000000fed1c000 - 00000000fed45000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved) Aug 12 19:07:29 pmx3host kernel: BIOS-e820: 0000000100000000 - 0000000c70000000 (usable) Aug 12 19:07:29 pmx3host kernel: DMI 2.5 present. Aug 12 19:07:29 pmx3host kernel: SMBIOS version 2.5 @ 0xF0440 Aug 12 19:07:29 pmx3host kernel: DMI: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.00.0060.090920111354 09/09/2011 Aug 12 19:07:29 pmx3host kernel: e820 update range: 0000000000000000 - 0000000000001000 (usable) ==> (reserved) Aug 12 19:07:29 pmx3host kernel: e820 remove range: 00000000000a0000 - 0000000000100000 (usable) Aug 12 19:07:29 pmx3host kernel: last_pfn = 0xc70000 max_arch_pfn = 0x400000000 Aug 12 19:07:29 pmx3host kernel: MTRR default type: write-back Aug 12 19:07:29 pmx3host kernel: MTRR fixed ranges enabled: Aug 12 19:07:29 pmx3host kernel: 00000-9FFFF write-back Aug 12 19:07:29 pmx3host kernel: A0000-BFFFF uncachable Aug 12 19:07:29 pmx3host kernel: C0000-DFFFF write-through Aug 12 19:07:29 pmx3host kernel: E0000-FFFFF write-protect Aug 12 19:07:29 pmx3host kernel: MTRR variable ranges enabled: Aug 12 19:07:29 pmx3host kernel: 0 base 00C0000000 mask FFC0000000 uncachable Aug 12 19:07:29 pmx3host kernel: 1 base 00A0000000 mask FFE0000000 uncachable Aug 12 19:07:29 pmx3host kernel: 2 base 0090000000 mask FFF0000000 uncachable Aug 12 19:07:29 pmx3host kernel: 3 base 00B0000000 mask FFFF000000 write-combining Aug 12 19:07:29 pmx3host kernel: 4 disabled Aug 12 19:07:29 pmx3host kernel: 5 disabled Aug 12 19:07:29 pmx3host kernel: 6 disabled Aug 12 19:07:29 pmx3host kernel: 7 disabled Aug 12 19:07:29 pmx3host kernel: x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 Aug 12 19:07:29 pmx3host kernel: last_pfn = 0x8c012 max_arch_pfn = 0x400000000 Aug 12 19:07:29 pmx3host kernel: initial memory mapped : 0 - 20000000 Aug 12 19:07:29 pmx3host kernel: init_memory_mapping: 0000000000000000-000000008c012000 Aug 12 19:07:29 pmx3host kernel: 0000000000 - 008c000000 page 2M Aug 12 19:07:29 pmx3host kernel: 008c000000 - 008c012000 page 4k Aug 12 19:07:29 pmx3host kernel: kernel direct mapping tables up to 8c012000 @ 8000-d000 Aug 12 19:07:29 pmx3host kernel: init_memory_mapping: 0000000100000000-0000000c70000000 Aug 12 19:07:29 pmx3host kernel: 0100000000 - 0c70000000 page 2M Aug 12 19:07:29 pmx3host kernel: kernel direct mapping tables up to c70000000 @ b000-3e000 Aug 12 19:07:29 pmx3host kernel: RAMDISK: 3703f000 - 37fef1f6 Aug 12 19:07:29 pmx3host kernel: ACPI: RSDP 00000000000f0410 00024 (v02 INTEL ) Aug 12 19:07:29 pmx3host kernel: ACPI: XSDT 000000008f7fd120 0009C (v01 INTEL S5520HC 00000000 01000013) Aug 12 19:07:29 pmx3host kernel: ACPI: FACP 000000008f7fb000 000F4 (v04 INTEL S5520HC 00000000 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: DSDT 000000008f7f4000 06531 (v02 INTEL S5520HC 00000003 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: FACS 000000008f6f1000 00040 Aug 12 19:07:29 pmx3host kernel: ACPI: APIC 000000008f7f3000 001A8 (v02 INTEL S5520HC 00000000 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: MCFG 000000008f7f2000 0003C (v01 INTEL S5520HC 00000001 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: HPET 000000008f7f1000 00038 (v01 INTEL S5520HC 00000001 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: SLIT 000000008f7f0000 00030 (v01 INTEL S5520HC 00000001 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: SRAT 000000008f7ef000 00430 (v02 INTEL S5520HC 00000001 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: SPCR 000000008f7ee000 00050 (v01 INTEL S5520HC 00000000 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: WDDT 000000008f7ed000 00040 (v01 INTEL S5520HC 00000000 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: SSDT 000000008f7d2000 1AFC4 (v02 INTEL SSDT PM 00004000 INTL 20061109) Aug 12 19:07:29 pmx3host kernel: ACPI: SSDT 000000008f7d1000 001D8 (v02 INTEL IPMI 00004000 INTL 20061109) Aug 12 19:07:29 pmx3host kernel: ACPI: HEST 000000008f7d0000 000A8 (v01 INTEL S5520HC 00000001 INTL 00000001) Aug 12 19:07:29 pmx3host kernel: ACPI: BERT 000000008f7cf000 00030 (v01 INTEL S5520HC 00000001 INTL 00000001) Aug 12 19:07:29 pmx3host kernel: ACPI: ERST 000000008f6f0000 00230 (v01 INTEL S5520HC 00000001 INTL 00000001) Aug 12 19:07:29 pmx3host kernel: ACPI: EINJ 000000008f6ef000 00130 (v01 INTEL S5520HC 00000001 INTL 00000001) Aug 12 19:07:29 pmx3host kernel: ACPI: DMAR 000000008f6ee000 001C8 (v01 INTEL S5520HC 00000001 MSFT 0100000D) Aug 12 19:07:29 pmx3host kernel: ACPI: Local APIC address 0xfee00000 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 0 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 16 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 2 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 18 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 4 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 20 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 6 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 22 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 1 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 17 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 3 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 19 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 5 -> Node 0 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 1 -> APIC 21 -> Node 1 Aug 12 19:07:29 pmx3host kernel: SRAT: PXM 0 -> APIC 7 -> Node 0 |
Here is my pveversion -v
# pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-19-pve
proxmox-ve-2.6.32: 2.3-96
pve-kernel-2.6.32-19-pve: 2.6.32-96
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-20
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-7
vncterm: 1.0-4
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-10
ksm-control-daemon: 1.1-1
#
Any ideas of what could be happening?
Regards,