Odd entries in log file

James Richmond

Active Member
Oct 27, 2018
22
0
41
39
Hi All,

I am hoping someone can shine some light on what this means.. First why am I looking at the logs. The other day I had a client call me up saying there server was down and they switched to a backup system. When I logged in everything seem to be fine but when I had them switch back to the proxmox I could not get to the server.

Things I have not done yet (Cant till the weekend)
I did not reboot the VM as they are running on a different system now (I have the VM shutdown)

All I did so far is try to find anything odd in the logs and that is what I think I found. Maybe it is nothing or it is something. Please let me know

I did check Ceph and every right now seems fine. I think it will work once we boot back up but why did I lose connection in the first place.

I am sorry if this is not enough info. Please let me know if you need any other info.
-------------------------

Nov 19 06:25:01 pve1 CRON[1038648]: pam_unix(cron:session): session opened for user root by (uid=0)

Nov 19 06:25:01 pve1 CRON[1038649]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))

Nov 19 06:25:01 pve1 ceph-mgr[2349]: 2018-11-19 06:25:01.204371 7f5c52381700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 ceph-mon[2359]: 2018-11-19 06:25:01.204404 7f2190034700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 ceph-osd[2701]: 2018-11-19 06:25:01.204876 7f1fd6ae2700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 systemd[1]: Started Proxmox VE replication runner.

Nov 19 06:25:01 pve1 systemd[1]: Stopping Proxmox VE firewall logger...

Nov 19 06:25:01 pve1 pvefw-logger[673118]: received terminate request (signal)

Nov 19 06:25:01 pve1 pvefw-logger[673118]: stopping pvefw logger

Nov 19 06:25:01 pve1 systemd[1]: Stopped Proxmox VE firewall logger.

Nov 19 06:25:01 pve1 systemd[1]: Starting Proxmox VE firewall logger...

Nov 19 06:25:01 pve1 pvefw-logger[1038763]: starting pvefw logger

Nov 19 06:25:01 pve1 systemd[1]: Started Proxmox VE firewall logger.

Nov 19 06:25:01 pve1 liblogging-stdlog[1133]: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1133" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
 
it looks like there is a cronjob in /etc/cron.daily which kills the ceph processes. nothing like that exists in our default setup (the processes shouldn't be killed).

I think it will work once we boot back up but why did I lose connection in the first place.

probably because the processes were killed.
 
it looks like there is a cronjob in /etc/cron.daily which kills the ceph processes. nothing like that exists in our default setup (the processes shouldn't be killed).



probably because the processes were killed.

Nevermind, it looks like there's a logrotate snippet running which runs the killall command. The problem probably lies somewhere else.

When I logged in everything seem to be fine but when I had them switch back to the proxmox I could not get to the server.

What do you mean exactly by not being able to reach the server?
 
Nevermind, it looks like there's a logrotate snippet running which runs the killall command. The problem probably lies somewhere else.



What do you mean exactly by not being able to reach the server?


At this point I am not sure... What I meant is when I type the IP of the VM I could not get to the VM. Proxmox itself seemed fine. But I also have a second VM (Container) that is acting odd. Ever since whatever happen I now get this with Observium: DB Error 2002: No such file or directory when going to the IP. I am not concerned about getting that VM running as I am sure I will fix it. But two different VM's that seem to have lost something but I don't know what that something is.

I am sorry I cant be more direct on what the issue is. The fact is I really don't understand what happen here. Everything was working perfect and nothing changed then one day 2 VM's (Noted at this time we only have 2 VM's) seem like I could no longer reach them via their IP. With that being said if I go to the console in Proxmox for those VM's they look fine.

Is there any other log I can look at? I am about to restart proxmox to see if that helps but I am trying to avoid that as we have this setup for high availability. We have 3 servers running in this cluster and it was working fine and still seems to be(?).
 
At this point I am not sure... What I meant is when I type the IP of the VM I could not get to the VM. Proxmox itself seemed fine. But I also have a second VM (Container) that is acting odd. Ever since whatever happen I now get this with Observium: DB Error 2002: No such file or directory when going to the IP. I am not concerned about getting that VM running as I am sure I will fix it. But two different VM's that seem to have lost something but I don't know what that something is.

I am sorry I cant be more direct on what the issue is. The fact is I really don't understand what happen here. Everything was working perfect and nothing changed then one day 2 VM's (Noted at this time we only have 2 VM's) seem like I could no longer reach them via their IP. With that being said if I go to the console in Proxmox for those VM's they look fine.

Is there any other log I can look at? I am about to restart proxmox to see if that helps but I am trying to avoid that as we have this setup for high availability. We have 3 servers running in this cluster and it was working fine and still seems to be(?).

I think there are two different problems. Maybe I was just on a witch hunt... not sure yet. I was able to fix the Observium pretty easily. Just reinstalled mysql since it seem to have disappeared and everything started to work for that.

My other VM I can't test till tomorrow but something caused it to not be accessible through the IP/interface. I am not sure how to find out what happen there. I am going to guess that it will be fine since it would have been rebooted.

Is there anything you would look at to determine why you could no longer connect to your VM even though it seem to be up and running fine?
I know as one step to try to get it up and running would be to reboot but this is something I would like to avoid needing to do all the time in the future.
 
My other VM I can't test till tomorrow but something caused it to not be accessible through the IP/interface. I am not sure how to find out what happen there. I am going to guess that it will be fine since it would have been rebooted.

Is there anything you would look at to determine why you could no longer connect to your VM even though it seem to be up and running fine?

It sounds like a network problem to me. Monitor your network for a while for weird behavior.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!