Odd entries in log file

James Richmond · Nov 21, 2018

Hi All,

I am hoping someone can shine some light on what this means.. First why am I looking at the logs. The other day I had a client call me up saying there server was down and they switched to a backup system. When I logged in everything seem to be fine but when I had them switch back to the proxmox I could not get to the server.

Things I have not done yet (Cant till the weekend)
I did not reboot the VM as they are running on a different system now (I have the VM shutdown)

All I did so far is try to find anything odd in the logs and that is what I think I found. Maybe it is nothing or it is something. Please let me know

I did check Ceph and every right now seems fine. I think it will work once we boot back up but why did I lose connection in the first place.

I am sorry if this is not enough info. Please let me know if you need any other info.
-------------------------

Nov 19 06:25:01 pve1 CRON[1038648]: pam_unix(cron:session): session opened for user root by (uid=0)

Nov 19 06:25:01 pve1 CRON[1038649]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))

Nov 19 06:25:01 pve1 ceph-mgr[2349]: 2018-11-19 06:25:01.204371 7f5c52381700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 ceph-mon[2359]: 2018-11-19 06:25:01.204404 7f2190034700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 ceph-osd[2701]: 2018-11-19 06:25:01.204876 7f1fd6ae2700 -1 received signal: Hangup from PID: 1038711 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0

Nov 19 06:25:01 pve1 systemd[1]: Started Proxmox VE replication runner.

Nov 19 06:25:01 pve1 systemd[1]: Stopping Proxmox VE firewall logger...

Nov 19 06:25:01 pve1 pvefw-logger[673118]: received terminate request (signal)

Nov 19 06:25:01 pve1 pvefw-logger[673118]: stopping pvefw logger

Nov 19 06:25:01 pve1 systemd[1]: Stopped Proxmox VE firewall logger.

Nov 19 06:25:01 pve1 systemd[1]: Starting Proxmox VE firewall logger...

Nov 19 06:25:01 pve1 pvefw-logger[1038763]: starting pvefw logger

Nov 19 06:25:01 pve1 systemd[1]: Started Proxmox VE firewall logger.

Nov 19 06:25:01 pve1 liblogging-stdlog[1133]: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1133" x-info="http://www.rsyslog.com"] rsyslogd was HUPed

oguz · Nov 22, 2018

it looks like there is a cronjob in /etc/cron.daily which kills the ceph processes. nothing like that exists in our default setup (the processes shouldn't be killed).

James Richmond said:
I think it will work once we boot back up but why did I lose connection in the first place.

probably because the processes were killed.

oguz · Nov 22, 2018

oguz said:
it looks like there is a cronjob in /etc/cron.daily which kills the ceph processes. nothing like that exists in our default setup (the processes shouldn't be killed).

probably because the processes were killed.

Nevermind, it looks like there's a logrotate snippet running which runs the killall command. The problem probably lies somewhere else.

James Richmond said:
When I logged in everything seem to be fine but when I had them switch back to the proxmox I could not get to the server.

What do you mean exactly by not being able to reach the server?

James Richmond · Nov 23, 2018

oguz said:
Nevermind, it looks like there's a logrotate snippet running which runs the killall command. The problem probably lies somewhere else.

What do you mean exactly by not being able to reach the server?

At this point I am not sure... What I meant is when I type the IP of the VM I could not get to the VM. Proxmox itself seemed fine. But I also have a second VM (Container) that is acting odd. Ever since whatever happen I now get this with Observium: DB Error 2002: No such file or directory when going to the IP. I am not concerned about getting that VM running as I am sure I will fix it. But two different VM's that seem to have lost something but I don't know what that something is.

I am sorry I cant be more direct on what the issue is. The fact is I really don't understand what happen here. Everything was working perfect and nothing changed then one day 2 VM's (Noted at this time we only have 2 VM's) seem like I could no longer reach them via their IP. With that being said if I go to the console in Proxmox for those VM's they look fine.

Is there any other log I can look at? I am about to restart proxmox to see if that helps but I am trying to avoid that as we have this setup for high availability. We have 3 servers running in this cluster and it was working fine and still seems to be(?).

James Richmond · Nov 23, 2018

James Richmond said:
At this point I am not sure... What I meant is when I type the IP of the VM I could not get to the VM. Proxmox itself seemed fine. But I also have a second VM (Container) that is acting odd. Ever since whatever happen I now get this with Observium: DB Error 2002: No such file or directory when going to the IP. I am not concerned about getting that VM running as I am sure I will fix it. But two different VM's that seem to have lost something but I don't know what that something is.

I am sorry I cant be more direct on what the issue is. The fact is I really don't understand what happen here. Everything was working perfect and nothing changed then one day 2 VM's (Noted at this time we only have 2 VM's) seem like I could no longer reach them via their IP. With that being said if I go to the console in Proxmox for those VM's they look fine.

Is there any other log I can look at? I am about to restart proxmox to see if that helps but I am trying to avoid that as we have this setup for high availability. We have 3 servers running in this cluster and it was working fine and still seems to be(?).

I think there are two different problems. Maybe I was just on a witch hunt... not sure yet. I was able to fix the Observium pretty easily. Just reinstalled mysql since it seem to have disappeared and everything started to work for that.

My other VM I can't test till tomorrow but something caused it to not be accessible through the IP/interface. I am not sure how to find out what happen there. I am going to guess that it will be fine since it would have been rebooted.

Is there anything you would look at to determine why you could no longer connect to your VM even though it seem to be up and running fine?
I know as one step to try to get it up and running would be to reboot but this is something I would like to avoid needing to do all the time in the future.

oguz · Nov 23, 2018

James Richmond said:
My other VM I can't test till tomorrow but something caused it to not be accessible through the IP/interface. I am not sure how to find out what happen there. I am going to guess that it will be fine since it would have been rebooted.

Is there anything you would look at to determine why you could no longer connect to your VM even though it seem to be up and running fine?

It sounds like a network problem to me. Monitor your network for a while for weird behavior.

Search

Search

Odd entries in log file

James Richmond

Active Member

oguz

Proxmox Retired Staff

oguz

Proxmox Retired Staff

James Richmond

Active Member

James Richmond

Active Member

oguz

Proxmox Retired Staff