Search results

  1. Z

    Designing my network(s)

    Hi all, I'm doing some upgrades to my network. I've got my servers ready but not sure what/how much traffic goes through for each piece of the puzzle. I have 4 Proxmox boxes and 2 FreeNAS boxes. On the Proxmox boxes I was thinking of setting up with (all GbE): (When I say +x it indicates...
  2. Z

    Bridge Bond0 - Lost Nework

    Unless you really know what you're doing and have a specific reason to not use the bridge it's very likely you need it. It's how your virtual machines connect to a network. So you will end up with: VM virtual NICs -> bridge -> bond -> physical NICs -> physical switch It sounds like you should...
  3. Z

    Bridge Bond0 - Lost Nework

    You can assign an IP address directly to bond0 if you don't need the bridge. The bridge is there so your VMs can connect to the network. Think of the bridge as a network switch. If you don't need that, then you can eliminate it.
  4. Z

    Can't keep cluster together after removing node

    Going back and looking at the graphs reveals corosync is buggy in this proxmox version. We removed that node from the cluster on Saturday and it's just been a nightmare ever since. Those are from the 2 nodes that were still clustered together. Corosync clearly has a memory leak that happens...
  5. Z

    Bridge Bond0 - Lost Nework

    I fixed some bonding issues by adding: bridge_maxage 0 bridge_ageing 0 bridge_maxwait 0 To vmbr0.
  6. Z

    Can't keep cluster together after removing node

    This just keeps getting better. kernel: Out of memory: OOM killed process 597850 (corosync) score 0 vm:53331892kB, rss:53129772kB, swap:0kB Can anyone explain why corosync needs 53 GB of RAM?
  7. Z

    Can't keep cluster together after removing node

    This is run from Node 1 # omping 10.0.0.105 10.0.0.102 10.0.0.103 10.0.0.105 : waiting for response msg 10.0.0.102 : waiting for response msg 10.0.0.105 : joined (S,G) = (*, 232.43.211.234), pinging 10.0.0.102 : joined (S,G) = (*, 232.43.211.234), pinging 10.0.0.105 : unicast, seq=1, size=69...
  8. Z

    Can't keep cluster together after removing node

    To your first question, yes, I switched it off using the PDU at the appropriate time to make sure. Multicast is working, verified by omping just now. Syslog from node 1 after reboot - the first line is repeated many many times: Oct 10 15:54:27 t1000 corosync[3293]: [TOTEM ] Retransmit List...
  9. Z

    Can't keep cluster together after removing node

    I have a cluster of 4 machines that is temporarily 1 short due to needing to test the RAM on it. After shutting it down and removing it from the cluster using instructions from the wiki, my cluster splits. Node 1 is by itself (no quorum, expected 3 quorum 2 total 1) Node 2 was removed...
  10. Z

    Hard drives blank after migration

    No one seems to know what the cause of this is. I'm posting my experience in case it helps a Googler down the road. It has now happened on an NFS share running on top of a hardware RAID, so MD is not to blame either. I now suspect it's possibly a kernel or fsutil bug inside the client VM, as...
  11. Z

    Hard drives blank after migration

    I'm not sure if this is a bug in NFS or MD, but this looks very similar: https://bugzilla.redhat.com/show_bug.cgi?id=725166#c22 I have switched my storage over to FreeNAS...so far so good.
  12. Z

    Hard drives blank after migration

    Update: It's happened again. This time on a brand new VM that's never been migrated, and it's on a iSCSI LVM store on a different physical machine, and the VM has never migrated between nodes since creation. So it's not NFS, and it's not specific to a node, and it's nothing to do with...
  13. Z

    Hard drives blank after migration

    I created a test VM with identical settings on the HDD as the ones that failed. I then repeated the steps I made in my original post and no issues. I left it running for a while and it was fine. I tried every available caching option on both the original failed VM and the test VM to no avail...
  14. Z

    Hard drives blank after migration

    Sorry for the delayed reply, been busy restoring backups. I restored them on fresh VMs so I can try to figure out what happened. I moved the VMs back to node 2, it no longer says not bootable but hangs at "booting from hard disk". I can access the disk itself when I boot from rescue, it just...
  15. Z

    Hard drives blank after migration

    I added a new server to my cluster today. It went from 2 nodes to 3, with a few hiccups but overall it seemed ok at first. One of my nodes was having random reboots so I wanted to be able to pull it and do RAM tests. I moved a couple hard drives that were on local storage to an NFS volume. I...
  16. Z

    401 Errors on enterprise repos with active registration

    Must have been a time delay. It's working now.....
  17. Z

    401 Errors on enterprise repos with active registration

    Hi there, My server is running Proxmox 3.1-3, recently upgraded from 2.x. I am unable to do apt-get update without receiving a 401 from the enterprise repos. I have other machines in the same cluster that work fine. Is somehing wrong or perhaps there's a time delay? Ive tried rebooting to no...
  18. Z

    Clocksource tsc unstable with CentOS 6.4 on Proxmox 2.3 / KVM

    1. Make sure /boot is mounted (if necessary) 2. In /boot/grub/grub.conf add "clocksource_failover=acpi_pm" to the end of the kernel line 3. Reboot Aternatively, try "clocksource=acpi_pm"