VPN not working inside virtual machine (KVM)

NoSum

New Member
May 15, 2021
27
0
1
36
I have the following network config

Code:
auto lo
iface lo inet loopback


auto enp193s0f0
iface enp193s0f0 inet manual


auto enp133s0f0
iface enp133s0f0 inet manual


auto enp133s0f1
iface enp133s0f1 inet manual


auto enp193s0f1
iface enp193s0f1 inet manual


iface enp9s0f3u2u2c2 inet manual


auto bond0
iface bond0 inet manual
        bond-slaves enp193s0f0 enp193s0f1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3


#VLAN
auto bond1
iface bond1 inet static
        address 192.168.0.120/16
        bond-slaves enp133s0f0 enp133s0f1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3


#PUBLIC NODE IP
auto vmbr0
iface vmbr0 inet static
        address IPV4/32
        gateway 51.195.234.254
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        post-up ip route add IPBLOCK/27 dev vmbr0
        post-up echo 1 >/proc/sys/net/ipv4/ip_forward


iface vmbr0 inet6 static
        address 2001:41d0:802:3f00::/56
        gateway fe80::1

For some reason VPNs are not working could this be due to

post-up ip route add IPBLOCK/27 dev vmbr0
post-up echo 1 >/proc/sys/net/ipv4/ip_forward

? What are other work arounds
 
This is a very broad problem description...

There are many reasons that could prevent "VPNs" to work.
  1. What type of VPN are you using? (IPsec, OpenVPN, ...)
  2. Is there L3 connectivity between the two VPN endpoints? (e.g. can they ping each other)
  3. What's in the logs of both endpoints regarding establishing the connection?
  4. Why is there a /32 address and a static route defined? Could it not simply be a /27 address?
  5. Are any Firewalls blocking the VPN connection?
  6. Maybe try to disable the "Firewall" option in the VMs NIC
 
This is a very broad problem description...

There are many reasons that could prevent "VPNs" to work.
  1. What type of VPN are you using? (IPsec, OpenVPN, ...)
  2. Is there L3 connectivity between the two VPN endpoints? (e.g. can they ping each other)
  3. What's in the logs of both endpoints regarding establishing the connection?
  4. Why is there a /32 address and a static route defined? Could it not simply be a /27 address?
  5. Are any Firewalls blocking the VPN connection?
  6. Maybe try to disable the "Firewall" option in the VMs NIC
1. OpenVPN
2. Yes they can ping each other before connecting
3. Below is connect log can't get other end

Code:
Fri May 21 10:32:24 2021 WARNING: file '/etc/openvpn/wfvpn/keys/wfvpn.key' is group or others accessible
Fri May 21 10:32:24 2021 WARNING: file '/etc/openvpn/wfvpn/keys/ta.key' is group or others accessible
Fri May 21 10:32:24 2021 OpenVPN 2.4.11 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Apr 21 2021
Fri May 21 10:32:24 2021 library versions: OpenSSL 1.0.2k-fips  26 Jan 2017, LZO 2.06
Fri May 21 10:32:24 2021 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 10:32:24 2021 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 10:32:24 2021 TCP/UDP: Preserving recently used remote address: [AF_INET]VPNIP:443
Fri May 21 10:32:24 2021 Socket Buffers: R=[212992->212992] S=[212992->212992]
Fri May 21 10:32:24 2021 UDP link local: (not bound)
Fri May 21 10:32:24 2021 UDP link remote: [AF_INET]VPNIP:443
Fri May 21 10:32:24 2021 TLS: Initial packet from [AF_INET]VPNIP:443, sid=e0277a0b 44a0faac
Fri May 21 10:32:24 2021 VERIFY OK: depth=1, CN=Easy-RSA CA
Fri May 21 10:32:24 2021 VERIFY KU OK
Fri May 21 10:32:24 2021 Validating certificate extended key usage
Fri May 21 10:32:24 2021 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
Fri May 21 10:32:24 2021 VERIFY EKU OK
Fri May 21 10:32:24 2021 VERIFY OK: depth=0, CN=servername
Fri May 21 10:32:24 2021 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 ECDHE-RSA-AES256-GCM-SHA384, 2048 bit RSA
Fri May 21 10:32:24 2021 [servername] Peer Connection Initiated with [AF_INET]VPNIP:443
Fri May 21 10:32:25 2021 SENT CONTROL [servername]: 'PUSH_REQUEST' (status=1)
Fri May 21 10:32:25 2021 PUSH: Received control message: 'PUSH_REPLY,redirect-gateway,block-outside-dns,dhcp-option DNS 8.8.8.8,dhcp-option DNS 8.8.4.4,block-outside-dns,route 10.8.0.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.8.0.10 10.8.0.9,peer-id 1,cipher AES-256-GCM'
Fri May 21 10:32:25 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:2: block-outside-dns (2.4.11)
Fri May 21 10:32:25 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:5: block-outside-dns (2.4.11)
Fri May 21 10:32:25 2021 OPTIONS IMPORT: timers and/or timeouts modified
Fri May 21 10:32:25 2021 OPTIONS IMPORT: --ifconfig/up options modified
Fri May 21 10:32:25 2021 OPTIONS IMPORT: route options modified
Fri May 21 10:32:25 2021 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified
Fri May 21 10:32:25 2021 OPTIONS IMPORT: peer-id set
Fri May 21 10:32:25 2021 OPTIONS IMPORT: adjusting link_mtu to 1625
Fri May 21 10:32:25 2021 OPTIONS IMPORT: data channel crypto options modified
Fri May 21 10:32:25 2021 Data Channel: using negotiated cipher 'AES-256-GCM'
Fri May 21 10:32:25 2021 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 10:32:25 2021 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 10:32:25 2021 ROUTE_GATEWAY 100.64.0.1
Fri May 21 10:32:25 2021 TUN/TAP device tun0 opened
Fri May 21 10:32:25 2021 TUN/TAP TX queue length set to 100
Fri May 21 10:32:25 2021 /sbin/ip link set dev tun0 up mtu 1500
Fri May 21 10:32:25 2021 /sbin/ip addr add dev tun0 local 10.8.0.10 peer 10.8.0.9
Fri May 21 10:32:25 2021 /sbin/ip route add VPNIP/32 via 100.64.0.1
Fri May 21 10:32:25 2021 /sbin/ip route add 0.0.0.0/1 via 10.8.0.9
Fri May 21 10:32:25 2021 /sbin/ip route add 128.0.0.0/1 via 10.8.0.9
Fri May 21 10:32:25 2021 /sbin/ip route add 10.8.0.0/24 via 10.8.0.9
Fri May 21 10:32:25 2021 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this
Fri May 21 10:32:25 2021 Initialization Sequence Completed
Fri May 21 10:33:08 2021 event_wait : Interrupted system call (code=4)
Fri May 21 10:33:08 2021 /sbin/ip route del 10.8.0.0/24
Fri May 21 10:33:08 2021 /sbin/ip route del VPNIP/32
Fri May 21 10:33:08 2021 /sbin/ip route del 0.0.0.0/1
Fri May 21 10:33:08 2021 /sbin/ip route del 128.0.0.0/1
Fri May 21 10:33:08 2021 Closing TUN/TAP interface
Fri May 21 10:33:08 2021 /sbin/ip addr del dev tun0 local 10.8.0.10 peer 10.8.0.9
Fri May 21 10:33:08 2021 SIGTERM[hard,] received, process exiting


4. The /32 is the main IP of my OVH node the /27 is an additional IP block.
5. No
6. I have done so. I thought it might be a firewall issue as it appears to connect for a breif moment based from the increase in ping https://i.gyazo.com/d720352cb3d3d9b0044d785ad97de35c.png
 
From what I can tell, the VPN endpoints are able to communicate with each other and establish a connection :)
After 30 seconds the service terminates (I assume you stopped it?)
  1. What is the problem now? The VM is not able to communicate via the VPN?
  2. Are there any other suspicious system logs regarding IP forwarding or TUN/TAP devices? (on the VM or the host)
 
From what I can tell, the VPN endpoints are able to communicate with each other and establish a connection :)
After 30 seconds the service terminates (I assume you stopped it?)
  1. What is the problem now? The VM is not able to communicate via the VPN?
  2. Are there any other suspicious system logs regarding IP forwarding or TUN/TAP devices? (on the VM or the host)
1. I did stop it but after 5-15 seconds of the VPN being established it stops working no mention of why in logs
2. The only other thing is when a VM boots up this is shown (VMID 100)
f9d91ab2ed53245976d7ebdcb874f273.png
 
These state changes are completely normal as far as the last one is "forwarding state".

That means, the VM is able to communicate via the VPN for a few seconds and then it stops working? Would be very strange... I see no reason for the VPN to stop working. Does a bare metal client behave the same?
 
Last edited:
These state changes are completely normal as far as the last one is "forwarding state".

That means, the VM is able to communicate via the VPN for a few seconds and then it stops working? Would be very strange... I see no reason for the VPN to stop working. Does a bare metal client have the same?
I will test the VPN on the node itself now.

Does this mean anything? Seen some threads but I think it's LXC only?

https://pastebin.com/KwGv7PNQ
 
No the VPN client fails on the main node too.

d8fd65c4d8d895eef1f6ac246004fa11.png


tun0 is the VPN

Code:
Fri May 21 12:47:31 2021 WARNING: file '/etc/openvpn/vpn/keys/vpn.key' is group or others accessible
Fri May 21 12:47:31 2021 WARNING: file '/etc/openvpn/vpn/keys/ta.key' is group or others accessible
Fri May 21 12:47:31 2021 OpenVPN 2.4.7 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019
Fri May 21 12:47:31 2021 library versions: OpenSSL 1.1.1d  10 Sep 2019, LZO 2.10
Fri May 21 12:47:31 2021 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 12:47:31 2021 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 12:47:31 2021 TCP/UDP: Preserving recently used remote address: [AF_INET]VPNIP:443
Fri May 21 12:47:31 2021 Socket Buffers: R=[212992->212992] S=[212992->212992]
Fri May 21 12:47:31 2021 UDP link local: (not bound)
Fri May 21 12:47:31 2021 UDP link remote: [AF_INET]VPNIP:443
Fri May 21 12:47:31 2021 TLS: Initial packet from [AF_INET]VPNIP:443, sid=0cbf97ec 15b190fc
Fri May 21 12:47:32 2021 VERIFY OK: depth=1, CN=Easy-RSA CA
Fri May 21 12:47:32 2021 VERIFY KU OK
Fri May 21 12:47:32 2021 Validating certificate extended key usage
Fri May 21 12:47:32 2021 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
Fri May 21 12:47:32 2021 VERIFY EKU OK
Fri May 21 12:47:32 2021 VERIFY OK: depth=0, CN=servername
Fri May 21 12:47:32 2021 Control Channel: TLSv1.2, cipher TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384, 2048 bit RSA
Fri May 21 12:47:32 2021 [servername] Peer Connection Initiated with [AF_INET]VPNIP:443
Fri May 21 12:47:33 2021 SENT CONTROL [servername]: 'PUSH_REQUEST' (status=1)
Fri May 21 12:47:33 2021 PUSH: Received control message: 'PUSH_REPLY,redirect-gateway,block-outside-dns,dhcp-option DNS 8.8.8.8,dhcp-option DNS 8.8.4.4,block-outside-dns,route 10.8.0.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.8.0.10 10.8.0.9,peer-id 1,cipher AES-256-GCM'
Fri May 21 12:47:33 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:2: block-outside-dns (2.4.7)
Fri May 21 12:47:33 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:5: block-outside-dns (2.4.7)
Fri May 21 12:47:33 2021 OPTIONS IMPORT: timers and/or timeouts modified
Fri May 21 12:47:33 2021 OPTIONS IMPORT: --ifconfig/up options modified
Fri May 21 12:47:33 2021 OPTIONS IMPORT: route options modified
Fri May 21 12:47:33 2021 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified
Fri May 21 12:47:33 2021 OPTIONS IMPORT: peer-id set
Fri May 21 12:47:33 2021 OPTIONS IMPORT: adjusting link_mtu to 1625
Fri May 21 12:47:33 2021 OPTIONS IMPORT: data channel crypto options modified
Fri May 21 12:47:33 2021 Data Channel: using negotiated cipher 'AES-256-GCM'
Fri May 21 12:47:33 2021 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 12:47:33 2021 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 12:47:33 2021 ROUTE_GATEWAY 100.64.0.1
Fri May 21 12:47:33 2021 TUN/TAP device tun0 opened
Fri May 21 12:47:33 2021 TUN/TAP TX queue length set to 100
Fri May 21 12:47:33 2021 /sbin/ip link set dev tun0 up mtu 1500
Fri May 21 12:47:33 2021 /sbin/ip addr add dev tun0 local 10.8.0.10 peer 10.8.0.9
Fri May 21 12:47:33 2021 /sbin/ip route add VPNIP/32 via 100.64.0.1
Error: Nexthop has invalid gateway.
Fri May 21 12:47:33 2021 ERROR: Linux route add command failed: external program exited with error status: 2
Fri May 21 12:47:33 2021 /sbin/ip route add 0.0.0.0/1 via 10.8.0.9
Fri May 21 12:47:33 2021 /sbin/ip route add 128.0.0.0/1 via 10.8.0.9
Fri May 21 12:47:33 2021 /sbin/ip route add 10.8.0.0/24 via 10.8.0.9
Fri May 21 12:47:33 2021 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this
Fri May 21 12:47:33 2021 Initialization Sequence Completed
Fri May 21 12:49:22 2021 event_wait : Interrupted system call (code=4)
Fri May 21 12:49:22 2021 /sbin/ip route del 10.8.0.0/24
Fri May 21 12:49:22 2021 /sbin/ip route del VPNIP/32
RTNETLINK answers: No such process
Fri May 21 12:49:22 2021 ERROR: Linux route delete command failed: external program exited with error status: 2
Fri May 21 12:49:22 2021 /sbin/ip route del 0.0.0.0/1
Fri May 21 12:49:22 2021 /sbin/ip route del 128.0.0.0/1
Fri May 21 12:49:22 2021 Closing TUN/TAP interface
Fri May 21 12:49:22 2021 /sbin/ip addr del dev tun0 local 10.8.0.10 peer 10.8.0.9
Fri May 21 12:49:22 2021 SIGTERM[hard,] received, process exiting
Fri May 21 12:49:42 2021 WARNING: file '/etc/openvpn/vpn/keys/vpn.key' is group or others accessible
Fri May 21 12:49:42 2021 WARNING: file '/etc/openvpn/vpn/keys/ta.key' is group or others accessible
Fri May 21 12:49:42 2021 OpenVPN 2.4.7 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019
Fri May 21 12:49:42 2021 library versions: OpenSSL 1.1.1d  10 Sep 2019, LZO 2.10
Fri May 21 12:49:42 2021 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 12:49:42 2021 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri May 21 12:49:42 2021 TCP/UDP: Preserving recently used remote address: [AF_INET]VPNIP:443
Fri May 21 12:49:42 2021 Socket Buffers: R=[212992->212992] S=[212992->212992]
Fri May 21 12:49:42 2021 UDP link local: (not bound)
Fri May 21 12:49:42 2021 UDP link remote: [AF_INET]VPNIP:443
Fri May 21 12:49:42 2021 TLS: Initial packet from [AF_INET]VPNIP:443, sid=b459c061 b4484dcd
Fri May 21 12:49:42 2021 VERIFY OK: depth=1, CN=Easy-RSA CA
Fri May 21 12:49:42 2021 VERIFY KU OK
Fri May 21 12:49:42 2021 Validating certificate extended key usage
Fri May 21 12:49:42 2021 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
Fri May 21 12:49:42 2021 VERIFY EKU OK
Fri May 21 12:49:42 2021 VERIFY OK: depth=0, CN=servername
Fri May 21 12:49:42 2021 Control Channel: TLSv1.2, cipher TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384, 2048 bit RSA
Fri May 21 12:49:42 2021 [servername] Peer Connection Initiated with [AF_INET]VPNIP:443
Fri May 21 12:49:43 2021 SENT CONTROL [servername]: 'PUSH_REQUEST' (status=1)
Fri May 21 12:49:44 2021 PUSH: Received control message: 'PUSH_REPLY,redirect-gateway,block-outside-dns,dhcp-option DNS 8.8.8.8,dhcp-option DNS 8.8.4.4,block-outside-dns,route 10.8.0.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.8.0.14 10.8.0.13,peer-id 2,cipher AES-256-GCM'
Fri May 21 12:49:44 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:2: block-outside-dns (2.4.7)
Fri May 21 12:49:44 2021 Options error: Unrecognized option or missing or extra parameter(s) in [PUSH-OPTIONS]:5: block-outside-dns (2.4.7)
Fri May 21 12:49:44 2021 OPTIONS IMPORT: timers and/or timeouts modified
Fri May 21 12:49:44 2021 OPTIONS IMPORT: --ifconfig/up options modified
Fri May 21 12:49:44 2021 OPTIONS IMPORT: route options modified
Fri May 21 12:49:44 2021 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified
Fri May 21 12:49:44 2021 OPTIONS IMPORT: peer-id set
Fri May 21 12:49:44 2021 OPTIONS IMPORT: adjusting link_mtu to 1625
Fri May 21 12:49:44 2021 OPTIONS IMPORT: data channel crypto options modified
Fri May 21 12:49:44 2021 Data Channel: using negotiated cipher 'AES-256-GCM'
Fri May 21 12:49:44 2021 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 12:49:44 2021 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri May 21 12:49:44 2021 ROUTE_GATEWAY 100.64.0.1
Fri May 21 12:49:44 2021 TUN/TAP device tun0 opened
Fri May 21 12:49:44 2021 TUN/TAP TX queue length set to 100
Fri May 21 12:49:44 2021 /sbin/ip link set dev tun0 up mtu 1500
Fri May 21 12:49:44 2021 /sbin/ip addr add dev tun0 local 10.8.0.14 peer 10.8.0.13
Fri May 21 12:49:44 2021 /sbin/ip route add VPNIP/32 via 100.64.0.1
Error: Nexthop has invalid gateway.
Fri May 21 12:49:44 2021 ERROR: Linux route add command failed: external program exited with error status: 2
Fri May 21 12:49:44 2021 /sbin/ip route add 0.0.0.0/1 via 10.8.0.13
Fri May 21 12:49:44 2021 /sbin/ip route add 128.0.0.0/1 via 10.8.0.13
Fri May 21 12:49:44 2021 /sbin/ip route add 10.8.0.0/24 via 10.8.0.13
Fri May 21 12:49:44 2021 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this
Fri May 21 12:49:44 2021 Initialization Sequence Completed
Fri May 21 12:49:47 2021 event_wait : Interrupted system call (code=4)
Fri May 21 12:49:47 2021 /sbin/ip route del 10.8.0.0/24
Fri May 21 12:49:47 2021 /sbin/ip route del VPNIP/32
RTNETLINK answers: No such process
Fri May 21 12:49:47 2021 ERROR: Linux route delete command failed: external program exited with error status: 2
Fri May 21 12:49:47 2021 /sbin/ip route del 0.0.0.0/1
Fri May 21 12:49:47 2021 /sbin/ip route del 128.0.0.0/1
Fri May 21 12:49:47 2021 Closing TUN/TAP interface
Fri May 21 12:49:47 2021 /sbin/ip addr del dev tun0 local 10.8.0.14 peer 10.8.0.13
Fri May 21 12:49:48 2021 SIGTERM[hard,] received, process exiting
 
I miss the `ip r` inside the VM (and the node)

Hosting ProxMox cluster in OVH myself, I'll advice to put the /27 on the vRack, and then have the VPN an IP on that side, solves a few other issues.

I noticed this in the logs:

Code:
Fri May 21 12:49:44 2021 /sbin/ip route add VPNIP/32 via 100.64.0.1
Error: Nexthop has invalid gateway.


Also, I miss the typical ip route add OVHGW/32 dev vmbr0 when using the /32 host ip.

Other than that, fire up byobuscreen/tmux, start a ping to VPN-peer, go to the proxmox node, start a tcpdump -on the tap interface of the VM, at this point you should see the ICMP request and response packets, open another virtual terminal on the VM's byobu/screen/tmux session, and start the VPN.

(1) check the ping is still active, if not, you'll need to force route that IP out the VM's interface direct.
(2) If things still don't work, (ie. the ping is active, and the openvpn started) add another tmux/screen/byobu session and start to ping the TUN link's local IP, and if that works, the remote IP of the tun link. if that fails, the keep it pinging and go back to the proxmox node's tcpdump of the tap interface. You should see:
a) the PING request/response packets of the VPN peer
b) the encrypted packets of OpenVPN
Q: are those in b) one direction only or both a sent and a receive.
if one direction, fix the NAT/routing on the remote peer,
if you see both directions, then you need to do a tcpdump on the tun interface inside the VM in yet another tmux/byobu/screen session. If you see the packets there then you'll need to fix the routing/NAT packets inside the VM

Running VPN VMs using tinc, openvpn, IPsec and ZeroTier (using Fortigate VMs, and Debian VMs) on top of ProxMox servers without glitches, the biggest thing is to get the routing in and out right and things then just work(TM)

PS: remember with OVH on the public interface: You need the right MAC addresses - remember to assign for the VM in the API/console else - use the vRack interface
 
I miss the `ip r` inside the VM (and the node)

Hosting ProxMox cluster in OVH myself, I'll advice to put the /27 on the vRack, and then have the VPN an IP on that side, solves a few other issues.

I noticed this in the logs:

Code:
Fri May 21 12:49:44 2021 /sbin/ip route add VPNIP/32 via 100.64.0.1
Error: Nexthop has invalid gateway.


Also, I miss the typical ip route add OVHGW/32 dev vmbr0 when using the /32 host ip.

Other than that, fire up byobuscreen/tmux, start a ping to VPN-peer, go to the proxmox node, start a tcpdump -on the tap interface of the VM, at this point you should see the ICMP request and response packets, open another virtual terminal on the VM's byobu/screen/tmux session, and start the VPN.

(1) check the ping is still active, if not, you'll need to force route that IP out the VM's interface direct.
(2) If things still don't work, (ie. the ping is active, and the openvpn started) add another tmux/screen/byobu session and start to ping the TUN link's local IP, and if that works, the remote IP of the tun link. if that fails, the keep it pinging and go back to the proxmox node's tcpdump of the tap interface. You should see:
a) the PING request/response packets of the VPN peer
b) the encrypted packets of OpenVPN
Q: are those in b) one direction only or both a sent and a receive.
if one direction, fix the NAT/routing on the remote peer,
if you see both directions, then you need to do a tcpdump on the tun interface inside the VM in yet another tmux/byobu/screen session. If you see the packets there then you'll need to fix the routing/NAT packets inside the VM

Running VPN VMs using tinc, openvpn, IPsec and ZeroTier (using Fortigate VMs, and Debian VMs) on top of ProxMox servers without glitches, the biggest thing is to get the routing in and out right and things then just work(TM)

PS: remember with OVH on the public interface: You need the right MAC addresses - remember to assign for the VM in the API/console else - use the vRack interface
Thank you for the reply but OVH vrack is not an option for me now due to most of my IP blocks being /30 and OVH vrack would reserve two of those IPs?


ip r
Code:
default via 100.64.0.1 dev vmbr0 proto kernel onlink
IPBLOCK/27 dev vmbr0 scope link
IPBLOCK/30 dev vmbr0 scope link
192.168.0.0/16 dev bond1 proto kernel scope link src 192.168.0.120

Inside VM:
Code:
default via 100.64.0.1 dev eth0
100.64.0.1 dev eth0 scope link

I will work on getting the other data requested
 
Thank you for the reply but OVH vrack is not an option for me now due to most of my IP blocks being /30 and OVH vrack would reserve two of those IPs?
Don't be skimpy and stingy, just fork out the money (once off) for a /27 or /26, my consulting invoice would be more than worth the 3x IPs (broadcast, network and gateway)

default via 100.64.0.1 dev vmbr0 proto kernel onlink

100.64.0.1 - where is that?

You should have the OVH gateway in there AFAIK unless you are doing mac-bouncing and then I'll yet again advise you to the console/API to rather assign a virtual MAC and use that then you can have "direct" routing and not mac bouncing.
 
Last edited:
Don't be skimpy and stingy, just fork out the money (once off) for a /27 or /26, my consulting invoice would be more than worth the 3x IPs (broadcast, network and gateway)



100.64.0.1 - where is that?

You should have the OVH gateway in there AFAIK unless you are doing mac-bouncing and then I'll yet again advise you to the console/API to rather assign a virtual MAC and use that then you can have "direct" routing and not mac bouncing.
It's not about being cheap I am using these IP blocks already also it doesn't explain why the VPN is not working on the node itself.

100.64.0.1 is the IP I got from booting the server into rescue mode these new Scale range servers are different the IPv6 for example only works using gateway fe80::1 but I have tried this with 51.195.234.254 too (Node IP is 51.195.234.x) with failure.
 
Okay, Scale - that's new, guess they had to setup new/different networks and to save a few extra IPs have the "internal network" with that 100.64.0.1

I notice an IP on bond0 - I would've expected that bond0 would've been attached to the vmbr0 without an IP as the IP should be vmbr0, shouldn't it?

but back to the problem, first confirm full networking stack before adding the VPN, and then tcpdump will be your friend to confirm/deny/debug the traffic that is leaving, with their MACs. On the host/node, you might need to snoop on both the interfaces and the bond0 to see the traffic and aggregate them.
Good luck, awaiting some client upgrades then I'll be looking at a couple of Scale-2s myself
 
Some progress. This seems to be an issue with CentOS and OVH. After I noticed it was working on a Windows VM using DHCP I changed I tried the same on both CentOS 7 and Debian 10. Debian 10 and Windows both work flawlessly however CentOS seems to be an issue.

Debian 10:

ip a
Code:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:00:ff:3b:ea brd ff:ff:ff:ff:ff:ff
    inet 145.239.xxx.xxx/32 brd 145.239.xxx.xxx scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::ff:feff:3bea/64 scope link
       valid_lft forever preferred_lft forever

ip r
Code:
default via 51.195.234.254 dev eth0
51.195.234.254 dev eth0 scope link

After connecting to VPN:
ip r
Code:
0.0.0.0/1 via 10.8.0.9 dev tun0
default via 51.195.234.254 dev eth0
10.8.0.0/24 via 10.8.0.9 dev tun0
10.8.0.9 dev tun0 proto kernel scope link src 10.8.0.10
VPNIP via 51.195.234.254 dev eth0
51.195.234.254 dev eth0 scope link
128.0.0.0/1 via 10.8.0.9 dev tun0[ICODE]


CentOS 7:

ip a
[CODE]1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:00:ff:3b:ea brd ff:ff:ff:ff:ff:ff
    inet 145.239.xxx.xxx/32 brd 145.239.xxx.xxx scope global dynamic eth0
       valid_lft 21599969sec preferred_lft 21599969sec
    inet6 fe80::ff:feff:3bea/64 scope link
       valid_lft forever preferred_lft forever

ip r
Code:
default via 51.195.234.254 dev eth0
51.195.234.254 dev eth0 scope link

After connecting to VPN:
Code:
ip r
0.0.0.0/1 via 10.8.0.9 dev tun0
default via 51.195.234.254 dev eth0
10.8.0.0/24 via 10.8.0.9 dev tun0
10.8.0.9 dev tun0 proto kernel scope link src 10.8.0.10
VPNIP via 51.195.234.254 dev eth0
51.195.234.254 dev eth0 scope link
128.0.0.0/1 via 10.8.0.9 dev tun0

They are both using DHCP

OVH guides say to add the following to /etc/sysconfig/network-scripts/route-eth0 which does not help nor show in 'ip r'

Code:
51.195.234.254 - 255.255.255.255 eth0
51.195.234.0 - 255.255.255.0 eth0
default 51.195.234.254

Also the following had no success

Code:
51.195.234.254 dev eth0
default via 51.195.234.254
 
Last edited: