Hello everyone!
I am trying to setup a PVE cluster at home with 2 machines. A couple months ago I managed to get a basic setup going with 8.0-2, and now trying with 8.1-1, but there seems to be a new problem with networking that I did not encounter before.
After the cluster is created, I add EVPN network to it. If I understand correctly, this solution will allow my VM's (I used this image) to talk to each other even if they are running on separate nodes. Plus, I add a BGP controller that allows me to automatically have a route to these VM's on my laptop. After this I verified that I can ssh to VM's from laptop, plus the VM's can ping each other while they are scheduled on different nodes (but did not try anything else).
I do not know if it is important, but I set up static IP addresses on the VM's via Cloud-Init.
Then I tried to install some packages on the VM's, and they tell me that DNS is not working, like that:
I launch tcpdump on the node with the VM, and verify that these requests have a response (I called the zone "evpn1"):
I do not know much about VRF, so not sure what to check next. I also added a "Simple" zone to SDN and attached the VM to this network, and DNS and everything else started to work through this network. Because of that, I kept the configuration with 2 networks.
Then I needed my VM's to talk to each other over EVPN (an HTTP request), and again, they could not (curl just hangs for some time):
I tried to lower MTU on the network adapters from default 1500 to 1450 (this is what VXLAN has, but this change does not make a lot of sense I think) and the VM's were able to communicate to each other. I then disconnected the "Simple" network to check if communication with outer world would work ("host google.com"), but it was not the case.
I also verified that the VM's can talk to each other with MTU 1500 when they run on the same node. This is probably a useless check though.
I do not remember this issue on 8.0-2, but of course maybe I did something differently this time (but I cannot figure out what, haha). Sorry for the long post!
I am trying to setup a PVE cluster at home with 2 machines. A couple months ago I managed to get a basic setup going with 8.0-2, and now trying with 8.1-1, but there seems to be a new problem with networking that I did not encounter before.
After the cluster is created, I add EVPN network to it. If I understand correctly, this solution will allow my VM's (I used this image) to talk to each other even if they are running on separate nodes. Plus, I add a BGP controller that allows me to automatically have a route to these VM's on my laptop. After this I verified that I can ssh to VM's from laptop, plus the VM's can ping each other while they are scheduled on different nodes (but did not try anything else).
I do not know if it is important, but I set up static IP addresses on the VM's via Cloud-Init.
Then I tried to install some packages on the VM's, and they tell me that DNS is not working, like that:
Bash:
# host google.com
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.4.4#53: timed out
;; no servers could be reached
I launch tcpdump on the node with the VM, and verify that these requests have a response (I called the zone "evpn1"):
Bash:
# tcpdump -i any src host dns.google and dst host 10.15.0.30 and port 53
21:46:53.065892 xvrf_evpn1 Out IP dns.google.domain > 10.15.0.30.36548: 59133 1/0/0 A 142.250.186.174 (44)
21:46:53.065895 xvrfp_evpn1 In IP dns.google.domain > 10.15.0.30.36548: 59133 1/0/0 A 142.250.186.174 (44)
I do not know much about VRF, so not sure what to check next. I also added a "Simple" zone to SDN and attached the VM to this network, and DNS and everything else started to work through this network. Because of that, I kept the configuration with 2 networks.
Then I needed my VM's to talk to each other over EVPN (an HTTP request), and again, they could not (curl just hangs for some time):
Bash:
# curl https://10.15.0.30:6443 -v
* Trying 10.15.0.30:6443...
* Connected to 10.15.0.30 (10.15.0.30) port 6443 (#0)
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* Recv failure: Connection reset by peer
* OpenSSL SSL_connect: Connection reset by peer in connection to 10.15.0.30:6443
* Closing connection 0
curl: (35) Recv failure: Connection reset by peer
I tried to lower MTU on the network adapters from default 1500 to 1450 (this is what VXLAN has, but this change does not make a lot of sense I think) and the VM's were able to communicate to each other. I then disconnected the "Simple" network to check if communication with outer world would work ("host google.com"), but it was not the case.
I also verified that the VM's can talk to each other with MTU 1500 when they run on the same node. This is probably a useless check though.
I do not remember this issue on 8.0-2, but of course maybe I did something differently this time (but I cannot figure out what, haha). Sorry for the long post!