NAME
gyptazy.ch

OPTIONS

CONTENT
Howto Create a Cheap Multi Site High Availability Setup with a Wireguard Tunnel (2024-05-12):

In today's interconnected world, ensuring high availability (HA) across multiple sites is crucial for businesses to maintain continuity and reliability. While traditional solutions often rely on complex protocols like BGP or GRE tunnels, implementing a cost-effective HA setup using WireGuard VPN tunnels provides a simpler yet robust alternative. This solution leverages the flexibility of WireGuard VPNs to create secure connections between multiple sites, with the added benefit of easy setup and management but also keeping the overall costs low. Within this solution all traffic terminates on public IPv4 and IPv6 IPs and will be routed or NATed (depending on the operators personal decision) to the desired endpoint. In this solution any endpoint can be used, even behind a (CG)NAT, because the site endpoint will initiate the VPN connection to the main Wireguard VPN server which holds the public IPs. In this example, a simple webserver will be made available in multiple different sizes and availability zones across the world. Another solution could also lead in dynamically defining and setting the domain's DNS records. However, that might bring you in additional problems when it comes to DNS caching which might still be even a problem with a very low TTL. As a result, this solution benefits from a direct availability and might also benefit from the TCP retransmission window, where even switching a site might often result in a seamless switch from a client perspective. Why I'm saying often here and not always, will be explained later.

High Level Description
Clients will request web based content from gyptazy.ch on port tcp/80 and tcp/443. Therefore, the matching records will be looked up during a DNS query. A resolver will obtain the following records:

    IPv4: gyptazy.ch. IN    A 94.247.42.35
    IPv6: gyptazy.ch. IN AAAA 2a0b:7140:4::1337

Those IPs are handled by a small pair of redundant FreeBSD servers by using pf, pfsync, CARP and Wireguard or by a typical Linux system with heartbeat and Wireguard (optionally with contrack sync). Within this setup, all requests will be NATed but also other solutions can be used to avoid NAT and IPv6 NAT. In this scenario, all requests for tcp/80 and tcp/443 on the public IPs will be forwarded on RFC1918 (internal IPs) and RFC4193 Unique Local Addresses (ULA) addresses.

    94.247.42.35 -> 10.10.10.2
    2a0b:7140:4::1337 -> fc00::2

The Wireguard client VPNs are directly located on each webserver system (it is also easily possible to create a dedicated router system to access all systems within the desired site). This means, any webserver that initiates the VPN connection to the HA router holding the public IPs will start serving the content. To make the switches automated and elect a leading system solutions like CARP, keepalived, heartbeat, corosync or other ones can be used. This means, that this is more depending on the underlying network infrastructure and will mostly require a mesh vpn between all instances or another solution to elect the leading system. Solutions and details will be explained later.

High Level Network Plan

Gyptazy Wireguard Multi Site HA Diagram

Wireguard Configuration
The most important part is the Wireguard configuration which allows us to tunnel the traffic safely and encrypted. Wireguard provides a better throughput performance than OpenVPN by still having the highest security encryption in place. While this can also be solved by using OpenVPN, Wireguard is the recommended solution.

Server:
The server configuration is pretty easy. Ensure to have IPv6 in place, defining the right Wireguard tunnel interface and the correct network card. Also make sure to define the NAT rules. This example configuration is for Linux based systems using ip(6)tables:

    [Interface]
    Address = 10.0.0.1/24
    Address = fc00::1/64
    SaveConfig = true
    PostUp = ufw route allow in on wg0 out on eth0
    PostUp = iptables -t nat -I POSTROUTING -o eth0 -j MASQUERADE
    PostUp = ip6tables -t nat -I POSTROUTING -o eth0 -j MASQUERADE
    PostUp = iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -d 94.247.42.35 -j DNAT --to-destination 10.0.0.2
    PostUp = iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 443 -d 94.247.42.35 -j DNAT --to-destination 10.0.0.2
    PostUp = ip6tables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -d 2a0b:7140:4::1337 -j DNAT --to-destination fc00::2
    PostUp = ip6tables -t nat -A PREROUTING -i eth0 -p tcp --dport 443 -d 2a0b:7140:4::1337 -j DNAT --to-destination fc00::2
    PreDown = ufw route delete allow in on wg0 out on eth0
    PreDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
    PreDown = ip6tables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
    PreDown = iptables -t nat -D PREROUTING -i eth0 -p tcp --dport 80 -d 94.247.42.35 -j DNAT --to-destination 10.0.0.2
    PreDown = iptables -t nat -D PREROUTING -i eth0 -p tcp --dport 443 -d 94.247.42.35 -j DNAT --to-destination 10.0.0.2
    PreDown = ip6tables -t nat -D PREROUTING -i eth0 -p tcp --dport 80 -d 2a0b:7140:4::1337 -j DNAT --to-destination fc00::2
    PreDown = ip6tables -t nat -D PREROUTING -i eth0 -p tcp --dport 443 -d 2a0b:7140:4::1337 -j DNAT --to-destination fc00::2
    ListenPort = 1337
    PrivateKey = $PRIVKEY

    [Peer]
    PublicKey = $PUBKEY
    AllowedIPs = 10.0.0.0/24, fc00::/64


Client(s)
All the clients will receive the same configuration for their VPN connection to the root system holding the public IPs:

    [Interface]
    PrivateKey = $PRIVKEY
    Address = 10.0.0.2/24
    Address = fc00::2/64
    PostUp = iptables -t nat -I POSTROUTING -o eth0 -j MASQUERADE
    PostUp = ip6tables -t nat -I POSTROUTING -o eth0 -j MASQUERADE
    PreDown = ufw route delete allow in on wg1 out on eth0
    PreDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
    PreDown = ip6tables -t nat -D POSTROUTING -o eth0 -j MASQUERADE

    [Peer]
    PublicKey = $PUBKEY
    AllowedIPs = 0.0.0.0/0, ::/0
    Endpoint = fw02-gw04.ch01.gyptazy.ch:1337
    PersistentKeepalive = 25


Site Selection
Within the current situation, every site is able to connect on its own to take the lead and serve the content from the public IPs. While this could now be managed manually (which might also fit some use cases as a cold standby setup), this might also be automated. Quorum solutions refer to mechanisms for ensuring consistency and reliability in distributed systems, particularly in scenarios where multiple nodes need to agree on a course of action. In a distributed environment, it's crucial to maintain agreement among nodes, even in the presence of failures or network partitions where another system in a different site need to initiate the connection to take the lead. As already mentioned, there are many solutions and it can simply be done by a self-written Shell or Python script but also keepalived, heartbeat, corosync and other ones might be used.