Valid Generic HTML

Installing WireGuard

James F. Carter <jimc@jfcarter.net>, 2021-10-07

Some info about WireGuard, a new VPN:

I have just gone through yet another audit of my VPNs, making sure that they work for all relevant clients and that the vpn-tester program can competently report if they are or aren't working. Currently (2021) my two servers run StrongS/WAN IPSec (strongswan-ipsec-5.9.3 on SuSE Tumbleweed) and OpenVPN (openvpn-2.5.3 on SuSE Tumbleweed). The clients have Linux (same versions) and Android: strongSwan VPN Client (version 2.3.3, org.strongswan.android) and OpenVPN for Android (version 0.7.25, de.blinkt.openvpn). Both VPNs work well when properly configured, but they have a number of less than wonderful features:

Responding to shortcomings in existing VPN software, Jason A. Donenfeld in 2015 began to develop WireGuard, a new VPN. The project website describes these features; whether they're scored as good or bad depends on the user's goals.

Some details of the Edwards Curve Diffie-Hellman key establishment procedure are interesting. See this Wikipedia article about ECDH, which I've summarized. See also the EdDSA (Edwards Curve Digital Signature Algorithm) article, the section on ED25519.

What are my goals for the VPNs, and how much hassle will it be to make WireGuard deliver what I need, so I can add it to my collection?

There's an issue that makes a lot of trouble for designing a net with VPNs: some clients always use the VPN and some don't. I'm implicitly assuming a central server that all the clients work through. For the always VPN case the right routing setup is to assign the client's hostname to a fixed address on the VPN tunnel device. The same address is on the client's LAN interface, and by policy routing its peer(s) send bearer packets to that interface. Normally only the server would have that policy route, but there might also be multiple peers. The server always has, and advertises, a route to this client through its own VPN tunnel endpoint, and the rest of the LAN (non-peers) sends to the client via the server. My laptop and my cloud server operate this way.

The harder case is when the client sometimes uses the VPN, and sometimes doesn't, like my VPN tester and my cellphone. It's a total can of worms to set up a route via the server when the client connects, and to make this route go away when it disconnects, particularly when other LAN members need to originate connections to the VPN client, directly or via the server, depending. The way I'm handling this on the other VPNs is, the client's name is assigned to a fixed IP on its egress interface: Wi-Fi or Ethernet. Peers on the LAN connect to this non-VPN address, except that isn't possible if the cellphone is roaming (because peers don't know the cellular assigned address and I'm not going to mess with dynamic DNS on my cellphone). When the client turns on the VPN, it puts a separate IP address on the VPN endpoint, and the server has a permanent route to this address (or pool) via its own VPN endpoint, which it advertises all the time. Other LAN hosts can originate connections to the client's VPN address, but only when the client and its peer have the VPN turned on.

Let's make this design into something a little more concrete that I can turn into a WireGuard conf file.

For the symmetric cipher on the main channel, WireGuard uses only ChaCha20Poly1305, for which hardware acceleration is very rare. On the Intel Core® i5-10210U, jimc's tests score it as half as fast as hardware accelerated AES-256 (Rijndael), and twice as fast as software AES-256. This difference would only be significant for a server with thousands of clients.

https://www.wireguard.com/quickstart/

ip link add dev wg0 type wireguard  #Pick a name for the tunnel device
ip address add dev wg0 192.168.2.1/24 [ peer 192.168.2.2 ] if only 1 peer
wg setconf wg0 myconfig.conf   (wg utility is provided) --or--
wg set wg0 listen-port 51820 private-key /path/to/private-key peer $itsname \
    allowed-ips 192.168.88.0/24 endpoint 209.202.254.14:8172
ip link set up dev wg0

wg (with no args) is equiv to wg show (for all interfaces e.g. wg0) wg-quick [up|down|etc] ctlfile

Wireguard wants ECDH (Edwards Curve Diffie-Hellman) private and public keys; each is 255 bits (32 bytes) long, or 43 bytes base64 encoded. The configuraton file may contain the base64 key itself, or the name of a file containing it. The provided wg utility can generate them for yous, like this:
wg genkey | tee privatekey | wg pubkey > publickey

Wireguard does not use X.509 certificates to authenticate/authorize the peers; authorized keys are preinstalled for each client-server pair. But they can be installed on the fly by wg.

You may test with their demo server.

So let's try to set something up. For testing, I'm starting this at 2021-10-07 18:00. I'm going to use these basic steps:

Android Client

Make sure there's a client for Android. Install it first but don't try to use it yet. Yes there is one, called WireGuard, with the serpent logo (®). Inception 2019-10-13, most recent update 11 days ago, 5e5 downloads, offered by WireGuard Development Team. You could import a configuration from a file, or a QR code (!), or create it by hand. I looked at the required info but didn't create my connection. 7 mins including reading the product info. https://wiki.archlinux.org/title/WireGuard

How to get the QR code that the Android client can import. This is from the Arch Linux wiki article about WireGuard. On the Linux desktop host that has the conf file:
qrencode -o outfile -t ansiutf8 -r client.conf
If you omit -o outfile or specify -o - the result is on standard output, and if this is a terminal that can display ANSI UTF-8 characters (see the -t option), the QR code itself becomes visible. You may need to make the window wider and/or higher to avoid wrapping lines. Suppress long comments; the maximum size is 4000 characters. qrencode is from package qrencode on OpenSuSE Tumbleweed.

Install on Surya and Petra Xena

The required kernel module is called wireguard.ko and it is in the standard kernel, version 5.14.11 and likely quite a bit earlier. To pass configuration information to it (plus displaying connection info and generating keys) you need wireguard-tools (current version as this is written is 1.0.20210914) from the OpenSuSE Tumbleweed main distro. Older versions are available for Leap 15.3 and 15.2. 72Kb to download, 145Kb installed. No dependent packages; it only requires systemd and libc. The package only contains the wg and wg-quick commands, and documentation.

wg-quick is a wrapper around wg for simple configurations. When either command is given just an interface name such as wg0, the corresponding configuration file is sought in /etc/wireguard/wg0.conf, whereas if an absolute pathname is given the interface is inferred from the basename of the conf file. The interface name may be up to 15 bytes of [a-zA-Z0-9_=+.-] . (You don't specify the interface name inside the conf file.)

On Xena I also installed NetworkManager-wireguard plus NetworkManager-wireguard-gnome (you need both for the GUI). These are experimental packages, not in the main distro. Find them with the SuSE package searcher. Depends on wireguard-tools. Most likely you don't have the developer's package signing public key; either get it, or ignore Zypper's security warning.

About 20min to install the packages and read the man pages.

Configuration Files and Key Pairs

A prerequisite is, what port am I going to use? WireGuard doesn't have an IANA port assignment, but documentation often shows 51280 and forum posts and howto's usually show this one. But this port range (all above 32768) is for aleatory ports, and a collision could occur. The BSD Daemon whispered in my ear that since OpenVPN has 1197 assigned, WireGuard should use xx96. Unassigned and stealable port numbers are 2196 4196 4296 4496 4696 4796 4896 4996 5096 and most candidates above this. 42xx is completely vacant and appears to be intended for private use, and I have a local policy to put nonstandard ports in this range, so 4296 is what I will use. I will need to set my firewall to pass 4296/udp in the same cases as it passes 1197/udp.

On the other hand, for the initial tests (that might fail) I don't want to mess with the firewall, so I'll use 4886, the unofficial wakeup port for Android, which my firewall passes from+to the local LAN so the Android hosts can wake each other up.

Here is the client's configuration file for testing. See the genkey subcommand of wg for producing your keys. The conf file contains your private key (not encrypted), so it should have appropriately restrictive permissions, mode 600. /etc/wireguard is insalled with mode 700, but I set the individual conf files to 600 anyway. See the man page for wg for a small number of additional configurable parameters such as the keepalive interval, if your net needs it.

[Interface]
PrivateKey = qwerty...=		# 43 base64 bytes, about 256 bits.  Keep the =.
ListenPort = 4886		# Android wakeup port, which my firewall 
				# allows, but I'll have to change this later.

[Peer]
PublicKey = asdfgh...=		# 43 base64 bytes, about 256 bits.
Endpoint = [2600:3c01:e000:306::8:1]:4886	# IPv6 in [], port after colon
AllowedIPs = 147.75.79.213/32,2604:1380:1:4d00::5/128	# www.zx2c4.com.
# There can be multiple peers.  

About 25min + to figure out the conf file.

Xena and Surya to Test Server

Starting about 16:10

The SuSE package wireguard-tools does not include the scripts mentioned in the quick start guide for contacting the demo server.

When wg is used to bring up the connection, it loads the wireguard kernel module, nine crypto modules (that the documentation says it actually uses), udp_tunnel and ip6_udp_tunnel.

Debugging Petra's networking took extra time, but once I switched to test on Xena it took about 10 minutes to turn on WireGuard and do the tests.

I repeated these steps on Surya. The two test activities succeeded.

Everywhere, Install Software, Keys and Conf File

Given how my VPN tester is designed, it's a whole lot easier if every host has WireGuard installed, specifically wireguard-tools. Doing that now.

OpenVPN and StrongS/WAN assign the client an IP address from a pool, similar to DHCP. But my tunnels are very predictable, so I pre-assigned IPs to potential WireGuard participants, all on the same subnet. Instead, I'm making new address ranges for WireGuard tunnel endpoints: 192.9.200.112/28 (16 addresses) and 2600:3c01:e000:306::9:0/112. The addresses are assigned according to a pattern, but most likely I will get them into /etc/hosts soon.

Each host gets a key pair and a generic conf file with Jacinth as its peer (server) (except Jacinth itself).

Connect Xena to Surya

This turned into a long and time-consuming learning experience. I'm condensing a lot of failures and listing the high points:

Install on Jacinth

I wrote a script to generate conf files and up scripts on each host. It follows the design plans for the special features on particular hosts. This way, issues are not forgotten and chewed-up configurations can be regenerated at will. All hosts now have their proper keys, configurations and up scripts. Petra to Jacinth: no response. Claude to Jacinth: Routes: 192/26 dev en0; 128/25 dev wg0; to Surya, pings to $pfx::8:2 are answered but not to $pfx::8:1 Xena to Claude: IPv6 only. Ditto Surya Jacinth + Iris to Claude: pings IPv4+6 Can't tell if offsite connections are dnatted to Claude via WG or vnet0. Holly to Jacinth: pinging claude diamond iris jacinth via main LAN: works pinging petra xena surya via WireGuard: no answer. xena->holly trcr -6: ov_u_j.cft.ca.us (1:1), holly (i.e. via WG) xena->holly trcr -4: ov_u_j.cft.ca.us (129), nothing thereafter. IPv4 on Jacinth sends this via br0. Got to implement "if client is using WG, route to it; if not, route via br0". Method 1: every bearer packet on the WG port of type 1 (content inspection) is cloned with mirred to some netlink socket.

Oso to Jacinth VPN

I have two types of clients: those that always use WireGuard, and those that sometimes use WireGuard. To deal with routing issues, Xena <-> Jacinth and Jacinth <-> Surya always need the VPN, whereas Selen (Android) uses it only when roaming (and when access to the local LAN is wanted). The latter scenario is the natural one for OpenVPN and IPSec, so I've been focused on that so far, but making it work is going to be hard with WireGuard, so I've decided to switch over to the always on paradigm, at least at first. Xena, Jacinth and Surya are the most important hosts on my net, and it's not acceptable to knock them out with VPN experiments. Among my other VM's, Claude (the webserver) is also mission-critical, and Petra is hosted on Xena and is affected by its networking. So to get this project moving, I revived a disused VM called Oso, hosted on Iris (a leaf node) with bridge networking, so it is effectively an independent leaf node.

For the first try I'm going to have, for each client, an individual interface (wg-$PEER) with individual addresses from 192.9.200.96/28 and 2600:3c01:e000:306::10:0/112. Later I'll try doing the tunnels on a shared interface like I originally planned.

For the first try on Oso I set up Oso with AllowedIPs = 192.9.200.106, 2600:3c01:e000:306::10:10 (just Jacinth's WireGuard interface addresses for Oso), and Jacinth had AllowedIPs = 192.9.200.122, 2600:3c01:e000:306::9:10 (Oso's WireGuard interface addresses). Oso's firewall was rejecting bearer packets on 4296/udp. This fixed, I could ping the peer's interface addresses, both families, both directions.

Next try is to add Oso's own addresses to AllowedIPs on Jacinth, and just Xena's subnet on Oso. For reconfiguring I'm going to take down WireGuard on both ends first, rather than trying to run wg-quick with a running configuration, since I'm expecting trouble on this one. Yes, Jacinth and Oso can't ping each other, because Jacinth tries to send the bearer packets to Oso via the tunnel that they're bearing. wg-quick has a limited ability to activate policy routing for the bearer packets, but this configuration is not recognized as needing it.

Next try: Jacinth AllowedIPs = Oso WG addresses + 2600:3c01:e000:306::d4/128 (Oso's own IP); Oso is unchanged with Jacinth's WG addresses + Xena subnet. Jacinth can ping all the Oso AllowedIPs mentioned, So can Oso. Xena and Petra can ping Oso's IPv4+6 WG address, but Xena needs to specify its public IP in the -I option of ping (source address) because that's what's in the AllowedIPs on Oso, vs. the endpoint of Xena's tunnel to Jacinth. For traceroute this would be the -s option.

Next try: a script that implements the Wireguard Evolution item for bearer packets down the tunnel. Trying it first on Oso. It works, but didn't solve my problems.

Here are the key principles that I finally worked out, for making a configuration file that gets the packets through.

Xena to Jacinth VPN

Using the newly installed NetworkManager plugin for WireGuard. Get Xena back on the net.

What's required by the WireGuard plugin:

Let's think about a design that will apply to all hosts. As set up now on Xena only, /etc/NetworkManager/dispatcher.d/ includes my script that starts OpenVPN. All these acripts are run whenever interfaces go up or down and the scripts decide if they need to do something about it. This script would have to change to bring up WireGuard. Let's concentrate on a design for the other hosts, then adapt it to work with NetworkManager.

What files and scripts do we need to make this all work?

WireGuard has a big problem if the client sometimes has WireGuard running, and sometimes expects to be contacted on the local LAN. Think of the backup host collecting changed files from the client. Basically, the sysop needs to direct the backup collector to the client's WireGuard or non-WireGuard address, depending on whether or not it's off-site and using the VPN.

Configure and Test Android Client

Vpn-Tester to Test WireGuard

Segment Tunnel

This is the tunnel from Jacinth to Surya, currently operated by OpenVPN. Jacinth originates it and Surya acts as a listening server or responder. But if bearer packets go down the tunnel this is a chicken and egg issue and you end up with an omelet. So OpenVPN has an anti-iroute so any packet on the initiator's end addressed to the interface used by the OpenVPN listener will go out by normal routing, not the tunnel, and Surya's firewall will reject it. The anti-iroute is not restricted to OpenVPN's port number; all ports are blocked, such as traffic to the webserver listening on Surya's wild side (www.jfcarter.net).

The WireGuard deployment campaign in 2021 got preempted, but an issue has arisen which returns WireGuard to the front burner. Specifically, my family is moving to Washington state to be closer to our son, and the master site Jacinth is going to be in a packing box for an unknown time. But Surya, our cloud server, won't be in any packing box. Therefore I'm going to transplant a lot of the server software, specifically our public webserver and site, to Surya. But before and after the move when the local LAN is functioning, LAN hosts still need to get to the webserver and other services on Surya. (Yes, there are slave servers too.) I was solving this by creating an ipip or Geneve tunnel within the segment tunnel, but adding kludge upon kludge is not the way to go; the right solution is to finish the WireGuard deployment and to divert bearer packets off the segment tunnel and onto the default route.

Further design and details are below.

Redesigning the WireGuard Deployment

The design reqirements are:

After a rather aggressive struggle to get WireGuard working, I learned these points:

Translating these into implementation issues:

What's already written: this is all in /etc/wireguard and I'm describing what's currently on Jacinth.

  • Fly in ointment; I'm testing the WireGuard configuration on one responder and one initiator. The responder starts first. The initiator starts 2 seconds later. (Both have Peer stanzas with Endpoints.) The responder sends a bearer packet, probably key establishment. The initiator responds ICMP UDP port 4296 unreachable. The responder never sends another bearer packet, despite payload packets coming in to wg0, and the initiator never sends any bearer packet.

    Investigation #1: Why didn't the Initiator get any payload packets routed to wg0? Because the AllowedIPs is to Jacinth's wild side and all the payloads go to Jacinth's LAN address.

    Fix #1: Force the AllowedIPs to the LAN address. Program couldn't handle a FQDN for a peer. Now it can handle the FQDN. Functional test passed (curl to webserver, both directions).

    The working endpoints are:

    Anubis, Your Guide to the Underworld

    The StrongS/WAN IPSec suite includes a daemon called Charon, formerly Pluto. The initiator starts a VPN-type connection by signalling their own Charon to establish a Security Association with the peer's Charon (authentication credential required) and to send to the peer connection parameters like which address ranges should go over the tunnel. In OpenVPN the connection setup module isn't a separate daemon but it performs similar functions, including selecting affected traffic (its equivalent of AllowedIPs is called an iroute).

    WireGuard needs a similar gatekeeper which, following the underworld theme, I'm calling Anubis. Its functions are just about one to one equivalent to Charon's, but WireGuard has advantages in simplicity. Here are the basic design points:

    Final (I Hope) WireGuard Design, Version 3

    Ideas for WireGuard Evolution

    Official Multi-Client Support

    Jacinth's role on OpenVPN and IPSec is as a generic server: potentially a variety of clients could connect at the same time, authenticating with an X.509 certificate with an acceptable trust chain. This isn't going to fly with WireGuard, since the server has to know the client's public key before it can accept a connection from the client.

    Brainwave:

    Explicit Exit Notify

    WireGuard needs the equivalent of OpenVPN's explicit-exit-notify. When the kernel module detects that a connection is going down (e.g. ip link del dev wg0) it should notify the peer. The rekey timeout seems to be short, under 1 minute, but the rekey attempt only occurs if the non-dead peer sends a packet, and it's not clear how much state it's keeping for the dead peer and how significant that is. It just seems neater to notify the surviving peer if you're closing the connection.

    Dealing With a Compromised Crypto Algorithm

    Cryptographic algorithms can't be relied on to last forever, although Rijndael (AES) has lasted with only minimally effective attacks up to 2021 since 1997 (inception, or 2001, anointment in FIPS pub. 197), and ChaCha20 has been widely deployed from 2008 to 2021. It would be a very smart move to add algo negotiation, with the needed info in the dummy payload in the initial handshake packet.

    Bearer Packets Down the Tunnel

    In this scenario you have a chicken and egg situation that results in an omelet. wg-quick already recognizes when the default route is sent through the tunnel and puts in a policy route to divert bearer packets to their original (presumably default) route. But a more limited omelet route is not recognized, nor is the case where such a policy route has already been set up.

    The very first step for wg-quick should be to do ip route get $EndpointIP, with the IP it's actually going to use (IPv4 or 6), This route should lead to the peer's non-tunnel address. When wg-quick finishes setting up routes, including running PostUp and PreDown scripts that might set routes, it should again do ip route get $EndpointIP, and if the route goes through the WireGuard interface, it should do the policy routing thing that diverts bearer packets via the route that it initially discovered. As much as possible of this route should be preserved, specifically the metric and the source address, if available.

    On a server with multiple peers you may need an individual diversion route for some or all of the peers.

    Re-planning Routes

    I'm looking carefully again at the network design on my net. I think I need to refactor routes to/via the VPNs (with WireGuard added). In the table below, leaves means all the hosts not explicitly mentioned. $pfx represents the first three octets of the IPv4 address range. See below for Xena's default route, indicated by *. There are analogous addresses and routes for IPv6.

    Host VPN or Route Presently Change To
    — Address Ranges —
    Vacant $pfx.0/25 $pfx.0/26+64/27
    Jacinth OpenVPN 1194/udp $pfx.128/29 $pfx.96/29
    Jacinth OpenVPN 443/tdp $pfx.144/29 $pfx.104/29
    Jacinth IPSec $pfx.160/29 $pfx.112/29
    Jacinth WireGuard (none) $pfx.120/29
    Surya OpenVPN 1194/udp $pfx.136/29 $pfx.128/29
    Surya OpenVPN 443/tdp $pfx.152/29 $pfx.136/29
    Surya IPSec $pfx.168/29 $pfx.144/29
    Surya WireGuard (none) $pfx.152/29
    Surya Segment tunnel $pfx.184/29 $pfx.160/29
    Xena Xena+Petra subnet $pfx.176/29 $pfx.168/29
    Vacant (none) $pfx.176/28
    Leaves Main LAN $pfx.192/26 $pfx.192/26 (same)
    DHCP In main LAN $pfx.240..254 No change
    — Routes —
    Leaves Default route Jacinth $pfx.193 (Same)
    Jacinth Default route IPv4 Its wild side (en1) (Same)
    Jacinth Default route IPv6 Surya $ofx.185 Surya $ofx.161
    Surya Default route both Its wild side (en0) (Same)
    Xena Default route Jacinth $ofx.193* (Same)
    Petra Default route Xena $pfx.177 Xena $pfx.169
    Jacinth Main LAN dev br0 (Same)
    Jacinth Jacinth OV 1194/udp dev tun0 (Same)
    Jacinth Jacinth OV 443/tcp dev tun1 (Same)
    Jacinth Jacinth IPSec Already on Jacinth (Same)
    Jacinth Jacinth WireGuard (none) dev wg0
    Jacinth Surya VPNs+subnets (Combined) dev tun9/wg9 to Surya
    Jacinth Surya OV 1194/udp Surya $pfx.185 (Combined)
    Jacinth Surya OV 443/tcp Surya $pfx.185 (Combined)
    Jacinth Surya IPSec Surya $pfx.185 (Combined)
    Jacinth Surya (segment tnl) dev tun9 (to surya) (Combined)
    Jacinth Xena + Petra VPN(Xena) $pfx.130 VPN(Xena) $pfx.106
    Surya Jacinth VPNs+subnets (Combined) dev tun9/wg9 to Jacinth
    Surya Jacinth OV 1194/udp Jacinth $pfx.186 (Combined)
    Surya Jacinth OV 443/tcp Jacinth $pfx.186 (Combined)
    Surya Jacinth (segment tnl)dev tun9 to Jacinth (Combined)
    Surya Jacinth IPSec Jacinth $pfx.186 (Combined)
    Surya Surya OV 1194/udp dev tun0 (Same)
    Surya Surya OV 443/tcp dev tun1 (Same)
    Surya Surya IPSec Already on Surya (Same)
    Surya Xena + Petra Jacinth $pfx.186 (Combined)
    Surya Main LAN Jacinth $pfx.186 (Combined)
    Xena (finish this)

    Picking Up Again

    WireGuard is not IPSec (StrongS/WAN) or OpenVPN; it has no key agreement mechanisms analogous to StrongS/WAN's Charon. The correct way to handle WireGuard is pure pre-shared keys. The server always loads all the authorized keys, and each client has its own key and the server's key. For true point to point links, the server would have only one authorized peer.

    So let's set this up and be done with it!