I have a laptop (called Xena) with a Wi-Fi network connection, which hosts a virtual machine (called Petra) using libvirtd, qemu and KVM; a second VM is possible. The normal way (for me) to put a VM on the net is to create a bridge on the host, which becomes the host's main NIC, with the wired Ethernet NIC as a member, as well as a virtual network connection from each of the VMs. This does not work when the egress port is Wi-Fi.
Wi-Fi emulation of Ethernet is not complete, and in particular, there are corner cases which don't work. Therefore the Wi-Fi developers have flatly forbidden membership of a Wi-Fi NIC in a bridge or equivalent constructs. Specific problem areas are:
When the Wi-Fi client authenticates to the access point, the Wi-Fi specs interpret the resulting promise as meaning that traffic will be disclosed to or originate from only the entity that authenticated, which is certainly not the case if the client NIC is a member of a bridge. However much we may know about trust relations on our own LAN, the developers have not provided any way to relax this restriction. (The hack to sabotage it is simple.)
Broadcast or particularly multicast packets are emulated poorly in Wi-Fi. They have to be sent individually from the AP to each client. The AP is not going to take into account multiple clients on a bridge.
The Wi-Fi packet layer has space for four addresses, two of which are the MAC address of the client's Wi-Fi NIC, and of the sending station, for use in a mesh or in AP to AP forwarding. Normally these two are identical, and (as an optimization?) the sending station's address is normally left blank. The driver can be gotten into four address mode, and it works, and a four address NIC can be accepted as a member of a bridge, but it doesn't work enough to be used functionally in production.
So I'm going to have to put together another network solution for the VM. Here are my requirements, in approximate order of importance.
My net uses both IPv4 and IPv6 (dual stack). Permanent hosts are assigned fixed IPs, which are available by DHCP, and there are DHCP pools for transients (guests). Router advertisements are sent and RFC 4862 addresses get configured in parallel with the fixed IPv6 addresses.
Except for special cases, each host has its own firewall and is safe from infected neighbors on the local LAN. The gateway has additional features to keep wild side peers in their cages.
The guest must be able to originate a (unicast) connection to the KVM host, to other hosts on the local LAN, and to the wild side.
The KVM host and other hosts on the local LAN must be able to originate a (unicast) connection to the guest. Also peers on the wild side, if they would have been able to do so to a non-virtual host, which would generally involve a VPN.
The guest should be able to use broadcast and multicast services such as DHCP and mDNS, including when the host is on foreign LANs.
The guest's networking should function when the KVM host is on the local net, and also when it is on a foreign (wild side) net.
Access to the local net from the guest on the wild side should follow the same security and authentication requirements on the guest as on the host.
The KVM host uses NetworkManager, and I want to keep it because of
its flexibility and nice GUI support. The guest uses whatever the rest
of the hosts use, currently Wicked. I need the guest to be flexible
and to have an environment that looks normal
, so I can test new
network infrastructure such as systemd-networkd.
When the KVM host is on the wild side and has a VPN to home, as it usually will, it would be a nice feature if the guest had automatic access to the local LAN, but this is not required.
I tried several networking schemes without success, detailed below. Now I'm reverting to an earlier concept: Xena as a router.
KVM creates on the VM host a network interface called vnet0 which bridges packets to or from the guest's eth0. In a simple setup the number 0 is fairly consistent, but it can vary randomly at boot time, and you need a setup procedure that works with any interface name.
KVM and libvirt have several styles to give network access to the guests.
One is bridged networking
; this is what I normally use on non-wireless
VM hosts. It actually basically works without a bridge, but libvirt is going
to be a lot happier if you create the bridge that your VM XML file says that
vnet0 should be put into. Before starting the guest.
Offis the default.
brctl is deprecated and its facilities have been added to the ip
command from the iproute2 package.
NetworkManager has a connection type of bridge
and in the GUI you
can configure the bridge to be created at boot. Gotcha: if NetworkManager
gets restarted, vnet0 will be evicted from the bridge and will not be
reinstated automatically; you will need to add it by hand using one of the
above commands.
Normal
networking is designed to work with subnets, not individual
hosts. I was able to get networking mostly working with an independent vnet0,
not in a bridge, but the results are a lot cleaner if I create the bridge.
At present, CouchNet has these subnets. The default gateway for all of them is Jacinth, except as noted.
To this collection I'm going to add:
Do this to activate forwarding. The equivalents of these commands could go in /etc/sysctl.conf or /etc/sysctl.d/01-whatever.conf . When changed they reset most parameters to their defaults, so they should be executed early, and sysctl.d fragments are executed in lexical order.
The bridge on the VM host (or the naked vnet0 device) needs these addresses and routes. Although it's legal to use the same IP address for both the bridge and the egress interface (wlan0), I have found that fewer strange things happen and it's less confusing if the bridge has its own IP address (called xenavm). IPv4 is shown but these need to be duplicated for IPv6.
noprefixroute. Then make the above route explicitly. If you gave the bridge its own subnet and its own address therein, all that would happen automatically.
Xena, my VM host, uses NetworkManager, and I created a NM connection which
creates the bridge automatically with these parameters as fixed IP addresses.
One minor detail, I want to use a fixed MAC address for the bridge, because
the guest's firewall requires that its peer's MAC be registered as trusted.
Also I want to be able to recognize bridge traffic
when seeing tcpdump output or error messages that contain the EUI-64.
As a local convention, for my
VMs I use 52:54:0: which is the assigned OUI (MAC range) for KVM, followed by
the last 3 octets of the interface's fixed IPv4 address. I'm using the same
convention for the bridge. The NetworkManager GUI for Edit Connections
does not have a text box for overriding the MAC address. But see this
blog post about making NetworkManager set a fixed or random MAC address
by Thomas Haller (2016-08-26). He shows how to use nmcli
to modify the
connection file (/etc/NetworkManager/system-connections/br0, on Xena), as well
as how to display the available parameters and the connection name.
For example:
nmcli connection modify br0 bridge.mac-address 52:54:00:09:c8:a9
All other hosts on the local LAN need a route via the VM host to the bridge's subnet:
The guest needs theese addresses and routes. The goal here is to have this
machine be as normal
as possible. One component of being
normal
is to have the address and routes appear by DHCP and/or IPv6
Router Discovery.
At this point, bidirectional communication is achieved between the guest and these peers: VM host, LAN gateway, other LAN neighbor, offsite peer. This is on both IPv4 and IPv6. (The IPv6 commands are analogous but are not shown.)
Normal networking does not require a bunch of manual commands every time you start up your VM. I'm setting up dnsmasq and radvd to provide the required address and routes by DHCPv4, DHCPv6 and IPv6 Router Discovery. Global issues about dnsmasq:
Features for /etc/dnsmasq.d/*.conf:
guestson this subnet.
Catch-22above for the reason.
The hosts on the local LAN need to be told to send traffic for the VM guest (Petra) via the VM host (Xena). Dnsmasq on the local LAN's gateway sends out this route for IPv4, and I'm using radvd on the VM host to send it for IPv6. Key features in /etc/radvd.conf:
viaparameter of the route.
Additional flies in the ointment that had to be fixed:
nmcli.
solved.
With those miscellaneous fixes, Petra can configure its network autonomously at boot, and can make and accept the connections listed in the requirements. So this project has come to a successful conclusion, if we're not too picky about various kludges.
Avahi-daemon gives endless trouble, losing its addresses for no obvious reason. I've thought about just not having a mDNS service. But systemd-resolved is a new entry in this area and I'm going to try to get it to work before giving up the whole service. Here's what systemd-resolved does. Most of these sub-services can be turned on or off in the configuration file, /etc/systemd/resolved.conf .
DNS stub resolver: it can't be authoritative for an entire zone,
but its main purpose is to keep aware of real
DNS servers on the
local net and to forward queries to them, or to wild-side DNS servers.
Signed responses are validated with DNSSEC; unsigned responses can be
rejected or accepted without validation. Responses are cached locally
until their TTL expires.
LLMNR: Link Local Multicast Name Resolution. This is a
Microsoft-ish protocol (RFC 4794..5) to elicit A
and PTR records
from hosts using link-local addresses, possibly exclusively.
Systemd-resolved can make and respond to such queries.
Named link local addresses are not a major part of my operation, and to avoid waking sleeping dragons, I have turned off this feature.
Systemd-resolved can respond to queries over dbus, the recommended mode of use.
There is a NSS module for the hosts map in glibc, that submits such dbus queries. Glibc subroutines like getaddrinfo and gethostbyname can reformat the results and deliver them to the caller.
On my system this module is in use and works fine.
Systemd-resolved listens on 127.0.0.53 port 53 for unicast DNS queries, and it maintains a file similar to /etc/resolv.conf with this server IP and port. It's recommended (but not required) to make a symlink from /etc/resolv.conf to this file.
Hosts on my net have this link, and the content is delivered reliably.
For mDNS (Multicast Domain Name Service), systemd-resolved listens on port 5353. Clients can send to this port on the mDNS multicast addresses 224.0.0.251 and ff02::fb, or unicast to the server's own address, and systemd-resolved will get the queries (verified by strace). Whether it will respond is another matter. Responding can be disabled globally, and also has to be enabled explicitly on each desired interface (e.g. not on the wild side) in /etc/systemd/network/xxx.network . Systemd-networkd has to be running to pass this configuration information to systemd-resolved.
However, I was never able to elicit a mDNS response. (The same tester succeeds with avahi-daemon.) Comments in changelogs suggest that OpenSuSE Tumbleweed (systemd-237) may have mDNS responses disabled but I don't see where (or why) this is done in the spec file.
mDNS is only used to query the zone .local., and .local. is only available via mDNS or by the dbus interface, not by conventional DNS on port 53.
Avahi-daemon and systemd-resolved can coexist, with Avahi (and not systemd-resolved) responding to mDNS queries on port 5353. Systemd-resolved still receives registration information so it can serve the .local. domain from its dbus interface. This is the mode I am operating in, successfully.
Systemd-resolved responds to 'A', AAAA and PTR queries for local addresses or names. If an address is known locally this information is preferred and a trans-net query is not made. When the information sources change, such as /etc/hosts, systemd-resolved updates its cache. These are the addresses handled as local information:
localhostand
localhost.localdomain.
_gateway, which is the target(s) of the default route(s).
On my net, all of this local information is available.
As for 1-component names, on my net they are all in /etc/hosts and systemd-resolved can send that authoritative information. But if they weren't in /etc/hosts, systemd-resolved would resolve them by a multicast LLMNR query, if it were enabled, which it isn't on my net.
Queries for multi-component names and IP addresses (not otherwise known) are forwarded to DNS.
DNS Service Discovery (DNS-SD) records are maintained and delivered
by systemd-resolved, for services that register themselves with DNS-SD.
Presumably registration happens by a dbus protocol. The DNS-SD records
are these, where $SERVICE represents e.g. ssh
, $PROTO is the
protocol used for that service (normally udp or tcp), and $HOST is the
host that provides this service. Examples are shown for ssh/tcp, which
I use for (successful) testing because all my hosts provide it.
In conclusion, I did not succeed in my main goal of replacing Avahi with systemd-resolved. However, I think I have made progress in cleaning up an area of my network infrastructure that was making a lot of trouble on my VM, that I didn't want to deal with while debugging VM networking.
One possible solution involves proxy ARP. (2009-06-24, OP bodhi zazen.) Remember that ARP (specifically proxy ARP) is only defined for IPv4. Here's a summary of his howto:
<interface type='ethernet'> <mac address='52:54:00:19:b2:bf'/> <script path='no'/> <target dev='tap0'/> <model type='rtl8139'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface>
yast2 lan. The guest should now be on your net.
It looks like the proxy ARP method won't fly because IPv6 is impossible.
Another family of solutions involves tunnels. The basic design would go like this:
The emulator uses bridge networking. Its emulated NIC appears on the host as a tap device which libvirt creates, and libvirt also puts it into a bridge on the host. (The guest thinks it was given a normal Ethernet NIC.)
The host has a tunnel endpoint which it has also put into the bridge. Thus packets from the guest are copied down the tunnel, and packets emerging from the tunnel are made available to the guest. There are several varieties of tunnel.
Bearer packets are ordinary IP traffic and leave or enter the host via its default route. There are no quibbles about the Wi-Fi NIC being tangled up with bridges.
The other end of the tunnel is going to be on my gateway. It is part of the bridge which includes the wired Ethernet to the local LAN, as well as the gateway's own VM. Thus all kinds of traffic on the local LAN are manifest to the guest, and all the guest's traffic goes out to the local LAN. (The bridge has optimization rules so packets are omitted if they are irrelevant to a particular bridge member. Basically, the guest gets a packet if it's broadcast, or multicast and the guest has registered in that group, or is unicast to the guest's MAC address.)
How the bearer packets get to the other end is irrelevant. The route could be direct to the Wi-Fi access point on the gateway, or to a foreign AP and from there, likely after NAT, through the global Internet to my gateway's wild side interface, or through a VPN from the host to the gateway. All three paths require authentication which is not part of the tunnel mechanism.
Some tunnels provide cryptographic privacy and integrity. I'm assuming that the guest tunnel will provide the same kind of protection (i.e. none) that the host's own traffic receives. For complete protection for the host's traffic as well as guest traffic on the tunnel, the host should use a VPN. Another mode is to use application-level security such as TLS or DNSSEC (integrity only) on both the host and the guest. On my net, generic wild-side packets are admitted only after authentication.
Fly in ointment: Both the gateway and the host, if on the wild side, have dynamic DHCP addresses. Fortunately these change rarely, lasting days to months. But when either one changes the tunnel will be broken and will have to be reestablished. Payload packets may be lost, but this is a fact of life on the Internet, and assuming quick reconnection, the payload datastreams will recover with no fuss.
What kind of tunnel might I want to use? These are some key aspects of the tunnel:
Authentication means that the remote end expects the local end (laptop) to prove who it is (so it can decide if the client is authorized), and the local end expects the gateway to prove who it is, versus some Black Hat in the middle.
On the wild side the traffic generally passes through various
equipment and agencies that are not chosen by the user and may not even
be ascertainable. Any of them may be, or are already known to be,
infested by Black Hats. Privacy means that the Black Hats cannot obtain
the payload information being transmitted. (But generally the Black
Hats can see the IP addresses of the endpoints; ways of obfuscating
these, such as TOR, are out of scope.) Impossible
is a relative
term, and generally if the Black Hats can crack the encryption using a
few billion dollars of equipment working together for a year, that is
considered to be adequate privacy.
Integrity means that the tunnel is aware if data arrives that was different from what was sent. This can happen due to noise on the communication line, or software errors, or attempts at fraud. There is a range of strength choices for integrity checksums. Not all of them can resist a competent Black Hat.
On the local LAN I can use whatever protocols I please, but on the wild side many hotel nets are very restrictive, e.g. blocking IPSec, or all use of UDP. For this case my laptop uses OpenVPN on TCP port 443: this is far from ideal, but that port is normally used for HTTPS and if it were blocked the hotel's network would be useless. I plan to choose the protocol freely, but if it's blocked I will use the VPN, same as if application protocols on the host were blocked.
I require authentication, so only the authorized laptop can connect to the gateway's endpoint and so it can be sure that the intended endpoint is being connected to. I'm providing for the guest the same privacy and integrity that the host gets, i.e. none. However, some protocols like SSH cannot turn off those features, and I won't reject them just for that reason.
Here is a promiscuous list of varieties of tunnel, with evaluations.
While normally it is used as a point to point link to a single client session, it can be switched to generic tunneling at layer 2 or 3 (link or network, i.e. IP packets).
I'm familiar with SSH tunneling, it is widely trusted including by me, and I already have the authentication infrastructure (public and private keys) in place. But you can't turn off encryption. SSH remains one of the front runners. But beware of TCP meltdown!
OpenSuSE Tumbleweed is using package openssh-7.7p1 at the time of writing.
(Not to be confused with Simple Symmetric Transport Protocol.) It runs PPP (point to point protocol) over TLS (Transport Layer Security). Authentication is required for both layers.
Assuming PPP can be made to defer to TLS for authentication, this looks like a possible winner. I already have the X.509 certificates needed for authentication. But beware of TCP meltdown!
There is a package NetworkManager-sstp for OpenSuSE Tumbleweed, community contributed (several instances). Except it has a missing dependency (libnm-gtk.so.0()(64bit)) that I can't find.
On CouchNet, it is normally used at the network layer (3), but it can be switched to the link layer (2).
I'm familiar with OpenVPN and use it regularly on the host. That's both good and bad: it would have to continue to run at the network layer. I'm a little worried about committing to have OpenVPN running on the host at all times. A more serious complaint is that I still need a tunnel protocol for the guest: OpenVPN provides a tunnel from the host's whole network stack to the local LAN, but I need part of that stack to be a tunnel that carries the guest's traffic to the gateway, and OpenVPN can carry the tunnel, but would have trouble to be the tunnel itself at the same time.
Another possibility is an OpenVPN tunnel from the guest itself to the gateway, but this would mean that I could not test or develop generic networking on the guest.
It does privacy and integrity, and to establish the Security Association the peers need to mutually authenticate. In IPv6 IPSec is implemented just as another packet header identifying the Security Association and giving the Message Integrity Code (HMAC); all that follows is encrypted, and as part of removing and obeying the header the kernel decrypts subsequent headers and the payload. But IPv4 headers aren't so flexible, and a separate IP protocol (ESP and/or AH) is used. In Linux the payload is considered to be received on the same interface as the bearer was, so for routing purposes IPSec isn't really a tunnel.
Wikipedia article about PPP. It's a link layer (2) protocol designed to work over alien links including ATM, in addition to IP. PPP includes (or could include) authentication, privacy and integrity (just a CRC, not cryptographic). Also compression. It is very modular and includes setup modules (Network Control Protocols) for most known protocol families.
Generally PPP is not an independent tunnel but is used as the stuffing for another tunnel protocol.
Wikipedia article about GRE, q.v. for relevant RFCs. It can do encryption using RC4 which is deprecated. Can do integrity, but it's not too clear how cryptographically robust the checksums are. PPTP uses slightly modified GRE packet headers. GRE was developed by Cisco and they have appliances that use it. GRE is popular in Windows shops.
Given the questionable security and the hassle of PPP/PPTP authentication, I'm not going to waste time trying to set up GRE.
Wikipedia article about L2TP. It's a hybrid of Cisco's L2F and Microsoft's PPTP. It encapsulates PPP (point to point protocol), but L2TPv3 can bear other link-level protocols. The outermost bearer packets use UDP. L2TP doesn't do authentication, privacy or integrity, but prepended IPSec headers can do so, and the interior protocols also can do so.
This protocol looks viable, but probably it has various frusrating issues which will turn up when I try to actually implement it. I will investigate it if none of the front runners pan out.
Wikipedia article about VXLAN (Virtual Extensible LAN), RFC 7348. Its goals are being scalable to large cloud nets. It has the equivalent of VLANs. A lot of vendors and software support it. Open vSwitch is one of these. Its main target is cloud isolation within a multi-tenant datacenter. Bearer packets use UDP. They contain the entire (almost) Ethernet frame that the guest would have sent on a wired connection.
It would be a big commitment to learn how to make this work. I don't see a whole lot of support for authentication. I doubt I will be using this one.
Encapsulates IP packets; does not handle non-IP packets.
The lack of authentication makes this protocol unsuitable.
Two IPv6 transition mechanisms that transmit IPv6 payload packets over a IPv4 network. No IPv6 bearer packets.
I need to handle mixed IPv4 and IPv6 traffic and I already have IPv6 payload capability; these transition solutions will not be helpful.
I wasn't able to find much information about this mode.
So the front runners are SSH and SSTP. I think I'm going to try SSH first,
One way to create the tunnel is by
ip tunnel add NAME mode any? remote ADDR local petraguest pmtudisc dev br0
However, this doesn't include SSTP and so this approach is useless.
There are a lot of companies that
publish setup guides for SSTP. The idea apparently is, your client establishes
a SSTP tunnel to their server, which is not free, and the result is a VPN.
According to
the ExpressVPN docs, SSTP is owned directly by Microsoft and is
available for Windows only
, which doesn't exactly match with jimc's
experience.
On Github there exists sstp-server by sorz (Shell Chen). A package of it apparently is available on Arch Linux but not OpenSuSE. pppd is a prerequisite.
I think that I'm going to follow my original plan and try ssh first.
ssh -w any -o Tunnel=ethernet
-w anymeans to create a tunnel device (with any number) on the client host and similarly on the server. You can also specify fixed tap numbers.
It also needs in the server's /etc/ssh/sshd_config:
The infrastructure manager will want to create a tun/tap device and
put it in the relevant bridge. Here's a
tutorial on doing this by waldner (2010-03-26). He has discovered an
undocumented feature of iproute2 (the ip
command); do
ip tuntap help
for a usage summary.
ip tuntap [add|del|show|list|help] mode [tun|tap] [user U] [group G] [name itsname]
ip tuntap add mode tap name tap8
Cutting off the tunnel idea. I got a lot of it working, but one issue killed it: Xena communicates over Wi-Fi and the normal channel ends in Jacinth's br0. Xena also creates a tunnel, whose endpoints are in Xena br0 and Jacinth br0. This creates a loop. Various maneuvers were used to keep Petra traffic in the tunnel and Xena traffc on Wi-Fi (including tunnel bearer packets), including turning on the Spanning Tree Protocol on one or the other bridge, but they were not effective enough, or were too effective, killing transport between various endpoints.