The goals of this project are:
To provide SSHFP DNS records (RFC 4255, 6594, 7479). This record type gives a hash (fingerprint) of a host's SSH public key, so that the client can know authoritatively that the host key proffered by the server actually belongs to that host. For the client to believe in it, the record must be signed with DNSSEC. SSHFP supplements the known_hosts file, which is hard to keep in sync.
To sign CouchNet's DNS records with DNS Security Extensions (DNSSEC). By using real trust versus wishful thinking, we avoid a lot of possible exploits that direct us to a fraud server.
To switch from the venerable Berkeley Bind (named) DNS server, to the new one, Unbound. Bind is still alive and functional, but it's kind of heavyweight for my needs.
To de-kludge my current multi-layer DNS implementation for client apps. Making four DNS sources work together is kind of getting out of hand, particularly for this foundation service.
I have been using Berkeley Bind (named, name daemon) as a DNS (Domain Name Service) server for 32 years (since 1987). It has performed reliably (except for occasional exploits like the great Internet meltdown of 1988) and has grown to meet present day challenges. However, hostname issues have grown up in parallel with the global DNS system, and integrating them has turned into a multi-layer kludge. Specifically on CouchNet, a DNS query goes through these stages (summarized):
realDNS server next, but it is incapable of forwarding to a nonstandard port, so instead I have a proxy to handle forwarding from systemd-resolved. There is also a fallback to a generic public recursive DNS server (Google's 8.8.8.8) for my laptop when roaming.
So I decided to scrap the whole pile and redesign from the beginning. Basically this means that I'm giving up DNS from dnsmasq, losing the _gateway DNS name from systemd-resolved, and relying on avahi-daemon rather than systemd-resolved for multicast DNS (names like $HOST.local). I rarely if ever use these features.
The decommitted services were systemd-resolved, dns-forward.J, and dnsmasq (DNS only).
My distro, OpenSuSE Tumbleweed, has recently introduced unbound
,
which is a total rewrite of Bind itself. According to the product hype it has
these good features:
These configuration issues were dealt with. I have a locally written
configuration management system and when its control files are mentioned they
are tagged with (LCM)
. The Unbound configuration directory is referred
to as /etc/unbound, which is actually a symbolic link into the chroot
jail, /var/lib/unbound/etc/unbound .
Formerly this was a symbolic link to the one provided dynamically by systemd-resolved, attracting queries to that service. Resolv.conf was changed to refer directly to localhost port 53 where Unbound is listening.
dnsmasq runs on Jacinth (master site) and
Xena (laptop). Its DNS configuration was changed to suppress DNS
entirely. port=0
was all that was needed.
These are self-signed certificates and private keys, to be used by unbound-control to authenticate to the server. Create them with unbound-control-setup.
This is the trust anchor for the whole global Domain Name Service. Every DNS resolver that verifies DNSSEC needs to know this trust anchor by some means other than DNS that it is going to verify. The program unbound-anchor is this "other means"; it uses the procedures from RFC 5011 and RFC 7958 to get the trust anchor. When a trust anchor is revoked or replaced (e.g. expiry), RFC 5011 gives a procedure by which the client can obtain the new trust anchor and can propagate trust from an old anchor to the new one. RFC 7958 gives the URL at which a XML file (and other formats) can be downloaded, the procedure to verify its authenticity, and the format of the contained data, from which a DS record could be constructed that can be used to verify the DNSKEY that signed the root zone (see RFC 5011) without reference to an existing trust anchor that is actually trusted.
This is a concatenated set of three certificates including the ICANN Root CA, used to sign the XML file that contains the DS record that can verify the root DNSKEY record saved in /etc/unbound/root.key . It can be downloaded from data.iana.org and the certs can be verified online by normal means from the IANA Certificate Authority. It is included with the unbound-anchor package.
This is a set of NS (nameserver) records giving the names and IP addresses of the root nameservers. When Unbound starts up, with CouchNet configuration it would have recursive forwarders to which it could send queries for arbitrary DNS objects like the list of root server NS's, but if all the forwarder(s) were inoperative, Unbound would need to know where to send that query by other means, like the root.hints file. It can be downloaded from ftp.internic.net. (In plan B, root.hints is moved into the config directory.)
This is the trust anchor for DNSSEC Lookaside Validation (DLV) (RFC 5074). It comes with the unbound-anchor package. But now that the root zones are all signed, DLV is no longer necessary, and it has been deprecated since 2017. Just ignore this file and leave it alone, until it is removed from the unbound-anchor package.
These were specific configuration issues in /etc/unbound/unbound.conf:
Where the configuration files are. Relative pathnames are relative to this directory. If you are doing chroot (recommended), the pathname should have the chroot jail path prepended, and since other programs look for configuration in /etc/unbound, that should be a symbolic link to the directory in the jail.
Path of the chroot jail directory. Mode 755 owned by unbound:unbound.
The user to drop privileges into, "" to not drop privileges. This is orthogonal to chroot; both are recommended.
1 produces startup messages (and errors) only. 0 = errors only, 2 = more operational details, 3 = per query reports, and on upward.
Listen for queries on the interfaces having these addresses. You need separate lines for IPv4 and IPv6. The values shown cause listening on all interfaces: if a query can get through the firewall, Unbound will hear it, which I need for monitoring if both Unbound and the firewall are working properly. The default is to listen only on localhost.
Anything that can get through the firewall can query recursively or can get local (authoritative) data; only localhost can query the cache.
UDP replies are made from the address to which the query was sent. Usually this is good practice, and is important with interface: 0.0.0.0 .
Listen port; this is the default. You can only have one.
Origin ports for outgoing recursive queries. These are what's in the distro's provided unbound.conf; the default is 1024-infinity.
See the discussion above of the root.hints file.
See the man page for unbound.conf, for what these defensive measures do and for the tradeoffs if you enable them. Mostly these are kept at either the compile-time default or the distro-provided setting.
Enforce special use rules for these address ranges per RFC 2606 and numerous supplements. The major rule is that such addresses must never be seen by off-site queriers, since the remote client would route the addresses to its own LAN, not our internal host. And if an off-site recursive query returns such an address, we will toss it to avoid sending possibly dangerous connections to our internal hosts. See the distro's unbound.conf for the recommended list.
Allows private addresses in these zones.
This one is very touchy. CouchNet uses ULA-type addresses in
these ranges and has corresponding zone files. local-zone
allows
them to be sent out despite RFC 2606 special use rules; transparent
means that the zone content should be looked up and sent out in the normal
way. domain-insecure
turned out to be essential also, because
the d.f.ip6.arpa zone cannot be signed, and Unbound needs permission to
serve sub-zones without DNSSEC verification.
Unbound will fill the Additional section of its response with everything it knows about the query, some of which the client may not ever use. This obviates re-queries (for the RR's that are used), but it takes CPU time and net bandwidth to send the stuff: a tradeoff. The default is yes (make them re-query).
The trust anchors required for DNSSEC verification. Pick one or the other root key; the unbound-anchor program looks for auto-trust-anchor-file so use that. trusted-keys-file(s) is/are the trust anchors for your own zone's signatures (if I used them). Instead I include a concatenated list of DS records (as trust-anchor "text" commands) generated with the DNSKEYs.
Makes it log an error message if it receives a response which fails DNSSEC verification. 0 = don't log; 1 = one line report; 2 = includes the reason and the bad IP.
Makes Unbound listen to the unbound-control program, for reloading Unbound, or for realtime jiggering of configuration parameters like the verbosity. See above for the required keys and certificates to authenticate.
In Plan A, CouchNet has local zone definitions in conf files in this directory, formerly different for the master site, slave DNS servers, and leaf nodes. In plan B I've simplified the design so the leaf servers have the same forwarders to the authoritative server instances, written in the main unbound.conf, and the authoritative servers are all slaves and have the local zone stanzas also in the main unbound.conf.
(Plan A is doomed to failure. Mitigations in plan B are noted briefly.)
A local zone definition includes a zone type keyword, the zone's name in
DNS, and information telling where Unbound should get data (Resource Records)
for that zone. The type keyword is auth-zone: forward-zone: or stub-zone:
(a section title, with no value). The name parameter always has the form name:
its.name.tld
(ending dot not required). The data source varies with
the type:
Master site: keyword is auth-zone
zonefile: "/var/lib/unbound/master/cft.zone"
This is an absolute path to the original zone file. A relative
path might be feasible also with a '..'. But it has to be inside
the chroot jail so Unbound can read it.
Since Unbound is not capable of sending AXFR/IXFR zone updates, it's kind of useless to have a master site, and in plan B all the dirsvrs are slaves.
Slave servers: keyword is also auth-zone
master: 192.9.200.193
url: http://192.9.200.193/unbound-master/cft.zone
zonefile: "/var/lib/unbound/slave/cft.zone"
Unbound listens for notify messages from the master, and sucks
new versions of the zonefile from it. The result becomes or replaces
the listed zonefile. There can be multiple masters (not on CouchNet).
Gotcha: if the master site runs Unbound (v1.9.6 on my system), it is not capable of emitting AXFR or IXFR responses. You need to set up a webserver (simplest if on the master site) that can serve the zone files. Zone file content (RRsets) are public record and are served to anyone who asks (who satisfies access control restrictions), but it is probably not best practice to offer the complete zone file to the global hacking community; i.e. mind the access control on the webserver too.
Unbound checks on the master for an updated zone file upon receiving a NOTIFY request or periodically per the times in the SOA. If given master:IP, Unbound retrieves the master's SOA and compares serial numbers, and exits the procedure if the master is not newer. If newer or if no master:IP, it then attempts each URL (if any) and then each master:IP (AXFR/IXFR) until one of them delivers the zone. If none of the transfers succeed, Unbound tries again according to the times in the SOA. Failures in this process do not produce error messages in the logs.
In the typical case, if you put the master's hostname in the URL, Unbound will need to resolve it to an IP using RRs in the zonefile that it is going to download from that server. So use an IP in the URL. It's not clear how you make this work with TLS (HTTPS), but the data transport path on my net is either local 802.3, or from my laptop on a VPN, so I don't need TLS.
Leaf node: keyword is stub-zone
stub-addr: 192.9.200.193
stub-prime: yes
Queries in this zone are forwarded to the DNS server on the
given address; there can be several. stub-prime
means that
at startup Unbound will ask at the given address for the list of
NS (nameservers) for the zone, and subsequently will use that instead
of the configured address.
It turns out that a stub zone is intended to copy (and cache) data from an authoritative server. This means that the leaf Unbound will not validate the data, which is fatal when you try to use a SSHFP to validate a server. In plan B, the leaf nodes forward to the authoritative dirsvr slaves and then validate the response.
Forwarders: keyword is forward-zone
name: "."
forward-addr: 192.9.200.193
forward-first: yes
Although Unbound is capable of looking up any domain name by
itself, and to verify DNSSEC for the answer if it's signed, it's a best
practice in an enterprise deployment for the leaf and slave nodes to
forward queries for off-site data to a trusted master site,
because it can cache the results including the DNSSEC records and
outcome, speeding up service for leaf nodes that repeat someone else's
query, and reducing the load on external servers.
There can be several forward-addr's, though this dilutes the effectiveness of forwarding. forward-first means that if a SERVFAIL response comes back, Unbound should attempt to do the query by itself in the normal way. Thus DNSSEC verification failures can be logged locally, or a defective forwarder doesn't kill your DNS. Empirically, a timeout also triggers self-verification.
Should the master site have a forwarder? There are tradeoffs here:
Ability to reach authoritative servers is the same for the master site and for its hypothetical forwarder, because (at least in my case) the master site has a generic Internet connection. So the issue of reachability is not relevant for deciding if a forwarder should be used.
The mantra is, see to your own security. You trust the stub resolver on localhost to do DNSSEC verification honestly, and you don't trust outside servers. But the forwarder is not verifying (and you would ignore the AD bit if set, which it isn't), it is sending you the RRSIGs by which the resolver on localhost can verify and can be aware if the payload (or the RRSIGs) are corrupt due to fraud or accident.
The forwarder has the answers ready to go out promptly and with no effort required by the target's DNS server (except for the first query for that site after the TTL expires). If a zillion individuals and enterprises use the forwarder, the effort saving could be significant.
If the master site (or any node) has no forwarder, when it starts up or reloads or the TTL expires, it needs to retrieve the NS, DS, DNSKEY and RRSIG records for the root, the TLDs, etc, directly from the root servers. Every possibility of not talking to the root servers should be implemented, to avoid loading them up.
Some forwarders have special features like parental controls and blacklists of fraud sites. A truly paranoid security professional would not tolerate having the outside forwarder make these kinds of judgments, but for the rest of the population, this kind of forwarder could be attractive.
You are feeding to the forwarder a continuous stream of domain names that you are using, and it is not feasible for you to ascertain what (beyond DNS replies) the forwarder is doing with this data. Google's public forwarders are often mentioned on this point. A truly paranoid sysadmin would be very nervous about such an exposure.
Jimc's conclusion: I'm not paranoid enough to refuse to use a forwarder. I'm already trusting Hurricane Electric to host my public DNS data, and advertising is not the foundation of their business model. So I'm going to use their public forwarder.
All Unbound instances (except the master site itself) will forward to my master site. The three directory servers will also forward to Hurricane Electric. These are the master site, the laptop (which needs it when roaming), and the other dirsvr, which will quickly learn that the master site gives the best service. I'm trying to minimize the number of configurations that I need to maintain.
Some of this stuff is modified or deleted in plan B.
Contains the master site's zone files. Don't make a symlink elsewhere; this has to be in the chroot jail so Unbound can read it.
In plan B, all the dirsvrs are slaves, so this directory is gone.
Unbound on the slave servers writes copies of the zone files here, so they can persist across restarts.
See /var/lib/unbound/dev/log below, which is a symlink to a socket in /run. Also, when exiting, the chrooted Unbound will not be able to remove the PID file unless /run is accessible, which makes trouble when you start Unbound again. So I bind-mount the real /run into the jail.
Back up everything except ./run
Startup script; pretty simple but still too much stuff for systemd to deal with in unbound.service. It does these steps:
Random number source (blocking). There's a lot of discussion about whether /dev/urandom is just as good as /dev/random, on modern hardware. Unbound will use whichever is available. These are independent instances of character device 1,8 and 1,9, not bind-mounted from /dev.
A symlink to the socket /run/systemd/journal/dev-log, which is another reason why /run is bind-mounted in the jail.
Once the LCM files and master configuration storage were gotten correct, the procedure to convert a host to Unbound went like this:
zypper install unbound — 7 packages to install.
rm -r /etc/unbound — so it can be replaced by a symlink into the chroot jail. Some programs like unbound-control look for configuration information and/or keys in this directory.
On Diamond (LCM master): /home/post_jump/sync_jump -p -C -a $HOST — Check results, then re-run with -c (install) instead of -C (compare). This LCM script installs the standard configuration (not just Unboound) onto $HOST with rsync doing most of the work.
audit-scripts -v -k -n — Check results, then re-run with -c (install) instead of -n. This LCM script enables and disables (un)wanted services. -k lets it kill services (like bind) that are no longer in the list of wanted services.
systemctl stop unbound-anchor.timer dns-forward.J.service systemd-resolved.service
systemctl start unbound #Also unbound2 on dirsvrs in Plan B
systemctl status unbound |& less
/usr/diklo/lib/functest/unbound -v -t -e 0 — I have a collection of about 90 functional test scripts for most services that I use. This one checks if Unbound is actually listening on port 53; maps 6 random hostnames to their IPv4+6 addresses and back to the FQDN; and checks the SOA record of each locally configured zone. Test passed.
checkout.sh > /tmp/check.out 2>&1 ; less /tmp/check.out — Runs all the functional tests to detect baleful effects of the switchover. Tests all passed.
Unbound will do DNSSEC out of the box, unless you sabotage it in the
configuration. For testing DNSSEC: With dig, +dnssec is needed to display the
RRSIG. internetsociety.org (and lots of others) have validly signed data.
dnssec-failed.org has an invalid signature. When you dig
its 'A' record
through a DNSSEC verifying server you get SERVFAIL (with or without +dnssec).
If you specify +cdflag (checking disabled) then the server will deliver this
site's 'A' record. Tidbit: if you specify +multi, dig will wrap long lines
more readably and will show an
interpretation of some of the arcane fields in the DNSSEC records.
Try at internet.nl to test IPv6 support. Conclusion: Hurricane Electric's forwarder does validate the test site's domain name signatures using IPv6 transport.
rootcanary.org tests most or all algorithm variants that are representable in DS and DNSKEY records. Our outcome: pairing each DS algorithm from SHA-1, 256 and 384, but not GOST, with each signing algorithm from DSA, RSA variants, ED25519, ED448, ECDSA variants, but not ECC-GOST or RSA-MD5, Hurricane Electric's forwarder can verify a RRSIG with each combination of algos. GOST is the Russian suite of crypto algorithms. The MD5 algo is deprecated due to known weaknesses.
To test the local Unbound itself, I temporarily turned off all forwarding (by changing the forward-zone's name to "su.") and reloading. Outcome: local Unbound can verify internet.nl's signatures by itself, gives SERVFAIL as it should on dnssec-failed.org, and can verify using all the algorithms for which tests are provided on rootcanary.org, except not GOST or RSA-MD5.
Tutorials on zone signing:
executive summaryof DNSSEC, titled
DNSSEC — What Is It and Why Is It Important?.
Zone signing steps:
You will generate two keys for each zone. The Zone Signing Key (ZSK) is used to sign each of the sets of records (RRsets) in your zonefile, while the Key Signing Key (KSK) signs the ZSK, and is used to produce a DS record containing this signature which you can send to your parent zone for inclusion, thus establishing a link in your chain of trust from your parent to your own zone.
It's recommended, and not too burdensome, to use a separate set of keys for each zone.
You need to start by choosing the algorithm and key length of your two zone keys. It's apparently not too easy to roll over to a new key with a different algorithm.
From the Sinatra tutorial (2012), one example shows RSASHA256 as the algo, with a key length of 1024 for the ZSK and 2048 for the KSK. Currently (2020), NSA recommends, since quantum computing may become practical during the lifetime of keys created presently, that 2048 bits be the minimum for a RSA key. Also, an elliptic curve in a prime field of modulus 2255-19, here labelled ED25519, seems to be winning the confidence of the cryptographic community, and this algo is getting widespread deployment. As elliptic curve algos are substantially more efficient and more compact than RSA, I'm inclined to use ED25519 for my keys. Quantum computing attack algos do not have an advantage over brute force when applied to elliptic curves, as they do against RSA keys.
An authoritative introduction to ED25519 may be found at
Ed25519: high-speed
high-security signatures
by Daniel J. Bernstein (2017-01-22).
A brute force attack takes about 2128 trials, so ED25519 has
equivalent strength to a 3000 bit RSA key, or a 128 bit symmetric
block cipher such as Rijndael (AES). Signatures are 64 bytes
(512 bits) long, and public keys are half that size.
See also this Wikipedia article about Curve25519. See also RFC 7748. Daniel J. Bernstein first released this elliptic curve in 2005, but later paranoia about the NSA's recommended elliptic curves led to a lot of interest in this curve and widespread deployment and adoption in standards.
ldns-keygen generates a key pair. There are three output files: the one with extension .key contains the (public) DNSKEY record; the one ending in .private contains the private key; and the one ending in .ds contains a DS record which could be sent to and inserted in your parent zone; this is a hash of the DNSKEY. (DS is only put out when you generate a KSK.) The basename of the files is K${name}+${algo}+${keyID} .
The private keys should go in a directory outside Unbound's chroot jail. The directory has to be backed up and you need to take the usual precautions to protect the secret key in the backup files. It would normally be on the master site, i.e. the one with the master nameserver, and the master site would be on a secure subnet of the local LAN that is not accessible to the global hacking community. For me it's file://jacinth/home/hostdata/dnssec.
Key generation is instantaneous
, unlike a RSA key. The
files are written in the current directory. The records are printable;
the payload of the DS record is a hex string, while the DNSKEY and the
private key are base64 encoded. The DNSKEY has mode 644 (everyone can
read) and the private key is 600 (only owner can read-write). The KSK
and ZSK have different key IDs (avoiding overwriting). The KSK has a
distinguishing flag: the 1 bit in the first integer (flags) in the text
representation. (The ZSK has a flag field of 256 while the KSK has
257.) Also, only the KSK has an accompanying DS file. The program can
be run as a non-root user; make sure that the user unbound
can
read the files afterward.
Jimc's command line (done manually at setup or rollover):
ldns-keygen -a ED25519 [-k] cft.ca.us
Option interpretation:
namesubstring in the output filename. I successfully gave my zone with an ending dot, but I get the impression from documentation that the ending dot is usually omitted.
To sign a zone (recommended to put these in a script):
ldns-signzone -b -e YYYYMMDD -i YYYYMMDD -n -f ../master/cft.ca.us.zone /home/hostdata/cft.ca.us.zone /home/hostdata/Kcft.ca.us+015+12345 …
Option interpretation:
inception).
This command line, executed on my host Petra, makes a simple test query for Petra's IPv4 address on Petra's leaf server, on which cft.ca.us. (plan A) is a stub zone forwarding to the master site Jacinth. (+multi folds long lines for easier reading.)
dig @localhost +dnssec +multi petra.cft.ca.us. A |& less
An 'A' record and a RRSIG (Resource Record set Signature) are returned. You will see in the header flags the AD bit, which means that Petra's leaf server certifies to the client (dig) that the RRSIG was made with the ZSK for that zone (DNSKEY record, not included), the ZSK was signed with the KSK, and there is a chain of trust from one of the various trust anchors which the leaf server has, to the KSK. In this case the trust anchor is the DS record for the KSK which I installed with the leaf server, but normally the chain of trust would start with the root key in /etc/unbound/root.key.
This outcome is a success. Now let's change localhost to Jacinth. The flags now include AA because Jacinth truly is authoritative for this zone, but AD is gone. The authoritative data will not be accepted as authentic, and in particular, non-authentic SSHFP records are not accepted for authenticating the server being connected to. This is a showstopper.
So why is the authoritative data not authentic? I don't have a good reference to a discussion of this point, but I can provide some of my own hot air. The difference comes from the trust relation between the client (the one making the DNS query) and whichever DNS server turned on the AD bit. The main use case for DNSSEC is for the client to obtain DNS data that it can actually trust, such as the IP address of a bank, a brokerage, a mail server, or a VPN endpoint. The client has no trust relation with the foreign DNS server that provides this data, nor with the various forwarders that may provide the data out of their caches, so if any of those servers alleges that the data is authentic by turning on the AD bit, the client's software should not believe and should turn off AD again. The client needs software that it trusts, in my case an instance of Unbound running on my own machine and set up and supervised by me or by my I.T. staff who I trust to not have gone over to the Dark Side. So only my own recursive and validating DNS resolver should do the work to determine the validity of DNS data.
It is not normal for a local resolver to have authoritative data for anything. It is also not too easy for the resolver to know for sure that a particular query is coming from a local user who is going to trust an AD bit, or from elsewhere where the AD bit will be considered an attempt at fraud. I'm guessing here, but I would say that the Unbound developers (and similarly for other software) decided that these issues are a can of policy worms that they didn't want to deal with, given the rarity of the situation. Therefore authoritative (AA) data is always sent with the AD bit turned off.
How can I recover from this design problem? By doing what the offsite servers do: I'll provide the authoritative data from a server separate from the recursive and validating resolver. Leaf nodes will continue to have only that server (on port 53), and directory servers will have the same configuration, again on port 53. The dirsvrs will have a second DNS server on port 4253, configured as authoritative but never used recursively. Instead of stub zones, the leaf servers will forward queries for local data to all three dirsvrs on port 4253. Offsite queries will be forwarded to the master site's leaf server, so we get a local cache of all such queries.
The leaf server has trust anchor(s) configured by me by which it can validate data it receives from otherwise untrusted foreign (or local) servers. One of these trust anchors is the public KSK of the root server; all validating resolvers need this, and there is a procedure (RFC 5011 and RFC 7958) by which they can obtain and validate the root key, implemented in Unbound's unbound-anchor helper program. In addition, the local server gets public keys (actually DS records) for zones in my island of trust, served by my organization's authoritative servers, which cannot be validated by reference to the global root. (The DNSSEC Lookaside Validation (DLV) service of RFC 5074 is an alternative, but it has been deprecated since 2017.)
Clients send their queries with the DO bit on, signifying that the local recursive server should validate the requested data working from the provided trust anchors, and the client hopes that the AD bit will be set in the response, meaning that all the signatures matched the payload data. The leaf server requests the data from untrusted sources without the DO bit, and it does not expect the AD bit, which it would not trust, in their responses.
What software should I use for the authoritative servers? I'm relying on
this
Wikipedia article on Comparison of DNS Server Software
. I limited
the software to those that are authoritative, slave-capable, with DNSSEC, with
IPv6, and free software. Then I read the detailed descriptions to determine
whether they could emit AXFR and IXFR, and other noteworthy aspects. All
packages that are slave-capable can read both AXFR and IXFR. Several of these
packages can rely on a backend database, e.g. MySQL, with its own replication,
so AXFR/IXFR are not used (though available).
Now let's compare Unbound with the rest of them in a pro&con format.
Unbound gets a big plus because I've already installed it and have learned how to configure it, and where some of the skeletons are buried.
Comparing IXFR vs. AXFR, how bad is it to not emit IXFR? With my small zones, IXFR's benefit is microscopic, whereas with complex and active zones like a TLD, there would be much more benefit for the slave, but the design and execution difficulties for the master to create the IXFR are incredible, and at least one package (NSD) known to be used on some TLD servers even so emits AXFR only.
Unbound has an alternative distribution method over HTTP. What's wrong with using that?
realDNS. But we should be picking mechanisms on technical merit, not purity or political correctness.
My conclusion then is to use Unbound. I will put a slave server on all
the directory server machines, and the collection of zone files via HTTP(S)
will be the actual master
. The slaves will not be told the master's
IP (because there is no real DNS master site), so they will have to download the
zones on every NOTIFY, but with my operating procedures the notifies are sent
only when there actually is a new zone. To avoid chicken and egg issues,
I will use the master's IP in the distribution URLs.
This design is working out: all nodes can retrieve authentic (AD, validated) SSHFP records for a host that has them, and the off-site test domains do or don't deliver authentic data as appropriate.
The SSHFP record is governed by RFC 4255 6594 and 7479. See SSHFP on Wikipedia for a non-normative description of the record. Its text representation is two integers and a hex string as follows:
Dighas a bug causing it to split this field into a field of 56 hex digits and one of 8 digits, for algo 4 (ED25519).
To extract from a running SSH server a SSHFP record that you can put in your DNS zone file: (Remember that the zone file has to be signed, for the SSH client to believe in it.)
Alternatively use your backup of the host keys:
It takes some reconfiguration to get SSH to actually use the SSHFP records.
First, the SSHFP record belongs to the target's FQDN, not to the
target's 1-component name. If you intend to use 1-component names
(and who doesn't?) you need CanonicalizeHostname always
early
in the applicable Host section of /etc/ssh/ssh_config (client
configuration).
You also need the CanonicalDomains statement. Its value is a space separated list of domains to append to the non-canonical name; SSH doesn't use the searchlist from /etc/resolv.conf. I put ending dots on my domains.
CanonicalizePermittedCNAMEs *:*
is also important if the
1-component name plus the domain ends up at a CNAME; e.g. you do
ssh backup burn-the-cd
where backup.cft.ca.us
is a CNAME
to the actual backup server, Diamond. *:*
means all CNAMEs
are allowed; see the man page if you want to be more paranoid.
CanonicalizeFallbackLocal yes
means that if the hostname
cannot be canonicalized, SSH should continue with the non-canonical
name (and without being able to find SSHFP records). No
would
mean that a botched canonicalizaion kills the session.
Following the canonical control statements, it is tempting to put
Match canonical
, meaning to process the rest of the
configuration only when the canonicalization is finished. But suppose
it has to CanonicalizeFallbackLocal? The given hostname would not be
canonical, would not match, and the session would not get the important
settings that follow.
Finally, set VerifyHostKeyDNS yes
so SSH will look for
SSHFP records, and if found will treat them as equivalent to entries in
known_hosts.
In addition I made these changes not related to SSHFP:
StrictHostKeyChecking accept-new
means to save a host key
automatically in known_hosts (with a warning) if not there before.
But if it's in known_hosts and is different, the session is killed
with a lurid message.
Formerly I had this at ask
, requiring user interaction for
any additions to known_hosts.
Since I want to encourage the client and the server to use ED25519,
I moved these algorithms to the beginning ('^') of the respective lists.
HostKeyAlgorithms ^ssh-ed25519-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp256
PubkeyAcceptedKeyTypes ^ssh-ed25519-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp256
There's a nice new feature: you can have SSH solicit a list of all
host keys that the server has, and missing ones will be added to
known_hosts. Format: UpdateHostKeys yes
. However, in the
likely case that the server always sends the same key, there's no
real saving from having the other keys, and I'd like to keep
known_hosts empty, so I did not turn this on.
Now a user, with an empty or nonexistent known_hosts file, can successfully connect with SSH using these hostnames:
So this project has ended in success: known_hosts is no longer needed to validate a SSH server that has SSHFP records. The server's host key is not added to known_hosts if the SSHFP record was used to validate the server.
If you connect to an IP address, SSH does not look for a PTR record to canonicalize it, and per StrictHostKeyChecking=accept-new, the host key is added to known_hosts under the IP address with no user interaction but with a warning on stderr.
If UpdateHostKeys=yes, and you connect to a server not in known_hosts, client SSH asks for all the server's host keys, and adds all of them to known_hosts under the canonical hostname plus the IP used to connect, with no user interaction and no warning message. (For most of my use cases, this is very helpful to keep garbage out of log files.)
If UpdateHostKeys=yes and you connect to an IP address and it is not in known-hosts, the host key is added to known_hosts under the IP address with no user interaction but with a warning on stderr, same as without UpdateHostKeys. But the second time you connect, i.e. if the IP and key are already in known_hosts, the client will request all the server's host keys and will add them to known_hosts (except the one already known) under the IP address.