Prev: GPG Operation Cancelled	Next: Building a SuSE RPM Package
	(Index)

Jim Carter's Bugfixes

/dev/random Blocks Forever

James F. Carter

2017-02-15

Symptom:

You are creating a secret (private) key for SSH, GPG or TLS (host key), or are connecting to another host using TLS, and the operation takes ridiculously long. It will finish eventually but is much slower than usual.

What's happening:

The crypto software reads /dev/random to get some truly (you hope) random numbers, but the machine's entropy pool is depleted and /dev/random blocks until more entropy is captured. On a less active machine, particularly one where the user is just sitting there waiting for an operation to finish, entropy is produced slowly.

You are supposed to be running a daemon which collects entropy from a variety of sources not practical for the kernel to tap into, but it has gone catatonic or has died.

See the end of this page for a discussion of entropy.

How to fix:

First, does your host have the kernel's hardware-specific driver that collects entropy? Do grep '_rng_driver' /proc/kallsyms and look for $ARCH_rng_driver where on x86 (Intel, AMD, Via) processors you may see ARCH = amd intel tpm via virtio; on Broadcom 2835 ARM (Raspberry Pi) look for ARCH = bcm-2835.

If the kernel has no trace of a random driver, look in /lib/modules/$VERS/kernel/drivers/char/hw_random/ (get $VERS from uname -r) and see which one(s) will load. If you have no modules, look for your architecture in the kernel source directory /usr/src/linux/drivers/char/hw_random. Details for some modules:

On Intel and AMD not every processor has the random generator that the module is looking for, so it won't load. On recent processors the RDRAND instruction is used instead. There is no kernel support to use RDRAND; see below for the politics on this.
tpm uses the random generator that comes with the Trusted Platform Module. The kernel module loads but often provides no entropy. The machine may look like it has a TPM but it doesn't; or the TPM may not be enabled in BIOS; or it may not be initialized as part of secure booting.
In a virtual machine (which has particularly slow entropy production), the virtio driver steals entropy from the host. It will load, but to provide any entropy it needs a stanza in the guest definition XML like this:
<rng model='virtio'>
<backend model='random'>/dev/random</backend>
<rate bytes='4096' period='1000'/>
</rng>
To protect the host from a runaway guest, you limit the number of bytes that can be stolen every so many milliseconds.

If you do have a functioning hardware random number generator then you should have a device /dev/hwrng and it should produce random bytes at a reasonable rate. To test: dd if=/dev/hwrng of=/dev/null count=8 iflag=fullblock. (Default blocksize for dd is 512 bytes.) On one of my virtual machines this stole 4096 bytes from the host in 0.14 seconds.

On a x86 (Intel or AMD) CPU, you also need to know if it has the RDRAND instruction. Look in /proc/cpuinfo, and among the CPU flags look for rdrand.

Now you need to inject this entropy into the kernel's pool. Currently there are two entropy daemons that do this. If you have either /dev/hwrng or the RDRAND instruction, you should run rngd from the rng-tools package. (When RDRAND is present it reports the DRNG entropy source.) It is aware when the kernel's entropy pool drops below a limit. Then it reads enough random bytes from its sources to fill the pool, configured to 3700 bits (not 4096) on a SuSE system. Under heavy load it can refill the pool very fast; I've never seen the pool under 3600 bits.

If you have no official hardware entropy source, as on my three older AMD machines, you need to run haveged from the haveged package. See this page for where haveged gets entropy. It has to work harder to get entropy, and is therefore slower at refilling the pool, but it does serve the purpose.

It is not harmful to run both daemons, but haveged uses more CPU (actually not too much) for no real added benefit, and testing the random number generator with two entropy daemons can get confusing, so it's recommended to pick one or the other.

What Is Entropy? In thermodynamics it is the logarithm of the number of states a system (e.g. an engine) can be in, which are assumed to be equally probable and unknown to the observer; hence the choice of a state is random or unpredictable. In computing it is the number of bits of random or unpredictable state information collected in the kernel's entropy pool. This is the same definition because the number of bits equals the logarithm of the number of states (each times a scale factor). The essential feature of a random number, that counts as entropy, is that an attacker cannot predict it. Thus the number, however many bits you need, can be used as a secret key in a cryptographic algorithm, and the attacker can only capture the payload by trying every possible value for the key; enough bits are used that a brute force attack like that requires impractically much resources or run time. When numbers are unpredictable they have certain statistical characteristics, e.g. the probability is 1/2 that a bit will be 1 or 0, the autocorrelation function is zero, and various more complex properties. To be accepted as truly random, the output of the hardware generator must pass these tests. Remember that truly random data will fail tests in a certain fraction of the trials, inversely proportional to the quantity of random numbers tested. Random numbers that never fail tests have been tampered with and have lost some of their unpredictability. A pseudo-random generator is a deterministic algorithm (a program) designed to produce output that is indistinguishable from truly random data. However, the pseudo-random generator needs to start from some initial state, called its seed, and if the attacker discovers the seed he can predict exactly what the pseudo-random generator will produce. But if the seed is truly random, so is the pseudo-random output, within limits. But the pseudo-random generator necessarily repeats eventually, and the more of the generator's output is available to the attacker, the better he can deduce the algorithm and the seed. With weaker algorithms the needed history is surprisingly short (and you need to use a cryptographically secure algorithm). Thus it's prudent to re-seed the generator from time to time. See this blog post about RDRAND and RDSEED by John M (Intel) (2012-11-17) as well as Intel's implementation guide for the Digital Random Number Generator (DRNG). Intel's RDRAND instruction delivers the result of a pseudo-random generator with a long repetition period, that is re-seeded from a noise-based source after up to 511 outputs (random). RDSEED delivers, more slowly, direct from the source, skipping seeds used for RDRAND. Manufacturing constraints limit what kinds of random sources a CPU chip can include easily; thermal (Johnson) noise is most often used, and shot noise is also practical. With today's emphasis on security, modern chips generally include a noise-based bit source, and its output can be transferred by the entropy daemon to the entropy pool. In addition, the haveged daemon uses variations in the speed of instruction execution, caused by competing processes, as a source of randomness, and the kernel drivers for the disc, network, keyboard and mouse similarly report unpredictable timings to the entropy pool. Hyper-paranoid cryptography experts point out that the noise-based bit generator is closed source, so we cannot be sure just what it is doing, and it is possible (jimc says, possible but very unlikely) that adversaries have induced the manufacturer to design it so its output is predictable if a procedure is used that is known only to the adversaries. For example the adversary might be able to set the seed and turn off re-seeding. Thus if very difficult conditions are met, the adversary could steal a secret key and decrypt an encrypted message. The conditions include: The random generator is not the only source of entropy. The adversary must predict all the other sources too. The adversary must exactly emulate the kernel's procedure in adding to and extracting from the entropy pool, to predict what /dev/random would have sent out. The adversary must be aware of exactly when the victim process reads /dev/random, so as to know which bits it receives out of the whole random bitstream. So jimc (and many others) does not take seriously the possibility of a predictable random generator. Even so, a political decision was made to not include the RDRAND instruction among the kernel's own entropy sources, relying on rngd to use it. Paranoids can instead use another entropy source that suits their security rules. How does the Linux kernel convert incoming entropy pool bits into the output of /dev/random? See this blog post about the Linux random generator by Aaron Toponce (2014-07-21). In summary, the various entropy sources are added to the input pool using a fast mixing algorithm. When random numbers are needed, the pool is hashed using SHA-1 and the result is used as the seed for a pseudorandom number generator (PRNG) producing 32bit integers. These are cached in two pools of 1024 bits (32 ints) each, one for /dev/random and one for dev/urandom. When random data is sent out by /dev/random, the entropy estimate of the input pool is decreased by the number of bits sent, and if it reaches zero, /dev/random blocks until more entropy is added to the input pool. /dev/urandom does not block, and its PRNG is re-seeded not more frequently than once every 60 seconds, decreasing the entropy count unless already 0. This means that Linux random numbers are cryptographically secure, meaning that they are truly random and unpredictable, but for regulatory compliance they cannot be considered independent, having been produced deterministically by a PRNG (between re-seeds). Jimc says (following the advice of others): For bulk random data, e.g. for shredding documents, /dev/urandom is the right source, since it can safely deliver large amounts of random data. For session and host keys that are produced separately and asychronously, the relatedness of PRNG output is not exploitable to compromise the key. Thus /dev/urandom is good enough. /dev/random is less deterministic, but is not 100% nondeterministic because a PRNG is still involved, so there is no real benefit in using it. On x86 CPUs, the RDRAND instruction also uses a PRNG and has the same strengths and weaknesses. It would be a good move for Linux to emulate Intel's RDSEED instruction: deliver one random output direct from the entropy pool. Running it once through the PRNG would be helpful to whiten any peculiarities in the entropy pool format, since the fast mixing algorithm is not intended to be cryptographically secure. As Intel recommends in their DRNG documentation, a random number of this level of non-determinism is mainly useful to create a seed for an external PRNG. Such non-determinism is irrelevant for cryptographic keys.

Prev: GPG Operation Cancelled	Next: Building a SuSE RPM Package
	(Index)