Valid HTML 4.01 Transitional
Prev: Infrared Remote Control as Keyboard Next: Back Version NFS Client Sees Files Owned by Nobody
Jim Carter's Bugfixes

Home Network Dies on Cable Due To Small MTU

James F. Carter

You change ISPs. (I replaced Verizon DSL at 0.768Mbit/s with Time-Warner cable at 15Mbit/s. This is in Los Angeles, and technical details might be different in other cities.) You can ping your favorite wild-side sites. But you can't view a web page, make a SSH connection, retrieve mail with IMAP, etc. If your router, the machine to which the wild-side connection is attached, is a Linux or Windows box, it can make connections, but hosts on your internal network are unuseable.

The same issue sometimes happens when you use a VPN . It bit me in the past when I used IPSEC (OpenS/WAN), and I successfully used the fix described below. (OpenVPN however has been trouble-free.)

What's happening:

There are various issues that can prevent connecting, but in my case the problem was the MTU (Maximum Transmission Unit). Each physical network has a limit on the maximum size of packet that it can handle. RFC 791 specifies a mimimum MTU of 576 octets (8-bit bytes). The normal MTU is 1500 bytes, and larger values can be used for special purposes on a local network; for example performance of the Network File System on SunOS-3 is significantly improved if you raise the MTU to 4096 bytes.

My home network uses a MTU of 1500, and so did Verizon DSL, but Time Warner cable has a MTU of 576, the minimum allowed value. When my machines send 1500 byte packets which have to be squeezed onto the 576 byte default route, two outcomes are possible. If an IPv4 packet doesn't have the do not fragment bit set, the router is supposed to send it in multiple fragments which the recipient will reassemble. If it does have this bit set, or always for IPv6, then following RFC 1191, the router is supposed to send an ICMP host unreachable, packet too large response, with the actual MTU, and tossing the original packet. (This is ICMP type 3 subcode 4.) The origin host should then resend the data in packets limited to the reported MTU (actually, MSS (Maximum Segment Size), which is the MTU minus the size of the packet's header).

My router sends the ICMP response to internal hosts, so they send sufficiently small packets, but the various outside partners never receive the packet too large report. Since I can't connect to any of a broad range of reputable and well-maintained sites such as Google, Amazon, Fidelity Investments, Hotmail, and others, the most likely explanation is that Time Warner doesn't send it, in violation of RFC 1191.

Peter Holland, in his blog post about MTU discovery dated 2011-08-23, gives a lot of interesting information on this syndrome, which is commonly referred to as the PMTUD black hole. Apparently violations of RFC 1191 are all too common. RFC 4821 tells how to actively probe for the path MTU, and Linux implements this, though it is not turned on by default. Windows has its own black hole mitigation strategy, which is on by default in Server 2008, XP, Vista, and Windows 7. However, these strategies have to be used on the remote (server) site which is not getting ICMP host unreachable, which is not under the client's administrative control.

He recommends that the client use the clamp MSS technique described below.

In the VPN case, the traffic is contained within a tunnel whose headers take away from the MTU that the sender could otherwise use.

How to fix:

If you have a commercial router, get into its configuration and make sure that path MTU clamping is turned on.

I have a comprehensive firewall on my router machine, running Linux, and in the IPv4 filter FORWARD chain I added this, where it will affect all thru traffic, specifically every connection-beginning SYN packet:

iptables -t filter -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

This changes the maximum segment size that the internal client thinks the remote server can use for its responses, to the lower value that the router knows about. Every router on the path is supposed to modify the SYN packet if necessary, so the smallest MTU governs.

I also put a similar rule in the filter OUTPUT chain (for packets originating on the router) just for paranoia, although it is supposed to get the MTU right by itself, and empirically it seems to do so. Also I added similar rules to the IPv6 firewall section.

This done, my internal hosts are happily connecting to external TCP servers.

Prev: Infrared Remote Control as Keyboard Next: Back Version NFS Client Sees Files Owned by Nobody