On 2016-10-21 an unknown perpetrator for an unknown purpose utilized a botnet marshalled by the Mirai software to do a Distributed Denial of Service (DDoS) attack on dyn.com's DNS servers worldwide. Here are some web resources about the attack.
Dyn Statement on 10/21/2016 DDoS Attack by Kyle York (Dyn Chief Strategy Officer), 2016-10-22. The attack was done in three waves targeting different geographically located sets of servers. At least 1e7 bots participated in the attack (considerably more than 1e7, actually). Mirai was used to organize them. Defense measures were used to mitigate the first wave, taking about 2 hours to restore service. The second wave was mitigated in about one hour. The third wave was mitigated without an effect on service to customers.
Source Code for IoT Botnet 'Mirai' Released by Brian Krebs, 2016-10-01. He has quotes from the announcement on Hackforums and analyses what the bot can do. The malware author says he/she can use it to get a botnet of 3e5 to 3.8e5 bots. Comment poster Nicholas Lim is the founder and CEO of AthenaLayer, a cloud-based DDoS protection service. His website was DDoS'ed (of course protected by his own company) and he gives some statistics on the attack. He got 5e4 requests per second, 2.8e5 packets per second, 2.2e9 bytes per second. He has a map of the geographical location of the IP addresses of the bots. (But unfortunately the given links are dead, with unusual symptoms suggesting an administrative ban by the webserver.)
Hacked Cameras, DVRs Powered Today's Massive Internet Outage
by Brian Krebs, 2016-10-21. He reports that many (most?) of the bots for
this attack were Internet of Things
(IoT) devices with telnet or ssh
administrative access; the password is hardwired in the firmware and cannot
be changed. One comment poster translates IoT as Internet of
Targets
. One particular Chinese vendor is pointed out as the maker
of widely used firmware of this type.
Although the sheer number of incoming packets can be a problem in a modern DDoS, the major issue is to not perform expensive services or use outgoing bandwidth (like sending a web page) for the bad guys: you want to just toss their packets. So how do you distinguish good from evil clients? After the attack has started you can look for common features of evil sendings, and put in a firewall rule that will do packet inspection and toss them. But my goal is to defend my own net, which requires the method to be generic and to work without a lot of handholding by the sysop.
This necessarily means that you need to cue on the source address of the incoming packets -- which for a lot of attack techniques might or must be spoofed. Spoofing is hard to detect at the victim end, but much easier at the source. But do source ISPs actually suppress spoofed source addresses?
Fake source addresses are useful for some but not all attack styles. A TCP connection from a non-answering address uses resources until the half-open connection times out -- a SYN-flood attack. Packet-oriented services like DNS or VPNs are best attacked from a fake source, so the reply packets (an expensive reply, for DNS) vanish without interfering with outgoing traffic from the bot. But an attack on a webserver is most effective if the bot reads the entire requested web page, making the victim do maximal work.
What protections do ISPs place on spoofed source addresses? It's assumed in RFC 791 that there will be no such checking. RFC 2827 (BCP 38) recommends that retail ISPs, i.e. those that can easily distinguish client subnets from egress route(s), should drop every packet whose source address would go on an egress route. But do they comply?
Addressing the Challenge of IP Spoofing, Internet Society white paper
(2015-09-09). Very little progress has been made on improving the
situation. Leaf routers, connected direct to customers, should reject
packets sourced from an address or range other than the one assigned to
that customer. Mobile (cellular) networks normally follow BCP 38 since
serving bots costs them a lot. Cable TV (DOCSIS 3.0) carriers
usually
follow BCP 38; the white paper doesn't give an uptake metric
for DSL or fiber nets. For IPv6, RFC 6092 requires that BCP 38 be
followed, but the author is unsure how widely this is obeyed. But one
apparently popular IPv6 Ready
logo and certification test requires
BCP 38. The author believes that in colo hosting, adherence to BCP 38 is
spotty, and most spoofed traffic comes from colos, so he says. (Jimc says:
obviously the dyn.com DDoS did not come from colos, at least the majority
of traffic.)
A significant class of attacks still work when using a fake source
address in the same subnet as the source, evading BCP 38 applied at the
subnet level. But in a reflection attack
the attacker sends a packet sourced
from the victim to a server,
typically DNS, that will send back (to the victim) a large reply packet.
BCP 38 is particularly effective against this kind of attack since the
victim is (almost) never on the server's subnet.
An important point that is not discussed: do the spoofed addresses repeat, or are they randomized on every connection? Repeating addresses are too easy to defend against, and it's much more likely that they are random.
The firewall on each CouchNet host junks packets sourced from other than the local net. However, you are allowed to send packets sourced from a different or nonexistent local machine, which is not the best practice.
Conclusion: while some spoofed addresses are being suppressed at the source, large numbers of ISPs are allowing their clients to send packets to victims with spoofed source addresses.
I had a brainwave for a method how to resist and defend against a DDoS.
The goal is to provide normal service to legitimate clients despite massive
traffic from attacking bots. The basic plan is to use a fq_codel traffic
control queue discipline (Fair Queuing with Controlled Delay), or possibly the
hashlimit or recent
firewall rules, to split up the packets by address.
Then a limit is put on the rate of
packets coming from particular IP addresses or address ranges. Clients that
make connections at a modest rate are considered legitimate; those sending lots
of packets get tossed. But this scheme has some problems.
If the source address is spoofed and is randomized on every connection, it becomes useless for recognizing bad packets.
It isn't practical to handle hundreds of thousands of attackers individually because they require memory to record the rate of ingress. I need to aggregate clients into subnets, probably only 16 bits. That means a legitimate client with a neighboring bot will get blocked. All early-acting mitigation strategies have this problem. IPv6 is worse because the address is longer.
The conntrack system, and in particular the rate estimator for
traffic control, distinguishes flows
. In principle these are
individual per peer IP and port (and local port). I wonder if memory
might be used up with a very large botnet making multiple connections
per bot. Or is some kind of hashing used, which could commingle good and
bad flows. It's got to be the latter.
Some legitimate clients connect frequently, specifically web crawlers. If the total load is below a limit, the traffic control scheme should handle all packets, but when the load is high the most frequent requesters' packets should be dropped. The criterion for a high load should be that packets stay in a traffic control queue for too long. A codel or fq_codel queue discipline is designed for control of this kind.
Conclusion: This method is useless when the attackers spoof their source addresses (except for a reflector attack). But it could be useful to protect a webserver where the attacker has to actually interact with the server to do damage. Comparing the likelihood that I will be attacked (low) versus the amount of work needed to create the defense and the fraction of attacks that it will repel, I think it's not a good idea for me to go forward with this defense method.