Yesterday, when I mentioned that Starlink would be of limited use in a true emergency situation in which there may be infrastructural issues with parts of the internet, many people demonstrated they don’t quite understand how the internet works.
So, this morning, we’re going to dive into that a bit. Buckle up. A 🧵.
First, your primary interaction with the internet is likely through phone apps or web browsers. To you, it’s simple as can be: you launch the app, or type in a domain in the address bar of your browser (or, if you’re like most people, just use a search engine and click on the first link that looks reasonable), and you’re taken to the content and/or functionality you want.
It’s nowhere near that simple. And I’m going to try to avoid analogies so people don’t get confu…
Yesterday, when I mentioned that Starlink would be of limited use in a true emergency situation in which there may be infrastructural issues with parts of the internet, many people demonstrated they don’t quite understand how the internet works.
So, this morning, we’re going to dive into that a bit. Buckle up. A 🧵.
First, your primary interaction with the internet is likely through phone apps or web browsers. To you, it’s simple as can be: you launch the app, or type in a domain in the address bar of your browser (or, if you’re like most people, just use a search engine and click on the first link that looks reasonable), and you’re taken to the content and/or functionality you want.
It’s nowhere near that simple. And I’m going to try to avoid analogies so people don’t get confused, while also trying to be as detailed as possible but still keeping it simple for the layperson.
To begin, you need to understand how you get to what you want on the internet. And that begins with addressing. Everything connected to the internet has an address. In fact, they have multiple addresses, but we’ll get to that in a bit. For now, I’m discussing Internet Protocol, or IP, addresses.
You generally don’t interact with or even see these. They look like a dotted quad: 192.168.37.42, for example.
What you generally see and interact with are domain names. Fully-qualified domain names, to be specific. A domain name looks like: .
In , there are actually 3 domains, and a subdomain:
- www: subdomain
- second level domain com: top level domain .: root domain
You’re probably wondering where that final period (".") came from that I labeled "root domain". It’s implied at the end of every domain, after the top level domain.
Every domain name registered on the internet is purchased through either a registrar, or an agent acting on behalf of a registrar. A registrar is a company that has permission to sell entries in a database held by a registry. Registries are companies given authority over the data for top level domains.
So, for example, there is a registry for .com. That registry company sells the rights to create entries in the .com database to various registrars, who in turn sell domain names to the public.
When someone buys a domain name, they give their money to a registrar, and that registrar puts the domain name in the top level domain’s registry database.
So, let’s say I buy . I pay my money, and my personal details (or those of a privacy agent I pay to mask my personal details) are entered into the registry database for .com, stating that I own that domain. Along with that are some unique identifiers, an expiry date for the domain, and two or more domain names or IP addresses for authoritative nameservers.
The authoritative nameservers become very important in a minute. They are the location of the database that contains all the data for the subdomains of . So, for example, it’ll contain the IP address for , if I put it in there. And I have to, because since I own , I’m responsible for managing the addressing for anything under it.
The root domain servers know the IP addresses for all the registries for all the top level domains in the world.
So! When you type in into your web browser, or a phone app tries to access , your computer starts a whole series of questions going out into the internet, trying to find out the IP address for .
This is generally done using a caching recursive resolver (we’re going to skip over things like stub resolvers for the purpose of this thread). A caching recursive resolver is the closest most of you will come to interacting with DNS infrastructure. Many people use popular ones like 8.8.8.8, or 1.1.1.1. Other people are content to use the ones assigned to them and run by their ISP.
So, when you try to connect to , your computer reaches out to whichever caching recursive resolvers your computer knows about, and says, "Hey! What’s the IP address for ?"
They will either know the answer because they’ve already done the research and have saved ("cached") the answer, and it’ll answer immediately, telling your computer the IP address, or it’ll begin the process of recursive resolution.
In recursive resolution, the caching recursive resolver looks up the IP address for one of the root DNS servers, which is stored locally on the caching recursive resolver as part of the configuration process in what’s called a "root hints" file, and goes and asks the root server "Hey! What’s the IP address for ?"
The root server doesn’t know. Because that’s not its job. Its job is to know the IP addresses of the top level domain registry databases. So the root server responds, "Beats me. Go ask Bob. Bob’s responsible for .com." and gives the caching recursive resolver the IP address for the .com registry.
The caching recursive resolver goes and asks the .com registry, "Hey, what’s the IP address for ?" And it responds, "Who knows? Go ask Bill; he’s responsible for !" and gives it the names for the authoritative nameservers I put in when I registered .
Now the caching recursive resolver has to start another entire process of recursion to look up the IP addresses for the names I put in as the authoritative nameservers for . Once the caching recursive resolver has those IP addresses, it goes to whichever one it wants (not really, but we’re not addressing nameserver priorities, round-robining, and such here) and says, "Everyone tells me you’re responsible for . What’s the IP address for ?"
And, all else being equal (there are multiple further wrinkles I’m avoiding to keep this simple), your caching recursive resolver is told the IP address for .
It then caches that response, which it’ll keep until the TTL (time to live) value assigned to the answer expires (it won’t, but that’s an entirely other story and has to do with a whole lot of politics, comp sci, and DNS inside baseball; I used to work for the guy who invented DNS, so I got to see the sausage get made routinely), and it gives your computer the IP address for .
That’s before your computer or phone has sent a single bit of data to .
As you can see, there’s a whole lot that can go wrong just in this process alone. If any point along any of this fails, or provides incorrect information, you can’t reach .
And there are myriad ways in which any of that I just described could fail. We’ll be discussing some of those further along.
One spectacular way it can fail is human stupidity. Back in 2016, a journalist named Brian Krebs was investigating some criminals pulling scams on unsuspecting businesses, selling them DDoS protection while the criminals were the ones perpetrating the DDoS attacks.
He embarassed them pretty good. In response, the criminals executed what was then the world’s largest DDoS attack against his website.
His website was hosted by a company called Akamai. Akamai’s a big deal in the internet world. They’re responsible for a whole lot of stuff. One of the things they provide is DNS hosting – that is, providing authoritative DNS services for individuals and companies. Like I described up above, when I bought . I said I was responsible for managing the data for . I’m also responsible for making that data available to the internet. I could run my own authoritative nameserver, but common practice these days is to pay someone else.
Akamai is one of those someone elses. And tons of people pay them for exactly this. Very large companies, like microsoft, and amazon, and zoom, and a bunch of other companies you’d instantly recognize.
When those criminals launched that massive DDoS, it revealed to the world how dumb most network folks are these days: they ignored . IANA is the organization responsible for all the rules and standards regarding how the internet works. There have always been requirements for how DNS should be run. The ones we’re concerned with here are that there should always be at least two authoritative nameservers, they should not be the same, and they should not be run by the same company, on the same infrastructure, in the same geographic or logical area.
That last one totally f*cked the entire East Coast of the US, and thus, much of the world, that day. Because everyone and their dog decided they’d just make all their authoritative nameservers be Akamai servers.
So, when Akamai got attacked, none of those authoritative nameservers could be reached.
And these days, the TTL for most names is 0 or very close to it. Which means answers for name queries only get cached for a few seconds.
Which, in turn, means recursive resolvers are constantly having to recurse to get the latest IP for a name. Which means they have to be able to talk to the authoritative nameservers responsible for the names they need IPs for.
That day, no one could, because everyone ignored IANA.
And they still do. To this very day. Not a single lesson was learned. Everyone just makes both their authoritative nameservers different names within the same company, usually within the same network, in the same geographic region.
A handful of p*ssed off script kiddies took Amazon and many other of the largest internet companies offline that day, and weren’t even trying to.
So, when I say the internet is fragile and there are tons of ways you can effectively take it down, this is but one example of what I meant.
And all I’ve covered so far is how to get an IP address for a domain name. We haven’t even gotten to the good stuff yet.
I’ve also avoided topics like DNS cache poisoning, NXDOMAIN redirection, typosquatting, domain sniping, bitsquatting, and other ways to manipulate DNS results, because I’m – you’ll chuckle at this point – trying to keep it simple.
Next, we’ll discuss a bit about how data flows over the internet, and what your computer does with that IP address. But first, I need another cup of coffee, so don’t get impatient.example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com example.com iana.org/help/nameserve…
I just noticed that, when I said "fully-qualified domain name", I gave an example that X decided to turn into a URL, and shorten a bit.
I actually typed in the correct thing: www[.]example[.]com (the brackets are to prevent X from doing it again), but in the text, it made it look like .
So, when you read that bit, please mentally put the "www" in front of for that one statement.example.com example.com