Name Resolution
Table of Contents
Why do we need DNS?
Computers speak to each other in numbers. At the very lowest levels, all computers really understand are 1 and 0. Reading binary numbers isn't the easiest for humans, so most binary numbers are represented in lots of different forms. This is especially true in the realm of networking. Remember that an IP address is really just a 32-bit binary number, but it's normally written out as 4 octets in decimal form since that's easier for humans to read. You might also remember that MAC addresses are just 48-bit binary numbers that are normally written out in 6 groupings of 2 hexadecimal digits each. While remembering 192.168.1.100 might be easier than remembering a long string of 1s and 0s, It still doesn't do a very good job when you have to remember more than just a few addresses. Imagine having to remember the four octets of an IP address for every website you visit. It's just not a thing that the human brain is normally good at.
Humans are much better at remembering words. That's where DNS, or domain name system, comes into play.
DNS is a global and highly distributed network service that resolves strings of letters into IP address for you.
Let's say you wanted to check a weather website to see what the temperature is going to be like. It's much easier to type www.weather.com into a web browser than it is to remember that one of the IP addresses for this site is 184.29.131.121. The IP address for a domain name can also change all the time for a lot of different reasons. A domain name is just the term we use for something that can be resolved by DNS. In the example we just used, www.weather.com would be the domain name, and the IP it resolves to could change, depending on a variety of factors. Let's say that weather.com was moving their web server to a new data center.
Maybe they signed a new contract, or the old data center was shutting down. By using DNS, an organization can just change what IP a domain name resolves to, and the end user would never even know. So, not only does DNS make it easier for humans to remember how to get to a website, It also lets administrative changes happen behind the scenes without an end user having to change their behavior. Try to imagine a world where you'd have to remember every IP for every website you visit, while also having to memorize new ones if something changed. We'd spend our whole day memorizing numbers. The importance of DNS for how the Internet operates, today, can't be overstated.
IP addresses might resolve to different things depending on where in the world you are. While most Internet communications travel at the speed of light, the further you have to route data, the slower things will become. In almost all situations, it's going to be quicker to transmit a certain amount of data between places that are geographically close to each other. If you're a global web company, you'd want people from all over the world to have a great experience accessing your website. So instead of keeping all of your web servers in one place, you could distribute them across data centers across the globe. This way, someone in New York, visiting a website, might get served by a web server close to New York, while someone in New Delhi might get served by a web server closer to New Delhi. Again, DNS helps provide this functionality. Because of its global structure, DNS lets organizations decide, if you're in the region, resolve the domain name to this IP. If you're in this other region, resolve this domain to this other IP.
DNS serves lots of purposes and might be one of the most important technologies to understand, as an IT support specialist, so you can effectively troubleshoot networking issues.
Many Steps of Name Resolution
At its most basic, DNS is a system that converts domain names into IP addresses. It's the way humans are likely to remember and categorize things resolved into the way computers prefer to think of things. This process of using DNS to turn a domain name into an IP address is known as name resolution.
Let's take a closer look at exactly how this works. The first thing that's important to know is that DNS servers, are one of the things that need to be specifically configured at a node on a network.For a computer to operate on a modern network, they need to have certain number of things configured. Remember, that MAC addresses are hard coded and tied to specific pieces of hardware. But we've also covered that the IP address, subnet mask, and gateway for a host must be specifically configured, a DNS server, is the fourth and final part of the standard modern network configuration. These are almost always the four things that must be configured for a host to operate on a network in an expected way. I should call out, that a computer can operate just fine without DNS or without a DNS server being configured, but as we covered in the section, this makes things difficult for any human that might be using that computer.
There are five primary types of DNS servers;
- Caching name servers
- Recursive name servers
- Root name servers
- TLD name servers
- Authoritative name servers
As we dive deeper into these, it's important to note that any given DNS server can fulfill many of these roles at once. Caching and recursive name servers are generally provided by an ISP or your local network. Their purpose is to store domain name lookups for a certain amount of time.As you'll see in a moment, there are lots of steps in order to perform a fully qualified resolution of a domain name. In order to prevent this from happening every single time a new TCP connection is established, your ISP or local network will generally have a caching name server available. Most caching name servers are also recursive name servers. Recursive name servers are ones that perform full DNS resolution requests. In most cases, your local name server will perform the duties of both, but it's definitely possible for a name server to be either just caching or just recursive.
Let's introduce an example to better explain how this works. You and your friend are both connected to the same network and you both want to check out Facebook.com, your friend enters www.facebook.com into a web browser, which means that their computer now needs to know the IP of www.facebook.com in order to establish a connection. Both of your computers are on the same network which usually means, that they both been configured with the same name server. So your friends computer ask the name server for the IP of www.facebook.com which it doesn't know, this name server now performs a fully recursive resolution to discover the correct IP for www.facebook.com. This involves a bunch of steps we'll cover in just a moment. This IP is then both delivered to your friend's computer and stored locally in a cache.
A few minutes later you enter www.facebook.com into a web browser. Again, your computer needs to know the IP for this domain, so your computer asks the local name server it's been configured with, which is the same one your friend's computer was just talking to. Since the domain name www.Facebook.com had just been looked up, the local name server still has the IP that it resolved to stored and is able to deliver that back to your computer without having to perform a full lookup. This is how the same servers act as a caching server.
All domain names in the global DNS system have a TTL or time to live.
This is a value in seconds, that can be configured by the owner of a domain name for how long a name server is allowed to cache in entry before it should discard it and perform a full resolution again.
Several years ago, it was normal for these TTL's to be really long, sometimes a full day or more. This is because the general bandwidth available on the Internet was just much less, so network administrators didn't want to waste what bandwidth was available to them by constantly performing full DNS lookups. As the Internet has grown and gone faster, these TTL's for most domains have dropped to anywhere from a few minutes to a few hours. But it's important to know that sometimes you still run into a domain names with very lengthy TTL's, it means that it can take up to the length of a total TTL for a change in DNS record to be known to the entire Internet.
Now, let's look at what happens when your local recursive server needs to perform a full recursive resolution. The first step is always to contact a root named server, there are 13 total root name servers and they're responsible for directing queries toward the appropriate TLD name server. In the past, these 13 root servers were distributed to very specific geographic regions, but today, they're mostly distributed across the globe via anycast.
Anycast is a technique that's used to route traffic to different destinations depending on factors like location, congestion, or link health.
Using anycast, a computer can send a data gram to a specific IP but could see it routed to one of many different actual destinations depending on a few factors. This should also make it clear that there aren't really only 13 physical route name servers anymore. It's better to think of them as 13 authorities that provide route name lookups as a service.The root servers will respond to a DNS lookup with the TLD name server that should be queried. TLD stands for top level domain and represents the top of the hierarchical DNS name resolution system.
A TLD is the last part of any domain name, using www.facebook.com as an example again, the dot com portion should be thought of as the TLD. We'll go into more details about the different components of a domain name in an upcoming lesson. For each TLD in existence, there is a TLD name server, but just like with root servers, this doesn't mean there's only physically one server in question, it's most likely a global distribution of any cast accessible servers responsible for each TLD. The TLD name servers will respond again with a redirect, this time informing the computer performing the name lookup with what authoritative name server to contact. Authoritative name servers are responsible for the last two parts of any domain name which is the resolution at which a single organization may be responsible for DNS lookups. Using www.weather.com as an example, the TLD name server would point a lookup at the authoritative server for Weather.com, which would likely be controlled by the Weather Channel, the organization itself that runs the site. Finally, the DNS lookup could be redirected at the authoritative server for weather.com which would finally provide the actual IP of the server in question.
This strict hierarchy is very important to the stability of the internet, making sure that all full DNS resolutions go through a strictly regulated and controlled series of lookups to get the correct responses, is the best way to protect against malicious parties redirecting traffic. Your computer will blindly send traffic to whatever IP it's told to. So by using a hierarchical system controlled by trusted entities in the way DNS does, we can better ensure that the responses to DNS lookups are accurate. Now that you see how many steps are involved, it should make sense why we trust our local name servers to cache DNS lookups, its so that full lookup path doesn't have to happen for every single TCP connection. In fact, your local computer from your phone to a desktop will generally have its own temporary DNS cache as well, that way, it doesn't have to bother its local name server for every TCP connection either.
DNS and UDP
DNS is a great example of an application layer service that uses UDP for the transport layer instead of TCP.This can be broken down into a few simple reasons. Remember that the biggest difference between TCP and UDP is that UDP is connectionless. This means there is no setup or teardown of a connection. So much less traffic needs to be transmitted overall. A single DNS request and its response can usually fit inside of a single UDP datagram, making it an ideal candidate for a connectionless protocol. It's also worth calling out that DNS can generate a lot of traffic. It's true that caches of DNS entries are stored both on local machines and caching name servers, but it's also true that if the full resolution needs to be processed, we're talking about a lot more traffic. Let's see what it would look like for a full DNS lookup to take place via TCP.
First, the host that's making the DNS resolution request would send a SYN packet to the local name server on port 53, which is the port that DNS listens on. This name server would then need to respond with a SYN ACK packet, that means the original host would have to respond with an ACK in order to complete the three-way-handshake. That's three packets. Now, that the connection has been established, the original host would have to send the actual request. I'd like the IP address for food accomplice. When it receives this request, the name server would have to respond with another ACK. I got your request for food.com. We're up to five packets sent now. In our scenario, the first caching name server doesn't have anything cached for food.com. So, it needs to talk to a root name server to find out who's responsible for the.comTLD. This would require a three-way-handshake. The actual request, the ACK of the request, the response, and then the ACK of the response. Finally, the connection would have to be closed via a four-way-handshake. That's 11 more packets or 16 total. Now that the recursive name server has the correct TLD name server, it needs to repeat that entire process to discover the proper authoritative name server. That's 11 more packets, bringing us up to 27 so far. Finally, the recursive name server would have to repeat the entire process one more time while talking to the authoritative name server in order to actually get the IP of food.com. This is 11 more packets for a running total of 38.
Now that the local name server finally has the IP address of food.com, it can finally respond to the initial request. A response to the DNS resolver that originally made the request, and then this computer sends an ACK back to confirm that it received the response. That's two more packets, putting us at 40. Finally, the TCP connection needs to be closed via a four-way-handshake. This brings us to a grand total of 44 packets at the minimum in order for a fully recursive DNS request to be fulfilled via TCP.
44 packets isn't really a huge number in terms of how fast modern networks operate. But it adds up fast as you can see. Remember that DNS traffic is just a precursor to actual traffic. A computer almost always performs a DNS lookup because it needs to know the IP of the domain name in order to send additional data, not just because it's curious.
Now, let's check out how this would look with UDP.
Spoiler alert, it doesn't take as many packets. The original computer sends a UDP packet to its local name server on port 53 asking for the IP for food.com, that's one packet. The local name server acts as a recursive server and sends up a UDP packet to the root server which sends a response containing the proper TLD name server, that's three packets. The recursive name server sends a packet to the TLD server and receives back a response containing the correct authoritative server. We're now at five packets. Next, the recursive name server sends its final request to the authoritative name server which sends a response containing the IP for food.com. That's seven packets. Finally, the local name server responds to the DNS resolver that made the request in the first place with the IP for food.com. That brings us to a grand total of eight packets.
See, way less packets. You can see now how much overhead TCP really requires. And for something as simple as DNS, it's just not needed. It's the perfect example for why protocols like UDP exist in addition to the more robust TCP. You might be wondering how error recovery plays into this, since UDP doesn't have any. The answer is pretty simple. The DNS resolver just asks again if it doesn't get a response. Basically, the same functionality that TCP provides at the transport layer is provided by DNS at the application layer in the most simple manner. A DNS server never needs to care about doing anything but responding to incoming lookups, and a DNS resolver simply needs to perform lookups and repeat them if they don't succeed. A real showcase of the simplicity of both DNS and UDP. I should call out that DNS over TCP does in fact exist and is also in use all over. As the Web has gotten more complex, it's no longer the case that all DNS lookup responses can fit in a single UDP datagram. In these situations, a DNS name server would respond with a packet explaining that the response is too large. The DNS client would then establish a TCP connection in order to perform the lookup.
References: