Anatomy of the Internet: DNS

This makes it possible for bad people to send spam under someone else’s name and forge messages from banks (which look like real ones, and links in them lead to a site with a username / password that looks like a real one, but is not).

To cover up all this freedom, mechanisms have appeared in the DNS to make sure that the actual sender of the letter is really related to the domain specified in the sender field.

Letters that do not pass this check will not reach at all, but any sane e-mail provider will not hesitate to send them to the Spam folder.

Unfortunately, the maximum that these checks can guarantee is confirmation of the fact that the computer from which the letter came is really authorized to send letters from the domain specified in the letter. In particular, they do not protect against domain name spoofing by replacing some letters with similar non-Latin ones.

Reverse lookup

A simple and naive check is that the receiving mail server does a reverse lookup on the IP address from which the server sending the letter knocked on it. Then the domain specified in the sender field in the letter is compared with the one resulting from the reverse lookup.

This method has its drawbacks. He’s not flexible enough. In particular, it does not work well in a situation where the sending server is an intermediary. For example, servers of mail providers such as gmail.com or mail.ru can serve not only users with addresses on the provider’s domain, but also entire domains.

SPF

SPF allows the recipient to verify that mail on behalf of a domain came from an IP address that has the right to send letters on behalf of this domain. Thus, the domain owner takes some responsibility for these letters, which increases the recipient’s confidence in them.

SPF record is added to the domain and allows you to set a set of IP addresses of computers from which mail for this domain can come:

Technically, this is a TXT record in a specific format:

$ dig habr.com txt | grep spf
habr.com.                  244   IN  TXT   "v=spf1 include:spf.habramail.net ~all"

$ dig spf.habramail.net txt | grep spf
spf.habramail.net.         5735  IN  TXT   "v=spf1 include:_mxservers.habramail.net ~all"

$ dig _mxservers.habramail.net txt | grep spf
_mxservers.habramail.net.  3600  IN  TXT   "v=spf1 ip4:95.47.173.0/27 ip6:2001:678:5e0::/64 ip6:2001:678:5e0:1::/64 ip4:95.211.146.161 ip4:82.192.95.169 ip4:176.112.221.59 ip4:82.192.95.171 ip6:2001:1AF8:4010:A087:22::146:161 ip6:2001:1af8:4010:a087:22::169 ip6:2001:1af8:4010:a087:22:0:95:171 -all"

Note that the SPF-record can refer to records in other domains, allowing you to centrally manage policies (in administratively related domains).

DKIM

DKIM is described in RFC 6376

DKIM adds a digital signature to emails. The public key is published in the domain zone, the signature itself is added to the email header.

The signature can cover different parts of the letter – from a subset of the header to the entire body of the letter. This is controlled in the sender settings and displayed in the headers of signed messages.

Thus, DKIM not only binds the sender address in the email to the actual domain, but also confirms to some extent that the email was not garbled.

DKIM is not a substitute for a PGP signature because there is no guarantee that the entire email is signed, and the recipient’s user interface usually doesn’t show much of the DKIM usage mode. In addition, the signature is not carried out by the sender himself, but by his outgoing mail server (otherwise the sender would have to be granted access to the domain’s private key).

The use of DKIM has the side effect that it can be difficult for the sender to later claim that they did not send such a letter (to the point that a signed letter can be presented as evidence in court).

I will not give low-level technical details so as not to overload the article.

DMARC

DMARC is described in RFC 7489.

DMARC nicely complements SPF and DKIM by allowing a domain owner to tell recipients of emails coming from that domain how to validate and what to do with emails that fail.

As usual, DMARC records are specially formatted TXT records that are published in the zone of the sending domain.

In general, the point of DMARC is to separate method (SPF/DKIM) from policy.

Public servers

Google invites everyone to use their servers with addresses 8.8.8.8 And 8.8.4.4 as public DNS servers. Several other companies offer similar services. For example, Cloudflare’s public addresses are located at 1.1.1.1 And 1.0.0.1

Public servers act as recursive resolvers, but are not owned by your ISP, but by some organization independent of it.

Using public servers can be convenient, but we should not forget that the statistics of DNS queries with a large sample is very valuable information, and if the source of the request can be personalized, the value of this information increases many times (knowing which addresses you resolve, you can understand which sites you visit and how often).

The author of these lines had access to a daily sample of DNS queries from a major ISP in a major Russian city. I learned a lot about my fellow citizens 🙂

DoT and DoH: DNS over TLS/HTTPS

DoT is described in RFC 7858.
DoH is described in RFC 8484.

These two protocols are quite similar. In both cases, a specific server is specified in the client settings, which acts as a recursive resolver.

DoT opens a TLS session and exchanges requests / responses with the server, much like DNS over TCP does, and DoH wraps them into HTTP requests.

In fact, this is an analogue of a public DNS server, but TLS or HTTPS is used as a transport.

DoT/DoH protects your DNS requests from being snooped by your ISP and neighbors and from spoofing the response, but it sends them centrally to the DoT/DoH server (which can also spy on requests and spoof answers – the question is who do you trust more? ).

It usually comes built into the web browser, and more and more often it is configured and enabled by default.

If the browser is called Chrome, guess who will own the default DNS server? 🙂 (the situation is similar with other browsers).

Other Uses of DNS

Server load balancing

If you have a lot of interchangeable servers, you can set up your DNS server in such a way that different addresses are given to different clients (with some system or by chance). This will distribute the load across the servers more or less evenly.

Another use case for this approach is to distribute customers geographically. Name google.com in different places it resolves to different addresses. And most likely, the server is somewhere nearby. Thus, Google serves customers with local servers, and does not force them to go halfway around the world for a search answer.

Finding neighbors in the cloud

If your service consists of a large number of microservices distributed across the cloud, the question inevitably arises of how the parts will find each other.

DNS gives a good answer to this question: parts can find each other by name, which is fixed, depending on the purpose of the part. This is much better than fixing addresses:

  • addresses can be dynamic

  • addresses are managed by the owner of the cloud, and names are managed by you

The cloud infrastructure typically provides access to a dedicated DNS server where services can dynamically publish their presence, and clients that depend on them can find them using normal DNS queries.

Multicast DNS for Local Area Networks (mDNS)

mDNS is described in RFC 6762

Apple technology was famous for the fact that it can be assembled in one room, screwed together with a network, and, without any extra settings, everything immediately starts working. Computers see each other and printers connected to the network. No need to configure addresses and name server, everything works by itself. Apple had its own networking stack, AppleTalk, and it’s built into its protocols.

However, TCP/IP networks won because they were adapted to work not only on the local network, but also on the big Internet.

Now happiness is gradually coming to us, ordinary users of TCP / IP.

To search for devices on the network, the good old DNS came in handy, but not in a simple, but in a multicast form.

To find out the address of a device, the corresponding DNS queries are not sent to the server, but are sent by multicasts. Multicast group address – 224.0.0.251 (ff02::fb for IPv6), the port number is 5353 to avoid confusion with regular DNS.

Here’s how it works:

$ ./mcdig KM7B6A91.local. A

; QUESTION PSEUDOSECTION:
;KM7B6A91.local.   IN   A

;; ANSWER SECTION:
KM7B6A91.local.    120   IN   A     192.168.1.102

KM7B6A91 is the name of my printer, which he himself came up with, and 192.168.1.102 is his IP address, which he received via DHCP.

By the way, if there was no DHCP, all devices on the local network would still receive addresses. These are the so-called link-local addresses; a special range is allocated for them. This works especially well in IPv6.

All this together is not called by Apple the words Rendezvous and Bonjour, and in the neutral world – ZeroConf, mDNS, DNS-SD. This technology has been fortunate enough to have many names.

For names that are found in this way, the Top Level Domain is reserved. .local. But in fact, devices with normal names also respond to MDNS in LAN.

Unlike classical DNS, multicast DNS does not impose strict restrictions on label syntax. The label may contain spaces, punctuation marks, and so on. – anything that is allowed in UTF-8. That’s why "Kyocera ECOSYS M2040dn._printer._tcp.local." is a perfectly normal domain name:

$ ./mcdig "Kyocera ECOSYS M2040dn._printer._tcp.local." SRV

;; QUESTION PSEUDOSECTION:
;Kyocera\ ECOSYS\ M2040dn._printer._tcp.local.	IN	 SRV

;; ANSWER SECTION:
Kyocera\ ECOSYS\ M2040dn._printer._tcp.local. 4500	IN  SRV	0 0 515 KM7B6A91.local.

In real life, everything is somewhat more complicated than in this simple description. For example, when a device joins a network, it must find out (by talking to neighbors) that the name is not taken by anyone. And in case of a conflict, come up with something else (and remember it, this is the requirement of the RFC).

In addition, multicasts can be lost, and sending them like a machine gun is somewhat impolite. Therefore, mDNS works better if the system has a local daemon that constantly monitors neighbors, and the “search” in the multicast DNS is to look in the daemon’s preheated cache.

By the way, responses to queries in mDNS are also recommended to be sent by multicasts – so that passive listeners can warm up their cache with useful information flying by.

mDNS is a trust based system. If someone is lying on the local network (or making a mistake in good faith), there is no way to catch him for this activity.

Another problem is that if the device is connected to several unconnected local networks (for example, WiFi and wired Ethernet), although with a low probability, it may turn out that the same name occurs in several networks, but refers to different devices. That is, in a good way, if the device is found via mDNS, then connect to it should contain information about both the address and the network
interface through which the connection is supposed to be established. However, in real life, this has not been completed, and more or less correctly works only when using IPv6 link-local addresses (they provide for an indication of the “zone”, which, in fact, is a link to the network interface in the address).

DNS-SD: automatic search for printers and more

DNS-SD is described in RFC 6763

mDNS is good, but there is no way for a normal person to know what his printer is called KM7B6A91.local.? A normal person would like to get a list of devices with human names so that he can choose from them.

And the program, before connecting to a device, would like to know what this device is (for example, a printer, scanner, etc.) and what, in general, it can do.

The protocol that solves these problems is called DNS-SD, and is built on top of mDNS.

DNS-SD stands for DNS Service Discovery, and strictly speaking, it does not look for devices, but for “services”. But if the service is a print service, then we will find, just, a printer (or announcing itself through the DNS-SD print server).

This technology works just fine in a small network. You bring the printer home, connect it to the network, and it immediately appears in the menus of programs that can print, and under a clear name. Cell phones also understand this technology, which gives you the ability to print pieces of paper directly from your phone.

In a large corporate network, this does not work so well. If you have fifty printers on the list, and you have to wade through this list every time, then a desire begins to appear to turn off all this automation to hell.

DNS-SD works as follows. Let’s say we need a printer that uses the IPP protocol. First, we will search for all printers on the network:

$ ./mcdig _ipp._tcp.local. PTR
;; QUESTION PSEUDOSECTION:
;_ipp._tcp.local.   IN   PTR

;; ANSWER SECTION:
_ipp._tcp.local.   4500   IN   PTR   Kyocera\ ECOSYS\ M2040dn._ipp._tcp.local.

Here the PTR record is used in an unusual (compared to normal DNS) way: it points not to an IP address, but to a domain name. Moreover, the very first label in the name uses a human-readable format. In RFC 6763 terminology, this is called the “Service Instance Name” – it’s not quite the same as the host name, but it’s also unique within the network, although you can’t tell by the look.

Now we want to know a little more about this device so that we can work with it.

The SRV record request brings us the host name and TCP port:

$ ./mcdig "Kyocera ECOSYS M2040dn._ipp._tcp.local." SRV
;; QUESTION PSEUDOSECTION:
;Kyocera\ ECOSYS\ M2040dn._ipp._tcp.local.  IN  SRV

;; ANSWER SECTION:
Kyocera\ ECOSYS\ M2040dn._ipp._tcp.local. 4500 IN  SRV  0 0 631 KM7B6A91.local.

The TXT record contains a lot of useful information about the printer:

$ ./mcdig "Kyocera ECOSYS M2040dn._ipp._tcp.local." TXT
;; ANSWER SECTION:
Kyocera\ ECOSYS\ M2040dn._ipp._tcp.local.  4500  CLASS32769	TXT	"txtvers=1" "pdl=image/pwg-raster,application/octet-stream,application/pdf,image/tiff,image/jpeg,image/urf,application/postscript,application/vnd.hp-PCL,application/vnd.hp-PCLXL,application/vnd.xpsdocument" "product=(ECOSYS M2040dn)" "ty=Kyocera ECOSYS M2040dn" "qtotal=1" "usb_MFG=Kyocera" "usb_MDL=Kyocera ECOSYS M2040dn (KPDL)" "note=" "adminurl=https://KM7B6A91.local/airprint" "Duplex=T" "Fax=F" "Scan=T" "Color=F" "UUID=4509a320-00a0-008f-00b6-002507510eca" "URF=CP255,DM4,IFU0,IS19-20,OB1-10,PQ4,RS600,V1.4,W8" "PaperMax=legal-A4" "kind=document,envelope" "priority=48" "rp=ipp/print" "print_wfds=T" "mopria-certified=1.2" "air=none" "TLS=1.2"

Finding Local Devices (USB)

The combination of DNS-SD to find devices with vendors bending towards using standard rather than their own protocols made possible such a thing as “driverless” printing and scanning of documents. If the device supports this technology, then drivers are not needed, the standard driver included in the operating system works with any printer that supports IPP and any scanner that supports eSCL.

However, USB devices that do not have a network connection were not covered.

To somehow solve this situation, the USB standardization organization invented the protocol IPP over USB, which would be more accurately called HTTP over USB, because that’s exactly what it is. By pushing HTTP traffic over USB ports, this protocol has made driverless printing possible, scanning over USB, and even the printer’s web console over USB now works amazingly as well.

This construct turns the USB device into a pseudo-network device, and uses DNS-SD to advertise the existence of such addresses on localhost.

Here it should be noted that no one sends multicasts through the interface 127.0.0.1, they are not provided there. Communication goes through a local mDNS daemon (Avahi in the case of Linux). The IPP over USB daemon announces devices via Avahi, clients find it, but all this is done within the same machine, and Avahi does not talk to itself in multicasts.

This architecture allows the use of IPP over USB daemons and a print server in isolation, the DNS-SD inside the machine is used as a signal bus, and all communication goes through network sockets.

Conclusion

As Kozma Prutkov said, “no one will embrace the immensity.” But it was worth trying 🙂

If I missed something important, welcome to the comments!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *