Dynamic host configuration, please.
This commit is contained in:
parent
024073d10e
commit
595f8ee2ff
668
dynamic_host_configuration_please.org
Normal file
668
dynamic_host_configuration_please.org
Normal file
@ -0,0 +1,668 @@
|
|||||||
|
#+TITLE: Dynamic host configuration, please
|
||||||
|
#+DATE: 2023-03-03
|
||||||
|
* Prologue
|
||||||
|
The minimal viable product for an OpenBSD laptop has the following
|
||||||
|
features:
|
||||||
|
1. It has a real time clock (RTC).
|
||||||
|
2. It runs Emacs.
|
||||||
|
3. It can suspend *and* resume.
|
||||||
|
4. It has working Wi-Fi.
|
||||||
|
With those things available we can start to improve the user
|
||||||
|
experience.
|
||||||
|
|
||||||
|
A smart phone is basically always online in urban areas and even in
|
||||||
|
rural areas[fn:: My phone automatically connected to the Wi-Fi at Elk
|
||||||
|
Lakes Cabin. Never mind that we had to drag the satellite dish over
|
||||||
|
the pass.]. Nearly seven years ago at a hackathon in Cambridge, UK, we
|
||||||
|
set out to have a similar experience for our laptops. We will look at
|
||||||
|
how OpenBSD configures Wi-Fi networks, deals with network
|
||||||
|
auto-configuration for IPv4 and IPv6, and DNS resolution. We will show
|
||||||
|
how it does this in a reasonably secure way with minimal manual
|
||||||
|
configuration.
|
||||||
|
* Join the Wi-Fi.
|
||||||
|
The reader might recognize this conversation when arriving at a new
|
||||||
|
location and taking out their phone:
|
||||||
|
#+begin_quote
|
||||||
|
Me: Hey, what's the Wi-Fi password?
|
||||||
|
|
||||||
|
Them: We are in the middle of nowhere, there is no Wi-Fi.
|
||||||
|
|
||||||
|
Me: All lower-case, one word?
|
||||||
|
#+end_quote
|
||||||
|
On the phone, we need to select the Wi-Fi and enter the password only
|
||||||
|
once. The phone then remembers it indefinitely and auto-connects to
|
||||||
|
it whenever the Wi-Fi is in range.
|
||||||
|
|
||||||
|
On OpenBSD, network interfaces are configured by [[https://man.openbsd.org/ifconfig.8][ifconfig(8)]], or
|
||||||
|
persistently in [[https://man.openbsd.org/hostname.if.5][/etc/hostname.IF]][fn::IF denotes a specific network
|
||||||
|
interface. For example for iwm0 the file is =/etc/hostname.iwm0=],
|
||||||
|
which is read by [[https://man.openbsd.org/netstart.8][netstart(8)]] during boot. netstart(8) calls ifconfig(8)
|
||||||
|
internally to handle the network configuration.
|
||||||
|
|
||||||
|
For a long time, we could only configure one SSID:
|
||||||
|
#+begin_src shell
|
||||||
|
$ cat /etc/hostname.iwm0
|
||||||
|
nwid home wpakey "trivial password"
|
||||||
|
inet autoconf
|
||||||
|
inet6 autoconf
|
||||||
|
up
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
This configures a Wi-Fi network named "home" and a password "trivial
|
||||||
|
password". IPv4 and IPv6 auto-configuration are enabled. Whenever the
|
||||||
|
network is in range the kernel automatically connects to it.
|
||||||
|
|
||||||
|
That is not a good user experience (UX). We typically take our laptops
|
||||||
|
with us and connect to different Wi-Fi networks, like our phones. We
|
||||||
|
have a Wi-Fi at home, at work, there are open Wi-Fis at hotels, and so
|
||||||
|
on.
|
||||||
|
|
||||||
|
People came up with all kinds of weird shell scripts that would run in
|
||||||
|
the background or triggered by [[https://man.openbsd.org/cron.8][cron(8)]] to notice when the laptop moved
|
||||||
|
to a different Wi-Fi. The script would then call ifconfig(8) to
|
||||||
|
reconfigure Wi-Fi from a list of networks it new about. This was all
|
||||||
|
incredibly fragile and not the OpenBSD way.
|
||||||
|
|
||||||
|
Peter Hessler (phessler@), with the help of Stefan Sperling (stsp@)
|
||||||
|
went ahead and tackled this problem: What if we could pass multiple
|
||||||
|
=(name, password)= tuples to the kernel and the kernel would chose the
|
||||||
|
right one?
|
||||||
|
|
||||||
|
#+begin_src shell
|
||||||
|
$ cat /etc/hostname.iwm0
|
||||||
|
join home wpakey "trivial password"
|
||||||
|
join work wpakey zUDciIezevfySqam
|
||||||
|
join "Airport Wi-Fi"
|
||||||
|
join ""
|
||||||
|
inet autoconf
|
||||||
|
inet6 autoconf
|
||||||
|
up
|
||||||
|
#+end_src
|
||||||
|
=join= implements exactly this. The argument to =join= is the name of
|
||||||
|
the network and the following =wpakey= is the password for that
|
||||||
|
network. If we leave out the =wpakey=, the Wi-Fi is open and does not
|
||||||
|
require a password. Using =join= with the empty string (~join ""~)
|
||||||
|
means the kernel will try to connect to any open Wi-Fi if no Wi-Fi
|
||||||
|
from the join list is found first.
|
||||||
|
|
||||||
|
We still need to configure the name and password by editing a file
|
||||||
|
in =/etc/= and run netstart(8) when we encounter a new Wi-Fi. This is
|
||||||
|
probably not the best UI[fn::As far as I am concerned ed(1) is the
|
||||||
|
pinnacle of UI design, but YMMV.] but the UX is pretty good and on par
|
||||||
|
with a smart phone. Once the Wi-Fi is configured by adding a =join=
|
||||||
|
line, the kernel will automatically re-connect to a known Wi-Fi
|
||||||
|
whenever it comes into range.
|
||||||
|
* Stop slacking.
|
||||||
|
Now that we are connected to the Wi-Fi, we need to configure IP
|
||||||
|
addresses.
|
||||||
|
|
||||||
|
We started our efforts to improve the network configuration user
|
||||||
|
experience with IPv6 for two reasons. Even in this day and age
|
||||||
|
IPv6 is a technology for early adopters[fn::Which is quite sad.], they
|
||||||
|
are used to pain. When we break IPv4, people tend to complain. With
|
||||||
|
IPv6 they are eager to help debug the problem.
|
||||||
|
|
||||||
|
The other reason was, OpenBSD got IPv6 support from the KAME project
|
||||||
|
in the late 1990s and early 2000s and then there was not a lot of work
|
||||||
|
done afterwards. The network configuration was handled mostly in the
|
||||||
|
kernel, so there was no isolation from malicious input. For the most
|
||||||
|
part it assumed a stationary work station that tried to acquire an
|
||||||
|
IPv6 prefix for stateless address auto-configuration during boot by
|
||||||
|
sending three router solicitations and then listened for router
|
||||||
|
advertisements to create auto-configuration addresses and renewed
|
||||||
|
their lifetimes when a new advertisement flew by. There was some
|
||||||
|
rudimentary code in rtsold(8) to handle movement between networks, but
|
||||||
|
nobody was using it because it was optional. rtsold(8) was used in
|
||||||
|
one-shot mode where it would sent at most three router solicitations
|
||||||
|
when an interface connected to the network and then it would exit.
|
||||||
|
|
||||||
|
We started to write [[https://man.openbsd.org/slaacd.8][slaacd(8)]][fn:name_things:I should not be allowed
|
||||||
|
to name things.] and once that was working we could delete rtsold(8)
|
||||||
|
and remove a lot of code from the kernel.
|
||||||
|
|
||||||
|
slaacd(8) is a privilege separated network daemon that build previous
|
||||||
|
experience with privilege separation in OpenBSD. It uses three
|
||||||
|
processes, the /parent/ process to configure the system, the
|
||||||
|
/frontend/ process to talk to the outside world and the /engine/
|
||||||
|
process to handle untrusted data and run a state machine for the
|
||||||
|
stateless address auto-configuration protocol.
|
||||||
|
|
||||||
|
pledge(2) restricts what a process is allowed to do and this is
|
||||||
|
enforced by the kernel. Enforcement means that the kernel will
|
||||||
|
terminate processes that violate what they pledged they would do. The
|
||||||
|
pledges themselves are in broad strokes, we do not concern ourselves
|
||||||
|
with single system calls but with groups of system calls. For example,
|
||||||
|
the process is allowed to interact with open file descriptors
|
||||||
|
(="stdio"=), it is allowed to open connections to hosts on the
|
||||||
|
Internet (="inet"=), or it is allowed to open files for reading
|
||||||
|
(="rpath"=).
|
||||||
|
|
||||||
|
The /parent/ process pledges that it will only open new network
|
||||||
|
sockets, send those to other processes and reconfigure the routing
|
||||||
|
table (="stdio inet sendfd wroute"=). The /frontend/ process pledges
|
||||||
|
to only receive file descriptors, open unix domain sockets and check
|
||||||
|
the state of the routing table (="stdio unix recvfd route"=). Checking
|
||||||
|
the routing table includes seeing which flags are configured per
|
||||||
|
interface. The /engine/ process pledges to only read and write to
|
||||||
|
already open file-descriptors (="stdio"=). The /engine/ process is
|
||||||
|
very restricted what it is allowed to do. This is important because it
|
||||||
|
handles untrusted data coming from the network. While the /frontend/
|
||||||
|
process talks to the network, it never looks at the data. An attacker
|
||||||
|
will not be able to confuse the /frontend/ process with data they
|
||||||
|
sent. They can and did [[https://ftp.openbsd.org/pub/OpenBSD/patches/7.0/common/014_slaacd.patch.sig][confuse]] the /engine/ process.
|
||||||
|
|
||||||
|
For more details see [[file:privsep.org]["Privilege drop, privilege separation, and
|
||||||
|
restricted-service operating mode in OpenBSD"]].
|
||||||
|
|
||||||
|
slaacd(8) is enabled per default on all OpenBSD installations.
|
||||||
|
|
||||||
|
IPv6 stateless address auto-configuration is enabled on an interface
|
||||||
|
by setting the =AUTCONF6= flag using [[file:/man.openbsd.org/ifconfig.8][ifconfig(8)]]: =ifconfig iwm0 inet6
|
||||||
|
autoconf=. The kernel announces this changed interface flag to the
|
||||||
|
whole system using a broadcasted route message. slaacd(8) reads those
|
||||||
|
messages using a [[https://man.openbsd.org/route.4][route(4)]] socket.
|
||||||
|
|
||||||
|
slaacd(8) handles all aspects of stateless address
|
||||||
|
auto-configuration. It sends router solicitations when needed, either
|
||||||
|
multi-cast or uni-cast, depending on which is appropriate. It waits
|
||||||
|
for router advertisements, parses them, and configures default routes,
|
||||||
|
global and temporary IPv6 addresses, and passes name server
|
||||||
|
information via a route message to the rest of the system. It takes
|
||||||
|
care of the lifetimes of addresses, default routes, and name server
|
||||||
|
information expiring and removing those from the system when no router
|
||||||
|
advertisements are received to extend the lifetime.
|
||||||
|
|
||||||
|
slaacd(8) also monitors when network interfaces regain their
|
||||||
|
connection to a network. For example because the laptop woke up from
|
||||||
|
suspend or it got moved out of range of a Wi-Fi network and moved back
|
||||||
|
into range. It then needs to find out if it connected to the same
|
||||||
|
network as before or if it is now in a new network. If it is a new
|
||||||
|
network we need to replace the old addresses, default route, and name
|
||||||
|
servers. If there is no IPv6 available it needs to remove the old
|
||||||
|
information.
|
||||||
|
|
||||||
|
The stateless address auto-configuration specification allows multiple
|
||||||
|
default routers being present on the same layer two network,
|
||||||
|
announcing the same or different network information. slaacd(8) tries
|
||||||
|
to handle this, but this has not been extensively tested in all
|
||||||
|
possible cases. There are still open questions being discussed at the
|
||||||
|
IETF on how to run networks with different network prefixes in the
|
||||||
|
same layer two network. Hic sunt dracones...
|
||||||
|
|
||||||
|
slaacd(8) does handle multiple interfaces just fine and we will show
|
||||||
|
later how we pick the right source address when multiple are available
|
||||||
|
to chose from.
|
||||||
|
|
||||||
|
* Dynamic host configuration, please.
|
||||||
|
With IPv6 address configuration mostly solved, it was time to look at
|
||||||
|
IPv4 again. We used a fork of ISC's dhclient(8). Henning Brauer
|
||||||
|
(henning@) added privilege separation to it and in recent years
|
||||||
|
Kenneth Westerback (krw@) heroically maintained it. It was showing its
|
||||||
|
age though. The privilege separation was never quite right. This
|
||||||
|
became more visible with the integration of pledge(2) and it would be
|
||||||
|
difficult to integrate some of the features we developed in slaacd(8).
|
||||||
|
|
||||||
|
It was time to write a new daemon. Otto Moerbeek (otto@) solved the
|
||||||
|
most pressing problem by suggesting a name for it: dhcpleased(8). We
|
||||||
|
try to be polite towards the computer. It is pronounced "dynamic host
|
||||||
|
configuration, please". The "d" is silent.
|
||||||
|
|
||||||
|
On a very high level IPv4 DHCP and IPv6 stateless address
|
||||||
|
auto-configuration are very similar. We request some information from
|
||||||
|
the router[fn::In IPv6 we might not need to request the information,
|
||||||
|
it might just show up unannounced.], we use it to configure the system
|
||||||
|
and we make sure that information does not expire. When we move
|
||||||
|
networks we need to probe if our information is still up to date and
|
||||||
|
if not, reconfigure the system.
|
||||||
|
|
||||||
|
The obvious solution is to copy =sbin/slaacd= to =sbin/dhcpleased= and
|
||||||
|
replace the IPv6 specific bits with IPv4 specific bits. And that is
|
||||||
|
exactly what we did.
|
||||||
|
|
||||||
|
On paper DHCP looks more complicated than IPv6 stateless address
|
||||||
|
auto-configuration because it negotiates with the server and there is
|
||||||
|
a complicated state machine to implement.
|
||||||
|
|
||||||
|
In practice it is the other way around. The "stateless" part in IPv6
|
||||||
|
does not apply to the client. The client must keep state and implement
|
||||||
|
a state machine to keep track of which routers are available and when
|
||||||
|
various information expires. In IPv4 we talk to one server and all
|
||||||
|
information expires at the same time.
|
||||||
|
|
||||||
|
We will talk about a few differences between slaacd(8) and
|
||||||
|
dhcpleased(8) in a moment, but from the user perspective both behave
|
||||||
|
the same. They make sure that the address configuration and default
|
||||||
|
gateway are always up to date and they pay attention when the machine
|
||||||
|
moves between networks, either while awake or while sleeping.
|
||||||
|
|
||||||
|
Because dhcpleased(8) has to use [[https://man.openbsd.org/bpf.4][bpf(4)]] instead of regular sockets for
|
||||||
|
some of the network packets it needs to sent, the /parent/ process
|
||||||
|
cannot use pledge(2). There is nothing it could pledge that would
|
||||||
|
allow the usage of bpf(4) at the moment. To protect the system and
|
||||||
|
prevent exfiltration of sensitive data we use [[https://man.openbsd.org/unveil.2][unveil(2)]] to restrict
|
||||||
|
the /parent/ process' view of the file system. dhcpleased(8) can only
|
||||||
|
read its configuration file, read and write =/dev/bpf=, and read,
|
||||||
|
write and create files underneath =/var/db/dhcpleased/= to store
|
||||||
|
information about received leases.
|
||||||
|
|
||||||
|
While we could get away with not implementing a config file for
|
||||||
|
slaacd(8), we were not this lucky with dhcpleased(8). Some systems out
|
||||||
|
there will only give us a DHCP lease if we sent the correct /client
|
||||||
|
id/ for example.
|
||||||
|
|
||||||
|
There are a lot of DHCP options specified in RFC 2132. We only
|
||||||
|
implement the bare minimum, only the options we need and can
|
||||||
|
handle. We do not need a swap server or a cookie server to get the
|
||||||
|
quote of the day.
|
||||||
|
|
||||||
|
Like slaacd(8), dhcpleased(8) is enabled on all OpenBSD
|
||||||
|
installations.
|
||||||
|
* Route priorities.
|
||||||
|
dhcpleased(8) and slaacd(8) can handle multiple interfaces at the same
|
||||||
|
time. The routing table might look like this:
|
||||||
|
#+begin_src shell
|
||||||
|
$ netstat -nrf inet
|
||||||
|
Routing tables
|
||||||
|
|
||||||
|
Internet:
|
||||||
|
Destination Gateway Flags Refs Use Mtu Prio Iface
|
||||||
|
default 192.168.1.1 UGS 4 110 - 8 em0
|
||||||
|
default 192.168.178.1 UGS 0 0 - 12 iwm0
|
||||||
|
[...]
|
||||||
|
#+end_src
|
||||||
|
We end up with two default routes, one gateway is reachable via the
|
||||||
|
/em0/ interface with priority value 8 and the other gateway is
|
||||||
|
reachable via the /iwm0/ interface with priority value 12. A route has
|
||||||
|
higher priority when its priority value is lower. /em0/ is an Ethernet
|
||||||
|
interface and it gets higher priority over the Wi-Fi interface
|
||||||
|
/iwm0/. All things being equal, the kernel will pick the address from
|
||||||
|
/em0/ as source address when making a new connection to the internet
|
||||||
|
and route traffic over the Ethernet interface, which is presumably
|
||||||
|
faster.
|
||||||
|
|
||||||
|
If we pick up the laptop and unplug the Ethernet interface, all things
|
||||||
|
are no longer equal, the route over /em0/ is no longer usable and
|
||||||
|
existing connections using it will stall and time out. New connections
|
||||||
|
will instead use /iwm0/.
|
||||||
|
|
||||||
|
If we plug /em0/ back in again, session might come alive again and new
|
||||||
|
connections will use /em0/. Connections that are running over /iwm0/
|
||||||
|
will continue working, because the interface is still connected to
|
||||||
|
the Wi-Fi.
|
||||||
|
|
||||||
|
Applications like web browsers, email clients or even video
|
||||||
|
conferencing systems will automatically establish a new connection
|
||||||
|
when they notice the old one is dead.
|
||||||
|
|
||||||
|
Unfortunately [[https://man.openbsd.org/ssh.1][ssh(1)]] is not one of them. If switching between wired
|
||||||
|
and wireless happens seldomly [[https://man.openbsd.org/tmux.1][tmux(1)]] on the remote system might help
|
||||||
|
with ssh(1) disconnects. Or maybe a [[https://man.openbsd.org/wg.4][wg(4)]] tunnel can be used so that
|
||||||
|
the source address does not change when switching between wired and
|
||||||
|
wireless.
|
||||||
|
* Cellular networks.
|
||||||
|
In addition to Ethernet and Wi-Fi networks, OpenBSD supports "Mobile
|
||||||
|
Broadband Interface Model" devices using the [[https://man.openbsd.org/umb.4][umb(4)]] driver. These can
|
||||||
|
be used to connect to UMTS or LTE networks. They require a sim card
|
||||||
|
and after being configured using a PIN they will connect to cellular
|
||||||
|
networks and automatically configure an IP address and default
|
||||||
|
route. The default route has an even lower route priority than Wi-Fi
|
||||||
|
so it will only be used when Ethernet and Wi-Fi are not connected.
|
||||||
|
* It is always DNS.[fn::In my line of work that is certainly true, but that is just sample bias.]
|
||||||
|
We need to talk about DNS next. Humans are not particularly good at
|
||||||
|
remembering =2606:2800:220:1:248:1893:25c8:1946=, we are much better
|
||||||
|
with names like /example.com/. When we run ~ping6 example.com~ we
|
||||||
|
sooner or later end up in [[https://man.openbsd.org/asr_run.3][libc's stub resolver]]. It will open
|
||||||
|
=/etc/resolv.conf=, and look for /nameserver/ lines to use for DNS
|
||||||
|
resolution.
|
||||||
|
|
||||||
|
We can learn name servers from dhcpleased(8), slaacd(8), umb(4),
|
||||||
|
and [[https://man.openbsd.org/iked.8][iked(8)]]. Historically dhclient(8) owned =/etc/resolv.conf=, which
|
||||||
|
means that no other process could add name servers to it. dhclient(8)
|
||||||
|
would just overwrite whatever was in there whenever it renewed its
|
||||||
|
lease. This made it impossible to sometimes move to an IPv6-only
|
||||||
|
network. slaacd(8) could not configure name servers and the left-over
|
||||||
|
IPv4 name servers were not reachable.
|
||||||
|
|
||||||
|
We can either teach all name server sources to somehow cooperate and
|
||||||
|
to not scribble over each other and share responsibility of
|
||||||
|
=/etc/resolv.conf= or we can run an arbitrator that collects name
|
||||||
|
servers from diverse sources and handles the contents of
|
||||||
|
=/etc/resolv.conf=.
|
||||||
|
|
||||||
|
[[https://man.openbsd.org/resolvd.8][resolvd(8)]] is such an arbitrator. It is another always enabled
|
||||||
|
daemon. It collects name servers from all the mentioned sources and
|
||||||
|
adds them to =/etc/resolv.conf=.
|
||||||
|
|
||||||
|
It also monitors if =/etc/resolv.conf= gets edited in which case it
|
||||||
|
re-reads the file and makes sure that the learned name servers are at
|
||||||
|
the beginning of the file. This is useful when the administrator of
|
||||||
|
the machine decides to add options to =/etc/resolv.conf=. For example,
|
||||||
|
we can edit the file and add =family inet6 inet= to prefer IPv6 over
|
||||||
|
IPv4 and resolvd(8) will cope. There is no need for an extra
|
||||||
|
configuration file, =/etc/resolv.conf= is the configuration file.
|
||||||
|
|
||||||
|
Name servers are announced using route messages and resolvd(8) listens
|
||||||
|
for them using a route(4) socket. They can also be observed using the
|
||||||
|
[[https://man.openbsd.org/route.8][route(8)]] tool: ~$ route monitor~.
|
||||||
|
|
||||||
|
resolvd(8) can also request that name servers are re-announced by their
|
||||||
|
sources. This is useful when resolvd(8) gets restarted.
|
||||||
|
* Let us unwind[fn:: See [fn:name_things].] a bit.
|
||||||
|
Good old plain DNS is not a secure protocol. It exchanges
|
||||||
|
un-authenticated UDP packets without any integrity protection. This
|
||||||
|
makes it easy for an attacker to spoof answer packets.
|
||||||
|
|
||||||
|
DNS answer packets are untrusted data, they come from the
|
||||||
|
network. However, the process that sends DNS queries and parses the
|
||||||
|
answer using the libc functions is almost always the single main
|
||||||
|
process of the tool. When we run ~ping example.com~, DNS packets are
|
||||||
|
parsed using our user. An attacker who can spoof a DNS answer might be
|
||||||
|
able to trigger a bug in libc and gain code execution that way.
|
||||||
|
|
||||||
|
On OpenBSD ping(8) pledges ="stdio DNS"=, so the attacker will not get
|
||||||
|
very far, but there are many more programs in ports that are not
|
||||||
|
pledged that might want to resolve names.
|
||||||
|
|
||||||
|
It would be worthwhile to have some sort of proxy running on localhost
|
||||||
|
so that DNS packets from the outside need to traverse a well locked
|
||||||
|
down process running in a different address-space and as a different
|
||||||
|
user than the program that needs to resolve a name.
|
||||||
|
|
||||||
|
An early experiment was rebound(8), written by Ted Unangst (tedu@). It
|
||||||
|
was simplistic and did not understand DNS at all, it would just
|
||||||
|
forward packets, but it would sit between the Internet and the
|
||||||
|
program.
|
||||||
|
|
||||||
|
An alternative is to run a full recursive resolver like [[https://man.openbsd.org/unbound.8][unbound(8)]] on
|
||||||
|
the laptop, but this leads to problems, too. unbound(8) expects a well
|
||||||
|
working network where nobody interferes with DNS, this is true in data
|
||||||
|
centres and can be achieved in well maintained home networks, but it
|
||||||
|
is not something we find when moving laptops to arbitrary networks
|
||||||
|
like free Wi-Fi in a hotel or airport.
|
||||||
|
|
||||||
|
We can either give up and move to a different hotel[fn::Which is not
|
||||||
|
realistic.], or we need to adjust our expectations, figure out what we
|
||||||
|
have and work with that.
|
||||||
|
|
||||||
|
It turns out that often the quality of the network changes over
|
||||||
|
time. When we first connect to a hotel Wi-Fi we may find ourselves in
|
||||||
|
what is referred to as a /captive portal/. Everything is blocked, DNS
|
||||||
|
gets intercepted, and we are redirected to a web site where we need to
|
||||||
|
agree to the terms and conditions. Maybe provide our name and room
|
||||||
|
number. Once we are past that, network quality improves considerably
|
||||||
|
and we are mostly free to talk to the outside world.
|
||||||
|
|
||||||
|
This is where [[https://man.openbsd.org/unwind.8][unwind(8)]] comes in. It is another privilege separated
|
||||||
|
network daemon that provides a recursive name server for the local
|
||||||
|
machine. resolvd(8) detects when it is running and automatically
|
||||||
|
rewrites =/etc/resolv.conf= to have only =nameserver 127.0.0.1= listed
|
||||||
|
as name server.
|
||||||
|
|
||||||
|
With that we have the first problem solved, or at least improved on
|
||||||
|
the situation. Programs that need DNS resolution are insulated from
|
||||||
|
the Internet. An attacker needs to get past unwind(8) first before
|
||||||
|
they can try to attack the libc stub resolver.
|
||||||
|
|
||||||
|
unwind(8) understands and speaks DNS and it actively observes the
|
||||||
|
network quality.
|
||||||
|
|
||||||
|
We did not write our own recursive name server. That would be
|
||||||
|
difficult, it would be unlikely we would get it right on first
|
||||||
|
try[fn:: Or second or third try for that matter.], and DNS is
|
||||||
|
constantly evolving, so it is a lot of effort to keep up. Instead we
|
||||||
|
are standing on the shoulders of giants and use libunbound, which is
|
||||||
|
part of [[https://man.openbsd.org/unbound.8][unbound(8)]]. It is developed under a BSD license by [[https://www.nlnetlabs.nl/][NLnet Labs]].
|
||||||
|
|
||||||
|
|
||||||
|
The resolver process pledges ="stdio inet dns rpath"= and
|
||||||
|
restricts access to the file system using unveil(2) to
|
||||||
|
=/etc/ssl/cert.pem=. This is the process that is exposed to the
|
||||||
|
Internet and handles untrusted data. It would be preferable to have
|
||||||
|
one process exposed to the Internet and another to parse untrusted
|
||||||
|
data but that is not possible to do with libunbound.
|
||||||
|
|
||||||
|
Since we are using a real recursive name server, that gives us a lot
|
||||||
|
of options on how we can resolve names:
|
||||||
|
+ We can do our own recursion, walk down from the root zone using
|
||||||
|
qname minimization to improve privacy.
|
||||||
|
+ We can use the name server we learned from dhcpleased(8) and
|
||||||
|
slaacd(8) as forwarders, so we do not need to do our own recursion,
|
||||||
|
which might be faster.
|
||||||
|
+ We can try to opportunistically speak DNS over TLS (DoT) to the
|
||||||
|
learned name servers to prevent eavesdroppers from listening in.
|
||||||
|
+ We can configure forwarders manually to not depend on the network
|
||||||
|
provided name servers. Those might be more trustworthy. They can
|
||||||
|
also be DoT forwarders to prevent eavesdropping.
|
||||||
|
+ As a last resort, unwind(8) can behave exactly like the libc stub
|
||||||
|
resolver[fn::I call this the Dutch train problem. The free Wi-Fi on
|
||||||
|
Dutch trains do not like DNS queries with an /EDNS0/ option, they
|
||||||
|
intercept them, do not understand them, and answer /NXDOMAIN/. There
|
||||||
|
are other free Wi-Fi networks that are similarly broken.].
|
||||||
|
We call these resolving strategies and unwind(8) actively probes if
|
||||||
|
they are usable by sending test queries when it notices that the
|
||||||
|
network changed, for example because we moved to a different Wi-Fi
|
||||||
|
network or woke up from suspend. It then orders them by quality and
|
||||||
|
picks the best one.
|
||||||
|
|
||||||
|
There is an implicit skew in the strategies for finding the best one:
|
||||||
|
A manually configured DoT name server is always considered better than
|
||||||
|
a name server provided by the local network. As long as its available
|
||||||
|
and not atrociously slow.
|
||||||
|
|
||||||
|
unwind(8) is not too concerned about preserving privacy, it is
|
||||||
|
pragmatic and tries to resolve names the best way it can, if that
|
||||||
|
means using the local name servers provided by the network because
|
||||||
|
they are the only ones available it will use them.
|
||||||
|
|
||||||
|
Since unwind(8) uses libunbound it also supports DNSSEC. DNSSEC
|
||||||
|
provides data integrity and cryptographic authenticity, it does not
|
||||||
|
provide confidentiality.
|
||||||
|
|
||||||
|
unwind(8) is pragmatic about DNSSEC. When it tests the quality of a
|
||||||
|
resolving strategy it also tries to find out if DNSSEC is
|
||||||
|
available. There are many reasons why DNSSEC is not available: The
|
||||||
|
network is misconfigured, DNSSEC is flat out blocked or the laptop
|
||||||
|
does not (yet) have the correct time. If DNSSEC does not work
|
||||||
|
unwind(8) does not insist on using it.
|
||||||
|
|
||||||
|
Of course this makes it susceptible to a downgrade attack. To mitigate
|
||||||
|
this, unwind(8) will insist on DNSSEC working after it discovered once
|
||||||
|
that DNSSEC is working in the local network. This means that an
|
||||||
|
attacker needs to be able to block DNSSEC from the moment we connect
|
||||||
|
to a network. They cannot show up later and try to downgrade
|
||||||
|
us. unwind(8) will only become lenient again when we connect to a new
|
||||||
|
network.
|
||||||
|
|
||||||
|
This is not a strong mitigation of course, but DNSSEC is not a silver
|
||||||
|
bullet that fixes everything at the resolver. Applications also need
|
||||||
|
to do their part and decide how much they are willing to trust
|
||||||
|
DNS. For example ssh(1)'s /VerifyHostKeyDNS/ feature will only trust
|
||||||
|
host key fingerprints it obtained from DNS if they were validated
|
||||||
|
using DNSSEC and the validator runs on the local
|
||||||
|
machine[fn::Technically not entirely true, ssh(1) trusts what libc
|
||||||
|
indicates and libc automatically trusts localhost. See /trust-ad/ in
|
||||||
|
[[https://man.openbsd.org/resolv.conf.5][resolv.conf(5)]].]. Otherwise it will ask the user what to do.
|
||||||
|
|
||||||
|
A worst case scenario when joining a somewhat broken Wi-Fi network
|
||||||
|
with captive portal and a manually configured DoT name server might
|
||||||
|
look like this:
|
||||||
|
1. We connect to the network, we cannot reach the DoT name server and
|
||||||
|
cannot do our own recursion.
|
||||||
|
2. unwind(8) will chose the name server provided by the
|
||||||
|
network. It also notes that we just connected to a new network so
|
||||||
|
it is lenient with respect to DNSSEC validation. In effect it will
|
||||||
|
ignore validation errors.
|
||||||
|
3. We try to access a web site and the captive portal
|
||||||
|
detection in the browser triggers. We click the buttons and fill in
|
||||||
|
the forms until we are allowed on the internet.
|
||||||
|
4. unwind(8) notices that it can do its own recursion.
|
||||||
|
5. At the same time, unwind(8) notices that the DoT name server is
|
||||||
|
also reachable now and starts using it.
|
||||||
|
|
||||||
|
unwind(8) does not natively support DNS over HTTPS (DoH) and we
|
||||||
|
sometimes find ourselves in networks that block everything except for
|
||||||
|
TCP port 443. One way around this is to use dnscrypt-proxy from ports
|
||||||
|
which does support DoH. We can point unwind(8) at it by manually
|
||||||
|
configuring a plain DNS forwarder in addition to a DoT forwarder:
|
||||||
|
#+begin_src shell
|
||||||
|
$ cat /etc/unwind.conf
|
||||||
|
forwarder "9.9.9.9" port 853 authentication name "dns.quad9.net" DoT
|
||||||
|
forwarder "2620:fe::9" port 853 authentication name "dns.quad9.net" DoT
|
||||||
|
forwarder "127.0.0.1" port 5353 # dnscrypt-proxy for DoH
|
||||||
|
#+end_src
|
||||||
|
* Time for gelato.[fn:: Again, see [fn:name_things].]
|
||||||
|
People from the future might encounter networks without any IPv4. If
|
||||||
|
they are not too far in the future they might still need to talk to
|
||||||
|
IPv4 hosts on the Internet.
|
||||||
|
|
||||||
|
There are various transition technologies that get us from an IPv4
|
||||||
|
only Internet to an IPv6 only Internet. We will only look at /NAT64/,
|
||||||
|
/DNS64/, and /464XLAT/.
|
||||||
|
|
||||||
|
/NAT64/ allows us to reach IPv4 hosts from an IPv6 only network by
|
||||||
|
pretending that the hosts are IPv6 enabled. IPv6 addresses are so big
|
||||||
|
that we can easily encode all of IPv4 in an IPv6 /64 prefix, which is
|
||||||
|
the usual size of on IPv6 prefix we see per layer two network. In fact
|
||||||
|
we don't need the whole /64, a /96 is enough to encode the whole IPv4
|
||||||
|
Internet.
|
||||||
|
|
||||||
|
Let us pretend we know the /96 prefix used for /NAT64/ and the IPv4
|
||||||
|
address we want to reach. Forming an IPv6 address for the host is then
|
||||||
|
simply a bitwise-or operation of the IPv4 address with the /96 prefix,
|
||||||
|
the IPv4 address fills in the lower bits of the IPv6 prefix. This is
|
||||||
|
called address synthesis.
|
||||||
|
|
||||||
|
We can then use this address to connect to the IPv4-only
|
||||||
|
host. Somewhere on the network path is the /NAT64/ gateway that is
|
||||||
|
dual stacked. It knows that our packets are using /NAT64/ because it
|
||||||
|
is configured with the /96 prefix. It intercepts the packets and forms
|
||||||
|
IPv4 packets and sends them on their way. The gateway needs to be
|
||||||
|
stateful to be able to /NAT/ the return traffic back to us.
|
||||||
|
|
||||||
|
To find out the IPv4 address we want to connect to we of course use
|
||||||
|
DNS. The local name servers that slaacd(8) learned about would know
|
||||||
|
about the /NAT64/ prefix used in the network and do the address
|
||||||
|
synthesis for us. This is called /DNS64/. The problem with this is that
|
||||||
|
the name servers spoof DNS answers, something that DNSSEC tries very
|
||||||
|
hard to prevent. unwind(8) will detect this and generate an error, or
|
||||||
|
unwind(8) might not even talk to the designated name servers at all.
|
||||||
|
|
||||||
|
To get around this unwind(8) can itself detect the presence of /DNS64/
|
||||||
|
on a network by asking the local name servers for the /AAAA/ record,
|
||||||
|
i.e. the IPv6 address, for something that is guaranteed to never have
|
||||||
|
one: /ipv4only.arpa/. If it gets an answer, it can reverse the address
|
||||||
|
synthesis and learn the /NAT64/ prefix. With that information it can
|
||||||
|
do /DNS64/ itself and there is no longer a problem with DNSSEC.
|
||||||
|
|
||||||
|
The downsides of this mechanism are that it is quite complicated, it
|
||||||
|
messes around with DNS, and it does not work with IPv4 address
|
||||||
|
literals. It also does not work with programs that are fundamentally
|
||||||
|
IPv4 only: =ping example.com= will never work in an IPv6 only network
|
||||||
|
with only /NAT64 / DNS64/.
|
||||||
|
|
||||||
|
Instead of pretending the IPv4 host we want to reach has IPv6, we can
|
||||||
|
pretend to have working IPv4 if a /NAT64/ gateway is present. We ask
|
||||||
|
the kernel via the [[https://man.openbsd.org/pf.4][pf(4)]] firewall to do the IPv4 to IPv6 translation
|
||||||
|
for us. The /NAT64/ gateway will then do the reverse translation and
|
||||||
|
send an IPv4 packet on its way. This is called /464XLAT/.
|
||||||
|
|
||||||
|
We first need an IPv4 address, RFC 7335 reserved =192.0.0.0/29= for
|
||||||
|
this purpose:
|
||||||
|
#+begin_src shell
|
||||||
|
ifconfig pair1 inet 192.0.0.4/29
|
||||||
|
#+end_src
|
||||||
|
We then need a default gateway:
|
||||||
|
#+begin_src shell
|
||||||
|
ifconfig pair2 rdomain 1
|
||||||
|
ifconfig pair2 inet 192.0.0.1/29
|
||||||
|
#+end_src
|
||||||
|
Because pf(4) will only do address family translation on inbound rules
|
||||||
|
we need a different /rdomain/ and use [[https://man.openbsd.org/pair.4][pair(4)]] interfaces. We need to
|
||||||
|
connect them:
|
||||||
|
#+begin_src shell
|
||||||
|
ifconfig pair1 patch pair2
|
||||||
|
#+end_src
|
||||||
|
And then we can configure our default route:
|
||||||
|
#+begin_src
|
||||||
|
route add -host -inet default 192.0.0.1 -priority 48
|
||||||
|
#+end_src
|
||||||
|
We set it to a very low priority[fn:: Remember, a high priority
|
||||||
|
*value* means low priority.] so that it does not interfere with routes
|
||||||
|
dhcpleased(8) configures when we move to an IPv4 enabled network.
|
||||||
|
|
||||||
|
We then need to configure address family translation in pf(4) when we
|
||||||
|
detect /NAT64/ being present. This is were [[https://github.com/fobser/gelatod/][gelatod(8)]] comes in. It is
|
||||||
|
a Customer-side transLATor (/CLAT/) configuration daemon[fn::If you
|
||||||
|
squint just right, gelato kinda sounds like clat[fn::Again, I really
|
||||||
|
really should be prohibited from naming things.].]. /CLAT/ is what
|
||||||
|
/464XLAT/ calls the address translation happening on the laptop.
|
||||||
|
|
||||||
|
gelatod(8) is yet another privilege separated daemon[fn::At this point
|
||||||
|
you should believe me that that is a good thing and I will not go into
|
||||||
|
pledge details.] that checks for the presence of a /NAT64/
|
||||||
|
gateway whenever we change networks. It does so either via the
|
||||||
|
/ipv4only.arpa/ trick or explicitly via router advertisements. RFC
|
||||||
|
8781 specifies how a network can signal the presence of a /NAT64/
|
||||||
|
gateway.
|
||||||
|
|
||||||
|
gelatod(8) needs a pf(4) anchor into which it adds rules that are
|
||||||
|
similar to this example:
|
||||||
|
#+begin_src
|
||||||
|
pass in log quick on pair2 inet af-to inet6 \
|
||||||
|
from 2001:db8::da68:f613:4573:4ed0 to 64:ff9b::/96 \
|
||||||
|
rtable 0
|
||||||
|
#+end_src
|
||||||
|
The rule is doing address family translation to IPv6 on incoming
|
||||||
|
packets on =pair2=. In this example it uses
|
||||||
|
=2001:db8::da68:f613:4573:4ed0= as the IPv6 source address, gelatod(8)
|
||||||
|
learned this from the system when slaacd(8) configured
|
||||||
|
it. =64:ff9b::/96= is the learned /NAT64/ prefix and we are moving
|
||||||
|
traffic back to =rtable 0=. Remember =pair2= is in rdomain 1[fn::Do
|
||||||
|
not ask me about the difference between an rdomain and an rtable, I do
|
||||||
|
not know either.].
|
||||||
|
|
||||||
|
While this is all cute and works rather well, it is also completely
|
||||||
|
horribly complicated to set up. And that is why gelatod(8) is not in
|
||||||
|
OpenBSD base but lives in ports. We believe in good defaults in
|
||||||
|
OpenBSD and try to keep the buttons a user has to push to get
|
||||||
|
something working to an absolute minimum.
|
||||||
|
* Future work.
|
||||||
|
Which brings us to future work.
|
||||||
|
|
||||||
|
We want the functionality of gelatod(8) in OpenBSD base. gelatod(8)
|
||||||
|
was mostly a proof of concept. We imagine that a new network device
|
||||||
|
like clat(4) take over the role of client side address family
|
||||||
|
translation. It could be always present and gelatod(8) just enables
|
||||||
|
and disables it. At that point we can move the functionality into
|
||||||
|
slaacd(8) and delete gelatod(8). /CLAT/ is defined as a stateless
|
||||||
|
mechanism so it does not need the full pf(4) machinery for address
|
||||||
|
family translation.
|
||||||
|
|
||||||
|
It would be nice to have DNS over HTTPS (DoH) and DNS over Quic (DoQ)
|
||||||
|
natively in unwind(8). We are mostly waiting on upstream to implement
|
||||||
|
support in unbound(8).
|
||||||
|
|
||||||
|
And then there is some ongoing maintenance, little things that could
|
||||||
|
be improved:
|
||||||
|
+ The captive portal detection in unwind(8) is not perfect and it will
|
||||||
|
probably never be.
|
||||||
|
+ dhcpleased(8) and slaacd(8) should remember IP addresses from
|
||||||
|
networks they have been connected to before to be able to quickly
|
||||||
|
re-establish connectivity by probing if we are connecting to a
|
||||||
|
previous network while the lifetime of our addresses did not expire
|
||||||
|
yet. RFC 4436 "Detecting Network Attachment in IPv4 (DNAv4)" and RFC
|
||||||
|
6059 "Simple Procedures for Detecting Network Attachment in IPv6"
|
||||||
|
have the details.
|
||||||
|
+ It would be nice if the dhcpleased(8) parent process could be
|
||||||
|
pledged. This is not currently possible because of bpf(4). Things to
|
||||||
|
investigate here are changes to the network stack that would allow
|
||||||
|
us to use raw sockets instead of bpf(4) sockets or the ability to
|
||||||
|
[[https://man.openbsd.org/dup.2][dup(2)]] an existing bpf(4) socket and re-program the interface it is
|
||||||
|
using.
|
||||||
|
* Epilogue
|
||||||
|
Writing all this software over the last six to seven years was a lot
|
||||||
|
of fun. And combined with all the other features OpenBSD has to offer
|
||||||
|
like the /join/ feature, working suspend and resume and accelerated
|
||||||
|
video on /amd/ and /intel/ graphic cards makes it a pleasure to use
|
||||||
|
OpenBSD on a laptop as a daily driver. Things just work. Mostly. And
|
||||||
|
if they do not you have something to fix!
|
Loading…
Reference in New Issue
Block a user