First stab at VerifyHostKeyDNS

2023-01-14 19:32:00 +01:00
parent ef9dea95ae
commit cdaa16020f
1 changed files with 123 additions and 0 deletions
--- a/VerifyHostKeyDNS.org
+++ b/VerifyHostKeyDNS.org
@ -0,0 +1,123 @@
+#+Title VerifyHostKeyDNS
+#+SUBTITLE: ... or how I enroll new hosts into my infrastructure.
+#+DATE: 2023-12-14
+* Prologue
+I run my own infrastructure. I self-host my email, DNS, this website,
+a [[https://git.tlakh.xyz/explore/repos][git server]], [[https://restic.net/][backups]], and probably a bunch of other stuff that I
+forgot about. Ah yes, [[https://icinga.com/][monitoring]], Ubiquiti uniFi for my Wi-Fi access
+points at home and probably even more stuff.
+
+All of it running [[https://openbsd.org/][OpenBSD]], except for one machine running [[https://debian.org/][debian]]. It's
+all tied together with [[https://www.ansible.com/][ansible]][fn:: I started out with ansible,
+switched to salt stack and moved back to ansible. Because reasons.].
+
+So far it's eight machines and I was reinstalling and consolidating
+some VMs and physical machines. Hooking up new machines became
+annoying.
+* StrictHostKeyChecking
+My ansible orchestration host needs to be able to talk to new machines
+over ssh. New machines need to talk to the backup server over ssh and
+submit passive check results over ssh to the monitoring server. The
+monitoring server needs to talk to new hosts over ssh[fn:: I don't
+trust nrpe. I have seen the code. Instead I use ~by_ssh~ to monitor
+hosts. Ansible adds an ssh public-key to a monitoring user with a
+force-command. The force-command is a shell-script switching over
+~${SSH_ORIGINAL_COMMAND}~ to run specific check_commands. It does not
+trust the remote ssh at all.].
+
+So we have the issue of existing infrastructure needing to verify
+host-keys of new hosts and new hosts needing to verify host-keys of
+existing infrastructure. One way to deal with this is to run a [[https://www.lorier.net/docs/ssh-ca.html][CA,
+sign host-keys with it and roll certificates out]].
+
+I on the other hand prefer to use DNS[fn:: I have a laptop sticker and
+travel mug with "We reject kings, presidents and voting. We believe in
+rough consensus and running code." crossed out with "Fuck that! Just
+put it in DNS." I also have a RUN DNS sticker. I am biased]. [[https://www.rfc-editor.org/rfc/rfc4255][RFC4255]] provides
+facilities to store host-keys in SSHFP resource records in DNS and we
+can secure those with DNSSEC.
+
+* VerifyHostKeyDNS
+[[https://man.openbsd.org/ssh_config.5#VerifyHostKeyDNS][ssh_config(5)]] explains how [[https://man.openbsd.org/ssh.1][ssh(1)]] can use SSHFP records to verify
+host-keys:
+#+begin_example
+     VerifyHostKeyDNS
+             Specifies whether to verify the remote key using DNS and SSHFP
+             resource records.  If this option is set to yes, the client will
+             implicitly trust keys that match a secure fingerprint from DNS.
+             Insecure fingerprints will be handled as if this option was set
+             to ask.  If this option is set to ask, information on fingerprint
+             match will be displayed, but the user will still need to confirm
+             new host keys according to the StrictHostKeyChecking option.  The
+             default is no.
+
+#+end_example
+
+One problem with this is, if you put
+#+begin_example
+Host *
+    VerifyHostKeyDNS yes
+#+end_example
+into your =.ssh/config= it will not work. The magic is /secure
+fingerprint/. What the man page means is that a DNS answer for SSHFP
+needs to have the /Authentic Data (AD)/ flag set. The flag gets set
+when a validating name-server is asked for the SSHFP record, it finds
+it and it can validate the answer using DNSSEC.
+
+But then the libc stub resolver[fn:: The thingy that ssh uses to talk
+to the validating name-server. On OpenBSD that is [[https://man.openbsd.org/man3/asr_run.3][asr]].] gets that
+answer it will strip the AD flag for security reasons. You see, it
+does not know that it can trust the validating name-server. One way to
+have a trustworthy validating name-server is to run one on localhost.
+
+[[http://man.openbsd.org/resolv.conf#trust-ad][resolv.conf(5)]] explains the *trust-ad* option:
+#+begin_example
+     trust-ad   A name server indicating that it performed DNSSEC
+                validation by setting the Authentic Data (AD) flag
+                in the answer can only be trusted if the name
+                server itself is trusted and the network path is
+                trusted.  Generally this is not the case and the
+                AD flag is cleared in the answer.  The trust-ad
+                option lets the system administrator indicate that
+                the name server and the network path are trusted.
+                This option is automatically enabled if
+                resolv.conf only lists name servers on localhost.
+#+end_example
+The easiest way is to run [[https://man.openbsd.org/unwind.8][unwind(8)]]. [[https://man.openbsd.org/resolvd.8][resolvd(8)]] will then add
+=nameserver 127.0.0.1= into =/etc/resolv.conf= and comment out all
+other dynamically learned name servers. Just make sure that you are
+not using any static configured name servers[fn:: I use ~! route
+nameserver $if 149.112.112.9 2620:fe::9 9.9.9.9 2620:fe::fe:9~ in my
+main [[http://man.openbsd.org/hostname.if.5][hostname.if(5)]] to add some static name servers in case unwind(8)
+crashes[fn:: Not sure why it would do that though. Sounds
+unpleasant.].] because you really want to have only =nameserver
+127.0.0.1= in there.
+* Putting it all together
+When I install a new host I have out of band access in one way or
+another. It might be a serial console, a fake html5 console or some
+KVM contraption. Heck, I even used [[https://blog.eulinux.org/2018/07/hetzner-install.html][qemu]] to get OpenBSD running on some
+Hetzner physical machine.
+
+On the installed machine I use said out of band access to run
+#+begin_src shell
+  ssh-keygen -l -f /etc/ssh/ssh_host_ed25519_key.pub
+#+end_src
+This gives me one ssh host-key fingerprint and I can then login over
+ssh.
+
+I then run
+#+begin_src shell
+  ls /etc/ssh/*.pub | xargs -n1 ssh-keygen -r $(hostname) -f
+#+end_src
+and copy & paste the result into my DNS zone file along side A and
+AAAA records for legacy IP and IPv6. I use [[https://www.powerdns.com/][PowerDNS]] as a hidden DNSSEC
+signer so I paste into the editor ~pdnsutil edit-zone~
+provides.
+
+While still logged in I install python3 and add an ssh-key for
+ansible. I then add the host to the ansible inventory. The ansible
+orchestrator can now finish the installation of the host over ssh
+while trusting the SSHFP it finds in DNS.
+
+The newly installed host knows that it's talking to my backup and
+monitoring server using their published SSHFP records.