#+TITLE: Fuzzing ping(8) #+SUBTITLE: ... and finding a 24 year old bug. #+DATE: 2022-12-01 * Prologue [[https://freebsd.org][FreeBSD]] had a [[https://www.freebsd.org/security/advisories/FreeBSD-SA-22:15.ping.asc][security fluctuation]] in their implementation of =ping(8)= the other day. As someone who has done a lot of work on [[https://man.openbsd.org/man/ping.8][=ping(8)=]] in [[https://openbsd.org][OpenBSD]] this tickled my interests. * What about OpenBSD? =ping(8)= is ancient: #+begin_example * Author - * Mike Muuss * U. S. Army Ballistic Research Laboratory * December, 1983 #+end_example What we know today as =ping(8)= started to become recognizable in 1986, for example see this [[https://github.com/csrg/csrg/commit/962056110ebf62ed8d4368964c7e82ac7434ea82][csrg commit]]. FreeBSD identified a stack overflow in the =pr_pack()= function and I expected a lot of similarity between the BSDs. This stuff did not change a lot since the csrg days. Step one: Does this effect us? Turns out, it does not. FreeBSD rewrote =pr_pack()= in [[https://github.com/freebsd/freebsd-src/commit/d9cacf605e2ac0f704e1ce76357cbfbe6cb63d52][2019]], citing alignment problems. Now we could join the punters on the Internet and point and laugh. But that's just rude, uncalled for, and generally boring and pointless. Technically I'm on vacation and I had resolved to only do fun things this week. So let's have some fun. Step two: Did we mess something else up? FreeBSD had a problem in =pr_pack()= because that function handles data from the network. The data is untrusted and needs to be validated. Now is a good a time as any to check OpenBSD's implementation of =pr_pack()=. I wanted to try fuzzing something, anything, with [[https://en.wikipedia.org/wiki/American_fuzzy_lop_(fuzzer)][afl]] for a few years, but never got around to it. I thought I might as well do it now, might be fun. * Make sure you are not holding it wrong. I installed =afl++= from packages and glanced at "[[https://aflplus.plus/docs/tutorials/libxml2_tutorial/][Fuzzing libxml2 with AFL++]]". Here is what we need: + A program to test. Something with a know bug so that we can tell the fuzzing works. + An input file, that does not trigger the bug. + Compile the program with =afl-clang-fast=. + Run =afl-fuzz=. [[file:fuzzing-ping/test.c][=test.c=]]: #+begin_src C /* Written by Florian Obser, Public Domain */ #include #include #include #include int main(int argc, char **argv) { FILE *f; size_t fsize; uint8_t *buf, len, *dbuf; f = fopen(argv[1], "rb"); fseek(f, 0, SEEK_END); fsize = ftell(f); rewind(f); buf = malloc(fsize + 1); if (buf == NULL) err(1, NULL); fread(buf, fsize, 1, f); fclose(f); buf[fsize] = 0; len = buf[0]; dbuf = malloc(len); if (dbuf == NULL) err(1, NULL); memcpy(buf + 1, dbuf, fsize - 1); warnx("len: %d", len); return 0; } #+end_src This program has a trivial buffer overflow. It figures out how big a file is on disk and stores this in =fsize=. It allocates a buffer of this size and then reads the whole file into it. It interprets the first byte as the length of the data (=len=) and allocates a new buffer (=dbuf=) of this size. It skips the length byte and copies =fsize - 1= bytes into the new buffer. So it trusts that the amount of data it read from disk is the same as indicated by the length byte. While this might seem silly, this is what real world buffer overflows look like. Here is a file where the length byte and file size agree. Create folders =in= and =out= and place =test.txt= into =in/test.txt=. Don't forget the newline. [[file:fuzzing-ping/test.txt][=test.txt=]]: #+begin_example ABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB #+end_example Compile =test.c=: #+begin_src shell CC=/usr/local/bin/afl-clang-fast make test #+end_src and run =afl-fuzz=: #+begin_src shell afl-fuzz -i in/ -o out -- ./test @@ #+end_src It more or less immediately finds a crash. The reproducer(s) are in =out/default/crashes/=. * Fuzzing =ping(8)= At this point we are facing a few problems. What does it mean to fuzz =ping(8)=, where are we getting the sample input from and how do we feed it to =ping(8)=. From a high level point of view =ping(8)= parses arguments, initializes a bunch of stuff and then enters an infinite loop sending ICMP echo request packets and waiting for a reply. It parses and prints each reply. Parsing the reply is the interesting thing. The reply comes from the network and is untrusted. This is where things can go wrong. The parsing is handled by =pr_pack()=, so that's what we should fuzz. ** =in/= for =ping(8)= We need some sample data. An ICMP package is binary data on-wire. Crafting it by hand is annoying. So let's just hack =ping(8)= to dump the packet to disk. [[file:fuzzing-ping/ping_output_hack.diff][=ping_output_hack.diff=]]: #+begin_src diff diff --git sbin/ping/ping.c sbin/ping/ping.c index a3b3d650eb5..78b571b95b4 100644 --- sbin/ping/ping.c +++ sbin/ping/ping.c @@ -79,6 +79,7 @@ #include #include +#include #include #include @@ -95,6 +96,7 @@ #include #include #include +#include #include #include #include @@ -217,6 +219,8 @@ const char *pr_addr(struct sockaddr *, socklen_t); void pr_pack(u_char *, int, struct msghdr *); __dead void usage(void); +void output(char *, u_char *, int); + /* IPv4 specific functions */ void pr_ipopt(int, u_char *); int in_cksum(u_short *, int); @@ -255,7 +259,7 @@ main(int argc, char *argv[]) int df = 0, tos = 0, bufspace = IP_MAXPACKET, hoplimit = -1, mflag = 0; u_char *datap, *packet; u_char ttl = MAXTTL; - char *e, *target, hbuf[NI_MAXHOST], *source = NULL; + char *e, *target, hbuf[NI_MAXHOST], *source = NULL, *output_path = NULL; char rspace[3 + 4 * NROUTES + 1]; /* record route space */ const char *errstr; double fraction, integral, seconds; @@ -264,11 +268,13 @@ main(int argc, char *argv[]) u_int rtableid = 0; extern char *__progname; +#if 0 /* Cannot pledge due to special setsockopt()s below */ if (unveil("/", "r") == -1) err(1, "unveil /"); if (unveil(NULL, NULL) == -1) err(1, "unveil"); +#endif if (strcmp("ping6", __progname) == 0) { v6flag = 1; @@ -297,8 +303,8 @@ main(int argc, char *argv[]) preload = 0; datap = &outpack[ECHOLEN + ECHOTMLEN]; while ((ch = getopt(argc, argv, v6flag ? - "c:DdEefgHh:I:i:Ll:mNnp:qS:s:T:V:vw:" : - "DEI:LRS:c:defgHi:l:np:qs:T:t:V:vw:")) != -1) { + "c:DdEefgHh:I:i:Ll:mNno:p:qS:s:T:V:vw:" : + "DEI:LRS:c:defgHi:l:no:p:qs:T:t:V:vw:")) != -1) { switch(ch) { case 'c': npackets = strtonum(optarg, 0, INT64_MAX, &errstr); @@ -375,6 +381,9 @@ main(int argc, char *argv[]) case 'n': options &= ~F_HOSTNAME; break; + case 'o': + output_path = optarg; + break; case 'p': /* fill buffer with user pattern */ options |= F_PINGFILLED; fill((char *)datap, optarg); @@ -768,10 +777,10 @@ main(int argc, char *argv[]) } if (options & F_HOSTNAME) { - if (pledge("stdio inet dns", NULL) == -1) + if (pledge("stdio inet dns wpath cpath", NULL) == -1) err(1, "pledge"); } else { - if (pledge("stdio inet", NULL) == -1) + if (pledge("stdio inet wpath cpath", NULL) == -1) err(1, "pledge"); } @@ -960,8 +969,11 @@ main(int argc, char *argv[]) } } continue; - } else + } else { + if (output_path != NULL) + output(output_path, packet, cc); pr_pack(packet, cc, &m); + } if (npackets && nreceived >= npackets) break; @@ -2274,3 +2286,29 @@ usage(void) } exit(1); } + +void +output(char *path, u_char *pack, int len) +{ + size_t bsz, off; + ssize_t nw; + int fd; + char *fname; + + bsz = len; + if (asprintf(&fname, "%s/ping_%lld_%d.out", path, time(NULL), + getpid()) == -1) + err(1, NULL); + + fd = open(fname, O_WRONLY | O_CREAT, S_IRUSR | S_IWUSR | S_IRGRP | + S_IROTH); + free(fname); + + if (fd == -1) + err(1, "open"); + + for (off = 0; off < bsz; off += nw) + if ((nw = write(fd, pack + off, bsz - off)) == 0 || nw == -1) + err(1, "write"); + close(fd); +} #+end_src After building and installing our hacked version of =ping(8)= we can create sample input data for afl thusly: #+begin_src shell while :; do ping -o ./in/ -w 1 -c 1 \ $(jot -r 0 255 | head -4 | tr '\n' '.' | sed 's/.$//') done #+end_src =jot= creates a stream of random numbers between 0 and 255, we get the first four, concatenate them with '.' and cut of the trailing dot. VoilĂ  we have a bunch of random IPv4 addresses. We then send a single ping and wait for one second. The ICMP reply is written to =./in/=. ** Fuzzing =pr_pack()= At this point I wrote a =main()= function that accepts a file name as argument and reads it into a buffer. I then ripped =pr_pack()= out of =ping(8)= and fed it the file contents. Of course compiling fails quite spectacularly at this point. So I added a bunch of missing functions, defines and global variables. It gets pretty close now. We don't have the =msghdr= from =recvfrom(2)= so we need to =#if 0= some code. We also need to get rid of the validation of the data packet using =SipHash= because the whole point is that the data does not validate and =SipHash= would short circuit. Oh yeah, and the thing is legacy IP only at this point. So [[file:fuzzing-ping/afl_ping.c][here (=afl_ping.c=)]] it is, it is quite terrible. It would probably make more sense to copy all of =ping(8)= and slap on a new =main()= function. Maybe. Anyway, at this point I was 30 minutes in, from reading about afl for the first time until firing up =afl-fuzz= on my hacked =pr_pack()=. Not too bad. It was time for dinner and I left the thing running. ** The promised bug I came back after dinner and afl found zero crashes. That's disappointing. Or good. Depending on how you look at it. But it found hangs. Running =afl_ping= on one of the reproducers, it printed "=unknown option 20=" forever. The problem is in this part of the code: #+begin_src C for (; hlen > (int)sizeof(struct ip); --hlen, ++cp) { /* [...] */ switch (*cp) { /* [...] */ default: printf("\nunknown option %x", *cp); hlen = hlen - (cp[IPOPT_OLEN] - 1); cp = cp + (cp[IPOPT_OLEN] - 1); break; } } #+end_src =cp= is untrusted data and if =cp[IPOPT_OLEN]= is zero we would increase =hlen= by one and the for loop would subtract one, same for =cp=. We never make any progress and spin forever. The diff is fairly simple: #+begin_src diff diff --git ping.c ping.c index fb31365ad31..6019c87d8db 100644 --- ping.c +++ ping.c @@ -1525,8 +1525,11 @@ pr_ipopt(int hlen, u_char *buf) break; default: printf("\nunknown option %x", *cp); - hlen = hlen - (cp[IPOPT_OLEN] - 1); - cp = cp + (cp[IPOPT_OLEN] - 1); + if (cp[IPOPT_OLEN] > 0 && (cp[IPOPT_OLEN] - 1) <= hlen) { + hlen = hlen - (cp[IPOPT_OLEN] - 1); + cp = cp + (cp[IPOPT_OLEN] - 1); + } else + hlen = 0; break; } } #+end_src I foolishly tweaked the diff after collecting OKs and of course the tweak was wrong. Note to self: Never do this. So it's spread out over two commits: [[https://cvsweb.openbsd.org/src/sbin/ping/ping.c#rev1.247][ping.c, Revision 1.247]] and [[https://cvsweb.openbsd.org/src/sbin/ping/ping.c#rev1.248][ping.c, Revision 1.248]]. This bug was introduced April 3rd, 1998 in [[https://cvsweb.openbsd.org/src/sbin/ping/ping.c#rev1.30][revision 1.30]], over 24 years ago. * Epilogue Afl uses files to feed data to programs to get them to crash or otherwise misbehave. I had wondered for a few years how I could use afl with things that talk to the network. Because that's what I mostly work on. In hindsight it's quite obvious. You identify the main parsing function, wrap it in a new =main()= function and Robert is your father's nearest male relative. The two main takeaways from this are: One, if someone messes up somewhere, go look if you messed up in the same or similar way somewhere else. Two, afl is pretty easy to use, even for network programs. 30 minutes from reading about afl for the first time to finding a bug in a real world program is pretty neat.