AFL-fast vs Erlang

Looking at Elixir and Phoenix, I started to wonder on how robust the Erlang VM is? After finding 1800+ crashes in Python 3.5 using a method shown here: http://tomforb.es/segfaulting-python-with-afl-fuzz I decided to try to do same for Erlang. I trawled through the Internet trying to find how to do it properly but didn't find anything useful.

TL;DR Erlang is a pretty robust code, but fuzzing it is currently a bit of pain. This is mainly due to some legacy, complex system of wrappers around the VM itself, and a bulky VM. Regardless I've found and submitted some interesting bugs.

Understanding how Erlang works (more or less)

So, the code for newest version Erlang can be seen here: https://github.com/erlang/otp
It's fairly easy to compile, you basically do:

./otp_build setup

This gives us full ERTS environment as well as full OTP. Now, in ./bin dir we've following executables:

sokoow@phalanx:/home/erlang/otp/bin$ ls -al  
total 720  
drwxr-xr-x  3 sokoow pharaoh   4096 Aug 22 17:57 .  
drwxr-xr-x 13 sokoow pharaoh   4096 Aug 22 17:24 ..  
-rwxr-xr-x  1 sokoow pharaoh   9843 Aug 22 17:49 cerl
-rwxr-xr-x  1 sokoow pharaoh 129344 Aug 22 17:49 ct_run
-rwxr-xr-x  1 sokoow pharaoh 126136 Aug 22 17:49 dialyzer
-rwxr-xr-x  1 sokoow pharaoh    865 Aug 22 17:49 erl
-rwxr-xr-x  1 sokoow pharaoh 125936 Aug 22 17:49 erlc
-rwxr-xr-x  1 sokoow pharaoh 132296 Aug 22 17:49 escript
-rw-r--r--  1 sokoow pharaoh      0 Aug 22 17:21 .gitignore
-rw-r--r--  1 sokoow pharaoh   5367 Aug 22 17:49 no_dot_erlang.boot
-rw-r--r--  1 sokoow pharaoh   6326 Aug 22 17:49 no_dot_erlang.script
-rw-r--r--  1 sokoow pharaoh   5395 Aug 22 17:49 start.boot
-rw-r--r--  1 sokoow pharaoh   5395 Aug 22 17:49 start_clean.boot
-rw-r--r--  1 sokoow pharaoh   6356 Aug 22 17:49 start_clean.script
-rw-r--r--  1 sokoow pharaoh   6476 Aug 22 17:49 start_sasl.boot
-rw-r--r--  1 sokoow pharaoh   7741 Aug 22 17:49 start_sasl.script
-rw-r--r--  1 sokoow pharaoh   6356 Aug 22 17:49 start.script
-rwxr-xr-x  1 sokoow pharaoh 120136 Aug 22 17:49 typer
drwxr-xr-x  2 sokoow pharaoh   4096 Aug 22 19:30 x86_64-unknown-linux-gnu  

Two basic things that are usable are erlc and erl commands. Erlc is compiling our code and erl is invoking interpreter or executing our code.

Allright, so now it's about time to compile a hello world on Erlang, for that is as following:

% hello world program
-module(helloworld).
-export([start/0]).

start() ->  
    io:fwrite("Hello, world!\n").

Code for a simple loop printing some numbers is also very easy:

-module(helloworld).
-export([while/1,while/2, start/0]).

while(L) -> while(L,0).  
while([], Acc) -> Acc;

while([_|T], Acc) ->  
   io:fwrite("~w~n",[Acc]),
   while(T,Acc+1).

   start() ->
   X = [1,2,3,4],
   while(X).               

I'll explain later on why both of them have helloworld symbol exported.

Compiling Erlang code

Cool, so we're ready to compile our first programs, let's do that:

./erlc helloworld.erl

and we get an output file helloworld.beam in exchange.

Good, so let's run it, possibly in unattended mode, so it prints our Hello World message and exits:

./erl helloworld.beam

Erlang/OTP 19 [erts-8.0.3] [source-7b1cda1] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V8.0.3  (abort with ^G)  
1>  

That didn't go very well, so how do we run it so it starts, prints and ends? It turns out that it's not that simple and we need to specify some extra arguments, here's what I found out after some digging:

./erl helloworld.beam -noshell -s helloworld start -s init stop

Hello, world!  

That's way better, appears that -s tells it what export methods to run. Cool, so we can compile to a binary, and run it unattended. We've almost all needed to have it fuzzed. But..

Behind the stage

So let's look closer at what does erl and erlc commands do, first glance at file sizes:

sokoow@phalanx:/home/erlang/otp/bin$ ls -al erl*  
-rwxr-xr-x 1 sokoow pharaoh    865 Aug 22 17:49 erl
-rwxr-xr-x 1 sokoow pharaoh 125936 Aug 22 17:49 erlc

So erl is very small, and erlc looks like a binary, but it's to small to be a full VM. Both are actually wrappers to further commands. Erl is a bash script that calls $BINDIR/erlexec, and erlc uses exevp call to fire away something else:

sokoow@phalanx:/home/erlang/otp/bin$ nm erlc | grep exec  
                 U execvp@@GLIBC_2.2.5

Cool, so this configuration is not very much fuzz-friendly: we can't fuzz bash scripts and fuzzing wrappers will just slow us down. How do we find out what stuff is being called to run the program ?

We could use strace, but this will only show us this:

sokoow@phalanx:/home/erlang/otp/bin$ strace ./erlc helloworld.erl  2>&1 | grep exec  
execve("./erlc", ["./erlc", "helloworld.erl"], [/* 23 vars */]) = 0  
execve("./erl", ["./erl", "+sbtu", "+A0", "-noinput", "-mode", "minimal", "-boot", "start_clean", "-s", "erl_compile", "compile_cmdline", "-extra", "helloworld.erl"], [/* 23 vars */]) = 0  
execve("/home/erlang/otp/bin/x86_64-unknown-linux-gnu/erlexec", ["/home/erlang/otp/bin/x86_64-unkn"..., "+sbtu", "+A0", "-noinput", "-mode", "minimal", "-boot", "start_clean", "-s", "erl_compile", "compile_cmdline", "-extra", "helloworld.erl"], [/* 27 vars */]) = 0  
execve("/home/erlang/otp/bin/x86_64-unknown-linux-gnu/beam.smp", ["/home/erlang/otp/bin/x86_64-unkn"..., "-sbtu", "-A0", "--", "-root", "/home/erlang/otp", "-progname", "erl", "--", "-home", "/root", "--", "-noshell", "-noinput", "-mode", "minimal", ...], [/* 27 vars */]) = 0  

Output is a bit obscured, so we now discovered that bin/x86_64-unknown-linux-gnu/beam.smp is called, but still have no full command line.

Digging even deeper

Not thinking too much, I started to trawl through the code and found an interesting line in erlexec.c file:

static int verbose = 0;         /* If non-zero, print some extra information. */  

After setting it to 1 and recompiling, I started seeing this:

sokoow@phalanx:/home/erlang/otp/bin$ ./erl helloworld.beam -noshell -s helloworld start -s init stop  
Executing: /home/erlang/otp/bin/x86_64-unknown-linux-gnu/beam.smp /home/erlang/otp/bin/x86_64-unknown-linux-gnu/beam.smp -- -root /home/erlang/otp -progname erl -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld start -s init stop

Hello, world!  

Yay! We're getting somewhere. beam.smp appears to be a fairly large binary (12 Megs), so it's our VM.

Now it turned that in order to compile, we can reduce this to:

beam.smp -sbtu -A0 -- -root /home/erlang/otp -- -home /home/sokoow -- -noshell -noinput -mode minimal -boot start_clean -s erl_compile compile_cmdline -extra helloworld.erl  

and to execute:

beam.smp -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  

Now, here's one catch. Because you don't call it from a native script, it won't know all the context it has to run from (directories etc). You have to set it all up manually:

export ROOTDIR="/home/erlang/otp"  
export BINDIR="$ROOTDIR/bin/x86_64-unknown-linux-gnu"  
export EMU="beam"  

Cool, looks like we have something to fuzz finally.

Erlang on AFLFast

AFLFast is that cool addon to AFL-Fuzz that makes it so much easier to find interesting inputs. Authors have added couple of new exploration heuristics: https://github.com/mboehme/aflfast

Just follow readme in order to set it all up. Also, we'll need to get inside llvm_mode dir and invoke make there - this will build afl-fast-clang version of compiler scripts, and is supposedly to be working faster. Remember to install clang and llvm first.

Now the part when we instrument our erlang VM code with afl:

CC=afl-clang-fast CXX=afl-clang-fast++ ./otp_build setup  

Now that we have the binary part built, we can do some first fuzzing. First, create a dir called input in your x86_64-unknown-linux-gnu location, place your helloworld.beam file there, and start fuzzing:

afl-fuzz -i input -o out -f helloworld.beam -- ./beam.smp -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  

And... this is not going to look good:

AFL no progress

Looks like instrumentation isn't picking up any changes at all. To see what's going on, we can invoke a single call under instrumentation and get command output, using afl-showmap command:

afl-showmap -o my_trace -- ./beam.smp -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  
afl-showmap 2.30b by <lcamtuf@google.com>  
[*] Executing './beam.smp'...

-- Program output begins --
erts_mmap: Failed to create super carrier of size 512 MB  
-- Program output ends --
[+] Captured 482 tuples in 'my_trace'.

Right, so not much memory, it wants more. After some experimentation, assigning 2GB was enough:

afl-showmap -o my_trace -- ./beam.smp -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  
afl-showmap 2.30b by <lcamtuf@google.com>  
[*] Executing './beam.smp'...

-- Program output begins --
Failed to create aux thread

Crash dump is being written to: erl_crash.dump...

A bit better, it appears to run... something. We still get Failed to create aux thread and it hangs at crash dump. After a bit of digging, a way to get rid of crash hangs would be as following:

export ERL_CRASH_DUMP=/dev/null  

But how do we get rid the aux thread message? Again, after some experimentation, turns out that the only way it will work is to use beam binary, not beam.smp one:

afl-showmap -o my_trace -- ./beam -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  
afl-showmap 2.30b by <lcamtuf@google.com>  
[*] Executing './beam'...

-- Program output begins --
Hello, world!  
-- Program output ends --
[+] Captured 12517 tuples in 'my_trace'.

Woohoo! Now let's just go with it:

afl-fuzz -i input -o out -f helloworld.beam -m 2000 -t 5000 -- ./beam -- -root /home/erlang/otp -- -home /home/sokoow -- helloworld.beam -noshell -s helloworld -s init stop  

And you will hopefully get something like this:

AFL erlang crashes

Now here's the catch, it's dog's slow. I was able to get max 2 execs/s. One way to improve this would be to use this girl's method: https://www.invincealabs.com/blog/2016/08/fuzzing-nginx-with-afl/ and place _AFL_INIT(); somewhere in a nice place. Or just talk to Erlang devs to see whether we can launch it in any different way.

Regardless, even at that turtle speed it managed to crash couple of times. 9 of the crashes appeared to be unique and I've submitted them to http://bugs.erlang.org/.

Post me a comment if you want more or have any questions.

comments powered by Disqus