Hopper
Distributed Fuzzer
inspired
by AFL++
Hopper aims to improve performance of Fuzzing in large-scale
distributed environments, it's not meant to replace AFL++ in most cases.

Hopper Master
Usage
Pre-Reqs:
Instrumentation:
- The compile script adds all the flags required to compile
the target program with clang++.
Ex:
./compile target.c
Env:
HOPPER_OUT
: Where to save Hopper output; sanitizer reports, crash inputs and
hopper reports, defaults to .
HOPPER_LOG
: Enable Logging on Master, defaults to HOPPER_LOG=0
Master
only
HOPPER_LOG_INTERVAL
: Logging interval in minutes, defaults to
HOPPER_LOG_INTERVAL=30
(ignored if HOPPER_LOG
is not set) Master only
Master:
- -I: Path to input corpus, directory containing files each being a
seed
- -H: Havoc level to use in mutator, defaults to
1
(recommended:
increase havoc for larger seeds)
- -P: Port to host Master on, defaults to
6969
Ex:
go build ./cmd/hopper-master && ./hopper-master -H 5 -I test/in
Node:
- -I: Node ID, usually just a unique int
- -T: Path to instrumented target binary
- -M: IP/address of Master, defaults to
localhost
- -P: Port of Master, defaults to
6969
- --raw: Should seed be fed directly in the run command, defaults to
false
. Hopper will put bytes in a file and feed that file to target.
- --args: Args to use against target, ex:
--depth=1 @@
- --env: Env variables for target seperated by a
;
, ex:
ENV1=foo;ENV2=bar;
- --stdin: Should seed be fed as stdin or as an argument, defaults
to
false
Ex:
Args: go build ./cmd/hopper-node; ./hopper-node -I 1 -T target --args "--depth=2 @@"
Stdin: go build ./cmd/hopper-node; ./hopper-node -I 1 -T target --stdin
Containerized Demos:
You can also look at all the Hopper containers running by doing: docker ps -f "name=hopper"
Fuzzing simple parse util:
Hopper running locally with 10 fuzzing Nodes, fuzzing a test application with a known
vulnerability:
- Clone project:
git clone https://github.com/Cybergenik/hopper.git && cd hopper
- Build base Hopper image:
docker build -t hopper-node .
- Run Master:
./examples/parse/docker/master_docker.sh
- Run Nodes:
./examples/parse/docker/node_docker.sh 1 10
(I'd recommend no more
than 1x # of logical cores on your machine, any more nodes on one system
and they just get throttled and competing for CPU time)
- Look at the nice TUI :>
Fuzzing Readelf:
Hopper running locally with 10 fuzzing Nodes, fuzzing GNU binutils-2.40 readelf
:
- Clone project:
git clone https://github.com/Cybergenik/hopper.git && cd hopper
- Build base Hopper image:
docker build -t hopper-node .
- Build readelf image:
docker build -t hopper-readelf ./examples/binutils/
- Run Master:
./examples/binutils/readelf/master_docker.sh
- Run Nodes:
./examples/binutils/readelf/node_docker.sh 1 10
(I'd recommend no more
than 1x # of logical cores on your machine, any more nodes on one system
and they just get throttled and competing for CPU time)
- Look at the nice TUI :>
Design & Implementation
Overview
Master
The Masters job is to schedule fuzz tasks on Nodes in the cluster, keep track of
coverage, mutate seeds, and produce reports. The Master handles all these
responsibilities concurrently. There are two main processes running concurrently
on the Master, an RPC server and the Mutation Engine.
Coverage:
Hopper uses a bloom filter to keep
track of coverage and to deduplicate seeds based on coverage and content.
Mutation Engine:
The mutation engine acts as a load balancer by popping energized seeds from the
energy priority queue (EPQ), mutating them, and feeding newly formed seeds to
the task queue. The Mutation Engine only mutates when there’s enough space in
the Task Queue for more tasks, otherwise it stalls. Because a single energized
seed can turn into tens of seeds, this can be seen as an inverse funnel, thus
the Mutation Engine has some control of flow through the system.
Node
A Hopper Node’s job is to run the PUT, gather and parse coverage, and report
coverage/crash data to the Master. Each Node runs a main Fuzz loop. Nodes are
fairly synchronous, with a few sections of parallelism for logging crashes and
clean-up. But generally we keep each instantiation of a Node synchronous such
that we can more easily reason about it as a discrete unit of computation.