Repository: gamozolabs/fzero_fuzzer
Branch: master
Commit: 712a566d7acf
Files: 6
Total size: 37.6 KB
Directory structure:
gitextract_mzee96e8/
├── .gitignore
├── Cargo.toml
├── README.md
├── html.json
├── json.json
└── src/
└── main.rs
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
/*
!/*.json
!/src
!/Cargo.toml
!/.gitignore
!/README.md
================================================
FILE: Cargo.toml
================================================
[package]
name = "fzero"
version = "0.1.0"
authors = ["Brandon Falk <bfalk@gamozolabs.com>"]
edition = "2018"
license = "MIT"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
================================================
FILE: README.md
================================================
# Intro
`fzero` is a grammar-based fuzzer that generates a Rust application inspired
by the paper "Building Fast Fuzzers" by Rahul Gopinath and Andreas Zeller.
https://arxiv.org/pdf/1911.07707.pdf
You can find the F1 fuzzer here:
https://github.com/vrthra/F1
# Usage
Currently this only generates an application that does benchmarking, but with
some quick hacks you could easily get the input out and feed it to an
application.
## Example usage
```
D:\dev\fzero_fuzz>cargo run --release html.json test.rs test.exe 8
Finished release [optimized] target(s) in 0.02s
Running `target\release\fzero.exe html.json test.rs test.exe 8`
Loaded grammar json
Converted grammar to binary format
Optimized grammar
Generated Rust source file
Created Rust binary!
D:\dev\fzero_fuzz>test.exe
MiB/sec: 1773.3719
MiB/sec: 1763.8357
MiB/sec: 1756.8917
MiB/sec: 1757.1934
MiB/sec: 1758.9417
MiB/sec: 1758.9122
MiB/sec: 1758.7352
```
# Concept
This program takes in an input grammar specified by a JSON file. This JSON
grammar representation is converted to a binary-style grammar that is intended
for interpretation and optimization. A Rust application (source file) is
produced by the shape of the input grammar. This then is compiled using `rustc`
to an application for the local machine.
This doesn't have any constraints on the random number generation as it uses an
infinite supply of random numbers. There is no limitation on the output size
and the buffer will dynamically grow as the input is created.
# Benchmarks
All tests on a single core of a `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz` with a turbo clock rate of 4.3 GHz
All numbers in `MiB/second`.
| Benchmark | fzero fuzzer | F1 fuzzer | Speedup |
|--------------------|--------------|-----------|---------|
| html.json depth=4 | 5330 | 1295 | 4.11x |
| html.json depth=8 | 1760 | 348 | 5.05x |
| html.json depth=16 | 338 | 195 | 1.73x |
| html.json depth=32 | 218 | 175 | 1.25x |
| html.json depth=64 | 201 | 175 | 1.14x |
| json.json depth=4 | 97 | 97 | 1.00x |
| json.json depth=8 | 79 | 93 | 0.84x |
| json.json depth=16 | 83 | 89 | 0.93x |
| json.json depth=32 | 85 | 88 | 0.97x |
| json.json depth=64 | 85 | 90 | 0.94x |
# Unsafe code
This project uses a small amount of `unsafe` code to provide the same semantics
of `extend_from_slice` but in a much faster way (over 4x faster). Not quite
sure why it's much faster, but if you are uncomfortable with `unsafe` code,
feel free to set `SAFE_ONLY` to `true` at the top of `src/lib.rs`. This will
restrict this fuzzer to only generate safe code. I don't think this is
necessary but who knows :)
# Performance
The performance of this tool is separated into multiple categories. One is the
code generation side, how long it takes for the JSON to be compiled into a Rust
application. The other is the code execution speeds, which is how fast the
produced application can generate inputs.
## Code Generation
Code generation vastly outperforms the "Building Fast Fuzzers" paper. For
example when generating the code based on the `html.json` grammar, the F1
fuzzer took over 25 minutes to produce the code. This fuzzer is capable of
producing a Rust application in under 10 seconds.
## Code execution
This project is on some performance metrics about 20-30% slower than the F1
fuzzer, but these scenarios are rare. However, in most situations we've been
about to out-perform F1 by about 30-50%, and in extreme cases (html.json
depth=8) we've observed over a 4x speedup.
# Differences from the F1 fuzzer
The F1 fuzzer mentions a technique that will resolve to the nearest terminal
tokens when stack depth is exceeded. We haven't implemented this technique but
I don't think it's a huge impact on the generated inputs. This is something I
will look into in the future.
Due to not using globals this can easily be scaled out to multiple threads as
all random state and input generation are done in a structure.
There is no use of assembly in this project, and thus it can produce
highly-performant fuzzers for any architecture or environment that Rust can
compile against (pretty much identical to LLVM's target list).
================================================
FILE: html.json
================================================
{
"<start>": [["<_l_>", "!DOCTYPE html", "<_r_>", "<html_document>"]],
"<_l_>": [["<"]],
"<_r_>": [[">"]],
"<_cl_>": [["</"]],
"<a_tag>": [["<_l_>", "a", "<d>", "<_r_>", "<a_content-1>", "<_cl_>", "a", "<_r_>"]],
"<a_content>": [["<heading>"], ["<text>"]],
"<abbr_tag>": [["<_l_>", "abbr", "<d>", "<_r_>", "<text>", "<_cl_>", "abbr", "<_r_>"]],
"<acronym_tag>": [["<_l_>", "acronym", "<d>", "<_r_>", "<text>", "<_cl_>", "acronym", "<_r_>"]],
"<address_tag>": [["<_l_>", "address", "<d>", "<_r_>", "<address_content-1>", "<_cl_>", "address", "<_r_>"]],
"<address_content>": [["<p_tag>"], ["<text>"]],
"<applet_content>": [["<param-1>", "<body_content>"]],
"<area>": [["<_l_>", "area", "<d>", "<_r_>"]],
"<applet_tag>": [["<_l_>", "applet", "<d>", "<_r_>", "<applet_content>", "<_cl_>", "applet", "<_r_>"]],
"<b_tag>": [["<_l_>", "b", "<d>", "<_r_>", "<text>", "<_cl_>", "b", "<_r_>"]],
"<basefont_tag>": [["<_l_>", "basefront", "<d>", "<_r_>", "<body_content>", "<_cl_>", "basefront", "<_r_>"]],
"<bdo_tag>": [["<_l_>", "bdo", "<d>", "<_r_>", "<text>", "<_cl_>", "bdo", "<_r_>"]],
"<big_tag>": [["<_l_>", "big", "<d>", "<_r_>", "<text>", "<_cl_>", "big", "<_r_>"]],
"<blink_tag>": [["<_l_>", "blink", "<d>", "<_r_>", "<text>", "<_cl_>", "blink", "<_r_>"]],
"<block>": [["<block_content-1>"]],
"<block_content>": [["<basefont_tag>"], ["<blockquote_tag>"], ["<center_tag>"], ["<dir_tag>"], ["<div_tag>"], ["<dl_tag>"], ["<form_tag>"], ["<listing_tag>"], ["<menu_tag>"], ["<multicol_tag>"], ["<nobr_tag>"], ["<ol_tag>"], ["<p_tag>"], ["<pre_tag>"], ["<table_tag>"], ["<ul_tag>"], ["<xmp_tag>"]],
"<blockquote_tag>": [["<_l_>", "blockquote", "<d>", "<_r_>", "<body_content>", "<_cl_>", "blockquote", "<_r_>"]],
"<body_content>": [["<_l_>", "bgsound", "<d>", "<_r_>"], ["<_l_>", "hr", "<_r_>"], ["<address_tag>"], ["<block>"], ["<del_tag>"], ["<heading>"], ["<ins_tag>"], ["<layer_tag>"], ["<map_tag>"], ["<marquee_tag>"], ["<text>"]],
"<body_tag>": [["<_l_>", "body", "<d>", "<_r_>", "<body_content-1>", "<_cl_>", "body", "<_r_>"]],
"<caption_tag>": [["<_l_>", "caption", "<d>", "<_r_>", "<body_content-2>", "<_cl_>", "caption", "<_r_>"]],
"<center_tag>": [["<_l_>", "center", "<d>", "<_r_>", "<body_content-3>", "<_cl_>", "center", "<_r_>"]],
"<cite_tag>": [["<_l_>", "cite", "<d>", "<_r_>", "<text>", "<_cl_>", "cite", "<_r_>"]],
"<code_tag>": [["<_l_>", "code", "<d>", "<_r_>", "<text>", "<_cl_>", "code", "<_r_>"]],
"<colgroup_content>": [["<_l_>", "col", "<d>", "<_r_-1>"]],
"<colgroup_tag>": [["<_l_>", "colgroup", "<d>", "<_r_>", "<colgroup_content>"]],
"<content_style>": [["<abbr_tag>"], ["<acronym_tag>"], ["<cite_tag>"], ["<code_tag>"], ["<dfn_tag>"], ["<em_tag>"], ["<kbd_tag>"], ["<q_tag>"], ["<strong_tag>"], ["<var_tag>"]],
"<dd_tag>": [["<_l_>", "dd", "<d>", "<_r_>", "<flow>", "<_cl_>", "dd", "<_r_>"]],
"<del_tag>": [["<_l_>", "del", "<d>", "<_r_>", "<flow>", "<_cl_>", "del", "<_r_>"]],
"<dfn_tag>": [["<_l_>", "dfn", "<d>", "<_r_>", "<text>", "<_cl_>", "dfn", "<_r_>"]],
"<dir_tag>": [["<_l_>", "dir", "<d>", "<_r_>", "<li_tag-1>", "<_cl_>", "dir", "<_r_>"]],
"<div_tag>": [["<_l_>", "div", "<d>", "<_r_>", "<body_content>", "<_cl_>", "div", "<_r_>"]],
"<dl_content>": [["<dt_tag>", "<dd_tag>"]],
"<dl_tag>": [["<_l_>", "dl", "<d>", "<_r_>", "<dl_content-1>", "<_cl_>", "dl", "<_r_>"]],
"<dt_tag>": [["<_l_>", "dt", "<d>", "<_r_>", "<text>", "<_cl_>", "dt", "<_r_>"]],
"<em_tag>": [["<_l_>", "em", "<d>", "<_r_>", "<text>", "<_cl_>", "em", "<_r_>"]],
"<fieldset_tag>": [["<_l_>", "fieldset", "<d>", "<_r_>", "<legend_tag-1>", "<form_content-1>", "<_cl_>", "fieldset", "<_r_>"]],
"<flow>": [["<flow_content-1>"]],
"<flow_content>": [["<block>"], ["<text>"]],
"<font_tag>": [["<_l_>", "font", "<d>", "<_r_>", "<style_text>", "<_cl_>", "font", "<_r_>"]],
"<form_content>": [["<_l_>", "input", "<d>", "<_r_>"], ["<_l_>", "keygen", "<d>", "<_r_>"], ["<body_content>"], ["<fieldset_tag>"], ["<label_tag>"], ["<select_tag>"], ["<textarea_tag>"]],
"<form_tag>": [["<_l_>", "form", "<d>", "<_r_>", "<form_content-2>", "<_cl_>", "form", "<_r_>"]],
"<frameset_content>": [["<_l_>", "frame", "<d>", "<_r_>"], ["<noframes_tag>"]],
"<frameset_tag>": [["<_l_>", "frameset", "<d>", "<_r_>", "<frameset_content-1>", "<_cl_>", "frameset", "<_r_>"]],
"<h1_tag>": [["<_l_>", "h1", "<d>", "<_r_>", "<text>", "<_cl_>", "h1", "<_r_>"]],
"<h2_tag>": [["<_l_>", "h2", "<d>", "<_r_>", "<text>", "<_cl_>", "h2", "<_r_>"]],
"<h3_tag>": [["<_l_>", "h3", "<d>", "<_r_>", "<text>", "<_cl_>", "h3", "<_r_>"]],
"<h4_tag>": [["<_l_>", "h4", "<d>", "<_r_>", "<text>", "<_cl_>", "h4", "<_r_>"]],
"<h5_tag>": [["<_l_>", "h5", "<d>", "<_r_>", "<text>", "<_cl_>", "h5", "<_r_>"]],
"<h6_tag>": [["<_l_>", "h6", "<d>", "<_r_>", "<text>", "<_cl_>", "h6", "<_r_>"]],
"<head_content>": [["<_l_>", "base", "<d>", "<_r_>"], ["<_l_>", "link", "<d>", "<_r_>"], ["<_l_>", "meta", "<d>", "<_r_>"], ["<style_tag>"], ["<title_tag>"], ["<script_tag>"]],
"<head_tag>": [["<_l_>", "head", "<d>", "<_r_>", "<head_content-1>", "<_cl_>", "head", "<_r_>"]],
"<heading>": [["<h1_tag>"], ["<h2_tag>"], ["<h3_tag>"], ["<h4_tag>"], ["<h5_tag>"], ["<h6_tag>"]],
"<html_content>": [["<head_tag>", "<body_tag>"], ["<head_tag>", "<frameset_tag>"]],
"<html_document>": [["<html_tag>"]],
"<html_tag>": [["<_l_>", "html", "<_r_>", "<html_content>", "<_cl_>", "html", "<_r_>"]],
"<i_tag>": [["<_l_>", "i", "<d>", "<_r_>", "<text>", "<_cl_>", "i", "<_r_>"]],
"<ilayer_tag>": [["<_l_>", "ilayer", "<d>", "<_r_>", "<body_content>", "<_cl_>", "ilayer", "<_r_>"]],
"<ins_tag>": [["<_l_>", "ins", "<d>", "<_r_>", "<flow>", "<_cl_>", "ins", "<_r_>"]],
"<kbd_tag>": [["<_l_>", "kbd", "<d>", "<_r_>", "<text>", "<_cl_>", "kbd", "<_r_>"]],
"<label_content>": [["<_l_>", "input", "<d>", "<_r_>"], ["<body_content>"], ["<select_tag>"], ["<textarea_tag>"]],
"<label_tag>": [["<_l_>", "label", "<d>", "<_r_>", "<label_content-1>", "<_cl_>", "label", "<_r_>"]],
"<layer_tag>": [["<_l_>", "layer", "<d>", "<_r_>", "<body_content>", "<_cl_>", "layer", "<_r_>"]],
"<legend_tag>": [["<_l_>", "legend", "<d>", "<_r_>", "<text>", "<_cl_>", "legend", "<_r_>"]],
"<li_tag>": [["<_l_>", "li", "<d>", "<_r_>", "<flow>", "<_cl_>", "li", "<_r_>"]],
"<literal_text>": [["<plain_text>"]],
"<listing_tag>": [["<_l_>", "listing", "<d>", "<_r_>", "<literal_text>", "<_cl_>", "listing", "<_r_>"]],
"<map_content>": [["<area-1>"]],
"<map_tag>": [["<_l_>", "map", "<d>", "<_r_>", "<map_content>", "<_cl_>", "map", "<_r_>"]],
"<marquee_tag>": [["<_l_>", "marquee", "<d>", "<_r_>", "<style_text>", "<_cl_>", "marquee", "<_r_>"]],
"<menu_tag>": [["<_l_>", "menu", "<d>", "<_r_>", "<li_tag-2>", "<_cl_>", "menu", "<_r_>"]],
"<multicol_tag>": [["<_l_>", "multicol", "<d>", "<_r_>", "<body_content>", "<_cl_>", "multicol", "<_r_>"]],
"<nobr_tag>": [["<_l_>", "nobr", "<d>", "<_r_>", "<text>", "<_cl_>", "nobr", "<_r_>"]],
"<noembed_tag>": [["<_l_>", "noembed", "<d>", "<_r_>", "<text>", "<_cl_>", "noembed", "<_r_>"]],
"<noframes_tag>": [["<_l_>", "noframes", "<d>", "<_r_>", "<body_content-4>", "<_cl_>", "noframes", "<_r_>"]],
"<noscript_tag>": [["<_l_>", "noscript", "<d>", "<_r_>", "<text>", "<_cl_>", "noscript", "<_r_>"]],
"<object_content>": [["<param-2>", "<body_content>"]],
"<object_tag>": [["<_l_>", "object", "<d>", "<_r_>", "<object_content>", "<_cl_>", "object", "<_r_>"]],
"<ol_tag>": [["<_l_>", "ol", "<d>", "<_r_>", "<li_tag-3>", "<_cl_>", "ol", "<_r_>"]],
"<optgroup_tag>": [["<_l_>", "optgroup", "<d>", "<_r_>", "<option_tag-1>", "<_cl_>", "optgroup", "<_r_>"]],
"<option_tag>": [["<_l_>", "option", "<d>", "<_r_>", "<plain_text-1>", "<_cl_>", "option", "<_r_>"]],
"<p_tag>": [["<_l_>", "p", "<_r_>", "<text>", "<_cl_>", "p", "<_r_>"]],
"<param>": [["<_l_>", "param", "<_r_>"]],
"<plain_text>": [["<entity-1>"]],
"<entity>": [["<char>"], ["<ampersand>"]],
"<char>": [["7"], ["*"], [":"], ["]"], ["n"], ["m"], ["N"], ["/"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [" "], [";"], ["b"], ["i"], ["L"], ["'"], ["Y"], ["\t"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["\""], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["`"], ["B"], ["\n"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["<"], ["u"], ["r"], ["U"], ["h"], [">"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["="], ["8"], ["+"], ["M"]],
"<ampersand>": [[" "]],
"<physical_style>": [["<b_tag>"], ["<bdo_tag>"], ["<big_tag>"], ["<blink_tag>"], ["<font_tag>"], ["<i_tag>"], ["<s_tag>"], ["<small_tag>"], ["<span_tag>"], ["<strike_tag>"], ["<sub_tag>"], ["<sup_tag>"], ["<tt_tag>"], ["<u_tag>"]],
"<pre_content>": [["<_l_>", "br", "<_r_>"], ["<_l_>", "hr", "<_r_>"], ["<a_tag>"], ["<style_text>"]],
"<pre_tag>": [["<_l_>", "pre", "<_r_>", "<pre_content-1>", "<_cl_>", "pre", "<_r_>"]],
"<q_tag>": [["<_l_>", "q", "<_r_>", "<text>", "<_cl_>", "q", "<_r_>"]],
"<s_tag>": [["<_l_>", "s", "<_r_>", "<text>", "<_cl_>", "s", "<_r_>"]],
"<script_tag>": [["<_l_>", "script", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "script", "<_r_>"]],
"<select_content>": [["<optgroup_tag>"], ["<option_tag>"]],
"<select_tag>": [["<_l_>", "select", "<d>", "<_r_>", "<select_content-1>", "<_cl_>", "select", "<_r_>"]],
"<small_tag>": [["<_l_>", "small", "<d>", "<_r_>", "<text>", "<_cl_>", "small", "<_r_>"]],
"<span_tag>": [["<_l_>", "span", "<d>", "<_r_>", "<text>", "<_cl_>", "span", "<_r_>"]],
"<strike_tag>": [["<_l_>", "strike", "<d>", "<_r_>", "<text>", "<_cl_>", "strike", "<_r_>"]],
"<strong_tag>": [["<_l_>", "strong", "<d>", "<_r_>", "<text>", "<_cl_>", "strong", "<_r_>"]],
"<style_tag>": [["<_l_>", "style", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "style", "<_r_>"]],
"<style_text>": [["<plain_text>"]],
"<sub_tag>": [["<_l_>", "sub", "<d>", "<_r_>", "<text>", "<_cl_>", "sub", "<_r_>"]],
"<sup_tag>": [["<_l_>", "sup", "<d>", "<_r_>", "<text>", "<_cl_>", "sup", "<_r_>"]],
"<table_cell>": [["<td_tag>"], ["<th_tag>"]],
"<table_content>": [["<_l_>", "tbody", "<d>", "<_r_>"], ["<_l_>", "tfoot", "<d>", "<_r_>"], ["<_l_>", "thead", "<d>", "<_r_>"], ["<tr_tag>"]],
"<table_tag>": [["<_l_>", "table", "<d>", "<_r_>", "<caption_tag-1>", "<colgroup_tag-1>", "<table_content-1>", "<_cl_>", "table", "<_r_>"]],
"<td_tag>": [["<_l_>", "td", "<d>", "<_r_>", "<body_content>", "<_cl_>", "td", "<_r_>"]],
"<text>": [["<text_content-1>"]],
"<text_content>": [["<_l_>", "br", "<d>", "<_r_>"], ["<_l_>", "embed", "<d>", "<_r_>"], ["<_l_>", "iframe", "<d>", "<_r_>"], ["<_l_>", "img", "<d>", "<_r_>"], ["<_l_>", "spacer", "<d>", "<_r_>"], ["<_l_>", "wbr", "<d>", "<_r_>"], ["<a_tag>"], ["<applet_tag>"], ["<content_style>"], ["<ilayer_tag>"], ["<noembed_tag>"], ["<noscript_tag>"], ["<object_tag>"], ["<plain_text>"], ["<physical_style>"]],
"<textarea_tag>": [["<_l_>", "textarea", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "textarea", "<_r_>"]],
"<th_tag>": [["<_l_>", "th", "<d>", "<_r_>", "<body_content>", "<_cl_>", "th", "<_r_>"]],
"<title_tag>": [["<_l_>", "title", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "title", "<_r_>"]],
"<tr_tag>": [["<_l_>", "tr", "<d>", "<_r_>", "<table_cell-1>", "<_cl_>", "tr", "<_r_>"]],
"<tt_tag>": [["<_l_>", "tt", "<d>", "<_r_>", "<text>", "<_cl_>", "tt", "<_r_>"]],
"<u_tag>": [["<_l_>", "u", "<d>", "<_r_>", "<text>", "<_cl_>", "u", "<_r_>"]],
"<ul_tag>": [["<_l_>", "ul", "<d>", "<_r_>", "<li_tag-4>", "<_cl_>", "ul", "<_r_>"]],
"<var_tag>": [["<_l_>", "var", "<d>", "<_r_>", "<text>", "<_cl_>", "var", "<_r_>"]],
"<xmp_tag>": [["<_l_>", "xmp", "<d>", "<_r_>", "<literal_text>", "<_cl_>", "xmp", "<_r_>"]],
"<d>": [["<space-1>", "<attributes-1>", "<space-2>"], []],
"<attribute>": [["<key>"], ["<key>", "=\"", "<value>", "\""], ["<key>", "='", "<value>", "'"], ["<key>", "=", "<uqvalue>"]],
"<key>": [["<allchars>"]],
"<allchars>": [["7"], ["*"], [":"], ["&"], ["]"], ["n"], ["m"], ["N"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [";"], ["b"], ["i"], ["L"], ["Y"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["`"], ["B"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["u"], ["r"], ["U"], ["h"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["8"], ["+"], ["M"]],
"<value>": [["<anychars>"]],
"<anychar>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"], ["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"], ["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"], ["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"], ["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"], ["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"], ["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["\""], ["#"], ["$"], ["%"], ["&"], ["'"], ["("], [")"], ["*"], ["+"], [","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="], [">"], ["?"], ["@"], ["["], ["\\"], ["]"], ["^"], ["_"], ["`"], ["{"], ["|"], ["}"], ["~"], [" "], ["\t"], ["\n"], ["\r"], ["\u000b"], ["\f"]],
"<anychars>": [["<anychar-1>"]],
"<uqvalue>": [["<uqchars>"]],
"<uqchar>": [["7"], ["*"], [":"], ["&"], ["]"], ["n"], ["m"], ["N"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [";"], ["b"], ["i"], ["L"], ["Y"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["B"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["u"], ["r"], ["U"], ["h"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["8"], ["+"], ["M"]],
"<uqchars>": [["<uqchar-1>"]],
"<attributes>": [["<attribute>"], ["<attribute>", "<space-3>", "<attributes>"]],
"<space>": [[" "], ["\t"], ["\n"]],
"<a_content-1>": [[], ["<a_content>", "<a_content-1>"]],
"<address_content-1>": [[], ["<address_content>", "<address_content-1>"]],
"<param-1>": [[], ["<param>", "<param-1>"]],
"<block_content-1>": [[], ["<block_content>", "<block_content-1>"]],
"<body_content-1>": [[], ["<body_content>", "<body_content-1>"]],
"<body_content-2>": [[], ["<body_content>", "<body_content-2>"]],
"<body_content-3>": [[], ["<body_content>", "<body_content-3>"]],
"<_r_-1>": [[], ["<_r_>", "<_r_-1>"]],
"<li_tag-1>": [["<li_tag>"], ["<li_tag>", "<li_tag-1>"]],
"<dl_content-1>": [["<dl_content>"], ["<dl_content>", "<dl_content-1>"]],
"<legend_tag-1>": [[], ["<legend_tag>", "<legend_tag-1>"]],
"<form_content-1>": [[], ["<form_content>", "<form_content-1>"]],
"<flow_content-1>": [[], ["<flow_content>", "<flow_content-1>"]],
"<form_content-2>": [[], ["<form_content>", "<form_content-2>"]],
"<frameset_content-1>": [[], ["<frameset_content>", "<frameset_content-1>"]],
"<head_content-1>": [[], ["<head_content>", "<head_content-1>"]],
"<label_content-1>": [[], ["<label_content>", "<label_content-1>"]],
"<area-1>": [[], ["<area>", "<area-1>"]],
"<li_tag-2>": [[], ["<li_tag>", "<li_tag-2>"]],
"<body_content-4>": [[], ["<body_content>", "<body_content-4>"]],
"<param-2>": [[], ["<param>", "<param-2>"]],
"<li_tag-3>": [["<li_tag>"], ["<li_tag>", "<li_tag-3>"]],
"<option_tag-1>": [[], ["<option_tag>", "<option_tag-1>"]],
"<plain_text-1>": [["<plain_text>"], ["<plain_text>", "<plain_text-1>"]],
"<entity-1>": [[], ["<entity>", "<entity-1>"]],
"<pre_content-1>": [[], ["<pre_content>", "<pre_content-1>"]],
"<select_content-1>": [[], ["<select_content>", "<select_content-1>"]],
"<caption_tag-1>": [[], ["<caption_tag>", "<caption_tag-1>"]],
"<colgroup_tag-1>": [[], ["<colgroup_tag>", "<colgroup_tag-1>"]],
"<table_content-1>": [[], ["<table_content>", "<table_content-1>"]],
"<text_content-1>": [[], ["<text_content>", "<text_content-1>"]],
"<table_cell-1>": [[], ["<table_cell>", "<table_cell-1>"]],
"<li_tag-4>": [[], ["<li_tag>", "<li_tag-4>"]],
"<space-1>": [["<space>"], ["<space>", "<space-1>"]],
"<attributes-1>": [[], ["<attributes>", "<attributes-1>"]],
"<space-2>": [[], ["<space>", "<space-2>"]],
"<anychar-1>": [[], ["<anychar>", "<anychar-1>"]],
"<uqchar-1>": [["<uqchar>"], ["<uqchar>", "<uqchar-1>"]],
"<space-3>": [["<space>"], ["<space>", "<space-3>"]]
}
================================================
FILE: json.json
================================================
{
"<start>": [["<json>"]],
"<json>": [["<element>"]],
"<element>": [["<ws>", "<value>", "<ws>"]],
"<value>": [["<object>"], ["<array>"], ["<string>"], ["<number>"],
["true"], ["false"],
["null"]],
"<object>": [["{", "<ws>", "}"], ["{", "<members>", "}"]],
"<members>": [["<member>", "<symbol-2>"]],
"<member>": [["<ws>", "<string>", "<ws>", ":", "<element>"]],
"<array>": [["[", "<ws>", "]"], ["[", "<elements>", "]"]],
"<elements>": [["<element>", "<symbol-1-1>"]],
"<string>": [["\"", "<characters>", "\""]],
"<characters>": [["<character-1>"]],
"<character>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"],
["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"],
["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"],
["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"],
["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"],
["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"],
["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["#"],
["$"], ["%"], ["&"], ["\""], ["("], [")"], ["*"], ["+"],
[","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="],
[">"], ["?"], ["@"], ["["], ["]"], ["^"], ["_"], ["`"],
["{"], ["|"], ["}"], ["~"], [" "], ["<esc>"]],
"<esc>": [["\\","<escc>"]],
"<escc>": [["\\"],["b"],["f"], ["n"], ["r"],["t"],["\""]],
"<number>": [["<int>", "<frac>", "<exp>"]],
"<int>": [["<digit>"], ["<onenine>", "<digits>"], ["-", "<digits>"],
["-", "<onenine>", "<digits>"]],
"<digits>": [["<digit-1>"]],
"<digit>": [["0"], ["<onenine>"]],
"<onenine>": [["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"], ["8"],
["9"]],
"<frac>": [[], [".", "<digits>"]],
"<exp>": [[], ["E", "<sign>", "<digits>"], ["e", "<sign>", "<digits>"]],
"<sign>": [[], ["+"], ["-"]],
"<ws>": [["<sp1>", "<ws>"], []],
"<sp1>": [[" "],["\n"],["\t"],["\r"]],
"<symbol>": [[",", "<members>"]],
"<symbol-1>": [[",", "<elements>"]],
"<symbol-2>": [[], ["<symbol>", "<symbol-2>"]],
"<symbol-1-1>": [[], ["<symbol-1>", "<symbol-1-1>"]],
"<character-1>": [[], ["<character>", "<character-1>"]],
"<digit-1>": [["<digit>"], ["<digit>", "<digit-1>"]]
}
================================================
FILE: src/main.rs
================================================
use std::collections::{BTreeMap, BTreeSet};
use std::path::Path;
use std::process::Command;
use serde::{Deserialize, Serialize};
/// If this is `true` then the output file we generate will not emit any
/// unsafe code. I'm not aware of any bugs with the unsafe code that I use and
/// thus this is by default set to `false`. Feel free to set it to `true` if
/// you are concerned.
const SAFE_ONLY: bool = false;
/// Representation of a grammar file in a Rust structure. This allows us to
/// use Serde to serialize and deserialize the json grammar files
#[derive(Serialize, Deserialize, Default, Debug)]
struct Grammar(BTreeMap<String, Vec<Vec<String>>>);
/// A strongly typed wrapper around a `usize` which selects different fragment
/// identifiers
#[derive(Clone, Copy, Debug)]
struct FragmentId(usize);
/// A fragment which is specified by the grammar file
#[derive(Clone, Debug)]
enum Fragment {
/// A non-terminal fragment which refers to a list of `FragmentId`s to
/// randomly select from for expansion
NonTerminal(Vec<FragmentId>),
/// A list of `FragmentId`s that should be expanded in order
Expression(Vec<FragmentId>),
/// A terminal fragment which simply should expand directly to the
/// contained vector of bytes
Terminal(Vec<u8>),
/// A fragment which does nothing. This is used during optimization passes
/// to remove fragments with no effect.
Nop,
}
/// A grammar representation in Rust that is designed to be easy to work with
/// in-memory and optimized for code generation.
#[derive(Debug, Default)]
struct GrammarRust {
/// All types
fragments: Vec<Fragment>,
/// Cached fragment identifier for the start node
start: Option<FragmentId>,
/// Mapping of non-terminal names to fragment identifers
name_to_fragment: BTreeMap<String, FragmentId>,
}
impl GrammarRust {
/// Create a new Rust version of a `Grammar` which was loaded via a
/// grammar json specification.
fn new(grammar: &Grammar) -> Self {
// Create a new grammar structure
let mut ret = GrammarRust::default();
// Parse the input grammar to resolve all fragment names
for (non_term, _) in grammar.0.iter() {
// Make sure that there aren't duplicates of fragment names
assert!(!ret.name_to_fragment.contains_key(non_term),
"Duplicate non-terminal definition, fail");
// Create a new, empty fragment
let fragment_id = ret.allocate_fragment(
Fragment::NonTerminal(Vec::new()));
// Add the name resolution for the fragment
ret.name_to_fragment.insert(non_term.clone(), fragment_id);
}
// Parse the input grammar
for (non_term, fragments) in grammar.0.iter() {
// Get the non-terminal fragment identifier
let fragment_id = ret.name_to_fragment[non_term];
// Create a vector to hold all of the variants possible under this
// non-terminal fragment
let mut variants = Vec::new();
// Go through all sub-fragments
for js_sub_fragment in fragments {
// Different options for this sub-fragment
let mut options = Vec::new();
// Go through each option in the sub-fragment
for option in js_sub_fragment {
let fragment_id = if let Some(&non_terminal) =
ret.name_to_fragment.get(option) {
// If we can resolve the name of this fragment, it is a
// non-terminal fragment and should be allocated as
// such
ret.allocate_fragment(
Fragment::NonTerminal(vec![non_terminal]))
} else {
// Convert the terminal bytes into a vector and
// create a new fragment containing it
ret.allocate_fragment(Fragment::Terminal(
option.as_bytes().to_vec()))
};
// Push this fragment as an option
options.push(fragment_id);
}
// Create a new fragment of all the options
variants.push(
ret.allocate_fragment(Fragment::Expression(options)));
}
// Get access to the fragment we want to update based on the
// possible variants
let fragment = &mut ret.fragments[fragment_id.0];
// Overwrite the terminal definition
*fragment = Fragment::NonTerminal(variants);
}
// Resolve the start node
ret.start = Some(ret.name_to_fragment["<start>"]);
ret
}
/// Allocate a new fragment identifier and add it to the fragment list
pub fn allocate_fragment(&mut self, fragment: Fragment) -> FragmentId {
// Get a unique fragment identifier
let fragment_id = FragmentId(self.fragments.len());
// Store the fragment
self.fragments.push(fragment);
fragment_id
}
/// Optimize to remove fragments with non-random effects
pub fn optimize(&mut self) {
// Keeps track of fragment identifiers which resolve to nops
let mut nop_fragments = BTreeSet::new();
// Track if a optimization had an effect
let mut changed = true;
while changed {
// Start off assuming no effect from optimzation
changed = false;
// Go through each fragment, looking for potential optimizations
for idx in 0..self.fragments.len() {
// Clone the fragment such that we can inspect it, but we also
// can mutate it in place.
match self.fragments[idx].clone() {
Fragment::NonTerminal(options) => {
// If this non-terminal only has one option, replace
// itself with the only option it resolves to
if options.len() == 1 {
self.fragments[idx] =
self.fragments[options[0].0].clone();
changed = true;
}
}
Fragment::Expression(expr) => {
// If this expression doesn't have anything to do at
// all. Then simply replace it with a `Nop`
if expr.len() == 0 {
self.fragments[idx] = Fragment::Nop;
changed = true;
// Track that this fragment identifier now resolves
// to a nop
nop_fragments.insert(idx);
}
// If this expression only does one thing, then replace
// the expression with the thing that it does.
if expr.len() == 1 {
self.fragments[idx] =
self.fragments[expr[0].0].clone();
changed = true;
}
// Remove all `Nop`s from this expression, as they
// wouldn't result in anything occuring.
if let Fragment::Expression(exprs) =
&mut self.fragments[idx] {
// Only retain fragments which are not nops
exprs.retain(|x| {
if nop_fragments.contains(&x.0) {
// Fragment was a nop, remove it
changed = true;
false
} else {
// Fragment was fine, keep it
true
}
});
}
}
Fragment::Terminal(_) | Fragment::Nop => {
// Already maximally optimized
}
}
}
}
}
/// Generate a new Rust program that can be built and will generate random
/// inputs and benchmark them
pub fn program<P: AsRef<Path>>(&self, path: P, max_depth: usize) {
let mut program = String::new();
// Construct the base of the application. This is a profiling loop that
// is used for testing.
program += &format!(r#"
#![allow(unused)]
use std::cell::Cell;
use std::time::Instant;
fn main() {{
let mut fuzzer = Fuzzer {{
seed: Cell::new(0x34cc028e11b4f89c),
buf: Vec::new(),
}};
let mut generated = 0usize;
let it = Instant::now();
for iters in 1u64.. {{
fuzzer.buf.clear();
fuzzer.fragment_{}(0);
generated += fuzzer.buf.len();
// Filter to reduce the amount of times printing occurs
if (iters & 0xfffff) == 0 {{
let elapsed = (Instant::now() - it).as_secs_f64();
let bytes_per_sec = generated as f64 / elapsed;
print!("MiB/sec: {{:12.4}}\n", bytes_per_sec / 1024. / 1024.);
}}
}}
}}
struct Fuzzer {{
seed: Cell<usize>,
buf: Vec<u8>,
}}
impl Fuzzer {{
fn rand(&self) -> usize {{
let mut seed = self.seed.get();
seed ^= seed << 13;
seed ^= seed >> 17;
seed ^= seed << 43;
self.seed.set(seed);
seed
}}
"#, self.start.unwrap().0);
// Go through each fragment in the list of fragments
for (id, fragment) in self.fragments.iter().enumerate() {
// Create a new function for this fragment
program += &format!(" fn fragment_{}(&mut self, depth: usize) {{\n", id);
// Add depth checking to terminate on depth exhaustion
program += &format!(" if depth >= {} {{ return; }}\n",
max_depth);
match fragment {
Fragment::NonTerminal(options) => {
// For non-terminal cases pick a random variant to select
// and invoke that fragment's routine
program += &format!(" match self.rand() % {} {{\n", options.len());
for (option_id, option) in options.iter().enumerate() {
program += &format!(" {} => self.fragment_{}(depth + 1),\n", option_id, option.0);
}
program += &format!(" _ => unreachable!(),\n");
program += &format!(" }}\n");
}
Fragment::Expression(expr) => {
// Invoke all of the expression's routines in order
for &exp in expr.iter() {
program += &format!(" self.fragment_{}(depth + 1);\n", exp.0);
}
}
Fragment::Terminal(value) => {
// Append the terminal value to the output buffer
if SAFE_ONLY {
program += &format!(" self.buf.extend_from_slice(&{:?});\n",
value);
} else {
// For some reason this is faster than
// `extend_from_slice` even though it does the exact
// same thing. This was observed to be over a 4-5x
// speedup in some scenarios.
program += &format!(r#"
unsafe {{
let old_size = self.buf.len();
let new_size = old_size + {};
if new_size > self.buf.capacity() {{
self.buf.reserve(new_size - old_size);
}}
std::ptr::copy_nonoverlapping({:?}.as_ptr(), self.buf.as_mut_ptr().offset(old_size as isize), {});
self.buf.set_len(new_size);
}}
"#, value.len(), value, value.len());
}
}
Fragment::Nop => {}
}
program += " }\n";
}
program += "}\n";
// Write out the test application
std::fs::write(path, program)
.expect("Failed to create output Rust application");
}
}
fn main() -> std::io::Result<()> {
// Get access to the command line arguments
let args: Vec<String> = std::env::args().collect();
if args.len() != 5 {
print!("usage: fzero <grammar json> <output Rust file> <output binary name> <max depth>\n");
return Ok(());
}
// Load up a grammar file
let grammar: Grammar = serde_json::from_slice(
&std::fs::read(&args[1])?)?;
print!("Loaded grammar json\n");
// Convert the grammar file to the Rust structures
let mut gram = GrammarRust::new(&grammar);
print!("Converted grammar to binary format\n");
// Optimize the grammar
gram.optimize();
print!("Optimized grammar\n");
// Generate a Rust application
gram.program(&args[2],
args[4].parse().expect("Invalid digit in max depth"));
print!("Generated Rust source file\n");
// Compile the application
// rustc -O -g test.rs -C target-cpu=native
let status = Command::new("rustc")
.arg("-O") // Optimize the binary
.arg("-g") // Generate debug information
.arg(&args[2]) // Name of the input Rust file
.arg("-C") // Optimize for the current microarchitecture
.arg("target-cpu=native")
.arg("-o") // Output filename
.arg(&args[3]).spawn()?.wait()?;
assert!(status.success(), "Failed to compile Rust binary");
print!("Created Rust binary!\n");
Ok(())
}
gitextract_mzee96e8/
├── .gitignore
├── Cargo.toml
├── README.md
├── html.json
├── json.json
└── src/
└── main.rs
SYMBOL INDEX (10 symbols across 1 files)
FILE: src/main.rs
constant SAFE_ONLY (line 10) | const SAFE_ONLY: bool = false;
type Grammar (line 15) | struct Grammar(BTreeMap<String, Vec<Vec<String>>>);
type FragmentId (line 20) | struct FragmentId(usize);
type Fragment (line 24) | enum Fragment {
type GrammarRust (line 44) | struct GrammarRust {
method new (line 58) | fn new(grammar: &Grammar) -> Self {
method allocate_fragment (line 130) | pub fn allocate_fragment(&mut self, fragment: Fragment) -> FragmentId {
method optimize (line 141) | pub fn optimize(&mut self) {
method program (line 212) | pub fn program<P: AsRef<Path>>(&self, path: P, max_depth: usize) {
function main (line 327) | fn main() -> std::io::Result<()> {
Condensed preview — 6 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (43K chars).
[
{
"path": ".gitignore",
"chars": 57,
"preview": "/*\n!/*.json\n!/src\n!/Cargo.toml\n!/.gitignore\n!/README.md\n\n"
},
{
"path": "Cargo.toml",
"chars": 311,
"preview": "[package]\nname = \"fzero\"\nversion = \"0.1.0\"\nauthors = [\"Brandon Falk <bfalk@gamozolabs.com>\"]\nedition = \"2018\"\nlicense = "
},
{
"path": "README.md",
"chars": 4355,
"preview": "# Intro\n\n`fzero` is a grammar-based fuzzer that generates a Rust application inspired\nby the paper \"Building Fast Fuzzer"
},
{
"path": "html.json",
"chars": 17183,
"preview": "{\n \"<start>\": [[\"<_l_>\", \"!DOCTYPE html\", \"<_r_>\", \"<html_document>\"]],\n \"<_l_>\": [[\"<\"]],\n \"<_r_>\": [[\">\"]],\n "
},
{
"path": "json.json",
"chars": 2502,
"preview": "{\n \"<start>\": [[\"<json>\"]],\n \"<json>\": [[\"<element>\"]],\n \"<element>\": [[\"<ws>\", \"<value>\", \"<ws>\"]],\n \"<valu"
},
{
"path": "src/main.rs",
"chars": 14058,
"preview": "use std::collections::{BTreeMap, BTreeSet};\nuse std::path::Path;\nuse std::process::Command;\nuse serde::{Deserialize, Ser"
}
]
About this extraction
This page contains the full source code of the gamozolabs/fzero_fuzzer GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 6 files (37.6 KB), approximately 12.0k tokens, and a symbol index with 10 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.