Full Code of gamozolabs/fzero_fuzzer for AI

master 712a566d7acf cached
6 files
37.6 KB
12.0k tokens
10 symbols
1 requests
Download .txt
Repository: gamozolabs/fzero_fuzzer
Branch: master
Commit: 712a566d7acf
Files: 6
Total size: 37.6 KB

Directory structure:
gitextract_mzee96e8/

├── .gitignore
├── Cargo.toml
├── README.md
├── html.json
├── json.json
└── src/
    └── main.rs

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
/*
!/*.json
!/src
!/Cargo.toml
!/.gitignore
!/README.md



================================================
FILE: Cargo.toml
================================================
[package]
name = "fzero"
version = "0.1.0"
authors = ["Brandon Falk <bfalk@gamozolabs.com>"]
edition = "2018"
license = "MIT"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"



================================================
FILE: README.md
================================================
# Intro

`fzero` is a grammar-based fuzzer that generates a Rust application inspired
by the paper "Building Fast Fuzzers" by Rahul Gopinath and Andreas Zeller.
https://arxiv.org/pdf/1911.07707.pdf

You can find the F1 fuzzer here:

https://github.com/vrthra/F1

# Usage

Currently this only generates an application that does benchmarking, but with
some quick hacks you could easily get the input out and feed it to an
application.

## Example usage

```
D:\dev\fzero_fuzz>cargo run --release html.json test.rs test.exe 8
    Finished release [optimized] target(s) in 0.02s
     Running `target\release\fzero.exe html.json test.rs test.exe 8`
Loaded grammar json
Converted grammar to binary format
Optimized grammar
Generated Rust source file
Created Rust binary!

D:\dev\fzero_fuzz>test.exe
MiB/sec:    1773.3719
MiB/sec:    1763.8357
MiB/sec:    1756.8917
MiB/sec:    1757.1934
MiB/sec:    1758.9417
MiB/sec:    1758.9122
MiB/sec:    1758.7352
```

# Concept

This program takes in an input grammar specified by a JSON file. This JSON
grammar representation is converted to a binary-style grammar that is intended
for interpretation and optimization. A Rust application (source file) is
produced by the shape of the input grammar. This then is compiled using `rustc`
to an application for the local machine.

This doesn't have any constraints on the random number generation as it uses an
infinite supply of random numbers. There is no limitation on the output size
and the buffer will dynamically grow as the input is created.

# Benchmarks

All tests on a single core of a `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz` with a turbo clock rate of 4.3 GHz

All numbers in `MiB/second`.

| Benchmark          | fzero fuzzer | F1 fuzzer | Speedup |
|--------------------|--------------|-----------|---------|
| html.json depth=4  |         5330 |      1295 |   4.11x |
| html.json depth=8  |         1760 |       348 |   5.05x |
| html.json depth=16 |          338 |       195 |   1.73x |
| html.json depth=32 |          218 |       175 |   1.25x |
| html.json depth=64 |          201 |       175 |   1.14x |
| json.json depth=4  |           97 |        97 |   1.00x |
| json.json depth=8  |           79 |        93 |   0.84x |
| json.json depth=16 |           83 |        89 |   0.93x |
| json.json depth=32 |           85 |        88 |   0.97x |
| json.json depth=64 |           85 |        90 |   0.94x |

# Unsafe code

This project uses a small amount of `unsafe` code to provide the same semantics
of `extend_from_slice` but in a much faster way (over 4x faster). Not quite
sure why it's much faster, but if you are uncomfortable with `unsafe` code,
feel free to set `SAFE_ONLY` to `true` at the top of `src/lib.rs`. This will
restrict this fuzzer to only generate safe code. I don't think this is
necessary but who knows :)

# Performance

The performance of this tool is separated into multiple categories. One is the
code generation side, how long it takes for the JSON to be compiled into a Rust
application. The other is the code execution speeds, which is how fast the
produced application can generate inputs.

## Code Generation

Code generation vastly outperforms the "Building Fast Fuzzers" paper. For
example when generating the code based on the `html.json` grammar, the F1
fuzzer took over 25 minutes to produce the code. This fuzzer is capable of
producing a Rust application in under 10 seconds.

## Code execution

This project is on some performance metrics about 20-30% slower than the F1
fuzzer, but these scenarios are rare. However, in most situations we've been
about to out-perform F1 by about 30-50%, and in extreme cases (html.json
depth=8) we've observed over a 4x speedup.

# Differences from the F1 fuzzer

The F1 fuzzer mentions a technique that will resolve to the nearest terminal
tokens when stack depth is exceeded. We haven't implemented this technique but
I don't think it's a huge impact on the generated inputs. This is something I
will look into in the future.

Due to not using globals this can easily be scaled out to multiple threads as
all random state and input generation are done in a structure.

There is no use of assembly in this project, and thus it can produce
highly-performant fuzzers for any architecture or environment that Rust can
compile against (pretty much identical to LLVM's target list).



================================================
FILE: html.json
================================================
{
    "<start>": [["<_l_>", "!DOCTYPE html", "<_r_>", "<html_document>"]],
    "<_l_>": [["<"]],
    "<_r_>": [[">"]],
    "<_cl_>": [["</"]],
    "<a_tag>": [["<_l_>", "a", "<d>", "<_r_>", "<a_content-1>", "<_cl_>", "a", "<_r_>"]],
    "<a_content>": [["<heading>"], ["<text>"]],
    "<abbr_tag>": [["<_l_>", "abbr", "<d>", "<_r_>", "<text>", "<_cl_>", "abbr", "<_r_>"]],
    "<acronym_tag>": [["<_l_>", "acronym", "<d>", "<_r_>", "<text>", "<_cl_>", "acronym", "<_r_>"]],
    "<address_tag>": [["<_l_>", "address", "<d>", "<_r_>", "<address_content-1>", "<_cl_>", "address", "<_r_>"]],
    "<address_content>": [["<p_tag>"], ["<text>"]],
    "<applet_content>": [["<param-1>", "<body_content>"]],
    "<area>": [["<_l_>", "area", "<d>", "<_r_>"]],
    "<applet_tag>": [["<_l_>", "applet", "<d>", "<_r_>", "<applet_content>", "<_cl_>", "applet", "<_r_>"]],
    "<b_tag>": [["<_l_>", "b", "<d>", "<_r_>", "<text>", "<_cl_>", "b", "<_r_>"]],
    "<basefont_tag>": [["<_l_>", "basefront", "<d>", "<_r_>", "<body_content>", "<_cl_>", "basefront", "<_r_>"]],
    "<bdo_tag>": [["<_l_>", "bdo", "<d>", "<_r_>", "<text>", "<_cl_>", "bdo", "<_r_>"]],
    "<big_tag>": [["<_l_>", "big", "<d>", "<_r_>", "<text>", "<_cl_>", "big", "<_r_>"]],
    "<blink_tag>": [["<_l_>", "blink", "<d>", "<_r_>", "<text>", "<_cl_>", "blink", "<_r_>"]],
    "<block>": [["<block_content-1>"]],
    "<block_content>": [["<basefont_tag>"], ["<blockquote_tag>"], ["<center_tag>"], ["<dir_tag>"], ["<div_tag>"], ["<dl_tag>"], ["<form_tag>"], ["<listing_tag>"], ["<menu_tag>"], ["<multicol_tag>"], ["<nobr_tag>"], ["<ol_tag>"], ["<p_tag>"], ["<pre_tag>"], ["<table_tag>"], ["<ul_tag>"], ["<xmp_tag>"]],
    "<blockquote_tag>": [["<_l_>", "blockquote", "<d>", "<_r_>", "<body_content>", "<_cl_>", "blockquote", "<_r_>"]],
    "<body_content>": [["<_l_>", "bgsound", "<d>", "<_r_>"], ["<_l_>", "hr", "<_r_>"], ["<address_tag>"], ["<block>"], ["<del_tag>"], ["<heading>"], ["<ins_tag>"], ["<layer_tag>"], ["<map_tag>"], ["<marquee_tag>"], ["<text>"]],
    "<body_tag>": [["<_l_>", "body", "<d>", "<_r_>", "<body_content-1>", "<_cl_>", "body", "<_r_>"]],
    "<caption_tag>": [["<_l_>", "caption", "<d>", "<_r_>", "<body_content-2>", "<_cl_>", "caption", "<_r_>"]],
    "<center_tag>": [["<_l_>", "center", "<d>", "<_r_>", "<body_content-3>", "<_cl_>", "center", "<_r_>"]],
    "<cite_tag>": [["<_l_>", "cite", "<d>", "<_r_>", "<text>", "<_cl_>", "cite", "<_r_>"]],
    "<code_tag>": [["<_l_>", "code", "<d>", "<_r_>", "<text>", "<_cl_>", "code", "<_r_>"]],
    "<colgroup_content>": [["<_l_>", "col", "<d>", "<_r_-1>"]],
    "<colgroup_tag>": [["<_l_>", "colgroup", "<d>", "<_r_>", "<colgroup_content>"]],
    "<content_style>": [["<abbr_tag>"], ["<acronym_tag>"], ["<cite_tag>"], ["<code_tag>"], ["<dfn_tag>"], ["<em_tag>"], ["<kbd_tag>"], ["<q_tag>"], ["<strong_tag>"], ["<var_tag>"]],
    "<dd_tag>": [["<_l_>", "dd", "<d>", "<_r_>", "<flow>", "<_cl_>", "dd", "<_r_>"]],
    "<del_tag>": [["<_l_>", "del", "<d>", "<_r_>", "<flow>", "<_cl_>", "del", "<_r_>"]],
    "<dfn_tag>": [["<_l_>", "dfn", "<d>", "<_r_>", "<text>", "<_cl_>", "dfn", "<_r_>"]],
    "<dir_tag>": [["<_l_>", "dir", "<d>", "<_r_>", "<li_tag-1>", "<_cl_>", "dir", "<_r_>"]],
    "<div_tag>": [["<_l_>", "div", "<d>", "<_r_>", "<body_content>", "<_cl_>", "div", "<_r_>"]],
    "<dl_content>": [["<dt_tag>", "<dd_tag>"]],
    "<dl_tag>": [["<_l_>", "dl", "<d>", "<_r_>", "<dl_content-1>", "<_cl_>", "dl", "<_r_>"]],
    "<dt_tag>": [["<_l_>", "dt", "<d>", "<_r_>", "<text>", "<_cl_>", "dt", "<_r_>"]],
    "<em_tag>": [["<_l_>", "em", "<d>", "<_r_>", "<text>", "<_cl_>", "em", "<_r_>"]],
    "<fieldset_tag>": [["<_l_>", "fieldset", "<d>", "<_r_>", "<legend_tag-1>", "<form_content-1>", "<_cl_>", "fieldset", "<_r_>"]],
    "<flow>": [["<flow_content-1>"]],
    "<flow_content>": [["<block>"], ["<text>"]],
    "<font_tag>": [["<_l_>", "font", "<d>", "<_r_>", "<style_text>", "<_cl_>", "font", "<_r_>"]],
    "<form_content>": [["<_l_>", "input", "<d>", "<_r_>"], ["<_l_>", "keygen", "<d>", "<_r_>"], ["<body_content>"], ["<fieldset_tag>"], ["<label_tag>"], ["<select_tag>"], ["<textarea_tag>"]],
    "<form_tag>": [["<_l_>", "form", "<d>", "<_r_>", "<form_content-2>", "<_cl_>", "form", "<_r_>"]],
    "<frameset_content>": [["<_l_>", "frame", "<d>", "<_r_>"], ["<noframes_tag>"]],
    "<frameset_tag>": [["<_l_>", "frameset", "<d>", "<_r_>", "<frameset_content-1>", "<_cl_>", "frameset", "<_r_>"]],
    "<h1_tag>": [["<_l_>", "h1", "<d>", "<_r_>", "<text>", "<_cl_>", "h1", "<_r_>"]],
    "<h2_tag>": [["<_l_>", "h2", "<d>", "<_r_>", "<text>", "<_cl_>", "h2", "<_r_>"]],
    "<h3_tag>": [["<_l_>", "h3", "<d>", "<_r_>", "<text>", "<_cl_>", "h3", "<_r_>"]],
    "<h4_tag>": [["<_l_>", "h4", "<d>", "<_r_>", "<text>", "<_cl_>", "h4", "<_r_>"]],
    "<h5_tag>": [["<_l_>", "h5", "<d>", "<_r_>", "<text>", "<_cl_>", "h5", "<_r_>"]],
    "<h6_tag>": [["<_l_>", "h6", "<d>", "<_r_>", "<text>", "<_cl_>", "h6", "<_r_>"]],
    "<head_content>": [["<_l_>", "base", "<d>", "<_r_>"], ["<_l_>", "link", "<d>", "<_r_>"], ["<_l_>", "meta", "<d>", "<_r_>"], ["<style_tag>"], ["<title_tag>"], ["<script_tag>"]],
    "<head_tag>": [["<_l_>", "head", "<d>", "<_r_>", "<head_content-1>", "<_cl_>", "head", "<_r_>"]],
    "<heading>": [["<h1_tag>"], ["<h2_tag>"], ["<h3_tag>"], ["<h4_tag>"], ["<h5_tag>"], ["<h6_tag>"]],
    "<html_content>": [["<head_tag>", "<body_tag>"], ["<head_tag>", "<frameset_tag>"]],
    "<html_document>": [["<html_tag>"]],
    "<html_tag>": [["<_l_>", "html", "<_r_>", "<html_content>", "<_cl_>", "html", "<_r_>"]],
    "<i_tag>": [["<_l_>", "i", "<d>", "<_r_>", "<text>", "<_cl_>", "i", "<_r_>"]],
    "<ilayer_tag>": [["<_l_>", "ilayer", "<d>", "<_r_>", "<body_content>", "<_cl_>", "ilayer", "<_r_>"]],
    "<ins_tag>": [["<_l_>", "ins", "<d>", "<_r_>", "<flow>", "<_cl_>", "ins", "<_r_>"]],
    "<kbd_tag>": [["<_l_>", "kbd", "<d>", "<_r_>", "<text>", "<_cl_>", "kbd", "<_r_>"]],
    "<label_content>": [["<_l_>", "input", "<d>", "<_r_>"], ["<body_content>"], ["<select_tag>"], ["<textarea_tag>"]],
    "<label_tag>": [["<_l_>", "label", "<d>", "<_r_>", "<label_content-1>", "<_cl_>", "label", "<_r_>"]],
    "<layer_tag>": [["<_l_>", "layer", "<d>", "<_r_>", "<body_content>", "<_cl_>", "layer", "<_r_>"]],
    "<legend_tag>": [["<_l_>", "legend", "<d>", "<_r_>", "<text>", "<_cl_>", "legend", "<_r_>"]],
    "<li_tag>": [["<_l_>", "li", "<d>", "<_r_>", "<flow>", "<_cl_>", "li", "<_r_>"]],
    "<literal_text>": [["<plain_text>"]],
    "<listing_tag>": [["<_l_>", "listing", "<d>", "<_r_>", "<literal_text>", "<_cl_>", "listing", "<_r_>"]],
    "<map_content>": [["<area-1>"]],
    "<map_tag>": [["<_l_>", "map", "<d>", "<_r_>", "<map_content>", "<_cl_>", "map", "<_r_>"]],
    "<marquee_tag>": [["<_l_>", "marquee", "<d>", "<_r_>", "<style_text>", "<_cl_>", "marquee", "<_r_>"]],
    "<menu_tag>": [["<_l_>", "menu", "<d>", "<_r_>", "<li_tag-2>", "<_cl_>", "menu", "<_r_>"]],
    "<multicol_tag>": [["<_l_>", "multicol", "<d>", "<_r_>", "<body_content>", "<_cl_>", "multicol", "<_r_>"]],
    "<nobr_tag>": [["<_l_>", "nobr", "<d>", "<_r_>", "<text>", "<_cl_>", "nobr", "<_r_>"]],
    "<noembed_tag>": [["<_l_>", "noembed", "<d>", "<_r_>", "<text>", "<_cl_>", "noembed", "<_r_>"]],
    "<noframes_tag>": [["<_l_>", "noframes", "<d>", "<_r_>", "<body_content-4>", "<_cl_>", "noframes", "<_r_>"]],
    "<noscript_tag>": [["<_l_>", "noscript", "<d>", "<_r_>", "<text>", "<_cl_>", "noscript", "<_r_>"]],
    "<object_content>": [["<param-2>", "<body_content>"]],
    "<object_tag>": [["<_l_>", "object", "<d>", "<_r_>", "<object_content>", "<_cl_>", "object", "<_r_>"]],
    "<ol_tag>": [["<_l_>", "ol", "<d>", "<_r_>", "<li_tag-3>", "<_cl_>", "ol", "<_r_>"]],
    "<optgroup_tag>": [["<_l_>", "optgroup", "<d>", "<_r_>", "<option_tag-1>", "<_cl_>", "optgroup", "<_r_>"]],
    "<option_tag>": [["<_l_>", "option", "<d>", "<_r_>", "<plain_text-1>", "<_cl_>", "option", "<_r_>"]],
    "<p_tag>": [["<_l_>", "p", "<_r_>", "<text>", "<_cl_>", "p", "<_r_>"]],
    "<param>": [["<_l_>", "param", "<_r_>"]],
    "<plain_text>": [["<entity-1>"]],
    "<entity>": [["<char>"], ["<ampersand>"]],
    "<char>": [["7"], ["*"], [":"], ["]"], ["n"], ["m"], ["N"], ["/"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [" "], [";"], ["b"], ["i"], ["L"], ["'"], ["Y"], ["\t"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["\""], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["`"], ["B"], ["\n"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["<"], ["u"], ["r"], ["U"], ["h"], [">"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["="], ["8"], ["+"], ["M"]],
    "<ampersand>": [["&nbsp;"]],
    "<physical_style>": [["<b_tag>"], ["<bdo_tag>"], ["<big_tag>"], ["<blink_tag>"], ["<font_tag>"], ["<i_tag>"], ["<s_tag>"], ["<small_tag>"], ["<span_tag>"], ["<strike_tag>"], ["<sub_tag>"], ["<sup_tag>"], ["<tt_tag>"], ["<u_tag>"]],
    "<pre_content>": [["<_l_>", "br", "<_r_>"], ["<_l_>", "hr", "<_r_>"], ["<a_tag>"], ["<style_text>"]],
    "<pre_tag>": [["<_l_>", "pre", "<_r_>", "<pre_content-1>", "<_cl_>", "pre", "<_r_>"]],
    "<q_tag>": [["<_l_>", "q", "<_r_>", "<text>", "<_cl_>", "q", "<_r_>"]],
    "<s_tag>": [["<_l_>", "s", "<_r_>", "<text>", "<_cl_>", "s", "<_r_>"]],
    "<script_tag>": [["<_l_>", "script", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "script", "<_r_>"]],
    "<select_content>": [["<optgroup_tag>"], ["<option_tag>"]],
    "<select_tag>": [["<_l_>", "select", "<d>", "<_r_>", "<select_content-1>", "<_cl_>", "select", "<_r_>"]],
    "<small_tag>": [["<_l_>", "small", "<d>", "<_r_>", "<text>", "<_cl_>", "small", "<_r_>"]],
    "<span_tag>": [["<_l_>", "span", "<d>", "<_r_>", "<text>", "<_cl_>", "span", "<_r_>"]],
    "<strike_tag>": [["<_l_>", "strike", "<d>", "<_r_>", "<text>", "<_cl_>", "strike", "<_r_>"]],
    "<strong_tag>": [["<_l_>", "strong", "<d>", "<_r_>", "<text>", "<_cl_>", "strong", "<_r_>"]],
    "<style_tag>": [["<_l_>", "style", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "style", "<_r_>"]],
    "<style_text>": [["<plain_text>"]],
    "<sub_tag>": [["<_l_>", "sub", "<d>", "<_r_>", "<text>", "<_cl_>", "sub", "<_r_>"]],
    "<sup_tag>": [["<_l_>", "sup", "<d>", "<_r_>", "<text>", "<_cl_>", "sup", "<_r_>"]],
    "<table_cell>": [["<td_tag>"], ["<th_tag>"]],
    "<table_content>": [["<_l_>", "tbody", "<d>", "<_r_>"], ["<_l_>", "tfoot", "<d>", "<_r_>"], ["<_l_>", "thead", "<d>", "<_r_>"], ["<tr_tag>"]],
    "<table_tag>": [["<_l_>", "table", "<d>", "<_r_>", "<caption_tag-1>", "<colgroup_tag-1>", "<table_content-1>", "<_cl_>", "table", "<_r_>"]],
    "<td_tag>": [["<_l_>", "td", "<d>", "<_r_>", "<body_content>", "<_cl_>", "td", "<_r_>"]],
    "<text>": [["<text_content-1>"]],
    "<text_content>": [["<_l_>", "br", "<d>", "<_r_>"], ["<_l_>", "embed", "<d>", "<_r_>"], ["<_l_>", "iframe", "<d>", "<_r_>"], ["<_l_>", "img", "<d>", "<_r_>"], ["<_l_>", "spacer", "<d>", "<_r_>"], ["<_l_>", "wbr", "<d>", "<_r_>"], ["<a_tag>"], ["<applet_tag>"], ["<content_style>"], ["<ilayer_tag>"], ["<noembed_tag>"], ["<noscript_tag>"], ["<object_tag>"], ["<plain_text>"], ["<physical_style>"]],
    "<textarea_tag>": [["<_l_>", "textarea", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "textarea", "<_r_>"]],
    "<th_tag>": [["<_l_>", "th", "<d>", "<_r_>", "<body_content>", "<_cl_>", "th", "<_r_>"]],
    "<title_tag>": [["<_l_>", "title", "<d>", "<_r_>", "<plain_text>", "<_cl_>", "title", "<_r_>"]],
    "<tr_tag>": [["<_l_>", "tr", "<d>", "<_r_>", "<table_cell-1>", "<_cl_>", "tr", "<_r_>"]],
    "<tt_tag>": [["<_l_>", "tt", "<d>", "<_r_>", "<text>", "<_cl_>", "tt", "<_r_>"]],
    "<u_tag>": [["<_l_>", "u", "<d>", "<_r_>", "<text>", "<_cl_>", "u", "<_r_>"]],
    "<ul_tag>": [["<_l_>", "ul", "<d>", "<_r_>", "<li_tag-4>", "<_cl_>", "ul", "<_r_>"]],
    "<var_tag>": [["<_l_>", "var", "<d>", "<_r_>", "<text>", "<_cl_>", "var", "<_r_>"]],
    "<xmp_tag>": [["<_l_>", "xmp", "<d>", "<_r_>", "<literal_text>", "<_cl_>", "xmp", "<_r_>"]],
    "<d>": [["<space-1>", "<attributes-1>", "<space-2>"], []],
    "<attribute>": [["<key>"], ["<key>", "=\"", "<value>", "\""], ["<key>", "='", "<value>", "'"], ["<key>", "=", "<uqvalue>"]],
    "<key>": [["<allchars>"]],
    "<allchars>": [["7"], ["*"], [":"], ["&"], ["]"], ["n"], ["m"], ["N"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [";"], ["b"], ["i"], ["L"], ["Y"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["`"], ["B"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["u"], ["r"], ["U"], ["h"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["8"], ["+"], ["M"]],
    "<value>": [["<anychars>"]],
    "<anychar>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"], ["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"], ["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"], ["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"], ["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"], ["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"], ["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["\""], ["#"], ["$"], ["%"], ["&"], ["'"], ["("], [")"], ["*"], ["+"], [","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="], [">"], ["?"], ["@"], ["["], ["\\"], ["]"], ["^"], ["_"], ["`"], ["{"], ["|"], ["}"], ["~"], [" "], ["\t"], ["\n"], ["\r"], ["\u000b"], ["\f"]],
    "<anychars>": [["<anychar-1>"]],
    "<uqvalue>": [["<uqchars>"]],
    "<uqchar>": [["7"], ["*"], [":"], ["&"], ["]"], ["n"], ["m"], ["N"], ["."], ["K"], ["T"], ["I"], ["f"], ["o"], [","], ["l"], ["W"], ["-"], ["?"], ["\\"], ["%"], ["1"], ["c"], ["H"], ["!"], ["A"], ["$"], ["9"], ["q"], ["["], [")"], [";"], ["b"], ["i"], ["L"], ["Y"], ["3"], ["g"], ["F"], ["E"], ["D"], ["C"], ["@"], ["t"], ["R"], ["2"], ["}"], ["~"], ["5"], ["4"], ["z"], ["X"], ["S"], ["O"], ["v"], ["J"], ["B"], ["y"], ["p"], ["6"], ["0"], ["k"], ["w"], ["\r"], ["V"], ["_"], ["s"], ["x"], ["{"], ["d"], ["a"], ["#"], ["Q"], ["u"], ["r"], ["U"], ["h"], ["("], ["P"], ["G"], ["\f"], ["Z"], ["j"], ["|"], ["e"], ["^"], ["8"], ["+"], ["M"]],
    "<uqchars>": [["<uqchar-1>"]],
    "<attributes>": [["<attribute>"], ["<attribute>", "<space-3>", "<attributes>"]],
    "<space>": [[" "], ["\t"], ["\n"]],
    "<a_content-1>": [[], ["<a_content>", "<a_content-1>"]],
    "<address_content-1>": [[], ["<address_content>", "<address_content-1>"]],
    "<param-1>": [[], ["<param>", "<param-1>"]],
    "<block_content-1>": [[], ["<block_content>", "<block_content-1>"]],
    "<body_content-1>": [[], ["<body_content>", "<body_content-1>"]],
    "<body_content-2>": [[], ["<body_content>", "<body_content-2>"]],
    "<body_content-3>": [[], ["<body_content>", "<body_content-3>"]],
    "<_r_-1>": [[], ["<_r_>", "<_r_-1>"]],
    "<li_tag-1>": [["<li_tag>"], ["<li_tag>", "<li_tag-1>"]],
    "<dl_content-1>": [["<dl_content>"], ["<dl_content>", "<dl_content-1>"]],
    "<legend_tag-1>": [[], ["<legend_tag>", "<legend_tag-1>"]],
    "<form_content-1>": [[], ["<form_content>", "<form_content-1>"]],
    "<flow_content-1>": [[], ["<flow_content>", "<flow_content-1>"]],
    "<form_content-2>": [[], ["<form_content>", "<form_content-2>"]],
    "<frameset_content-1>": [[], ["<frameset_content>", "<frameset_content-1>"]],
    "<head_content-1>": [[], ["<head_content>", "<head_content-1>"]],
    "<label_content-1>": [[], ["<label_content>", "<label_content-1>"]],
    "<area-1>": [[], ["<area>", "<area-1>"]],
    "<li_tag-2>": [[], ["<li_tag>", "<li_tag-2>"]],
    "<body_content-4>": [[], ["<body_content>", "<body_content-4>"]],
    "<param-2>": [[], ["<param>", "<param-2>"]],
    "<li_tag-3>": [["<li_tag>"], ["<li_tag>", "<li_tag-3>"]],
    "<option_tag-1>": [[], ["<option_tag>", "<option_tag-1>"]],
    "<plain_text-1>": [["<plain_text>"], ["<plain_text>", "<plain_text-1>"]],
    "<entity-1>": [[], ["<entity>", "<entity-1>"]],
    "<pre_content-1>": [[], ["<pre_content>", "<pre_content-1>"]],
    "<select_content-1>": [[], ["<select_content>", "<select_content-1>"]],
    "<caption_tag-1>": [[], ["<caption_tag>", "<caption_tag-1>"]],
    "<colgroup_tag-1>": [[], ["<colgroup_tag>", "<colgroup_tag-1>"]],
    "<table_content-1>": [[], ["<table_content>", "<table_content-1>"]],
    "<text_content-1>": [[], ["<text_content>", "<text_content-1>"]],
    "<table_cell-1>": [[], ["<table_cell>", "<table_cell-1>"]],
    "<li_tag-4>": [[], ["<li_tag>", "<li_tag-4>"]],
    "<space-1>": [["<space>"], ["<space>", "<space-1>"]],
    "<attributes-1>": [[], ["<attributes>", "<attributes-1>"]],
    "<space-2>": [[], ["<space>", "<space-2>"]],
    "<anychar-1>": [[], ["<anychar>", "<anychar-1>"]],
    "<uqchar-1>": [["<uqchar>"], ["<uqchar>", "<uqchar-1>"]],
    "<space-3>": [["<space>"], ["<space>", "<space-3>"]]
}


================================================
FILE: json.json
================================================
{
    "<start>": [["<json>"]],
    "<json>": [["<element>"]],
    "<element>": [["<ws>", "<value>", "<ws>"]],
    "<value>": [["<object>"], ["<array>"], ["<string>"], ["<number>"],
                ["true"], ["false"],
                ["null"]],
    "<object>": [["{", "<ws>", "}"], ["{", "<members>", "}"]],
    "<members>": [["<member>", "<symbol-2>"]],
    "<member>": [["<ws>", "<string>", "<ws>", ":", "<element>"]],
    "<array>": [["[", "<ws>", "]"], ["[", "<elements>", "]"]],
    "<elements>": [["<element>", "<symbol-1-1>"]],
    "<string>": [["\"", "<characters>", "\""]],
    "<characters>": [["<character-1>"]],
    "<character>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"],
                    ["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
                    ["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"],
                    ["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"],
                    ["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"],
                    ["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"],
                    ["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"],
                    ["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["#"],
                    ["$"], ["%"], ["&"], ["\""], ["("], [")"], ["*"], ["+"],
                    [","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="],
                    [">"], ["?"], ["@"], ["["], ["]"], ["^"], ["_"], ["`"],
                    ["{"], ["|"], ["}"], ["~"], [" "], ["<esc>"]],
    "<esc>": [["\\","<escc>"]],
    "<escc>": [["\\"],["b"],["f"], ["n"], ["r"],["t"],["\""]],
    "<number>": [["<int>", "<frac>", "<exp>"]],
    "<int>": [["<digit>"], ["<onenine>", "<digits>"], ["-", "<digits>"],
              ["-", "<onenine>", "<digits>"]],
    "<digits>": [["<digit-1>"]],
    "<digit>": [["0"], ["<onenine>"]],
    "<onenine>": [["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"], ["8"],
                  ["9"]],
    "<frac>": [[], [".", "<digits>"]],
    "<exp>": [[], ["E", "<sign>", "<digits>"], ["e", "<sign>", "<digits>"]],
    "<sign>": [[], ["+"], ["-"]],
    "<ws>": [["<sp1>", "<ws>"], []],
    "<sp1>": [[" "],["\n"],["\t"],["\r"]],
    "<symbol>": [[",", "<members>"]],
    "<symbol-1>": [[",", "<elements>"]],
    "<symbol-2>": [[], ["<symbol>", "<symbol-2>"]],
    "<symbol-1-1>": [[], ["<symbol-1>", "<symbol-1-1>"]],
    "<character-1>": [[], ["<character>", "<character-1>"]],
    "<digit-1>": [["<digit>"], ["<digit>", "<digit-1>"]]
}


================================================
FILE: src/main.rs
================================================
use std::collections::{BTreeMap, BTreeSet};
use std::path::Path;
use std::process::Command;
use serde::{Deserialize, Serialize};

/// If this is `true` then the output file we generate will not emit any
/// unsafe code. I'm not aware of any bugs with the unsafe code that I use and
/// thus this is by default set to `false`. Feel free to set it to `true` if
/// you are concerned.
const SAFE_ONLY: bool = false;

/// Representation of a grammar file in a Rust structure. This allows us to
/// use Serde to serialize and deserialize the json grammar files
#[derive(Serialize, Deserialize, Default, Debug)]
struct Grammar(BTreeMap<String, Vec<Vec<String>>>);

/// A strongly typed wrapper around a `usize` which selects different fragment
/// identifiers
#[derive(Clone, Copy, Debug)]
struct FragmentId(usize);

/// A fragment which is specified by the grammar file
#[derive(Clone, Debug)]
enum Fragment {
    /// A non-terminal fragment which refers to a list of `FragmentId`s to
    /// randomly select from for expansion
    NonTerminal(Vec<FragmentId>),

    /// A list of `FragmentId`s that should be expanded in order
    Expression(Vec<FragmentId>),

    /// A terminal fragment which simply should expand directly to the
    /// contained vector of bytes
    Terminal(Vec<u8>),

    /// A fragment which does nothing. This is used during optimization passes
    /// to remove fragments with no effect.
    Nop,
}

/// A grammar representation in Rust that is designed to be easy to work with
/// in-memory and optimized for code generation.
#[derive(Debug, Default)]
struct GrammarRust {
    /// All types
    fragments: Vec<Fragment>,

    /// Cached fragment identifier for the start node
    start: Option<FragmentId>,

    /// Mapping of non-terminal names to fragment identifers
    name_to_fragment: BTreeMap<String, FragmentId>,
}

impl GrammarRust {
    /// Create a new Rust version of a `Grammar` which was loaded via a
    /// grammar json specification.
    fn new(grammar: &Grammar) -> Self {
        // Create a new grammar structure
        let mut ret = GrammarRust::default();

        // Parse the input grammar to resolve all fragment names
        for (non_term, _) in grammar.0.iter() {
            // Make sure that there aren't duplicates of fragment names
            assert!(!ret.name_to_fragment.contains_key(non_term),
                "Duplicate non-terminal definition, fail");

            // Create a new, empty fragment
            let fragment_id = ret.allocate_fragment(
                Fragment::NonTerminal(Vec::new()));

            // Add the name resolution for the fragment
            ret.name_to_fragment.insert(non_term.clone(), fragment_id);
        }

        // Parse the input grammar
        for (non_term, fragments) in grammar.0.iter() {
            // Get the non-terminal fragment identifier
            let fragment_id = ret.name_to_fragment[non_term];

            // Create a vector to hold all of the variants possible under this
            // non-terminal fragment
            let mut variants = Vec::new();

            // Go through all sub-fragments
            for js_sub_fragment in fragments {
                // Different options for this sub-fragment
                let mut options = Vec::new();

                // Go through each option in the sub-fragment
                for option in js_sub_fragment {
                    let fragment_id = if let Some(&non_terminal) =
                            ret.name_to_fragment.get(option) {
                        // If we can resolve the name of this fragment, it is a
                        // non-terminal fragment and should be allocated as
                        // such
                        ret.allocate_fragment(
                            Fragment::NonTerminal(vec![non_terminal]))
                    } else {
                        // Convert the terminal bytes into a vector and
                        // create a new fragment containing it
                        ret.allocate_fragment(Fragment::Terminal(
                            option.as_bytes().to_vec()))
                    };

                    // Push this fragment as an option
                    options.push(fragment_id);
                }

                // Create a new fragment of all the options
                variants.push(
                    ret.allocate_fragment(Fragment::Expression(options)));
            }

            // Get access to the fragment we want to update based on the
            // possible variants
            let fragment = &mut ret.fragments[fragment_id.0];

            // Overwrite the terminal definition
            *fragment = Fragment::NonTerminal(variants);
        }

        // Resolve the start node
        ret.start = Some(ret.name_to_fragment["<start>"]);

        ret
    }

    /// Allocate a new fragment identifier and add it to the fragment list
    pub fn allocate_fragment(&mut self, fragment: Fragment) -> FragmentId {
        // Get a unique fragment identifier
        let fragment_id = FragmentId(self.fragments.len());

        // Store the fragment
        self.fragments.push(fragment);

        fragment_id
    }

    /// Optimize to remove fragments with non-random effects
    pub fn optimize(&mut self) {
        // Keeps track of fragment identifiers which resolve to nops
        let mut nop_fragments = BTreeSet::new();

        // Track if a optimization had an effect
        let mut changed = true;
        while changed {
            // Start off assuming no effect from optimzation
            changed = false;

            // Go through each fragment, looking for potential optimizations
            for idx in 0..self.fragments.len() {
                // Clone the fragment such that we can inspect it, but we also
                // can mutate it in place.
                match self.fragments[idx].clone() {
                    Fragment::NonTerminal(options) => {
                        // If this non-terminal only has one option, replace
                        // itself with the only option it resolves to
                        if options.len() == 1 {
                            self.fragments[idx] =
                                self.fragments[options[0].0].clone();
                            changed = true;
                        }
                    }
                    Fragment::Expression(expr) => {
                        // If this expression doesn't have anything to do at
                        // all. Then simply replace it with a `Nop`
                        if expr.len() == 0 {
                            self.fragments[idx] = Fragment::Nop;
                            changed = true;

                            // Track that this fragment identifier now resolves
                            // to a nop
                            nop_fragments.insert(idx);
                        }

                        // If this expression only does one thing, then replace
                        // the expression with the thing that it does.
                        if expr.len() == 1 {
                            self.fragments[idx] =
                                self.fragments[expr[0].0].clone();
                            changed = true;
                        }

                        // Remove all `Nop`s from this expression, as they
                        // wouldn't result in anything occuring.
                        if let Fragment::Expression(exprs) =
                                &mut self.fragments[idx] {
                            // Only retain fragments which are not nops
                            exprs.retain(|x| {
                                if nop_fragments.contains(&x.0) {
                                    // Fragment was a nop, remove it
                                    changed = true;
                                    false
                                } else {
                                    // Fragment was fine, keep it
                                    true
                                }
                            });
                        }
                    }
                    Fragment::Terminal(_) | Fragment::Nop => {
                        // Already maximally optimized
                    }
                }
            }
        }
    }

    /// Generate a new Rust program that can be built and will generate random
    /// inputs and benchmark them
    pub fn program<P: AsRef<Path>>(&self, path: P, max_depth: usize) {
        let mut program = String::new();

        // Construct the base of the application. This is a profiling loop that
        // is used for testing.
        program += &format!(r#"
#![allow(unused)]
use std::cell::Cell;
use std::time::Instant;

fn main() {{
    let mut fuzzer = Fuzzer {{
        seed:  Cell::new(0x34cc028e11b4f89c),
        buf:   Vec::new(),
    }};
    
    let mut generated = 0usize;
    let it = Instant::now();

    for iters in 1u64.. {{
        fuzzer.buf.clear();
        fuzzer.fragment_{}(0);
        generated += fuzzer.buf.len();

        // Filter to reduce the amount of times printing occurs
        if (iters & 0xfffff) == 0 {{
            let elapsed = (Instant::now() - it).as_secs_f64();
            let bytes_per_sec = generated as f64 / elapsed;
            print!("MiB/sec: {{:12.4}}\n", bytes_per_sec / 1024. / 1024.);
        }}
    }}
}}

struct Fuzzer {{
    seed:  Cell<usize>,
    buf:   Vec<u8>,
}}

impl Fuzzer {{
    fn rand(&self) -> usize {{
        let mut seed = self.seed.get();
        seed ^= seed << 13;
        seed ^= seed >> 17;
        seed ^= seed << 43;
        self.seed.set(seed);
        seed
    }}
"#, self.start.unwrap().0);

        // Go through each fragment in the list of fragments
        for (id, fragment) in self.fragments.iter().enumerate() {
            // Create a new function for this fragment
            program += &format!("    fn fragment_{}(&mut self, depth: usize) {{\n", id);

            // Add depth checking to terminate on depth exhaustion
            program += &format!("        if depth >= {} {{ return; }}\n",
                max_depth);

            match fragment {
                Fragment::NonTerminal(options) => {
                    // For non-terminal cases pick a random variant to select
                    // and invoke that fragment's routine
                    program += &format!("        match self.rand() % {} {{\n", options.len());

                    for (option_id, option) in options.iter().enumerate() {
                        program += &format!("            {} => self.fragment_{}(depth + 1),\n", option_id, option.0);
                    }
                    program += &format!("            _ => unreachable!(),\n");

                    program += &format!("        }}\n");
                }
                Fragment::Expression(expr) => {
                    // Invoke all of the expression's routines in order
                    for &exp in expr.iter() {
                        program += &format!("        self.fragment_{}(depth + 1);\n", exp.0);
                    }
                }
                Fragment::Terminal(value) => {
                    // Append the terminal value to the output buffer
                    if SAFE_ONLY {
                        program += &format!("        self.buf.extend_from_slice(&{:?});\n",
                            value);
                    } else {
                        // For some reason this is faster than
                        // `extend_from_slice` even though it does the exact
                        // same thing. This was observed to be over a 4-5x
                        // speedup in some scenarios.
                        program += &format!(r#"
            unsafe {{
                let old_size = self.buf.len();
                let new_size = old_size + {};

                if new_size > self.buf.capacity() {{
                    self.buf.reserve(new_size - old_size);
                }}

                std::ptr::copy_nonoverlapping({:?}.as_ptr(), self.buf.as_mut_ptr().offset(old_size as isize), {});
                self.buf.set_len(new_size);
            }}
    "#, value.len(), value, value.len());
                    }
                }
                Fragment::Nop => {}
            }

            program += "    }\n";
        }
        program += "}\n";

        // Write out the test application
        std::fs::write(path, program)
            .expect("Failed to create output Rust application");
    }
}

fn main() -> std::io::Result<()> {
    // Get access to the command line arguments
    let args: Vec<String> = std::env::args().collect();
    if args.len() != 5 {
        print!("usage: fzero <grammar json> <output Rust file> <output binary name> <max depth>\n");
        return Ok(());
    }

    // Load up a grammar file
    let grammar: Grammar = serde_json::from_slice(
        &std::fs::read(&args[1])?)?;
    print!("Loaded grammar json\n");

    // Convert the grammar file to the Rust structures
    let mut gram = GrammarRust::new(&grammar);
    print!("Converted grammar to binary format\n");

    // Optimize the grammar
    gram.optimize();
    print!("Optimized grammar\n");

    // Generate a Rust application
    gram.program(&args[2],
        args[4].parse().expect("Invalid digit in max depth"));
    print!("Generated Rust source file\n");

    // Compile the application
    // rustc -O -g test.rs -C target-cpu=native
    let status = Command::new("rustc")
        .arg("-O")                // Optimize the binary
        .arg("-g")                // Generate debug information
        .arg(&args[2])            // Name of the input Rust file
        .arg("-C")                // Optimize for the current microarchitecture
        .arg("target-cpu=native")
        .arg("-o")                // Output filename
        .arg(&args[3]).spawn()?.wait()?;
    assert!(status.success(), "Failed to compile Rust binary");
    print!("Created Rust binary!\n");

    Ok(())
}

Download .txt
gitextract_mzee96e8/

├── .gitignore
├── Cargo.toml
├── README.md
├── html.json
├── json.json
└── src/
    └── main.rs
Download .txt
SYMBOL INDEX (10 symbols across 1 files)

FILE: src/main.rs
  constant SAFE_ONLY (line 10) | const SAFE_ONLY: bool = false;
  type Grammar (line 15) | struct Grammar(BTreeMap<String, Vec<Vec<String>>>);
  type FragmentId (line 20) | struct FragmentId(usize);
  type Fragment (line 24) | enum Fragment {
  type GrammarRust (line 44) | struct GrammarRust {
    method new (line 58) | fn new(grammar: &Grammar) -> Self {
    method allocate_fragment (line 130) | pub fn allocate_fragment(&mut self, fragment: Fragment) -> FragmentId {
    method optimize (line 141) | pub fn optimize(&mut self) {
    method program (line 212) | pub fn program<P: AsRef<Path>>(&self, path: P, max_depth: usize) {
  function main (line 327) | fn main() -> std::io::Result<()> {
Condensed preview — 6 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (43K chars).
[
  {
    "path": ".gitignore",
    "chars": 57,
    "preview": "/*\n!/*.json\n!/src\n!/Cargo.toml\n!/.gitignore\n!/README.md\n\n"
  },
  {
    "path": "Cargo.toml",
    "chars": 311,
    "preview": "[package]\nname = \"fzero\"\nversion = \"0.1.0\"\nauthors = [\"Brandon Falk <bfalk@gamozolabs.com>\"]\nedition = \"2018\"\nlicense = "
  },
  {
    "path": "README.md",
    "chars": 4355,
    "preview": "# Intro\n\n`fzero` is a grammar-based fuzzer that generates a Rust application inspired\nby the paper \"Building Fast Fuzzer"
  },
  {
    "path": "html.json",
    "chars": 17183,
    "preview": "{\n    \"<start>\": [[\"<_l_>\", \"!DOCTYPE html\", \"<_r_>\", \"<html_document>\"]],\n    \"<_l_>\": [[\"<\"]],\n    \"<_r_>\": [[\">\"]],\n "
  },
  {
    "path": "json.json",
    "chars": 2502,
    "preview": "{\n    \"<start>\": [[\"<json>\"]],\n    \"<json>\": [[\"<element>\"]],\n    \"<element>\": [[\"<ws>\", \"<value>\", \"<ws>\"]],\n    \"<valu"
  },
  {
    "path": "src/main.rs",
    "chars": 14058,
    "preview": "use std::collections::{BTreeMap, BTreeSet};\nuse std::path::Path;\nuse std::process::Command;\nuse serde::{Deserialize, Ser"
  }
]

About this extraction

This page contains the full source code of the gamozolabs/fzero_fuzzer GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 6 files (37.6 KB), approximately 12.0k tokens, and a symbol index with 10 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!