Repository: kpcyrd/i-probably-didnt-backdoor-this Branch: main Commit: 2a9ebd529199 Files: 16 Total size: 50.0 KB Directory structure: gitextract_6sjkbplh/ ├── .gitignore ├── Cargo.toml ├── Dockerfile ├── LICENSE-APACHE ├── LICENSE-MIT ├── Makefile ├── PKGBUILD ├── README.md ├── src/ │ └── main.rs └── writeups/ ├── archlinux.md ├── cargo-lock.md ├── cargo-toml.md ├── dockerfile.md ├── main-rs.md ├── makefile.md └── pkgbuild.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ /target /src.tgz ================================================ FILE: Cargo.toml ================================================ [package] name = "asdf" version = "0.1.1" edition = "2018" # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html [dependencies] ================================================ FILE: Dockerfile ================================================ FROM docker.io/rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64 WORKDIR /app COPY . . RUN cargo build --release --locked --target=x86_64-unknown-linux-musl FROM docker.io/alpine@sha256:eb3e4e175ba6d212ba1d6e04fc0782916c08e1c9d7b45892e9796141b1d379ae COPY --from=0 /app/target/x86_64-unknown-linux-musl/release/asdf /asdf ENTRYPOINT ["/asdf"] ================================================ FILE: LICENSE-APACHE ================================================ Apache License Version 2.0, January 2004 https://www.apache.org/licenses/LICENSE-2.0 TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: LICENSE-MIT ================================================ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: Makefile ================================================ build: docker run --rm -v "$(PWD):/app" -w /app -u "$(shell id -u):$(shell id -g)" \ rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64 \ cargo build --release --locked --target=x86_64-unknown-linux-musl docker: sudo buildah bud --timestamp 0 --tag asdf src.tgz: git archive -o src.tgz HEAD .PHONY: build docker src.tgz ================================================ FILE: PKGBUILD ================================================ pkgname=i-probably-didnt-backdoor-this pkgver=0.1.1 pkgrel=1 arch=('x86_64') makedepends=('cargo') source=(src.tgz) sha256sums=(SKIP) build() { cargo build --release --locked } package() { install -Dm 755 target/release/asdf -t "${pkgdir}/usr/bin/" } ================================================ FILE: README.md ================================================ # I probably didn't backdoor this This is a practical attempt at shipping a program and having reasonably solid evidence there's probably no backdoor. All source code is annotated and there are instructions explaining how to use reproducible builds to rebuild the artifacts distributed in this repository from source. The idea is shifting the burden of proof from "you need to prove there's a backdoor" to "we need to prove there's probably no backdoor". This repository is less about code (we're going to try to keep code at a minimum actually) and instead contains technical writing that explains why these controls are effective and how to verify them. You are very welcome to adopt the techniques used here in your projects. The author should be assumed to be your average software developer, who might be suspiciously good with computer security, but doesn't have nation-state capabilities. ## Contents - [Preparing retroactive reviews](#preparing-retroactive-reviews) - [Pinned external resources](#pinned-external-resources) - [Reading the source code](#reading-the-source-code) - [Reproducing the ELF binary](#reproducing-the-elf-binary) - [Reproducing the Docker image](#reproducing-the-docker-image) - [Reproducing the Arch Linux package](#reproducing-the-arch-linux-package) - [Notes on security patches](#notes-on-security-patches) - [How is this related to Reproducible Builds](#how-is-this-related-to-reproducible-builds) - [Similar work](#similar-work) ### Preparing retroactive reviews Since "reading the source code" requires advanced domain knowledge, this section describes a pen-and-paper aproach that can be used to cryptographically ensure you can retro-actively review what you executed, even if you didn't review before you executed it. Pen-and-paper should be taken literally here to ensure this can't be modified by software. If done correctly, you don't need to read the other sections immediately, instead you're creating an immutable papertrail that can later be used by a subject matter expert. Note that the review needs to happen on a different computer than the one that executed the code, for safety reasons. Because it's in the authors interest to prove there are no backdoors, all external resources that are not contained within this repository need to be referred to in a way that's addressing its content (more on this in the next section). We're starting with the main repository by cloning it and showing the commit hash we're about to work with: ```sh $ git clone https://github.com/kpcyrd/i-probably-didnt-backdoor-this $ cd i-probably-didnt-backdoor-this/ $ git rev-parse HEAD aabbccddeeff00112233445566778899aabbccdd ``` The hash in the last line is going to be different for you. This 40 character id is what you need for your paper trail, you need to write this down (preferably along with the current date) and keep it in a safe location. It needs to be protected from undetected tampering but isn't secret, so you may create copies or even post it publicly. This id uniquely identifies all files in this repository with their content. If a file is modified/removed/added/renamed in this repository, this hash changes too. If you want to read more about the cryptographic properties behind this, look into [Merkel trees](https://en.wikipedia.org/wiki/Merkle_tree). ### Pinned external resources In the previous section we've described how git is automatically tracking the content of all files in this repository with a single hash. Software projects often rely on external resources downloaded from the internet, like libraries. Downloading resources from the internet doesn't weaken what we've established in the previous section, as long as: 1. The content of the resource is pinned with a cryptographic hash and the hash is recorded in the git repository. 2. We can be reasonably sure the resource is not going to disappear. If they disappear you could attempt to use backup copies, as long as they match the cryptographic hash in the repository. If at least one of those two doesn't apply we "broke the chain of custody". We don't have to implement this ourselves, but `cargo` and `docker` implement this internally. ### Reading the source code The repository contains 6 source code files, there's a writeup for each of them. Files ending with `.md` are documentation. - [`Cargo.toml`](writeups/cargo-toml.md) - Contains metadata about the project and a list of dependencies (if any) - [`Cargo.lock`](writeups/cargo-lock.md) - Automatically generated, records sha256 checksums for all dependencies - [`src/main.rs`](writeups/main-rs.md) - The actual source code of our program - [`Makefile`](writeups/makefile.md) - A wrapper script with build instructions - [`Dockerfile`](writeups/dockerfile.md) - Contains build instructions for a container image - [`PKGBUILD`](writeups/pkgbuild.md) - Contains build instructions for an Arch Linux package ### Reproducing the ELF binary The binary is built in a docker container, the exact command can be found in the [`Makefile`](writeups/makefile.md). Running make executes the build in a specific Docker image (the official rust 1.54.0 alpine 3.14 docker image). Because the build environment is pinned and there's nothing introducing non-determinism to the build (like recording the build time), running the build on different computers (or even operating systems) should always result in the same binary. Start the build with this command: ```sh $ make ``` This command should finish quite quickly and produces a binary that matches this checksum: ```sh $ b2sum target/x86_64-unknown-linux-musl/release/asdf cd112870cdf12052e5604e7559e45f95cac4e52a45e91c9d9285a22a82c6392e95fbf0dc5f784837e7769a3ce14c898c866a85e4d60b051d3416875e301e28aa target/x86_64-unknown-linux-musl/release/asdf ``` Downloading and hashing the pre-compiled binary from the [releases page](https://github.com/kpcyrd/i-probably-didnt-backdoor-this/releases) should give you an identical hash: ```sh $ curl -LsS 'https://github.com/kpcyrd/i-probably-didnt-backdoor-this/releases/download/v0.1.1/asdf' | b2sum - cd112870cdf12052e5604e7559e45f95cac4e52a45e91c9d9285a22a82c6392e95fbf0dc5f784837e7769a3ce14c898c866a85e4d60b051d3416875e301e28aa - ``` If you get the same checksum you've successfully reproduced the binary. If there's no difference between the pre-compiled binary and the one you built yourself this means the pre-compiled binary is just as trustworthy as the one you built yourself. ### Reproducing the Docker image There's a Dockerfile in the repository that always produces the same bit-for-bit identical image. It's a multi-stage build, so it builds the binary in one temporary image and then creates the real image with just `FROM`, `COPY` and `ENTRYPOINT`. The build environment is virtually identical to what we're using in the previous section, then we're copying it over into an Alpine image that's pinned by its sha256 hash. ```sh $ make docker sudo buildah bud --timestamp 0 --tag asdf [1/2] STEP 1/4: FROM docker.io/rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64 [1/2] STEP 2/4: WORKDIR /app [1/2] STEP 3/4: COPY . . [1/2] STEP 4/4: RUN cargo build --release --locked --target=x86_64-unknown-linux-musl Finished release [optimized] target(s) in 0.02s [2/2] STEP 1/3: FROM docker.io/alpine@sha256:eb3e4e175ba6d212ba1d6e04fc0782916c08e1c9d7b45892e9796141b1d379ae [2/2] STEP 2/3: COPY --from=0 /app/target/x86_64-unknown-linux-musl/release/asdf /asdf [2/2] STEP 3/3: ENTRYPOINT ["/asdf"] [2/2] COMMIT asdf Getting image source signatures Copying blob bc276c40b172 skipped: already exists Copying blob 7d377d49a080 done Copying config f0b71b1591 done Writing manifest to image destination Storing signatures --> f0b71b1591c Successfully tagged localhost/asdf:latest f0b71b1591cf50cf3609494187083741c1021fd99f6168ab8283c4390954cef1 ``` The last line is the hash of the image we just built. We're using buildah to build the image because there's no way to set the layer timestamp with docker (causing the hash to vary). Unfortunately buildah records it's version, this image has been built with `1.22.3`. The pre-compiled images can be found on the [container registry](https://github.com/kpcyrd/i-probably-didnt-backdoor-this/pkgs/container/i-probably-didnt-backdoor-this) (also linked in the side-bar on the right). Pull the image with this command: ```sh $ docker pull ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest latest: Pulling from kpcyrd/i-probably-didnt-backdoor-this 50341f5fa632: Already exists 163594b80890: Pull complete Digest: sha256:11cc7ec2b907a325fa3565039d990a466a7d83a06aa7dffdebba38d495d1571d Status: Downloaded newer image for ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest ``` You'll noticed the hash doesn't seem to match at first, but if everything worked the image id is indeed the same: ```sh $ docker images --no-trunc ghcr.io/kpcyrd/i-probably-didnt-backdoor-this REPOSITORY TAG IMAGE ID CREATED SIZE ghcr.io/kpcyrd/i-probably-didnt-backdoor-this latest sha256:f0b71b1591cf50cf3609494187083741c1021fd99f6168ab8283c4390954cef1 51 years ago 9.38MB ``` ### Reproducing the Arch Linux package There's a custom Arch Linux repository that's distributing a pre-built package: ``` [i-probably-didnt-backdoor-this] Server = https://pkgbuild.com/~kpcyrd/$repo/os/$arch/ ``` This package can be reproduced from source, the full writeup for this can be found in [this document](writeups/archlinux.md). ### Notes on security patches We've pinned very specific versions in multiple places (including the compiler). This is often considered bad style since we're now in charge of keeping all of this updated. If you're adopting this in your own project you should periodically release new versions, even if you aren't making any changes to the code anymore. This also applies to many modern programming ecosystems these days due to lock files. The following places need to be updated occasionally, causing the artifact hashes to change. - Dependencies in Cargo.toml/Cargo.lock (if any, cargo update) - `FROM` lines in Dockerfile (docker pull rust:alpine, docker pull alpine:latest) - The build image in the Makefile (docker pull rust:alpine) ### How is this related to Reproducible Builds There's quite a bit of overlap with the [reproducible builds](https://reproducible-builds.org) project. The techniques used to rebuild the binary artifacts are only possible because the builds for this project are [reproducible](https://reproducible-builds.org/docs/definition/). This project also attempts to exclusively use binaries distributed by high-profile targets like Alpine Linux and the Rust project. This is commonly accepted as "reasonable" in the wider tech industry, but makes their build servers and signing keys extremely valuable. The reproducible builds effort attempts to reduce this risk by allowing independent parties to "reproduce" their packages with "confirmation rebuilds", just like you did when following the instructions here! ### Similar work - [Verifying a Tails image for reproducibility](https://tails.boum.org/contribute/build/reproducible/) - [Reproducing Monero Binaries](https://github.com/monero-project/monero/blob/master/contrib/gitian/README.md) - [i-probably-didnt-backdoor-this](https://github.com/bureado/i-probably-didnt-backdoor-this) fork by @bureado using unprivileged build containers with podman - [A similar guide using NixOS and Bazel](https://github.com/mlieberman85/reproducible-examples) by @mlieberman85 ## Acknowledgments This project was funded by Google, The Linux Foundation, and people like you and me through [GitHub sponsors](https://github.com/sponsors/kpcyrd). ♥️♥️♥️ ## License Licensed under either of Apache License, Version 2.0 or MIT license at your option. ================================================ FILE: src/main.rs ================================================ fn main() { println!("Hello, world!"); } ================================================ FILE: writeups/archlinux.md ================================================ ### Reproducing the Arch Linux package There's a custom Arch Linux repository that's distributing this package. [i-probably-didnt-backdoor-this] Server = https://pkgbuild.com/~kpcyrd/$repo/os/$arch/ This repository contains these 6 files: - [i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst) This is the pre-built package that's going to be installed on the system. This is the file we want to reproduce. - [i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst.sig](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst.sig) This is a signature that we can use to verify the previous file was signed by somebody with control over a specific private key. This signature is also included in the .db file, so this file might not get downloaded. - [i-probably-didnt-backdoor-this.db](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this.db) This contains an index of all packages in the repository. pacman downloads this to learn about the packages in this repository. - [i-probably-didnt-backdoor-this.db.tar.gz](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this.db.tar.gz) identical with the .db file for compatibility reasons. - [i-probably-didnt-backdoor-this.files](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this.files) This contains an index of all the files in each package, this is only used by `pacman -F` operations. - [i-probably-didnt-backdoor-this.files.tar.gz](https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this.files.tar.gz) identical with the .files file for compatibility reasons. This repository has been setup from this git repository using the following commands. You don't need to run them, they're only included for documentation purpose. - `extra-x86_64-build` to build the package from the [`PKGBUILD`](pkgbuild.md) instructions - `gpg --detach-sign --no-armor -u kpcyrd@archlinux.org i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst` signs the package with my key - `repo-add i-probably-didnt-backdoor-this.db.tar.gz i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst` this creates a package database with our package `extra-x86_64-build` uses a clean chroot and records the environment in a file called `.BUILDINFO` that's embedded in the package. We download the package and list what's inside of it: $ wget https://pkgbuild.com/~kpcyrd/i-probably-didnt-backdoor-this/os/x86_64/i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst [...] $ b2sum i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst 84f398fae04a0d73647a7074dd9669e7899d8b702c30c928606ec2b0fde739b8131f817e77631ed0ffef4696380fa5478a0ebe0aa4e0b3d33ea7e3ee2910678c i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst $ tar tf i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst .BUILDINFO .MTREE .PKGINFO usr/ usr/bin/ usr/bin/asdf There's the .BUILDINFO file we already mentioned before, 2 files that are used internally by pacman, and the binary that is distributed in this package. We only really care about the `.BUILDINFO` file, let's look into it: $ tar xfO i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst .BUILDINFO format = 2 pkgname = i-probably-didnt-backdoor-this pkgbase = i-probably-didnt-backdoor-this pkgver = 0.1.1-1 pkgarch = x86_64 pkgbuild_sha256sum = b6e4c48bf0c3ee73ef80e6f29e646f191b204695b725684664a64c12cb7ee150 packager = kpcyrd builddate = 1630668222 builddir = /build startdir = /startdir buildtool = makepkg buildtoolver = 6.0.0 buildenv = !distcc buildenv = color buildenv = !ccache buildenv = check buildenv = !sign options = strip options = docs options = !libtool options = !staticlibs options = emptydirs options = zipman options = purge options = !debug installed = acl-2.3.1-1-x86_64 installed = archlinux-keyring-20210820-1-any installed = attr-2.5.1-1-x86_64 installed = audit-3.0.4-1-x86_64 installed = autoconf-2.71-1-any installed = automake-1.16.4-1-any installed = bash-5.1.008-1-x86_64 installed = binutils-2.36.1-3-x86_64 installed = bison-3.7.6-1-x86_64 installed = bzip2-1.0.8-4-x86_64 installed = ca-certificates-20210603-1-any installed = ca-certificates-mozilla-3.69.1-1-x86_64 installed = ca-certificates-utils-20210603-1-any installed = coreutils-8.32-1-x86_64 installed = curl-7.78.0-1-x86_64 installed = db-5.3.28-5-x86_64 installed = diffutils-3.8-1-x86_64 installed = e2fsprogs-1.46.4-1-x86_64 installed = elfutils-0.185-1-x86_64 installed = expat-2.4.1-1-x86_64 installed = fakeroot-1.25.3-2-x86_64 installed = file-5.40-5-x86_64 installed = filesystem-2021.05.31-1-x86_64 installed = findutils-4.8.0-1-x86_64 installed = flex-2.6.4-3-x86_64 installed = gawk-5.1.0-1-x86_64 installed = gc-8.0.4-4-x86_64 installed = gcc-11.1.0-1-x86_64 installed = gcc-libs-11.1.0-1-x86_64 installed = gdbm-1.20-1-x86_64 installed = gettext-0.21-1-x86_64 installed = glib2-2.68.4-1-x86_64 installed = glibc-2.33-5-x86_64 installed = gmp-6.2.1-1-x86_64 installed = gnupg-2.2.29-1-x86_64 installed = gnutls-3.7.2-2-x86_64 installed = gpgme-1.16.0-1-x86_64 installed = grep-3.6-1-x86_64 installed = groff-1.22.4-6-x86_64 installed = guile-2.2.7-1-x86_64 installed = gzip-1.10-3-x86_64 installed = iana-etc-20210728-1-any installed = icu-69.1-1-x86_64 installed = keyutils-1.6.3-1-x86_64 installed = krb5-1.19.1-1-x86_64 installed = less-1:590-1-x86_64 installed = libarchive-3.5.2-1-x86_64 installed = libassuan-2.5.5-1-x86_64 installed = libcap-2.53-1-x86_64 installed = libcap-ng-0.8.2-3-x86_64 installed = libcroco-0.6.13-2-x86_64 installed = libedit-20210522_3.1-1-x86_64 installed = libelf-0.185-1-x86_64 installed = libffi-3.3-4-x86_64 installed = libgcrypt-1.9.4-1-x86_64 installed = libgpg-error-1.42-1-x86_64 installed = libidn2-2.3.2-1-x86_64 installed = libksba-1.6.0-1-x86_64 installed = libldap-2.4.59-2-x86_64 installed = libmpc-1.2.1-1-x86_64 installed = libnghttp2-1.44.0-1-x86_64 installed = libp11-kit-0.24.0-1-x86_64 installed = libpsl-0.21.1-1-x86_64 installed = libsasl-2.1.27-3-x86_64 installed = libseccomp-2.5.1-2-x86_64 installed = libsecret-0.20.4-1-x86_64 installed = libssh2-1.9.0-3-x86_64 installed = libtasn1-4.17.0-1-x86_64 installed = libtirpc-1.3.2-1-x86_64 installed = libtool-2.4.6+42+gb88cebd5-16-x86_64 installed = libunistring-0.9.10-3-x86_64 installed = libxcrypt-4.4.25-1-x86_64 installed = libxml2-2.9.10-9-x86_64 installed = linux-api-headers-5.12.3-1-any installed = llvm-libs-12.0.1-3-x86_64 installed = lz4-1:1.9.3-2-x86_64 installed = m4-1.4.19-1-x86_64 installed = make-4.3-3-x86_64 installed = mpfr-4.1.0.p13-1-x86_64 installed = ncurses-6.2-2-x86_64 installed = nettle-3.7.3-1-x86_64 installed = npth-1.6-3-x86_64 installed = openssl-1.1.1.l-1-x86_64 installed = p11-kit-0.24.0-1-x86_64 installed = pacman-6.0.0-5-x86_64 installed = pacman-mirrorlist-20210822-1-any installed = pam-1.5.1-1-x86_64 installed = pambase-20210605-2-any installed = patch-2.7.6-8-x86_64 installed = pcre-8.45-1-x86_64 installed = pcre2-10.37-1-x86_64 installed = perl-5.34.0-2-x86_64 installed = pinentry-1.1.1-1-x86_64 installed = pkgconf-1.7.3-1-x86_64 installed = readline-8.1.001-1-x86_64 installed = rust-1:1.54.0-1-x86_64 installed = sed-4.8-1-x86_64 installed = shadow-4.8.1-4-x86_64 installed = sqlite-3.36.0-1-x86_64 installed = sudo-1.9.7.p2-1-x86_64 installed = systemd-libs-249.4-1-x86_64 installed = tar-1.34-1-x86_64 installed = texinfo-6.8-2-x86_64 installed = tzdata-2021a-2-x86_64 installed = util-linux-2.37.2-1-x86_64 installed = util-linux-libs-2.37.2-1-x86_64 installed = which-2.21-5-x86_64 installed = xz-5.2.5-1-x86_64 installed = zlib-1:1.2.11-4-x86_64 installed = zstd-1.5.0-1-x86_64 This seems like a lot of control from a file we just downloaded from the internet, but note that most of these are only informal and the tooling we're going to use only allows packages to be installed that have been officially published in Arch Linux. The packages listed here are basically a base-devel install plus `makedepends=`. For official Arch Linux packages the field `pkgbase` is used with [`asp`](https://man.archlinux.org/man/extra/asp/asp.1.en) to fetch the build instructions and `pkgbuild_sha256sum` is used to identify the right commit. For technical reasons the commit id is not yet available when the Arch Linux package is built. Because in our case everything is contained within this repository and we already know the right commit, so we can skip this step. The Arch Linux tooling expects a tar ball, so we're generating one from this repository. The commands that are used for this can be found in the [`Makefile`](makefile.md). $ make src.tgz Using the Arch Linux reproducible builds tooling we're taking the build environment from the package, the [`PKGBUILD`](pkgbuild.md) build instructions from this repository, and the `src.tgz` tar ball we've just generated and attempt to create an identical package. `makerepropkg` can be installed with `pacman -S devtools`. $ makerepropkg i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst Create subvolume '/var/lib/archbuild/reproducible/root' ==> Creating install root at /var/lib/archbuild/reproducible/root ==> Installing packages to /var/lib/archbuild/reproducible/root warning: database file for 'core' does not exist (use '-Sy' to download) warning: database file for 'extra' does not exist (use '-Sy' to download) warning: database file for 'community' does not exist (use '-Sy' to download) loading packages... resolving dependencies... looking for conflicting packages... Packages (110) acl-2.3.1-1 archlinux-keyring-20210820-1 attr-2.5.1-1 audit-3.0.4-1 autoconf-2.71-1 automake-1.16.4-1 bash-5.1.008-1 binutils-2.36.1-3 bison-3.7.6-1 bzip2-1.0.8-4 ca-certificates-20210603-1 ca-certificates-mozilla-3.69.1-1 ca-certificates-utils-20210603-1 coreutils-8.32-1 curl-7.78.0-1 db-5.3.28-5 diffutils-3.8-1 e2fsprogs-1.46.4-1 elfutils-0.185-1 expat-2.4.1-1 fakeroot-1.25.3-2 file-5.40-5 filesystem-2021.05.31-1 findutils-4.8.0-1 flex-2.6.4-3 gawk-5.1.0-1 gc-8.0.4-4 gcc-11.1.0-1 gcc-libs-11.1.0-1 gdbm-1.20-1 gettext-0.21-1 glib2-2.68.4-1 glibc-2.33-5 gmp-6.2.1-1 gnupg-2.2.29-1 gnutls-3.7.2-2 gpgme-1.16.0-1 grep-3.6-1 groff-1.22.4-6 guile-2.2.7-1 gzip-1.10-3 iana-etc-20210728-1 icu-69.1-1 keyutils-1.6.3-1 krb5-1.19.1-1 less-1:590-1 libarchive-3.5.2-1 libassuan-2.5.5-1 libcap-2.53-1 libcap-ng-0.8.2-3 libcroco-0.6.13-2 libedit-20210522_3.1-1 libelf-0.185-1 libffi-3.3-4 libgcrypt-1.9.4-1 libgpg-error-1.42-1 libidn2-2.3.2-1 libksba-1.6.0-1 libldap-2.4.59-2 libmpc-1.2.1-1 libnghttp2-1.44.0-1 libp11-kit-0.24.0-1 libpsl-0.21.1-1 libsasl-2.1.27-3 libseccomp-2.5.1-2 libsecret-0.20.4-1 libssh2-1.9.0-3 libtasn1-4.17.0-1 libtirpc-1.3.2-1 libtool-2.4.6+42+gb88cebd5-16 libunistring-0.9.10-3 libxcrypt-4.4.25-1 libxml2-2.9.10-9 linux-api-headers-5.12.3-1 llvm-libs-12.0.1-3 lz4-1:1.9.3-2 m4-1.4.19-1 make-4.3-3 mpfr-4.1.0.p13-1 ncurses-6.2-2 nettle-3.7.3-1 npth-1.6-3 openssl-1.1.1.l-1 p11-kit-0.24.0-1 pacman-6.0.0-5 pacman-mirrorlist-20210822-1 pam-1.5.1-1 pambase-20210605-2 patch-2.7.6-8 pcre-8.45-1 pcre2-10.37-1 perl-5.34.0-2 pinentry-1.1.1-1 pkgconf-1.7.3-1 readline-8.1.001-1 rust-1:1.54.0-1 sed-4.8-1 shadow-4.8.1-4 sqlite-3.36.0-1 sudo-1.9.7.p2-1 systemd-libs-249.4-1 tar-1.34-1 texinfo-6.8-2 tzdata-2021a-2 util-linux-2.37.2-1 util-linux-libs-2.37.2-1 which-2.21-5 xz-5.2.5-1 zlib-1:1.2.11-4 zstd-1.5.0-1 Total Installed Size: 1330.71 MiB :: Proceed with installation? [Y/n] (110/110) checking keys in keyring [#######################################] 100% (110/110) checking package integrity [#######################################] 100% (110/110) loading package files [#######################################] 100% (110/110) checking for file conflicts [#######################################] 100% (110/110) checking available disk space [#######################################] 100% :: Processing package changes... ( 1/110) installing linux-api-headers [#######################################] 100% ( 2/110) installing tzdata [#######################################] 100% ( 3/110) installing iana-etc [#######################################] 100% ( 4/110) installing filesystem [#######################################] 100% This takes some time but should eventually print the following text at the end: ==> Extracting sources... -> Extracting src.tgz with bsdtar ==> Starting build()... Compiling asdf v0.1.1 (/build/i-probably-didnt-backdoor-this/src) Finished release [optimized] target(s) in 1.63s ==> Entering fakeroot environment... ==> Starting package()... ==> Tidying install... -> Removing libtool files... -> Purging unwanted files... -> Removing static library files... -> Stripping unneeded symbols from binaries and libraries... -> Compressing man and info pages... ==> Checking for packaging issues... ==> Creating package "i-probably-didnt-backdoor-this"... -> Generating .PKGINFO file... -> Generating .BUILDINFO file... warning: database file for 'core' does not exist (use '-Sy' to download) warning: database file for 'extra' does not exist (use '-Sy' to download) warning: database file for 'community' does not exist (use '-Sy' to download) -> Generating .MTREE file... -> Compressing package... ==> Leaving fakeroot environment. ==> Finished making: i-probably-didnt-backdoor-this 0.1.1-1 (Fri 03 Sep 2021 01:26:15 PM CEST) -> built succeeded! built packages can be found in /var/lib/archbuild/reproducible/testenv/pkgdest ==> comparing artifacts... -> Package 'i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst' successfully reproduced! This means we've successfully built a package from source that is bit-for-bit identical, including every file inside of it. The first is the package we downloaded, the second is the package we built from source. ```sh $ b2sum i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst /var/lib/archbuild/reproducible/testenv/pkgdest/i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst 84f398fae04a0d73647a7074dd9669e7899d8b702c30c928606ec2b0fde739b8131f817e77631ed0ffef4696380fa5478a0ebe0aa4e0b3d33ea7e3ee2910678c i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst 84f398fae04a0d73647a7074dd9669e7899d8b702c30c928606ec2b0fde739b8131f817e77631ed0ffef4696380fa5478a0ebe0aa4e0b3d33ea7e3ee2910678c /var/lib/archbuild/reproducible/testenv/pkgdest/i-probably-didnt-backdoor-this-0.1.1-1-x86_64.pkg.tar.zst ``` ================================================ FILE: writeups/cargo-lock.md ================================================ # `Cargo.lock` This file is automatically generated, it usually contains checksums of the dependencies of the project, but since our project doesn't have any it only shows our name and version that we've seen in [`Cargo.toml`](cargo-toml.md). ```toml # This file is automatically @generated by Cargo. # It is not intended for manual editing. version = 3 [[package]] name = "asdf" version = "0.1.1" ``` ================================================ FILE: writeups/cargo-toml.md ================================================ # `Cargo.toml` This is mostly defaults from running `cargo new --bin asdf`: ```toml [package] name = "asdf" version = "0.1.1" edition = "2018" # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html [dependencies] ``` - `name = "asdf"` is the name of our project. This isn't really used for anything but the binary name. - `version = "0.1.1"` is the version of our project. - `edition = "2018"` this means we're opting into new features of the Rust compiler. If unspecified rustc is using the 2015 edition, the 2018 edition is the default for new projects. The `[dependencies]` section is able to reference other code that isn't part of this repository, but there are none in this case because there are no lines after this one. ================================================ FILE: writeups/dockerfile.md ================================================ # `Dockerfile` ```dockerfile FROM docker.io/rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64 WORKDIR /app COPY . . RUN cargo build --release --locked --target=x86_64-unknown-linux-musl FROM docker.io/alpine@sha256:eb3e4e175ba6d212ba1d6e04fc0782916c08e1c9d7b45892e9796141b1d379ae COPY --from=0 /app/target/x86_64-unknown-linux-musl/release/asdf /asdf ENTRYPOINT ["/asdf"] ``` The Dockerfile has two stages, the first one compiles the source code: - `FROM docker.io/rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64` This describes the image we use to build the binary for our Docker image. This is the same image we also referenced in the Makefile. - `WORKDIR /app` By default the working directory is `/`, this line creates a folder that we're going to compile in and change the working directory to this folder. - `COPY . .` This copies the content of the current directory of the build system into the container. The current dirctory on the build system is expected to be the folder that this repository was cloned to, so the files that are copied are only those from the git repository. - `RUN cargo build --release --locked --target=x86_64-unknown-linux-musl` This compiles the rust source code into a binary. The command is explained in detail in the [`Makefile`](makefile.md) writeup. This commnand is likely making our temporary The temporary build image is done at this point, the second `FROM` starts a new image: - `FROM docker.io/alpine@sha256:eb3e4e175ba6d212ba1d6e04fc0782916c08e1c9d7b45892e9796141b1d379ae` This is the base image for our final image that we're going to upload. The rust compiler is not required anymore so we're using one of the official [alpine](https://hub.docker.com/_/alpine) images here. - `COPY --from=0 /app/target/x86_64-unknown-linux-musl/release/asdf /asdf` This copies the build artifact from the previous image into the final Docker image. If this binary is reproducible the final Docker image is going to be reproducible too. - `ENTRYPOINT ["/asdf"]` This sets the `/asdf` binary we just copied into the container as the entrypoint, meaning if the container is executed, this binary runs, nothing else. ================================================ FILE: writeups/main-rs.md ================================================ # `src/main.rs` This is our program. We're creating a function called `main` that contains a call to the `println!` macro to print the string `Hello, world!` to stdout. ```rust fn main() { println!("Hello, world!"); } ``` ================================================ FILE: writeups/makefile.md ================================================ # `Makefile` This is likely the most complicated file in this repository, we're going to explain this in depth. ```make build: docker run --rm -v "$(PWD):/app" -w /app -u "$(shell id -u):$(shell id -g)" \ rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64 \ cargo build --release --locked --target=x86_64-unknown-linux-musl docker: sudo buildah bud --timestamp 0 --tag asdf src.tgz: git archive -o src.tgz HEAD .PHONY: build docker src.tgz ``` ## `build:` This builds the ELF binary. - `docker run` This means we're going to run a [docker container](https://www.docker.com/resources/what-container) - `--rm` This means the container should be temporary and is going to be deleted after our command completes. - `-v "$(PWD):/app"` This means we're going to make the current directory available in the container at `/app`. - `-w /app` This means we want to run our command inside of `/app` - `-u "$(shell id -u):$(shell id -g)" ` This means the user id and group id of the process in the container should be equal to the user id and group id of the host system. This is important on Linux because of the mount, with docker for macOS this setting is actually optional. - `rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64` This specifies one specific Rust image by it's checksum. This is important to document which compiler was used to build the binary. If you use the same compiler you're also going to get a 100% identical binary (note this might not be the case with more complex builds). - `cargo build --release --locked` This is the command we're running inside the container. This is compiling the binary with optimizations from the release profile. - `--locked` This option explicitly says the dependencies in the Cargo.lock file must be used (none, in our case). - `--target=x86_64-unknown-linux-musl` This specifies the target system we want to support, in our case that's the cpu architecture and linux with musl libc. This currently implies statically linked builds in the rust world. ## `docker:` This builds the Docker image. - `sudo buildah` This means we're going to run [`buildah`](https://buildah.io/) as root. `buildah` is a tool to build container images, its less known than docker but has a unique feature that we're going to use here. The final image can be used with docker too. - `bud` This is short for build-using-dockerfile, it means we're going to execute the instructions in our [`Dockerfile`](dockerfile.md). - `--timestamp 0` This causes buildah to hardcode the build time to `1970-01-01 00:00:00`. Usually the current time is used instead, which would make the image indeterministic. - `--tag asdf` This means our newly built image should be tagged with the name `asdf`. ## `src.tgz:` This snapshots all the source code from this git commit into an archive. This is only needed for the Arch Linux package. - `git archive` This subcommand creates an archive that contains all the code from a given commit. - `-o src.tgz` This specifies the file the archive should be written to. - `HEAD` Refers to the commit we've currently checked out. ## `.PHONY: build docker` This means `build` and `docker` are target names, not file names. The commands should execute even if a file with that name already exists. ================================================ FILE: writeups/pkgbuild.md ================================================ # `PKGBUILD` This file is only used if you're following the "Reproducing the Arch Linux package" instructions. ```sh pkgname=i-probably-didnt-backdoor-this pkgver=0.1.1 pkgrel=1 arch=('x86_64') makedepends=('cargo') source=(src.tgz) sha256sums=(SKIP) build() { cargo build --release --locked } package() { install -Dm 755 target/release/asdf -t "${pkgdir}/usr/bin/" } ``` This is the bare minimum to create a working package, even though it doesn't meet the Arch Linux packaging standards (namcap is going to print some warnings about this). ```sh pkgname=i-probably-didnt-backdoor-this pkgver=0.1.1 pkgrel=1 arch=('x86_64') ``` These fields are required in a PKGBUILD, it contains the package name and the package version. `pkgrel=` is a "revision" in case we need to release a new package for the same upstream version. `arch=` is a list of supported architectures, for simplicity we set this to just `x86_64`. ```sh makedepends=('cargo') ``` In addition to base-devel, which we can assume is always installed in the build container, we also need the rust build system. Note that we don't specify which version we want, for the initial build this is automatically going to resolve to "the latest in Arch Linux", for our rebuild we're going to pick the one that's specified in the `.BUILDINFO` file of the package we want to reproduce. There's an in-depth explanation in the [Arch Linux package writeup](archlinux.md). ```sh source=(src.tgz) sha256sums=(SKIP) ``` These are the source inputs, `src.tgz` is a tarball of this repository that we're going to generate with `git archive` as described in the [`Makefile`](makefile.md). ```sh build() { cargo build --release --locked } ``` This function implements the actual build. `--release` specifies the binary should be built with the [standard release profile](https://doc.rust-lang.org/cargo/reference/profiles.html#release). `--locked` means the build MUST use the dependencies defined in [`Cargo.lock`](cargo-lock.md). ```sh package() { install -Dm 755 target/release/asdf -t "${pkgdir}/usr/bin/" } ``` After the build this function implements how the package should be created. - `install -Dm 755` means we want to copy a file, all directories in the destination path should be created as needed, and `755` are the permissions the file should have. This translates to "read/write/execute for root, read/execute for everybody else". - `target/release/asdf` this is the file that was compiled in `build()` that we want to ship in our package. - `-t "${pkgdir}/usr/bin/"` this means the file should be copied into the `/usr/bin/` folder of our package. `install` is going to keep the filename.