master c89655c002c9 cached
22 files
74.2 KB
18.3k tokens
97 symbols
1 requests
Download .txt
Repository: rust-secure-code/cargo-supply-chain
Branch: master
Commit: c89655c002c9
Files: 22
Total size: 74.2 KB

Directory structure:
gitextract_8jmx9lol/

├── .github/
│   └── workflows/
│       └── rust.yml
├── .gitignore
├── CHANGELOG.md
├── Cargo.toml
├── LICENSE-APACHE
├── LICENSE-MIT
├── LICENSE-ZLIB
├── README.md
├── fixtures/
│   └── optional_non_dev_dep/
│       ├── Cargo.toml
│       └── src/
│           └── lib.rs
└── src/
    ├── api_client.rs
    ├── cli.rs
    ├── common.rs
    ├── crates_cache.rs
    ├── main.rs
    ├── publishers.rs
    └── subcommands/
        ├── crates.rs
        ├── json.rs
        ├── json_schema.rs
        ├── mod.rs
        ├── publishers.rs
        └── update.rs

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/workflows/rust.yml
================================================
name: Rust CI
on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        rust: [stable]
    steps:
    - uses: actions/checkout@v2
    - run: rustup default ${{ matrix.rust }}
    - name: build
      run: >
        cargo build --verbose
    - name: test
      run: >
        cargo test --tests
  rustfmt:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: actions-rs/toolchain@v1
      with:
        toolchain: stable
        override: true
        components: rustfmt
    - name: Run rustfmt check
      uses: actions-rs/cargo@v1
      with:
        command: fmt
        args: -- --check
  doc:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        rust: [stable]
    steps:
    - uses: actions/checkout@v2
    - run: rustup default ${{ matrix.rust }}
    - name: doc 
      run: >
        cargo doc --no-deps --document-private-items --all-features


================================================
FILE: .gitignore
================================================
target


================================================
FILE: CHANGELOG.md
================================================
## v0.3.7 (2026-02-05)

 - Updated the caching code to handle the recent changes to crates.io dump format

## v0.3.6 (2026-01-22)

 - Fixed the tool reporting transitive optional dependencies that are disabled by features as part of supply chain surface
 - Removed test JSON data from the git tree, matching the crates.io package to the git state again
 - Upgraded to cargo-metadata v0.23

## v0.3.5 (2025-09-18)

 - Fixed support for Windows by switching from `xdg` crate to `dirs` crate for discovering the cache directory

## v0.3.4 (2025-06-04)

 - Improved the message displayed when the latest data dump is considered outdated (contribution by @smoelius)
 - Bumped dependencies in Cargo.lock by running `cargo update`
 - Resolved some Clippy lints

## v0.3.3 (2023-05-08)

 - Add `--no-dev` flag to omit dev dependencies (contribution by @smoelius)

## v0.3.2 (2022-11-04)

 - Upgrade to `bpaf` 0.7

## v0.3.1 (2021-03-18)

 - Fix `--features` flag not being honored if `--target` is also passed

## v0.3.0 (2021-03-18)

 - Renamed `--cache_max_age` to `--cache-max-age` for consistency with Cargo flags
 - Accept flags such as `--target` directly, without relying on the escape hatch of passing cargo metadata arguments after `--`
 - No longer default to `--all-features`, handle features via the same flags as Cargo itself
 - The json schema is now printed separately, use `cargo supply-chain json --print-schema` to get it
 - Dropped the `help` subcommand. Use `--help` instead, e.g. `cargo supply-chain crates --help`

Internal improvements:

 - Migrate to bpaf CLI parser, chosen for its balance of expressiveness vs complexity and supply chain sprawl
 - Add tests for the CLI interface
 - Do not regenerate the JSON schema on every build; saves a bit of build time and a bit of dependencies in production builds

## v0.2.0 (2021-05-21)

- Added `json` subcommand providing structured output and more details
- Added `-d`, `--diffable` flag for diff-friendly output mode to all subcommands
- Reduced the required download size for `update` subcommand from ~350Mb to ~60Mb
- Added a detailed progress bar to all subcommands using `indicatif`
- Fixed interrupted `update` subcommand considering its cache to be fresh.
  Other subcommands were not affected and would simply fetch live data.
- If a call to `cargo metadata` fails, show an error instead of panicking
- The list of crates in the output of `publishers` subcommand is now sorted

## v0.1.2 (2021-02-24)

- Fix help text sometimes being misaligned
- Change download progress messages to start counting from 1 rather than from 0
- Only print warnings about crates.io that are immediately relevant to listing
  dependencies and publishers

## v0.1.1 (2021-02-18)

- Drop extreaneous files from the tarball uploaded to crates.io

## v0.1.0 (2021-02-18)

- Drop `authors` subcommand
- Add `help` subcommand providing detailed help for each subcommand
- Bring help text more in line with Cargo help text
- Warn about a large amount of data to be downloaded in `update` subcommand
- Buffer reads and writes to cache files for a 6x speedup when using cache

## v0.0.4 (2021-01-01)

- Report failure instead of panicking on network failure in `update` subcommand
- Correctly handle errors returned by the remote server

## v0.0.3 (2020-12-28)

- In case of network failure, retry with exponential backoff up to 3 times
- Use local certificate store instead of bundling the trusted CA certificates
- Refactor argument parsing to use `pico-args` instead of hand-rolled parser

## v0.0.2 (2020-10-14)

- `crates` - Shows the people or groups with publisher rights for each crate.
- `publishers` - Is the reverse of `crates`, grouping by publisher instead.
- `update` - Caches the data dumps from `crates.io` to avoid crawling the web
  service when lookup up publisher and author information.

## v0.0.1 (2020-10-02)

Initial release, supports one command:
- `authors` - Crawl through Cargo.toml of all crates and list their authors.
  Authors might be listed multiple times. For each author, differentiate if
  they are known by being mentioned in a crate from the local workspace or not.
  Support for crawling `crates.io` sourced packages is planned.
- `publishers` - Doesn't do anything right now.


================================================
FILE: Cargo.toml
================================================
[package]
name = "cargo-supply-chain"
version = "0.3.7"
description = "Gather author, contributor, publisher data on crates in your dependency graph"
repository = "https://github.com/rust-secure-code/cargo-supply-chain"
authors = ["Andreas Molzer <andreas.molzer@gmx.de>", "Sergey \"Shnatsel\" Davidoff <shnatsel@gmail.com>"]
edition = "2018"
license = "Apache-2.0 OR MIT OR Zlib"
categories = ["development-tools::cargo-plugins", "command-line-utilities"]

[dependencies]
cargo_metadata = "0.23.0"
csv = "1.1"
flate2 = "1"
humantime = "2"
humantime-serde = "1"
ureq = { version = "2.0.1", default-features=false, features = ["tls", "native-certs", "json"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tar = "0.4.30"
indicatif = "0.17.0"
bpaf = { version = "0.9.1", features = ["derive", "dull-color"] }
anyhow = "1.0.28"
dirs = "6.0.0"

[dev-dependencies]
schemars = "0.8.3"


================================================
FILE: LICENSE-APACHE
================================================
                              Apache License
                        Version 2.0, January 2004
                     http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

   "License" shall mean the terms and conditions for use, reproduction,
   and distribution as defined by Sections 1 through 9 of this document.

   "Licensor" shall mean the copyright owner or entity authorized by
   the copyright owner that is granting the License.

   "Legal Entity" shall mean the union of the acting entity and all
   other entities that control, are controlled by, or are under common
   control with that entity. For the purposes of this definition,
   "control" means (i) the power, direct or indirect, to cause the
   direction or management of such entity, whether by contract or
   otherwise, or (ii) ownership of fifty percent (50%) or more of the
   outstanding shares, or (iii) beneficial ownership of such entity.

   "You" (or "Your") shall mean an individual or Legal Entity
   exercising permissions granted by this License.

   "Source" form shall mean the preferred form for making modifications,
   including but not limited to software source code, documentation
   source, and configuration files.

   "Object" form shall mean any form resulting from mechanical
   transformation or translation of a Source form, including but
   not limited to compiled object code, generated documentation,
   and conversions to other media types.

   "Work" shall mean the work of authorship, whether in Source or
   Object form, made available under the License, as indicated by a
   copyright notice that is included in or attached to the work
   (an example is provided in the Appendix below).

   "Derivative Works" shall mean any work, whether in Source or Object
   form, that is based on (or derived from) the Work and for which the
   editorial revisions, annotations, elaborations, or other modifications
   represent, as a whole, an original work of authorship. For the purposes
   of this License, Derivative Works shall not include works that remain
   separable from, or merely link (or bind by name) to the interfaces of,
   the Work and Derivative Works thereof.

   "Contribution" shall mean any work of authorship, including
   the original version of the Work and any modifications or additions
   to that Work or Derivative Works thereof, that is intentionally
   submitted to Licensor for inclusion in the Work by the copyright owner
   or by an individual or Legal Entity authorized to submit on behalf of
   the copyright owner. For the purposes of this definition, "submitted"
   means any form of electronic, verbal, or written communication sent
   to the Licensor or its representatives, including but not limited to
   communication on electronic mailing lists, source code control systems,
   and issue tracking systems that are managed by, or on behalf of, the
   Licensor for the purpose of discussing and improving the Work, but
   excluding communication that is conspicuously marked or otherwise
   designated in writing by the copyright owner as "Not a Contribution."

   "Contributor" shall mean Licensor and any individual or Legal Entity
   on behalf of whom a Contribution has been received by Licensor and
   subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of
   this License, each Contributor hereby grants to You a perpetual,
   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
   copyright license to reproduce, prepare Derivative Works of,
   publicly display, publicly perform, sublicense, and distribute the
   Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of
   this License, each Contributor hereby grants to You a perpetual,
   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
   (except as stated in this section) patent license to make, have made,
   use, offer to sell, sell, import, and otherwise transfer the Work,
   where such license applies only to those patent claims licensable
   by such Contributor that are necessarily infringed by their
   Contribution(s) alone or by combination of their Contribution(s)
   with the Work to which such Contribution(s) was submitted. If You
   institute patent litigation against any entity (including a
   cross-claim or counterclaim in a lawsuit) alleging that the Work
   or a Contribution incorporated within the Work constitutes direct
   or contributory patent infringement, then any patent licenses
   granted to You under this License for that Work shall terminate
   as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the
   Work or Derivative Works thereof in any medium, with or without
   modifications, and in Source or Object form, provided that You
   meet the following conditions:

   (a) You must give any other recipients of the Work or
       Derivative Works a copy of this License; and

   (b) You must cause any modified files to carry prominent notices
       stating that You changed the files; and

   (c) You must retain, in the Source form of any Derivative Works
       that You distribute, all copyright, patent, trademark, and
       attribution notices from the Source form of the Work,
       excluding those notices that do not pertain to any part of
       the Derivative Works; and

   (d) If the Work includes a "NOTICE" text file as part of its
       distribution, then any Derivative Works that You distribute must
       include a readable copy of the attribution notices contained
       within such NOTICE file, excluding those notices that do not
       pertain to any part of the Derivative Works, in at least one
       of the following places: within a NOTICE text file distributed
       as part of the Derivative Works; within the Source form or
       documentation, if provided along with the Derivative Works; or,
       within a display generated by the Derivative Works, if and
       wherever such third-party notices normally appear. The contents
       of the NOTICE file are for informational purposes only and
       do not modify the License. You may add Your own attribution
       notices within Derivative Works that You distribute, alongside
       or as an addendum to the NOTICE text from the Work, provided
       that such additional attribution notices cannot be construed
       as modifying the License.

   You may add Your own copyright statement to Your modifications and
   may provide additional or different license terms and conditions
   for use, reproduction, or distribution of Your modifications, or
   for any such Derivative Works as a whole, provided Your use,
   reproduction, and distribution of the Work otherwise complies with
   the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,
   any Contribution intentionally submitted for inclusion in the Work
   by You to the Licensor shall be under the terms and conditions of
   this License, without any additional terms or conditions.
   Notwithstanding the above, nothing herein shall supersede or modify
   the terms of any separate license agreement you may have executed
   with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade
   names, trademarks, service marks, or product names of the Licensor,
   except as required for reasonable and customary use in describing the
   origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or
   agreed to in writing, Licensor provides the Work (and each
   Contributor provides its Contributions) on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
   implied, including, without limitation, any warranties or conditions
   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
   PARTICULAR PURPOSE. You are solely responsible for determining the
   appropriateness of using or redistributing the Work and assume any
   risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,
   whether in tort (including negligence), contract, or otherwise,
   unless required by applicable law (such as deliberate and grossly
   negligent acts) or agreed to in writing, shall any Contributor be
   liable to You for damages, including any direct, indirect, special,
   incidental, or consequential damages of any character arising as a
   result of this License or out of the use or inability to use the
   Work (including but not limited to damages for loss of goodwill,
   work stoppage, computer failure or malfunction, or any and all
   other commercial damages or losses), even if such Contributor
   has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing
   the Work or Derivative Works thereof, You may choose to offer,
   and charge a fee for, acceptance of support, warranty, indemnity,
   or other liability obligations and/or rights consistent with this
   License. However, in accepting such obligations, You may act only
   on Your own behalf and on Your sole responsibility, not on behalf
   of any other Contributor, and only if You agree to indemnify,
   defend, and hold each Contributor harmless for any liability
   incurred by, or claims asserted against, such Contributor by reason
   of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

   To apply the Apache License to your work, attach the following
   boilerplate notice, with the fields enclosed by brackets "[]"
   replaced with your own identifying information. (Don't include
   the brackets!)  The text should be enclosed in the appropriate
   comment syntax for the file format. We also recommend that a
   file or class name and description of purpose be included on the
   same "printed page" as the copyright notice for easier
   identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


================================================
FILE: LICENSE-MIT
================================================
MIT License

Copyright (c) 2020 Andreas Molzer aka. HeroicKatora

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: LICENSE-ZLIB
================================================
Copyright (c) 2020 Andreas Molzer aka. HeroicKatora

This software is provided 'as-is', without any express or implied warranty. In
no event will the authors be held liable for any damages arising from the use
of this software.

Permission is granted to anyone to use this software for any purpose, including
commercial applications, and to alter it and redistribute it freely, subject to
the following restrictions:

1. The origin of this software must not be misrepresented; you must not claim
   that you wrote the original software. If you use this software in a product, an
   acknowledgment in the product documentation would be appreciated but is not
   required.

2. Altered source versions must be plainly marked as such, and must not be
   misrepresented as being the original software.

3. This notice may not be removed or altered from any source distribution.


================================================
FILE: README.md
================================================
# cargo-supply-chain

Gather author, contributor and publisher data on crates in your dependency graph.

Use cases include:

- Find people and groups worth supporting.
- Identify risks in your dependency graph.
- An analysis of all the contributors you implicitly trust by building their software. This might have both a sobering and humbling effect.

Sample output when run on itself: [`publishers`](https://gist.github.com/Shnatsel/3b7f7d331d944bb75b2f363d4b5fb43d), [`crates`](https://gist.github.com/Shnatsel/dc0ec81f6ad392b8967e8d3f2b1f5f80), [`json`](https://gist.github.com/Shnatsel/511ad1f87528c450157ef9ad09984745).

## Usage

To install this tool, please run the following command:

```shell
cargo install cargo-supply-chain
```

Then run it with:

```shell
cargo supply-chain publishers
```

By default the supply chain is listed for **all targets** and **default features only**.

You can alter this behavior by passing `--target=…` to list dependencies for a specific target.
You can use `--all-features`, `--no-default-features`, and `--features=…` to control feature selection.

Here's a list of subcommands:

```none
Gather author, contributor and publisher data on crates in your dependency graph

Usage: COMMAND [ARG]…

Available options:
    -h, --help      Prints help information
    -v, --version   Prints version information

Available commands:
    publishers  List all crates.io publishers in the depedency graph
    crates      List all crates in dependency graph and crates.io publishers for each
    json        Like 'crates', but in JSON and with more fields for each publisher
    update      Download the latest daily dump from crates.io to speed up other commands

Most commands also accept flags controlling the features, targets, etc.
See 'cargo supply-chain <command> --help' for more information on a specific command.
```

## License

Triple licensed under any of Apache-2.0, MIT, or zlib terms.


================================================
FILE: fixtures/optional_non_dev_dep/Cargo.toml
================================================
[package]
name = "optional_non_dev_dep"
version = "0.1.0"
edition = "2024"
publish = false

[dependencies]
libz-rs-sys = { version = "=0.5.5", optional = true }

[dev-dependencies]
libz-rs-sys = "=0.5.5"

[workspace]


================================================
FILE: fixtures/optional_non_dev_dep/src/lib.rs
================================================
pub fn add(left: u64, right: u64) -> u64 {
    left + right
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn it_works() {
        let result = add(2, 2);
        assert_eq!(result, 4);
    }
}


================================================
FILE: src/api_client.rs
================================================
use std::time::{Duration, Instant};

pub struct RateLimitedClient {
    last_request_time: Option<Instant>,
    agent: ureq::Agent,
}

impl Default for RateLimitedClient {
    fn default() -> Self {
        RateLimitedClient {
            last_request_time: None,
            agent: ureq::agent(),
        }
    }
}

impl RateLimitedClient {
    pub fn new() -> Self {
        RateLimitedClient::default()
    }

    pub fn get(&mut self, url: &str) -> ureq::Request {
        self.wait_to_honor_rate_limit();
        self.agent.get(url).set(
            "User-Agent",
            "cargo supply-chain (https://github.com/rust-secure-code/cargo-supply-chain)",
        )
    }

    /// Waits until at least 1 second has elapsed since last request,
    /// as per <https://crates.io/data-access>
    fn wait_to_honor_rate_limit(&mut self) {
        if let Some(prev_req_time) = self.last_request_time {
            let next_req_time = prev_req_time + Duration::from_secs(1);
            if let Some(time_to_wait) = next_req_time.checked_duration_since(Instant::now()) {
                std::thread::sleep(time_to_wait);
            }
        }
        self.last_request_time = Some(Instant::now());
    }
}


================================================
FILE: src/cli.rs
================================================
use bpaf::*;
use std::{path::PathBuf, time::Duration};

/// Arguments to be passed to `cargo metadata`
#[derive(Clone, Debug, Bpaf)]
#[bpaf(generate(meta_args))]
pub struct MetadataArgs {
    // `all_features` and `no_default_features` are not mutually exclusive in `cargo metadata`,
    // in the sense that it will not error out when encountering them; it just follows `all_features`
    /// Activate all available features
    pub all_features: bool,

    /// Do not activate the `default` feature
    pub no_default_features: bool,

    /// Ignore dev-only dependencies
    pub no_dev: bool,

    // This is a `String` because we don't parse the value, just pass it on to `cargo metadata` blindly
    /// Space or comma separated list of features to activate
    #[bpaf(argument("FEATURES"))]
    pub features: Option<String>,

    /// Only include dependencies matching the given target-triple
    #[bpaf(argument("TRIPLE"))]
    pub target: Option<String>,

    /// Path to Cargo.toml
    #[bpaf(argument("PATH"))]
    pub manifest_path: Option<PathBuf>,
}

/// Arguments for typical querying commands - crates, publishers, json
#[derive(Clone, Debug, Bpaf)]
#[bpaf(generate(args))]
pub(crate) struct QueryCommandArgs {
    #[bpaf(external)]
    pub cache_max_age: Duration,

    /// Make output more friendly towards tools such as `diff`
    #[bpaf(short, long)]
    pub diffable: bool,
}

#[derive(Clone, Debug, Bpaf)]
pub(crate) enum PrintJson {
    /// Print JSON schema and exit
    #[bpaf(long("print-schema"))]
    Schema,

    Info {
        #[bpaf(external)]
        args: QueryCommandArgs,
        #[bpaf(external)]
        meta_args: MetadataArgs,
    },
}

/// Gather author, contributor and publisher data on crates in your dependency graph
///
///
/// Most commands also accept flags controlling the features, targets, etc.
///  See 'cargo supply-chain <command> --help' for more information on a specific command.
#[derive(Clone, Debug, Bpaf)]
#[bpaf(options("supply-chain"), generate(args_parser), version)]
pub(crate) enum CliArgs {
    /// Lists all crates.io publishers in the dependency graph and owned crates for each
    ///
    ///
    /// If a local cache created by 'update' subcommand is present and up to date,
    /// it will be used. Otherwise live data will be fetched from the crates.io API.
    #[bpaf(command)]
    Publishers {
        #[bpaf(external)]
        args: QueryCommandArgs,
        #[bpaf(external)]
        meta_args: MetadataArgs,
    },

    /// List all crates in dependency graph and crates.io publishers for each
    ///
    ///
    /// If a local cache created by 'update' subcommand is present and up to date,
    /// it will be used. Otherwise live data will be fetched from the crates.io API.
    #[bpaf(command)]
    Crates {
        #[bpaf(external)]
        args: QueryCommandArgs,
        #[bpaf(external)]
        meta_args: MetadataArgs,
    },

    /// Detailed info on publishers of all crates in the dependency graph, in JSON
    ///
    /// The JSON schema is also available, use --print-schema to get it.
    ///
    /// If a local cache created by 'update' subcommand is present and up to date,
    /// it will be used. Otherwise live data will be fetched from the crates.io API.",
    #[bpaf(command)]
    Json(#[bpaf(external(print_json))] PrintJson),

    /// Download the latest daily dump from crates.io to speed up other commands
    ///
    ///
    /// If the local cache is already younger than specified in '--cache-max-age' option,
    /// a newer version will not be downloaded.
    ///
    /// Note that this downloads the entire crates.io database, which is hundreds of Mb of data!
    /// If you are on a metered connection, you should not be running the 'update' subcommand.
    /// Instead, rely on requests to the live API - they are slower, but use much less data.
    #[bpaf(command)]
    Update {
        #[bpaf(external)]
        cache_max_age: Duration,
    },
}

fn cache_max_age() -> impl Parser<Duration> {
    long("cache-max-age")
        .help(
            "\
The cache will be considered valid while younger than specified.
The format is a human readable duration such as `1w` or `1d 6h`.
If not specified, the cache is considered valid for 48 hours.",
        )
        .argument::<String>("AGE")
        .parse(|text| humantime::parse_duration(&text))
        .fallback(Duration::from_secs(48 * 3600))
}

#[cfg(test)]
mod tests {
    use super::*;

    fn parse_args(args: &[&str]) -> Result<CliArgs, ParseFailure> {
        args_parser().run_inner(Args::from(args))
    }

    #[test]
    fn test_cache_max_age_parser() {
        let _ = parse_args(&["crates", "--cache-max-age", "7d"]).unwrap();
        let _ = parse_args(&["crates", "--cache-max-age=7d"]).unwrap();
        let _ = parse_args(&["crates", "--cache-max-age=1w"]).unwrap();
        let _ = parse_args(&["crates", "--cache-max-age=1m"]).unwrap();
        let _ = parse_args(&["crates", "--cache-max-age=1s"]).unwrap();
        // erroneous invocations that must be rejected
        assert!(parse_args(&["crates", "--cache-max-age"]).is_err());
        assert!(parse_args(&["crates", "--cache-max-age=5"]).is_err());
    }

    #[test]
    fn test_accepted_query_options() {
        for command in ["crates", "publishers", "json"] {
            let _ = args_parser().run_inner(&[command][..]).unwrap();
            let _ = args_parser().run_inner(&[command, "-d"][..]).unwrap();
            let _ = args_parser()
                .run_inner(&[command, "--diffable"][..])
                .unwrap();
            let _ = args_parser()
                .run_inner(&[command, "--cache-max-age=7d"][..])
                .unwrap();
            let _ = args_parser()
                .run_inner(&[command, "-d", "--cache-max-age=7d"][..])
                .unwrap();
            let _ = args_parser()
                .run_inner(&[command, "--diffable", "--cache-max-age=7d"][..])
                .unwrap();
        }
    }

    #[test]
    fn test_accepted_update_options() {
        let _ = args_parser().run_inner(Args::from(&["update"])).unwrap();
        let _ = parse_args(&["update", "--cache-max-age=7d"]).unwrap();
        // erroneous invocations that must be rejected
        assert!(parse_args(&["update", "-d"]).is_err());
        assert!(parse_args(&["update", "--diffable"]).is_err());
        assert!(parse_args(&["update", "-d", "--cache-max-age=7d"]).is_err());
        assert!(parse_args(&["update", "--diffable", "--cache-max-age=7d"]).is_err());
    }

    #[test]
    fn test_json_schema_option() {
        let _ = parse_args(&["json", "--print-schema"]).unwrap();
        // erroneous invocations that must be rejected
        assert!(parse_args(&["json", "--print-schema", "-d"]).is_err());
        assert!(parse_args(&["json", "--print-schema", "--diffable"]).is_err());
        assert!(parse_args(&["json", "--print-schema", "--cache-max-age=7d"]).is_err());
        assert!(
            parse_args(&["json", "--print-schema", "--diffable", "--cache-max-age=7d"]).is_err()
        );
    }

    #[test]
    fn test_invocation_through_cargo() {
        let _ = parse_args(&["supply-chain", "update"]).unwrap();
        let _ = parse_args(&["supply-chain", "publishers", "-d"]).unwrap();
        let _ = parse_args(&["supply-chain", "crates", "-d", "--cache-max-age=5h"]).unwrap();
        let _ = parse_args(&["supply-chain", "json", "--diffable"]).unwrap();
        let _ = parse_args(&["supply-chain", "json", "--print-schema"]).unwrap();
        // erroneous invocations to be rejected
        assert!(parse_args(&["supply-chain", "supply-chain", "json", "--print-schema"]).is_err());
        assert!(parse_args(&["supply-chain", "supply-chain", "crates", "-d"]).is_err());
    }
}


================================================
FILE: src/common.rs
================================================
use anyhow::bail;
use cargo_metadata::{
    CargoOpt::AllFeatures, CargoOpt::NoDefaultFeatures, DependencyKind, Metadata, MetadataCommand,
    NodeDep, Package, PackageId,
};
use std::collections::{HashMap, HashSet};

pub use crate::cli::MetadataArgs;

#[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)]
#[cfg_attr(test, derive(serde::Deserialize, serde::Serialize))]
pub enum PkgSource {
    Local,
    CratesIo,
    Foreign,
}

#[derive(Debug, Clone)]
#[cfg_attr(test, derive(Eq, PartialEq, serde::Deserialize, serde::Serialize))]
pub struct SourcedPackage {
    pub source: PkgSource,
    pub package: Package,
}

fn metadata_command(args: MetadataArgs) -> MetadataCommand {
    let mut command = MetadataCommand::new();
    if args.all_features {
        command.features(AllFeatures);
    }
    if args.no_default_features {
        command.features(NoDefaultFeatures);
    }
    if let Some(path) = args.manifest_path {
        command.manifest_path(path);
    }
    let mut other_options = Vec::new();
    if let Some(target) = args.target {
        other_options.push(format!("--filter-platform={}", target));
    }
    // `cargo-metadata` crate assumes we have a Vec of features,
    // but we really didn't want to parse it ourselves, so we pass the argument directly
    if let Some(features) = args.features {
        other_options.push(format!("--features={}", features));
    }
    command.other_options(other_options);
    command
}

pub fn sourced_dependencies(
    metadata_args: MetadataArgs,
) -> Result<Vec<SourcedPackage>, anyhow::Error> {
    let no_dev = metadata_args.no_dev;
    let command = metadata_command(metadata_args);
    let meta = match command.exec() {
        Ok(v) => v,
        Err(cargo_metadata::Error::CargoMetadata { stderr: e }) => bail!(e),
        Err(err) => bail!("Failed to fetch crate metadata!\n  {}", err),
    };

    sourced_dependencies_from_metadata(meta, no_dev)
}

fn sourced_dependencies_from_metadata(
    meta: Metadata,
    no_dev: bool,
) -> Result<Vec<SourcedPackage>, anyhow::Error> {
    let mut how: HashMap<PackageId, PkgSource> = HashMap::new();
    let mut what: HashMap<PackageId, Package> = meta
        .packages
        .iter()
        .map(|package| (package.id.clone(), package.clone()))
        .collect();

    for pkg in &meta.packages {
        // Suppose every package is foreign, until proven otherwise..
        how.insert(pkg.id.clone(), PkgSource::Foreign);
    }

    // Find the crates.io dependencies..
    for pkg in &meta.packages {
        if let Some(source) = pkg.source.as_ref() {
            if source.is_crates_io() {
                how.insert(pkg.id.clone(), PkgSource::CratesIo);
            }
        }
    }

    for pkg in &meta.workspace_members {
        *how.get_mut(pkg).unwrap() = PkgSource::Local;
    }

    if no_dev {
        (how, what) = extract_non_dev_dependencies(&meta, &mut how, &mut what);
    }

    let dependencies: Vec<_> = how
        .iter()
        .map(|(id, kind)| {
            let dep = what.get(id).cloned().unwrap();
            SourcedPackage {
                source: *kind,
                package: dep,
            }
        })
        .collect();

    Ok(dependencies)
}

/// Start with the `PkgSource::Local` packages, then iteratively add non-dev-dependencies until no
/// more packages can be added, and return the results.
///
/// This function uses the resolved dependency graph from `cargo metadata` to determine which
/// dependencies are actually used. This function does _not_ use the declared dependencies, which
/// may include optional dependencies that aren't actually used.
fn extract_non_dev_dependencies(
    meta: &Metadata,
    how: &mut HashMap<PackageId, PkgSource>,
    what: &mut HashMap<PackageId, Package>,
) -> (HashMap<PackageId, PkgSource>, HashMap<PackageId, Package>) {
    let mut how_new = HashMap::new();
    let mut what_new = HashMap::new();

    let Some(resolve) = &meta.resolve else {
        return (HashMap::new(), HashMap::new());
    };

    let node_deps: HashMap<&PackageId, &[NodeDep]> = resolve
        .nodes
        .iter()
        .map(|node| (&node.id, node.deps.as_slice()))
        .collect();

    let mut ids = how
        .iter()
        .filter_map(|(id, source)| {
            if matches!(source, PkgSource::Local) {
                Some(id.clone())
            } else {
                None
            }
        })
        .collect::<Vec<_>>();

    while !ids.is_empty() {
        let mut deps = HashSet::new();

        for id in ids.drain(..) {
            if let Some(node_deps) = node_deps.get(&id) {
                for dep in *node_deps {
                    if dep
                        .dep_kinds
                        .iter()
                        .any(|info| info.kind != DependencyKind::Development)
                    {
                        deps.insert(&dep.pkg);
                    }
                }
            }

            how_new.insert(id.clone(), how.remove(&id).unwrap());
            what_new.insert(id.clone(), what.remove(&id).unwrap());
        }

        for pkg_id in what.keys() {
            if deps.contains(pkg_id) {
                ids.push(pkg_id.clone());
            }
        }
    }

    (how_new, what_new)
}

pub fn crate_names_from_source(crates: &[SourcedPackage], source: PkgSource) -> Vec<String> {
    let mut filtered_crate_names: Vec<String> = crates
        .iter()
        .filter(|p| p.source == source)
        .map(|p| p.package.name.to_string())
        .collect();
    // Collecting into a HashSet is less user-friendly because order varies between runs
    filtered_crate_names.sort_unstable();
    filtered_crate_names.dedup();
    filtered_crate_names
}

pub fn complain_about_non_crates_io_crates(dependencies: &[SourcedPackage]) {
    {
        // scope bound to avoid accidentally referencing local crates when working with foreign ones
        let local_crate_names = crate_names_from_source(dependencies, PkgSource::Local);
        if !local_crate_names.is_empty() {
            eprintln!(
                "\nThe following crates will be ignored because they come from a local directory:"
            );
            for crate_name in &local_crate_names {
                eprintln!(" - {}", crate_name);
            }
        }
    }

    {
        let foreign_crate_names = crate_names_from_source(dependencies, PkgSource::Foreign);
        if !foreign_crate_names.is_empty() {
            eprintln!("\nCannot audit the following crates because they are not from crates.io:");
            for crate_name in &foreign_crate_names {
                eprintln!(" - {}", crate_name);
            }
        }
    }
}

pub fn comma_separated_list(list: &[String]) -> String {
    let mut result = String::new();
    let mut first_loop = true;
    for crate_name in list {
        if !first_loop {
            result.push_str(", ");
        }
        first_loop = false;
        result.push_str(crate_name.as_str());
    }
    result
}

#[cfg(test)]
mod tests {
    use super::sourced_dependencies_from_metadata;
    use cargo_metadata::MetadataCommand;
    #[test]
    fn optional_dependency_excluded_when_not_activated() {
        let metadata = MetadataCommand::new()
            .current_dir("fixtures/optional_non_dev_dep")
            .exec()
            .unwrap();

        let deps = sourced_dependencies_from_metadata(metadata.clone(), false).unwrap();
        assert!(deps.iter().any(|dep| dep.package.name == "libz-rs-sys"));

        let deps_no_dev = sourced_dependencies_from_metadata(metadata, true).unwrap();
        assert!(!deps_no_dev
            .iter()
            .any(|dep| dep.package.name == "libz-rs-sys"));
    }
}


================================================
FILE: src/crates_cache.rs
================================================
use crate::api_client::RateLimitedClient;
use crate::publishers::{PublisherData, PublisherKind};
use dirs;
use flate2::read::GzDecoder;
use serde::{Deserialize, Serialize};
use std::{
    collections::{BTreeSet, HashMap},
    fs,
    io::{self, ErrorKind},
    path::PathBuf,
    time::Duration,
    time::SystemTimeError,
};

pub struct CratesCache {
    cache_dir: Option<CacheDir>,
    metadata: Option<MetadataStored>,
    crates: Option<HashMap<String, Crate>>,
    crate_owners: Option<HashMap<u64, Vec<CrateOwner>>>,
    users: Option<HashMap<u64, User>>,
    teams: Option<HashMap<u64, Team>>,
    versions: Option<HashMap<(u64, String), Publisher>>,
}

pub enum CacheState {
    Fresh,
    Expired,
    Unknown,
}

pub enum DownloadState {
    /// The tag still matched and resource was not stale.
    Fresh,
    /// There was a newer resource.
    Expired,
    /// We forced the download of an update.
    Stale,
}

struct CacheDir(PathBuf);

#[derive(Clone, Deserialize, Serialize)]
struct Metadata {
    #[serde(with = "humantime_serde")]
    timestamp: std::time::SystemTime,
}

#[derive(Clone, Deserialize, Serialize)]
struct MetadataStored {
    #[serde(with = "humantime_serde")]
    timestamp: std::time::SystemTime,
    #[serde(default)]
    etag: Option<String>,
}

#[derive(Clone, Deserialize, Serialize)]
struct Crate {
    name: String,
    id: u64,
    repository: Option<String>,
}

#[derive(Clone, Deserialize, Serialize)]
struct CrateOwner {
    crate_id: u64,
    owner_id: u64,
    owner_kind: i32,
}

#[derive(Clone, Deserialize, Serialize)]
struct Publisher {
    crate_id: u64,
    published_by: u64,
}

#[derive(Clone, Deserialize, Serialize)]
struct Team {
    id: u64,
    avatar: Option<String>,
    login: String,
    name: Option<String>,
}

#[derive(Clone, Deserialize, Serialize)]
struct User {
    id: u64,
    gh_avatar: Option<String>,
    gh_id: Option<String>,
    gh_login: String,
    name: Option<String>,
}

impl CratesCache {
    const METADATA_FS: &'static str = "metadata.json";
    const CRATES_FS: &'static str = "crates.json";
    const CRATE_OWNERS_FS: &'static str = "crate_owners.json";
    const USERS_FS: &'static str = "users.json";
    const TEAMS_FS: &'static str = "teams.json";
    const VERSIONS_FS: &'static str = "versions.json";

    const DUMP_URL: &'static str = "https://static.crates.io/db-dump.tar.gz";

    /// Open a crates cache.
    pub fn new() -> Self {
        CratesCache {
            cache_dir: Self::cache_dir().map(CacheDir),
            metadata: None,
            crates: None,
            crate_owners: None,
            users: None,
            teams: None,
            versions: None,
        }
    }

    fn cache_dir() -> Option<PathBuf> {
        dirs::cache_dir()
    }

    /// Re-download the list from the data dumps.
    pub fn download(
        &mut self,
        client: &mut RateLimitedClient,
        max_age: Duration,
    ) -> Result<DownloadState, io::Error> {
        let bar = indicatif::ProgressBar::new(!0)
            .with_prefix("Downloading")
            .with_style(
                indicatif::ProgressStyle::default_spinner()
                    .template("{prefix:>12.bright.cyan} {spinner} {msg:.cyan}")
                    .unwrap(),
            )
            .with_message("preparing");

        let remembered_etag;
        let response = {
            let mut request = client.get(Self::DUMP_URL);
            if let Some(meta) = self.load_metadata() {
                remembered_etag = meta.etag.clone();
                // See if we can consider the resource not-yet-stale.
                if meta.validate(max_age) == Some(true) {
                    if let Some(etag) = meta.etag.as_ref() {
                        request = request.set("if-none-match", etag);
                    }
                }
            } else {
                remembered_etag = None;
            }
            request.call()
        }
        .map_err(io::Error::other)?;

        // Not modified.
        if response.status() == 304 {
            bar.finish_and_clear();
            return Ok(DownloadState::Fresh);
        }

        if let Some(length) = response
            .header("content-length")
            .and_then(|l| l.parse().ok())
        {
            bar.set_style(
                indicatif::ProgressStyle::default_bar()
                    .template("{prefix:>12.bright.cyan} [{bar:27}] {bytes:>9}/{total_bytes:9}  {bytes_per_sec}  ETA {eta:4} - {msg:.cyan}").unwrap()
                    .progress_chars("=> "));
            bar.set_length(length);
        } else {
            bar.println("Length unspecified, expect at least 250MiB");
            bar.set_style(indicatif::ProgressStyle::default_spinner().template(
                "{prefix:>12.bright.cyan} {spinner} {bytes:>9} {bytes_per_sec} - {msg:.cyan}",
            ).unwrap());
        }

        let etag = response.header("etag").map(String::from);
        let reader = bar.wrap_read(response.into_reader());
        let ungzip = GzDecoder::new(reader);
        let mut archive = tar::Archive::new(ungzip);

        let cache_dir = CratesCache::cache_dir().ok_or(ErrorKind::NotFound)?;
        let mut cache_updater = CacheUpdater::new(cache_dir)?;
        let required_files = [
            Self::CRATE_OWNERS_FS,
            Self::CRATES_FS,
            Self::USERS_FS,
            Self::TEAMS_FS,
            Self::METADATA_FS,
        ]
        .iter()
        .map(ToString::to_string)
        .collect::<BTreeSet<_>>();

        for entry in (archive.entries()?).flatten() {
            if let Ok(path) = entry.path() {
                if let Some(name) = path.file_name().and_then(std::ffi::OsStr::to_str) {
                    bar.set_message(name.to_string());
                }
            }
            if entry_path_ends_with(&entry, "crate_owners.csv") {
                let owners: Vec<CrateOwner> = read_csv_data(entry)?;
                cache_updater.store_multi_map(
                    &mut self.crate_owners,
                    Self::CRATE_OWNERS_FS,
                    owners.as_slice(),
                    &|owner| owner.crate_id,
                )?;
            } else if entry_path_ends_with(&entry, "crates.csv") {
                let crates: Vec<Crate> = read_csv_data(entry)?;
                cache_updater.store_map(
                    &mut self.crates,
                    Self::CRATES_FS,
                    crates.as_slice(),
                    &|crate_| crate_.name.clone(),
                )?;
            } else if entry_path_ends_with(&entry, "users.csv") {
                let users: Vec<User> = read_csv_data(entry)?;
                cache_updater.store_map(
                    &mut self.users,
                    Self::USERS_FS,
                    users.as_slice(),
                    &|user| user.id,
                )?;
            } else if entry_path_ends_with(&entry, "teams.csv") {
                let teams: Vec<Team> = read_csv_data(entry)?;
                cache_updater.store_map(
                    &mut self.teams,
                    Self::TEAMS_FS,
                    teams.as_slice(),
                    &|team| team.id,
                )?;
            } else if entry_path_ends_with(&entry, "metadata.json") {
                let meta: Metadata = serde_json::from_reader(entry)?;
                cache_updater.store(
                    &mut self.metadata,
                    Self::METADATA_FS,
                    MetadataStored {
                        timestamp: meta.timestamp,
                        etag: etag.clone(),
                    },
                )?;
            } else {
                // This was not a file with a filename we actually use.
                // Check if we've obtained all the files we need.
                // If yes, we can end the download early.
                // This saves hundreds of megabytes of traffic.
                if required_files.is_subset(&cache_updater.staged_files) {
                    break;
                }
            }
        }
        // Now that we've successfully downloaded and stored everything,
        // replace the old cache contents with the new one.
        cache_updater.commit()?;

        // If we get here, we had no etag or the etag mismatched or we forced a download due to
        // stale data. Catch the last as it means the crates.io daily dumps were not updated.
        if remembered_etag == etag {
            Ok(DownloadState::Stale)
        } else {
            Ok(DownloadState::Expired)
        }
    }

    pub fn expire(&mut self, max_age: Duration) -> CacheState {
        match self.validate(max_age) {
            // Still fresh.
            Some(true) => CacheState::Fresh,
            // There was no valid meta data. Consider expired for safety.
            None => {
                self.cache_dir = None;
                CacheState::Unknown
            }
            Some(false) => {
                self.cache_dir = None;
                CacheState::Expired
            }
        }
    }

    pub fn age(&mut self) -> Option<Duration> {
        match self.load_metadata() {
            Some(meta) => meta.age().ok(),
            None => None,
        }
    }

    pub fn publisher_users(&mut self, crate_name: &str) -> Option<Vec<PublisherData>> {
        let id = self.load_crates()?.get(crate_name)?.id;
        let owners = self.load_crate_owners()?.get(&id)?.clone();
        let users = self.load_users()?;
        let publisher = owners
            .into_iter()
            .filter(|owner| owner.owner_kind == 0)
            .filter_map(|owner: CrateOwner| {
                let user = users.get(&owner.owner_id)?;
                Some(PublisherData {
                    id: user.id,
                    avatar: user.gh_avatar.clone(),
                    login: user.gh_login.clone(),
                    name: user.name.clone(),
                    kind: PublisherKind::user,
                })
            })
            .collect();
        Some(publisher)
    }

    pub fn publisher_teams(&mut self, crate_name: &str) -> Option<Vec<PublisherData>> {
        let id = self.load_crates()?.get(crate_name)?.id;
        let owners = self.load_crate_owners()?.get(&id)?.clone();
        let teams = self.load_teams()?;
        let publisher = owners
            .into_iter()
            .filter(|owner| owner.owner_kind == 1)
            .filter_map(|owner: CrateOwner| {
                let team = teams.get(&owner.owner_id)?;
                Some(PublisherData {
                    id: team.id,
                    avatar: team.avatar.clone(),
                    login: team.login.clone(),
                    name: team.name.clone(),
                    kind: PublisherKind::team,
                })
            })
            .collect();
        Some(publisher)
    }

    fn validate(&mut self, max_age: Duration) -> Option<bool> {
        let meta = self.load_metadata()?;
        meta.validate(max_age)
    }

    fn load_metadata(&mut self) -> Option<&MetadataStored> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.metadata, Self::METADATA_FS)
            .ok()
    }

    fn load_crates(&mut self) -> Option<&HashMap<String, Crate>> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.crates, Self::CRATES_FS)
            .ok()
    }

    fn load_crate_owners(&mut self) -> Option<&HashMap<u64, Vec<CrateOwner>>> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.crate_owners, Self::CRATE_OWNERS_FS)
            .ok()
    }

    fn load_users(&mut self) -> Option<&HashMap<u64, User>> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.users, Self::USERS_FS)
            .ok()
    }

    fn load_teams(&mut self) -> Option<&HashMap<u64, Team>> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.teams, Self::TEAMS_FS)
            .ok()
    }

    fn load_versions(&mut self) -> Option<&HashMap<(u64, String), Publisher>> {
        self.cache_dir
            .as_ref()?
            .load_cached(&mut self.versions, Self::VERSIONS_FS)
            .ok()
    }
}

fn entry_path_ends_with<R: io::Read>(entry: &tar::Entry<R>, needle: &str) -> bool {
    let Ok(path) = entry.path() else {
        return false;
    };
    let Some(file_name) = path.file_name() else {
        return false;
    };
    file_name == needle
}

fn read_csv_data<T: serde::de::DeserializeOwned>(
    from: impl io::Read,
) -> Result<Vec<T>, csv::Error> {
    let mut reader = csv::ReaderBuilder::new()
        .delimiter(b',')
        .double_quote(true)
        .quoting(true)
        .from_reader(from);
    reader.deserialize().collect()
}

impl MetadataStored {
    fn validate(&self, max_age: Duration) -> Option<bool> {
        match self.age() {
            Ok(duration) => Some(duration < max_age),
            Err(_) => None,
        }
    }

    pub fn age(&self) -> Result<Duration, SystemTimeError> {
        self.timestamp.elapsed()
    }
}

impl CacheDir {
    fn load_cached<'cache, T>(
        &self,
        cache: &'cache mut Option<T>,
        file: &str,
    ) -> Result<&'cache T, io::Error>
    where
        T: serde::de::DeserializeOwned,
    {
        match cache {
            Some(datum) => Ok(datum),
            None => {
                let file = fs::File::open(self.0.join(file))?;
                let reader = io::BufReader::new(file);
                let crates: T = serde_json::from_reader(reader).unwrap();
                Ok(cache.get_or_insert(crates))
            }
        }
    }
}

/// Implements a two-phase transactional update mechanism:
/// you can store data, but it will not overwrite previous data until you call `commit()`
struct CacheUpdater {
    dir: PathBuf,
    staged_files: BTreeSet<String>,
}

/// Creates the cache directory if it doesn't exist.
/// Returns an error if creation fails.
impl CacheUpdater {
    fn new(dir: PathBuf) -> Result<Self, io::Error> {
        if !dir.exists() {
            fs::create_dir_all(&dir)?;
        }

        if !dir.is_dir() {
            // Well. We certainly don't want to delete anything.
            return Err(io::ErrorKind::AlreadyExists.into());
        }

        Ok(Self {
            dir,
            staged_files: BTreeSet::new(),
        })
    }

    /// Commits to disk any changes that you have staged via the `store()` function.
    fn commit(&mut self) -> io::Result<()> {
        let mut uncommitted_files = std::mem::take(&mut self.staged_files);
        let metadata_file = uncommitted_files.take(CratesCache::METADATA_FS);
        for file in uncommitted_files {
            let source = self.dir.join(&file).with_extension("part");
            let destination = self.dir.join(&file);
            fs::rename(source, destination)?;
        }
        // metadata_file is special since it contains the timestamp for the cache.
        // We will only commit it and update the timestamp if updating everything else succeeds.
        // Otherwise it would be possible to create a partially updated cache that's considered fresh.
        if let Some(file) = metadata_file {
            let source = self.dir.join(&file).with_extension("part");
            let destination = self.dir.join(&file);
            fs::rename(source, destination)?;
        }
        Ok(())
    }

    /// Does not overwrite existing data until `commit()` is called.
    /// If you do not call `commit()` after this, the on-disk cache will not be actually updated!
    fn store<T>(&mut self, cache: &mut Option<T>, file: &str, value: T) -> Result<(), io::Error>
    where
        T: Serialize,
    {
        *cache = None;
        let value = cache.get_or_insert(value);

        self.staged_files.insert(file.to_owned());
        let out_path = self.dir.join(file).with_extension("part");
        let out_file = fs::File::create(out_path)?;
        let out = io::BufWriter::new(out_file);
        serde_json::to_writer(out, value)?;
        Ok(())
    }

    fn store_map<T, K>(
        &mut self,
        cache: &mut Option<HashMap<K, T>>,
        file: &str,
        entries: &[T],
        key_fn: &dyn Fn(&T) -> K,
    ) -> Result<(), io::Error>
    where
        T: Serialize + Clone,
        K: Serialize + Eq + std::hash::Hash,
    {
        let hashed: HashMap<K, _> = entries
            .iter()
            .map(|entry| (key_fn(entry), entry.clone()))
            .collect();
        self.store(cache, file, hashed)
    }

    fn store_multi_map<T, K>(
        &mut self,
        cache: &mut Option<HashMap<K, Vec<T>>>,
        file: &str,
        entries: &[T],
        key_fn: &dyn Fn(&T) -> K,
    ) -> Result<(), io::Error>
    where
        T: Serialize + Clone,
        K: Serialize + Eq + std::hash::Hash,
    {
        let mut hashed: HashMap<K, _> = HashMap::new();
        for entry in entries.iter() {
            let key = key_fn(entry);
            hashed
                .entry(key)
                .or_insert_with(Vec::new)
                .push(entry.clone());
        }
        self.store(cache, file, hashed)
    }
}


================================================
FILE: src/main.rs
================================================
//! Gather author, contributor, publisher data on crates in your dependency graph.
//!
//! There are some use cases:
//!
//! * Find people and groups worth supporting.
//! * An analysis of all the contributors you implicitly trust by building their software. This
//!   might have both a sobering and humbling effect.
//! * Identify risks in your dependency graph.

#![forbid(unsafe_code)]

mod api_client;
mod cli;
mod common;
mod crates_cache;
mod publishers;
mod subcommands;

use cli::CliArgs;
use common::MetadataArgs;

fn main() -> Result<(), anyhow::Error> {
    let args = cli::args_parser().fallback_to_usage().run();
    dispatch_command(args)
}

fn dispatch_command(args: CliArgs) -> Result<(), anyhow::Error> {
    match args {
        CliArgs::Publishers { args, meta_args } => {
            subcommands::publishers(meta_args, args.diffable, args.cache_max_age)?;
        }
        CliArgs::Crates { args, meta_args } => {
            subcommands::crates(meta_args, args.diffable, args.cache_max_age)?;
        }
        CliArgs::Update { cache_max_age } => subcommands::update(cache_max_age)?,
        CliArgs::Json(json) => match json {
            cli::PrintJson::Schema => subcommands::print_schema()?,
            cli::PrintJson::Info { args, meta_args } => {
                subcommands::json(meta_args, args.diffable, args.cache_max_age)?;
            }
        },
    }

    Ok(())
}


================================================
FILE: src/publishers.rs
================================================
use crate::api_client::RateLimitedClient;
use crate::crates_cache::{CacheState, CratesCache};
use serde::{Deserialize, Serialize};
use std::{
    collections::BTreeMap,
    io::{self},
    time::Duration,
};

#[cfg(test)]
use schemars::JsonSchema;

use crate::common::{crate_names_from_source, PkgSource, SourcedPackage};

#[derive(Deserialize)]
struct UsersResponse {
    users: Vec<PublisherData>,
}

#[derive(Deserialize)]
struct TeamsResponse {
    teams: Vec<PublisherData>,
}

/// Data about a single publisher received from a crates.io API endpoint
#[cfg_attr(test, derive(JsonSchema))]
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct PublisherData {
    pub id: u64,
    pub login: String,
    pub kind: PublisherKind,
    // URL is disabled because it's present in API responses but not in DB dumps,
    // so the output would vary inconsistent depending on data source
    //pub url: Option<String>,
    /// Display name. It is NOT guaranteed to be unique!
    pub name: Option<String>,
    /// Avatar image URL
    pub avatar: Option<String>,
}

impl PartialEq for PublisherData {
    fn eq(&self, other: &Self) -> bool {
        self.id == other.id
    }
}

impl Eq for PublisherData {
    // holds for PublisherData because we're comparing u64 IDs, and it holds for u64
    fn assert_receiver_is_total_eq(&self) {}
}

impl PartialOrd for PublisherData {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
        Some(self.id.cmp(&other.id))
    }
}

impl Ord for PublisherData {
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
        self.id.cmp(&other.id)
    }
}

#[cfg_attr(test, derive(JsonSchema))]
#[derive(Serialize, Deserialize, Debug, Copy, Clone, Ord, PartialOrd, Eq, PartialEq)]
#[allow(non_camel_case_types)]
pub enum PublisherKind {
    team,
    user,
}

pub fn publisher_users(
    client: &mut RateLimitedClient,
    crate_name: &str,
) -> Result<Vec<PublisherData>, io::Error> {
    let url = format!("https://crates.io/api/v1/crates/{}/owner_user", crate_name);
    let resp = get_with_retry(&url, client, 3)?;
    let data: UsersResponse = resp.into_json()?;
    Ok(data.users)
}

pub fn publisher_teams(
    client: &mut RateLimitedClient,
    crate_name: &str,
) -> Result<Vec<PublisherData>, io::Error> {
    let url = format!("https://crates.io/api/v1/crates/{}/owner_team", crate_name);
    let resp = get_with_retry(&url, client, 3)?;
    let data: TeamsResponse = resp.into_json()?;
    Ok(data.teams)
}

fn get_with_retry(
    url: &str,
    client: &mut RateLimitedClient,
    attempts: u8,
) -> Result<ureq::Response, io::Error> {
    let mut resp = client.get(url).call().map_err(io::Error::other)?;

    let mut count = 1;
    let mut wait = 5;
    while resp.status() != 200 && count <= attempts {
        eprintln!(
            "Failed retrieving {:?}, trying again in {} seconds, attempt {}/{}",
            url, wait, count, attempts
        );
        std::thread::sleep(std::time::Duration::from_secs(wait));

        resp = client.get(url).call().map_err(io::Error::other)?;

        count += 1;
        wait *= 3;
    }

    Ok(resp)
}

pub fn fetch_owners_of_crates(
    dependencies: &[SourcedPackage],
    max_age: Duration,
) -> Result<
    (
        BTreeMap<String, Vec<PublisherData>>,
        BTreeMap<String, Vec<PublisherData>>,
    ),
    io::Error,
> {
    let crates_io_names = crate_names_from_source(dependencies, PkgSource::CratesIo);
    let mut client = RateLimitedClient::new();
    let mut cached = CratesCache::new();
    let using_cache = match cached.expire(max_age) {
        CacheState::Fresh => true,
        CacheState::Expired => {
            eprintln!(
                "\nIgnoring expired cache, older than {}.",
                // we use humantime rather than indicatif because we take humantime input
                // and here we simply repeat it back to the user
                humantime::format_duration(max_age)
            );
            eprintln!("  Run `cargo supply-chain update` to update it.");
            false
        }
        CacheState::Unknown => {
            eprintln!("\nThe `crates.io` cache was not found or it is invalid.");
            eprintln!("  Run `cargo supply-chain update` to generate it.");
            false
        }
    };
    let mut users: BTreeMap<String, Vec<PublisherData>> = BTreeMap::new();
    let mut teams: BTreeMap<String, Vec<PublisherData>> = BTreeMap::new();

    if using_cache {
        let age = cached.age().unwrap();
        eprintln!(
            "\nUsing cached data. Cache age: {}",
            indicatif::HumanDuration(age)
        );
    } else {
        eprintln!("\nFetching publisher info from crates.io");
        eprintln!("This will take roughly 2 seconds per crate due to API rate limits");
    }

    let bar = indicatif::ProgressBar::new(crates_io_names.len() as u64)
    .with_prefix("Preparing")
    .with_style(
        indicatif::ProgressStyle::default_bar()
        .template("{prefix:>12.bright.cyan} [{bar:27}] {pos:>4}/{len:4} ETA {eta:3} - {msg:.cyan}").unwrap()
        .progress_chars("=> ")
    );

    for (i, crate_name) in crates_io_names.iter().enumerate() {
        bar.set_message(crate_name.clone());
        bar.set_position((i + 1) as u64);
        let cached_users = cached.publisher_users(crate_name);
        let cached_teams = cached.publisher_teams(crate_name);
        if let (Some(pub_users), Some(pub_teams)) = (cached_users, cached_teams) {
            bar.set_prefix("Loading cache");
            users.insert(crate_name.clone(), pub_users);
            teams.insert(crate_name.clone(), pub_teams);
        } else {
            // Handle crates not found in the cache by fetching live data for them
            bar.set_prefix("Downloading");
            let pusers = publisher_users(&mut client, crate_name)?;
            users.insert(crate_name.clone(), pusers);
            let pteams = publisher_teams(&mut client, crate_name)?;
            teams.insert(crate_name.clone(), pteams);
        }
    }
    Ok((users, teams))
}


================================================
FILE: src/subcommands/crates.rs
================================================
use crate::publishers::{fetch_owners_of_crates, PublisherKind};
use crate::{
    common::{comma_separated_list, complain_about_non_crates_io_crates, sourced_dependencies},
    MetadataArgs,
};

pub fn crates(
    metadata_args: MetadataArgs,
    diffable: bool,
    max_age: std::time::Duration,
) -> Result<(), anyhow::Error> {
    let dependencies = sourced_dependencies(metadata_args)?;
    complain_about_non_crates_io_crates(&dependencies);
    let (mut owners, publisher_teams) = fetch_owners_of_crates(&dependencies, max_age)?;

    for (crate_name, publishers) in publisher_teams {
        owners.entry(crate_name).or_default().extend(publishers);
    }

    let mut ordered_owners: Vec<_> = owners.into_iter().collect();
    if diffable {
        // Sort alphabetically by crate name
        ordered_owners.sort_unstable_by_key(|(name, _)| name.clone());
    } else {
        // Order by the number of owners, but put crates owned by teams first
        ordered_owners.sort_unstable_by_key(|(name, publishers)| {
            (
                !publishers.iter().any(|p| p.kind == PublisherKind::team), // contains at least one team
                usize::MAX - publishers.len(),
                name.clone(),
            )
        });
    }
    for (_, publishers) in &mut ordered_owners {
        // For each crate put teams first
        publishers.sort_unstable_by_key(|p| (p.kind, p.login.clone()));
    }

    if !diffable {
        println!(
            "\nDependency crates with the people and teams that can publish them to crates.io:\n"
        );
    }
    for (i, (crate_name, publishers)) in ordered_owners.iter().enumerate() {
        let pretty_publishers: Vec<String> = publishers
            .iter()
            .map(|p| match p.kind {
                PublisherKind::team => format!("team \"{}\"", p.login),
                PublisherKind::user => p.login.to_string(),
            })
            .collect();
        let publishers_list = comma_separated_list(&pretty_publishers);
        if diffable {
            println!("{}: {}", crate_name, publishers_list);
        } else {
            println!("{}. {}: {}", i + 1, crate_name, publishers_list);
        }
    }

    if !ordered_owners.is_empty() {
        eprintln!("\nNote: there may be outstanding publisher invitations. crates.io provides no way to list them.");
        eprintln!("See https://github.com/rust-lang/crates.io/issues/2868 for more info.");
    }
    Ok(())
}


================================================
FILE: src/subcommands/json.rs
================================================
//! `json` subcommand is equivalent to `crates`,
//! but provides structured output and more info about each publisher.
use crate::publishers::{fetch_owners_of_crates, PublisherData};
use crate::{
    common::{crate_names_from_source, sourced_dependencies, PkgSource},
    MetadataArgs,
};
use serde::Serialize;
use std::collections::BTreeMap;

#[cfg(test)]
use schemars::JsonSchema;

#[cfg_attr(test, derive(JsonSchema))]
#[derive(Debug, Serialize, Default, Clone)]
pub struct StructuredOutput {
    not_audited: NotAudited,
    /// Maps crate names to info about the publishers of each crate
    crates_io_crates: BTreeMap<String, Vec<PublisherData>>,
}

#[cfg_attr(test, derive(JsonSchema))]
#[derive(Debug, Serialize, Default, Clone)]
pub struct NotAudited {
    /// Names of crates that are imported from a location in the local filesystem, not from a registry
    local_crates: Vec<String>,
    /// Names of crates that are neither from crates.io nor from a local filesystem
    foreign_crates: Vec<String>,
}

pub fn json(
    args: MetadataArgs,
    diffable: bool,
    max_age: std::time::Duration,
) -> Result<(), anyhow::Error> {
    let mut output = StructuredOutput::default();
    let dependencies = sourced_dependencies(args)?;
    // Report non-crates.io dependencies
    output.not_audited.local_crates = crate_names_from_source(&dependencies, PkgSource::Local);
    output.not_audited.foreign_crates = crate_names_from_source(&dependencies, PkgSource::Foreign);
    output.not_audited.local_crates.sort_unstable();
    output.not_audited.foreign_crates.sort_unstable();
    // Fetch list of owners and publishers
    let (mut owners, publisher_teams) = fetch_owners_of_crates(&dependencies, max_age)?;
    // Merge the two maps we received into one
    for (crate_name, publishers) in publisher_teams {
        owners.entry(crate_name).or_default().extend(publishers);
    }
    // Sort the vectors of publisher data. This helps when diffing the output,
    // but we do it unconditionally because it's cheap and helps users pull less hair when debugging.
    for list in owners.values_mut() {
        list.sort_unstable_by_key(|x| x.id);
    }
    output.crates_io_crates = owners;
    // Print the result to stdout
    let stdout = std::io::stdout();
    let handle = stdout.lock();
    if diffable {
        serde_json::to_writer_pretty(handle, &output)?;
    } else {
        serde_json::to_writer(handle, &output)?;
    }
    Ok(())
}


================================================
FILE: src/subcommands/json_schema.rs
================================================
//! The schema for the JSON subcommand output

use std::io::{Result, Write};

pub fn print_schema() -> Result<()> {
    writeln!(std::io::stdout(), "{}", JSON_SCHEMA)?;
    Ok(())
}

const JSON_SCHEMA: &str = r##"{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "StructuredOutput",
  "type": "object",
  "required": [
    "crates_io_crates",
    "not_audited"
  ],
  "properties": {
    "crates_io_crates": {
      "description": "Maps crate names to info about the publishers of each crate",
      "type": "object",
      "additionalProperties": {
        "type": "array",
        "items": {
          "$ref": "#/definitions/PublisherData"
        }
      }
    },
    "not_audited": {
      "$ref": "#/definitions/NotAudited"
    }
  },
  "definitions": {
    "NotAudited": {
      "type": "object",
      "required": [
        "foreign_crates",
        "local_crates"
      ],
      "properties": {
        "foreign_crates": {
          "description": "Names of crates that are neither from crates.io nor from a local filesystem",
          "type": "array",
          "items": {
            "type": "string"
          }
        },
        "local_crates": {
          "description": "Names of crates that are imported from a location in the local filesystem, not from a registry",
          "type": "array",
          "items": {
            "type": "string"
          }
        }
      }
    },
    "PublisherData": {
      "description": "Data about a single publisher received from a crates.io API endpoint",
      "type": "object",
      "required": [
        "id",
        "kind",
        "login"
      ],
      "properties": {
        "avatar": {
          "description": "Avatar image URL",
          "type": [
            "string",
            "null"
          ]
        },
        "id": {
          "type": "integer",
          "format": "uint64",
          "minimum": 0.0
        },
        "kind": {
          "$ref": "#/definitions/PublisherKind"
        },
        "login": {
          "type": "string"
        },
        "name": {
          "description": "Display name. It is NOT guaranteed to be unique!",
          "type": [
            "string",
            "null"
          ]
        }
      }
    },
    "PublisherKind": {
      "type": "string",
      "enum": [
        "team",
        "user"
      ]
    }
  }
}"##;

#[cfg(test)]
mod tests {
    use super::*;
    use crate::subcommands::json::StructuredOutput;
    use schemars::schema_for;

    #[test]
    fn test_json_schema() {
        let schema = schema_for!(StructuredOutput);
        let schema = serde_json::to_string_pretty(&schema).unwrap();
        assert_eq!(schema, JSON_SCHEMA);
    }
}


================================================
FILE: src/subcommands/mod.rs
================================================
pub mod crates;
pub mod json;
pub mod json_schema;
pub mod publishers;
pub mod update;

pub use crates::crates;
pub use json::json;
pub use json_schema::print_schema;
pub use publishers::publishers;
pub use update::update;


================================================
FILE: src/subcommands/publishers.rs
================================================
use std::collections::BTreeMap;

use crate::publishers::fetch_owners_of_crates;
use crate::MetadataArgs;
use crate::{
    common::{comma_separated_list, complain_about_non_crates_io_crates, sourced_dependencies},
    publishers::PublisherData,
};

pub fn publishers(
    metadata_args: MetadataArgs,
    diffable: bool,
    max_age: std::time::Duration,
) -> Result<(), anyhow::Error> {
    let dependencies = sourced_dependencies(metadata_args)?;
    complain_about_non_crates_io_crates(&dependencies);
    let (publisher_users, publisher_teams) = fetch_owners_of_crates(&dependencies, max_age)?;

    // Group data by user rather than by crate
    let mut user_to_crate_map = transpose_publishers_map(&publisher_users);
    let mut team_to_crate_map = transpose_publishers_map(&publisher_teams);

    // Sort crate names alphabetically
    user_to_crate_map.values_mut().for_each(|c| c.sort());
    team_to_crate_map.values_mut().for_each(|c| c.sort());

    if diffable {
        // empty map just means 0 loop iterations here
        let sorted_map = sort_transposed_map_for_diffing(user_to_crate_map);
        for (user, crates) in &sorted_map {
            let crate_list = comma_separated_list(crates);
            println!("user \"{}\": {}", &user.login, crate_list);
        }
    } else if !publisher_users.is_empty() {
        println!("\nThe following individuals can publish updates for your dependencies:\n");
        let map_for_display = sort_transposed_map_for_display(user_to_crate_map);
        for (i, (user, crates)) in map_for_display.iter().enumerate() {
            // We do not print usernames, since you can embed terminal control sequences in them
            // and erase yourself from the output that way.
            let crate_list = comma_separated_list(crates);
            println!(" {}. {} via crates: {}", i + 1, &user.login, crate_list);
        }
        eprintln!("\nNote: there may be outstanding publisher invitations. crates.io provides no way to list them.");
        eprintln!("See https://github.com/rust-lang/crates.io/issues/2868 for more info.");
    }

    if diffable {
        let sorted_map = sort_transposed_map_for_diffing(team_to_crate_map);
        for (team, crates) in &sorted_map {
            let crate_list = comma_separated_list(crates);
            println!("team \"{}\": {}", &team.login, crate_list);
        }
    } else if !publisher_teams.is_empty() {
        println!(
            "\nAll members of the following teams can publish updates for your dependencies:\n"
        );
        let map_for_display = sort_transposed_map_for_display(team_to_crate_map);
        for (i, (team, crates)) in map_for_display.iter().enumerate() {
            let crate_list = comma_separated_list(crates);
            if let (true, Some(org)) = (
                team.login.starts_with("github:"),
                team.login.split(':').nth(1),
            ) {
                println!(
                    " {}. \"{}\" (https://github.com/{}) via crates: {}",
                    i + 1,
                    &team.login,
                    org,
                    crate_list
                );
            } else {
                println!(" {}. \"{}\" via crates: {}", i + 1, &team.login, crate_list);
            }
        }
        eprintln!("\nGithub teams are black boxes. It's impossible to get the member list without explicit permission.");
    }
    Ok(())
}

/// Turns a crate-to-publishers mapping into publisher-to-crates mapping.
/// [`BTreeMap`] is used because [`PublisherData`] doesn't implement Hash.
fn transpose_publishers_map(
    input: &BTreeMap<String, Vec<PublisherData>>,
) -> BTreeMap<PublisherData, Vec<String>> {
    let mut result: BTreeMap<PublisherData, Vec<String>> = BTreeMap::new();
    for (crate_name, publishers) in input.iter() {
        for publisher in publishers {
            result
                .entry(publisher.clone())
                .or_default()
                .push(crate_name.clone());
        }
    }
    result
}

/// Returns a Vec sorted so that publishers are sorted by the number of crates they control.
/// If that number is the same, sort by login.
fn sort_transposed_map_for_display(
    input: BTreeMap<PublisherData, Vec<String>>,
) -> Vec<(PublisherData, Vec<String>)> {
    let mut result: Vec<_> = input.into_iter().collect();
    result.sort_unstable_by_key(|(publisher, crates)| {
        (usize::MAX - crates.len(), publisher.login.clone())
    });
    result
}

fn sort_transposed_map_for_diffing(
    input: BTreeMap<PublisherData, Vec<String>>,
) -> Vec<(PublisherData, Vec<String>)> {
    let mut result: Vec<_> = input.into_iter().collect();
    result.sort_unstable_by_key(|(publisher, _crates)| publisher.login.clone());
    result
}


================================================
FILE: src/subcommands/update.rs
================================================
use crate::api_client::RateLimitedClient;
use crate::crates_cache::{CratesCache, DownloadState};
use anyhow::bail;

pub fn update(max_age: std::time::Duration) -> Result<(), anyhow::Error> {
    let mut cache = CratesCache::new();
    let mut client = RateLimitedClient::new();

    match cache.download(&mut client, max_age) {
        Ok(state) => match state {
            DownloadState::Fresh => eprintln!("No updates found"),
            DownloadState::Expired => {
                eprintln!("Successfully updated to the newest daily data dump.");
            }
            DownloadState::Stale => bail!("Latest daily data dump matches the previous version, which was considered outdated."),
        },
        Err(error) => bail!("Could not update to the latest daily data dump!\n{}", error)
    }
    Ok(())
}
Download .txt
gitextract_8jmx9lol/

├── .github/
│   └── workflows/
│       └── rust.yml
├── .gitignore
├── CHANGELOG.md
├── Cargo.toml
├── LICENSE-APACHE
├── LICENSE-MIT
├── LICENSE-ZLIB
├── README.md
├── fixtures/
│   └── optional_non_dev_dep/
│       ├── Cargo.toml
│       └── src/
│           └── lib.rs
└── src/
    ├── api_client.rs
    ├── cli.rs
    ├── common.rs
    ├── crates_cache.rs
    ├── main.rs
    ├── publishers.rs
    └── subcommands/
        ├── crates.rs
        ├── json.rs
        ├── json_schema.rs
        ├── mod.rs
        ├── publishers.rs
        └── update.rs
Download .txt
SYMBOL INDEX (97 symbols across 12 files)

FILE: fixtures/optional_non_dev_dep/src/lib.rs
  function add (line 1) | pub fn add(left: u64, right: u64) -> u64 {
  function it_works (line 10) | fn it_works() {

FILE: src/api_client.rs
  type RateLimitedClient (line 3) | pub struct RateLimitedClient {
    method new (line 18) | pub fn new() -> Self {
    method get (line 22) | pub fn get(&mut self, url: &str) -> ureq::Request {
    method wait_to_honor_rate_limit (line 32) | fn wait_to_honor_rate_limit(&mut self) {
  method default (line 9) | fn default() -> Self {

FILE: src/cli.rs
  type MetadataArgs (line 7) | pub struct MetadataArgs {
  type QueryCommandArgs (line 36) | pub(crate) struct QueryCommandArgs {
  type PrintJson (line 46) | pub(crate) enum PrintJson {
  type CliArgs (line 66) | pub(crate) enum CliArgs {
  function cache_max_age (line 118) | fn cache_max_age() -> impl Parser<Duration> {
  function parse_args (line 135) | fn parse_args(args: &[&str]) -> Result<CliArgs, ParseFailure> {
  function test_cache_max_age_parser (line 140) | fn test_cache_max_age_parser() {
  function test_accepted_query_options (line 152) | fn test_accepted_query_options() {
  function test_accepted_update_options (line 172) | fn test_accepted_update_options() {
  function test_json_schema_option (line 183) | fn test_json_schema_option() {
  function test_invocation_through_cargo (line 195) | fn test_invocation_through_cargo() {

FILE: src/common.rs
  type PkgSource (line 12) | pub enum PkgSource {
  type SourcedPackage (line 20) | pub struct SourcedPackage {
  function metadata_command (line 25) | fn metadata_command(args: MetadataArgs) -> MetadataCommand {
  function sourced_dependencies (line 49) | pub fn sourced_dependencies(
  function sourced_dependencies_from_metadata (line 63) | fn sourced_dependencies_from_metadata(
  function extract_non_dev_dependencies (line 116) | fn extract_non_dev_dependencies(
  function crate_names_from_source (line 175) | pub fn crate_names_from_source(crates: &[SourcedPackage], source: PkgSou...
  function complain_about_non_crates_io_crates (line 187) | pub fn complain_about_non_crates_io_crates(dependencies: &[SourcedPackag...
  function comma_separated_list (line 212) | pub fn comma_separated_list(list: &[String]) -> String {
  function optional_dependency_excluded_when_not_activated (line 230) | fn optional_dependency_excluded_when_not_activated() {

FILE: src/crates_cache.rs
  type CratesCache (line 15) | pub struct CratesCache {
    constant METADATA_FS (line 94) | const METADATA_FS: &'static str = "metadata.json";
    constant CRATES_FS (line 95) | const CRATES_FS: &'static str = "crates.json";
    constant CRATE_OWNERS_FS (line 96) | const CRATE_OWNERS_FS: &'static str = "crate_owners.json";
    constant USERS_FS (line 97) | const USERS_FS: &'static str = "users.json";
    constant TEAMS_FS (line 98) | const TEAMS_FS: &'static str = "teams.json";
    constant VERSIONS_FS (line 99) | const VERSIONS_FS: &'static str = "versions.json";
    constant DUMP_URL (line 101) | const DUMP_URL: &'static str = "https://static.crates.io/db-dump.tar.gz";
    method new (line 104) | pub fn new() -> Self {
    method cache_dir (line 116) | fn cache_dir() -> Option<PathBuf> {
    method download (line 121) | pub fn download(
    method expire (line 264) | pub fn expire(&mut self, max_age: Duration) -> CacheState {
    method age (line 280) | pub fn age(&mut self) -> Option<Duration> {
    method publisher_users (line 287) | pub fn publisher_users(&mut self, crate_name: &str) -> Option<Vec<Publ...
    method publisher_teams (line 308) | pub fn publisher_teams(&mut self, crate_name: &str) -> Option<Vec<Publ...
    method validate (line 329) | fn validate(&mut self, max_age: Duration) -> Option<bool> {
    method load_metadata (line 334) | fn load_metadata(&mut self) -> Option<&MetadataStored> {
    method load_crates (line 341) | fn load_crates(&mut self) -> Option<&HashMap<String, Crate>> {
    method load_crate_owners (line 348) | fn load_crate_owners(&mut self) -> Option<&HashMap<u64, Vec<CrateOwner...
    method load_users (line 355) | fn load_users(&mut self) -> Option<&HashMap<u64, User>> {
    method load_teams (line 362) | fn load_teams(&mut self) -> Option<&HashMap<u64, Team>> {
    method load_versions (line 369) | fn load_versions(&mut self) -> Option<&HashMap<(u64, String), Publishe...
  type CacheState (line 25) | pub enum CacheState {
  type DownloadState (line 31) | pub enum DownloadState {
  type CacheDir (line 40) | struct CacheDir(PathBuf);
    method load_cached (line 412) | fn load_cached<'cache, T>(
  type Metadata (line 43) | struct Metadata {
  type MetadataStored (line 49) | struct MetadataStored {
    method validate (line 399) | fn validate(&self, max_age: Duration) -> Option<bool> {
    method age (line 406) | pub fn age(&self) -> Result<Duration, SystemTimeError> {
  type Crate (line 57) | struct Crate {
  type CrateOwner (line 64) | struct CrateOwner {
  type Publisher (line 71) | struct Publisher {
  type Team (line 77) | struct Team {
  type User (line 85) | struct User {
  function entry_path_ends_with (line 377) | fn entry_path_ends_with<R: io::Read>(entry: &tar::Entry<R>, needle: &str...
  function read_csv_data (line 387) | fn read_csv_data<T: serde::de::DeserializeOwned>(
  type CacheUpdater (line 434) | struct CacheUpdater {
    method new (line 442) | fn new(dir: PathBuf) -> Result<Self, io::Error> {
    method commit (line 459) | fn commit(&mut self) -> io::Result<()> {
    method store (line 480) | fn store<T>(&mut self, cache: &mut Option<T>, file: &str, value: T) ->...
    method store_map (line 495) | fn store_map<T, K>(
    method store_multi_map (line 513) | fn store_multi_map<T, K>(

FILE: src/main.rs
  function main (line 22) | fn main() -> Result<(), anyhow::Error> {
  function dispatch_command (line 27) | fn dispatch_command(args: CliArgs) -> Result<(), anyhow::Error> {

FILE: src/publishers.rs
  type UsersResponse (line 16) | struct UsersResponse {
  type TeamsResponse (line 21) | struct TeamsResponse {
  type PublisherData (line 28) | pub struct PublisherData {
  method eq (line 42) | fn eq(&self, other: &Self) -> bool {
  method assert_receiver_is_total_eq (line 49) | fn assert_receiver_is_total_eq(&self) {}
  method partial_cmp (line 53) | fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
  method cmp (line 59) | fn cmp(&self, other: &Self) -> std::cmp::Ordering {
  type PublisherKind (line 67) | pub enum PublisherKind {
  function publisher_users (line 72) | pub fn publisher_users(
  function publisher_teams (line 82) | pub fn publisher_teams(
  function get_with_retry (line 92) | fn get_with_retry(
  function fetch_owners_of_crates (line 117) | pub fn fetch_owners_of_crates(

FILE: src/subcommands/crates.rs
  function crates (line 7) | pub fn crates(

FILE: src/subcommands/json.rs
  type StructuredOutput (line 16) | pub struct StructuredOutput {
  type NotAudited (line 24) | pub struct NotAudited {
  function json (line 31) | pub fn json(

FILE: src/subcommands/json_schema.rs
  function print_schema (line 5) | pub fn print_schema() -> Result<()> {
  constant JSON_SCHEMA (line 10) | const JSON_SCHEMA: &str = r##"{
  function test_json_schema (line 110) | fn test_json_schema() {

FILE: src/subcommands/publishers.rs
  function publishers (line 10) | pub fn publishers(
  function transpose_publishers_map (line 82) | fn transpose_publishers_map(
  function sort_transposed_map_for_display (line 99) | fn sort_transposed_map_for_display(
  function sort_transposed_map_for_diffing (line 109) | fn sort_transposed_map_for_diffing(

FILE: src/subcommands/update.rs
  function update (line 5) | pub fn update(max_age: std::time::Duration) -> Result<(), anyhow::Error> {
Condensed preview — 22 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (80K chars).
[
  {
    "path": ".github/workflows/rust.yml",
    "chars": 988,
    "preview": "name: Rust CI\non:\n  push:\n    branches: [ master ]\n  pull_request:\n    branches: [ master ]\njobs:\n  build:\n    runs-on: "
  },
  {
    "path": ".gitignore",
    "chars": 7,
    "preview": "target\n"
  },
  {
    "path": "CHANGELOG.md",
    "chars": 4260,
    "preview": "## v0.3.7 (2026-02-05)\n\n - Updated the caching code to handle the recent changes to crates.io dump format\n\n## v0.3.6 (20"
  },
  {
    "path": "Cargo.toml",
    "chars": 903,
    "preview": "[package]\nname = \"cargo-supply-chain\"\nversion = \"0.3.7\"\ndescription = \"Gather author, contributor, publisher data on cra"
  },
  {
    "path": "LICENSE-APACHE",
    "chars": 10847,
    "preview": "                              Apache License\n                        Version 2.0, January 2004\n                     http"
  },
  {
    "path": "LICENSE-MIT",
    "chars": 1089,
    "preview": "MIT License\n\nCopyright (c) 2020 Andreas Molzer aka. HeroicKatora\n\nPermission is hereby granted, free of charge, to any p"
  },
  {
    "path": "LICENSE-ZLIB",
    "chars": 873,
    "preview": "Copyright (c) 2020 Andreas Molzer aka. HeroicKatora\n\nThis software is provided 'as-is', without any express or implied w"
  },
  {
    "path": "README.md",
    "chars": 1933,
    "preview": "# cargo-supply-chain\n\nGather author, contributor and publisher data on crates in your dependency graph.\n\nUse cases inclu"
  },
  {
    "path": "fixtures/optional_non_dev_dep/Cargo.toml",
    "chars": 217,
    "preview": "[package]\nname = \"optional_non_dev_dep\"\nversion = \"0.1.0\"\nedition = \"2024\"\npublish = false\n\n[dependencies]\nlibz-rs-sys ="
  },
  {
    "path": "fixtures/optional_non_dev_dep/src/lib.rs",
    "chars": 210,
    "preview": "pub fn add(left: u64, right: u64) -> u64 {\n    left + right\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n "
  },
  {
    "path": "src/api_client.rs",
    "chars": 1205,
    "preview": "use std::time::{Duration, Instant};\n\npub struct RateLimitedClient {\n    last_request_time: Option<Instant>,\n    agent: u"
  },
  {
    "path": "src/cli.rs",
    "chars": 7767,
    "preview": "use bpaf::*;\nuse std::{path::PathBuf, time::Duration};\n\n/// Arguments to be passed to `cargo metadata`\n#[derive(Clone, D"
  },
  {
    "path": "src/common.rs",
    "chars": 7703,
    "preview": "use anyhow::bail;\nuse cargo_metadata::{\n    CargoOpt::AllFeatures, CargoOpt::NoDefaultFeatures, DependencyKind, Metadata"
  },
  {
    "path": "src/crates_cache.rs",
    "chars": 17144,
    "preview": "use crate::api_client::RateLimitedClient;\nuse crate::publishers::{PublisherData, PublisherKind};\nuse dirs;\nuse flate2::r"
  },
  {
    "path": "src/main.rs",
    "chars": 1405,
    "preview": "//! Gather author, contributor, publisher data on crates in your dependency graph.\n//!\n//! There are some use cases:\n//!"
  },
  {
    "path": "src/publishers.rs",
    "chars": 6055,
    "preview": "use crate::api_client::RateLimitedClient;\nuse crate::crates_cache::{CacheState, CratesCache};\nuse serde::{Deserialize, S"
  },
  {
    "path": "src/subcommands/crates.rs",
    "chars": 2458,
    "preview": "use crate::publishers::{fetch_owners_of_crates, PublisherKind};\nuse crate::{\n    common::{comma_separated_list, complain"
  },
  {
    "path": "src/subcommands/json.rs",
    "chars": 2458,
    "preview": "//! `json` subcommand is equivalent to `crates`,\n//! but provides structured output and more info about each publisher.\n"
  },
  {
    "path": "src/subcommands/json_schema.rs",
    "chars": 2696,
    "preview": "//! The schema for the JSON subcommand output\n\nuse std::io::{Result, Write};\n\npub fn print_schema() -> Result<()> {\n    "
  },
  {
    "path": "src/subcommands/mod.rs",
    "chars": 223,
    "preview": "pub mod crates;\npub mod json;\npub mod json_schema;\npub mod publishers;\npub mod update;\n\npub use crates::crates;\npub use "
  },
  {
    "path": "src/subcommands/publishers.rs",
    "chars": 4763,
    "preview": "use std::collections::BTreeMap;\n\nuse crate::publishers::fetch_owners_of_crates;\nuse crate::MetadataArgs;\nuse crate::{\n  "
  },
  {
    "path": "src/subcommands/update.rs",
    "chars": 816,
    "preview": "use crate::api_client::RateLimitedClient;\nuse crate::crates_cache::{CratesCache, DownloadState};\nuse anyhow::bail;\n\npub "
  }
]

About this extraction

This page contains the full source code of the rust-secure-code/cargo-supply-chain GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 22 files (74.2 KB), approximately 18.3k tokens, and a symbol index with 97 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!