[
  {
    "path": ".cargo/config.toml",
    "content": "[target.x86_64-apple-darwin]\nlinker = \"x86_64-apple-darwin14-clang\"\nar = \"x86_64-apple-darwin14-ar\""
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Describe the bug**\nA clear and concise description of what the bug is.\n\n**To Reproduce**\nSteps to reproduce the behavior:\n1. This md file '...'\n2. This command '....'\n3. See error\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Desktop (please complete the following information):**\n - OS: [e.g. iOS]\n - Browser [e.g. chrome, safari]\n - Version [e.g. 22]\n\n**Additional context**\nAdd any other context about the problem here.\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/custom.md",
    "content": "---\nname: Custom issue template\nabout: Describe this issue template's purpose here.\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Is your feature request related to a problem? Please describe.**\nA clear and concise description of what the problem is. Ex. I'm always frustrated when [...]\n\n**Describe the solution you'd like**\nA clear and concise description of what you want to happen.\n\n**Describe alternatives you've considered**\nA clear and concise description of any alternative solutions or features you've considered.\n\n**Additional context**\nAdd any other context or screenshots about the feature request here.\n"
  },
  {
    "path": ".github/dependabot.yml",
    "content": "version: 2\nupdates:\n  - package-ecosystem: \"cargo\"\n    directory: \"/\"\n    schedule:\n      interval: \"weekly\"\n    open-pull-requests-limit: 5\n    labels:\n      - \"dependencies\"\n\n  - package-ecosystem: \"github-actions\"\n    directory: \"/\"\n    schedule:\n      interval: \"weekly\"\n    open-pull-requests-limit: 5\n    labels:\n      - \"dependencies\"\n"
  },
  {
    "path": ".github/instructions/rust.instructions.md",
    "content": "---\ndescription: 'Rust programming language coding conventions and best practices'\napplyTo: '**/*.rs'\n---\n\n# Rust Coding Conventions and Best Practices\n\nFollow idiomatic Rust practices and community standards when writing Rust code. \n\nThese instructions are based on [The Rust Book](https://doc.rust-lang.org/book/), [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/), [RFC 430 naming conventions](https://github.com/rust-lang/rfcs/blob/master/text/0430-finalizing-naming-conventions.md), and the broader Rust community at [users.rust-lang.org](https://users.rust-lang.org).\n\n## General Instructions\n\n- Always prioritize readability, safety, and maintainability.\n- Use strong typing and leverage Rust's ownership system for memory safety.\n- Break down complex functions into smaller, more manageable functions.\n- For algorithm-related code, include explanations of the approach used.\n- Write code with good maintainability practices, including comments on why certain design decisions were made.\n- Handle errors gracefully using `Result<T, E>` and provide meaningful error messages.\n- For external dependencies, mention their usage and purpose in documentation.\n- Use consistent naming conventions following [RFC 430](https://github.com/rust-lang/rfcs/blob/master/text/0430-finalizing-naming-conventions.md).\n- Write idiomatic, safe, and efficient Rust code that follows the borrow checker's rules.\n- Ensure code compiles without warnings.\n\n## Patterns to Follow\n\n- Use modules (`mod`) and public interfaces (`pub`) to encapsulate logic.\n- Handle errors properly using `?`, `match`, or `if let`.\n- Use `serde` for serialization and `thiserror` or `anyhow` for custom errors.\n- Implement traits to abstract services or external dependencies.\n- Structure async code using `async/await` and `tokio` or `async-std`.\n- Prefer enums over flags and states for type safety.\n- Use builders for complex object creation.\n- Split binary and library code (`main.rs` vs `lib.rs`) for testability and reuse.\n- Use `rayon` for data parallelism and CPU-bound tasks.\n- Use iterators instead of index-based loops as they're often faster and safer.\n- Use `&str` instead of `String` for function parameters when you don't need ownership.\n- Prefer borrowing and zero-copy operations to avoid unnecessary allocations.\n\n### Ownership, Borrowing, and Lifetimes\n\n- Prefer borrowing (`&T`) over cloning unless ownership transfer is necessary.\n- Use `&mut T` when you need to modify borrowed data.\n- Explicitly annotate lifetimes when the compiler cannot infer them.\n- Use `Rc<T>` for single-threaded reference counting and `Arc<T>` for thread-safe reference counting.\n- Use `RefCell<T>` for interior mutability in single-threaded contexts and `Mutex<T>` or `RwLock<T>` for multi-threaded contexts.\n\n## Patterns to Avoid\n\n- Don't use `unwrap()` or `expect()` unless absolutely necessary—prefer proper error handling.\n- Avoid panics in library code—return `Result` instead.\n- Don't rely on global mutable state—use dependency injection or thread-safe containers.\n- Avoid deeply nested logic—refactor with functions or combinators.\n- Don't ignore warnings—treat them as errors during CI.\n- Avoid `unsafe` unless required and fully documented.\n- Don't overuse `clone()`, use borrowing instead of cloning unless ownership transfer is needed.\n- Avoid premature `collect()`, keep iterators lazy until you actually need the collection.\n- Avoid unnecessary allocations—prefer borrowing and zero-copy operations.\n\n## Code Style and Formatting\n\n- Follow the Rust Style Guide and use `rustfmt` for automatic formatting.\n- Keep lines under 100 characters when possible.\n- Place function and struct documentation immediately before the item using `///`.\n- Use `cargo clippy` to catch common mistakes and enforce best practices.\n\n## Error Handling\n\n- Use `Result<T, E>` for recoverable errors and `panic!` only for unrecoverable errors.\n- Prefer `?` operator over `unwrap()` or `expect()` for error propagation.\n- Create custom error types using `thiserror` or implement `std::error::Error`.\n- Use `Option<T>` for values that may or may not exist.\n- Provide meaningful error messages and context.\n- Error types should be meaningful and well-behaved (implement standard traits).\n- Validate function arguments and return appropriate errors for invalid input.\n\n## API Design Guidelines\n\n### Common Traits Implementation\nEagerly implement common traits where appropriate:\n- `Copy`, `Clone`, `Eq`, `PartialEq`, `Ord`, `PartialOrd`, `Hash`, `Debug`, `Display`, `Default`\n- Use standard conversion traits: `From`, `AsRef`, `AsMut`\n- Collections should implement `FromIterator` and `Extend`\n- Note: `Send` and `Sync` are auto-implemented by the compiler when safe; avoid manual implementation unless using `unsafe` code\n\n### Type Safety and Predictability\n- Use newtypes to provide static distinctions\n- Arguments should convey meaning through types; prefer specific types over generic `bool` parameters\n- Use `Option<T>` appropriately for truly optional values\n- Functions with a clear receiver should be methods\n- Only smart pointers should implement `Deref` and `DerefMut`\n\n### Future Proofing\n- Use sealed traits to protect against downstream implementations\n- Structs should have private fields\n- Functions should validate their arguments\n- All public types must implement `Debug`\n\n## Testing and Documentation\n\n- Write comprehensive unit tests using `#[cfg(test)]` modules and `#[test]` annotations.\n- Use test modules alongside the code they test (`mod tests { ... }`).\n- Write integration tests in `tests/` directory with descriptive filenames.\n- Write clear and concise comments for each function, struct, enum, and complex logic.\n- Ensure functions have descriptive names and include comprehensive documentation.\n- Document all public APIs with rustdoc (`///` comments) following the [API Guidelines](https://rust-lang.github.io/api-guidelines/).\n- Use `#[doc(hidden)]` to hide implementation details from public documentation.\n- Document error conditions, panic scenarios, and safety considerations.\n- Examples should use `?` operator, not `unwrap()` or deprecated `try!` macro.\n\n## Project Organization\n\n- Use semantic versioning in `Cargo.toml`.\n- Include comprehensive metadata: `description`, `license`, `repository`, `keywords`, `categories`.\n- Use feature flags for optional functionality.\n- Organize code into modules using `mod.rs` or named files.\n- Keep `main.rs` or `lib.rs` minimal - move logic to modules.\n\n## Pre-Pull Request Requirements\n\n**IMPORTANT**: Before opening or updating a pull request, you MUST run the following commands and fix any issues:\n\n1. **Format code with rustfmt**:\n   ```bash\n   cargo fmt\n   ```\n\n2. **Apply clippy fixes**:\n   ```bash\n   cargo clippy --fix --all-targets --all-features --allow-dirty --allow-staged\n   ```\n\n3. **Verify no warnings remain**:\n   ```bash\n   cargo clippy --all-targets --all-features -- -D warnings\n   ```\n\nThese steps ensure code quality and consistency before the PR is opened for review.\n\n## Quality Checklist\n\nBefore publishing or reviewing Rust code, ensure:\n\n### Core Requirements\n- [ ] **Naming**: Follows RFC 430 naming conventions\n- [ ] **Traits**: Implements `Debug`, `Clone`, `PartialEq` where appropriate\n- [ ] **Error Handling**: Uses `Result<T, E>` and provides meaningful error types\n- [ ] **Documentation**: All public items have rustdoc comments with examples\n- [ ] **Testing**: Comprehensive test coverage including edge cases\n\n### Safety and Quality\n- [ ] **Safety**: No unnecessary `unsafe` code, proper error handling\n- [ ] **Performance**: Efficient use of iterators, minimal allocations\n- [ ] **API Design**: Functions are predictable, flexible, and type-safe\n- [ ] **Future Proofing**: Private fields in structs, sealed traits where appropriate\n- [ ] **Tooling**: Code passes `cargo fmt`, `cargo clippy`, and `cargo test`\n"
  },
  {
    "path": ".github/workflows/ci.yml",
    "content": "name: Continuous Integration\n\non:\n  push:\n    branches: [\"master\"]\n    tags:\n      - \"v*\"\n  pull_request:\n    branches: [\"master\"]\n  workflow_dispatch:\n\nenv:\n  CARGO_TERM_COLOR: always\n  BINARY_NAME: mlc\n  RUSTFLAGS: \"-Dwarnings\"\n\njobs:\n  test_own_readme:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v6\n      - name: Cache\n        uses: actions/cache@v5\n        with:\n          path: |\n            ~/.cargo/registry\n            ~/.cargo/git\n            target\n          key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}\n      - name: Run\n        run: cargo run -- ./README.md -d\n\n  formatting:\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n    steps:\n      - uses: actions/checkout@v6\n        with:\n          token: ${{ secrets.GITHUB_TOKEN }}\n          ref: ${{ github.head_ref || github.ref }}\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n        with:\n          components: rustfmt, clippy\n      - name: Run rustfmt\n        run: cargo fmt\n      - name: Run clippy with auto-fix\n        run: cargo clippy --fix --all-targets --all-features --allow-dirty --allow-staged\n      - name: Check for formatting changes\n        id: check_changes\n        run: |\n          if [[ -n \"$(git status --porcelain)\" ]]; then\n            echo \"changes=true\" >> $GITHUB_OUTPUT\n          else\n            echo \"changes=false\" >> $GITHUB_OUTPUT\n          fi\n      - name: Commit and push formatting and clippy changes\n        if: steps.check_changes.outputs.changes == 'true'\n        run: |\n          git config --local user.email \"github-actions[bot]@users.noreply.github.com\"\n          git config --local user.name \"github-actions[bot]\"\n          git add .\n          git commit -m \"Auto-format code with rustfmt and clippy\"\n          git push origin ${{ github.head_ref || github.ref_name }}\n\n  test:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v6\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n      - run: cargo test --verbose\n\n  build_linux:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v6\n      - uses: awalsh128/cache-apt-pkgs-action@v1\n        with:\n          packages: musl-tools # provides musl-gcc\n          version: 1.0\n      - name: \"Get the Rust toolchain\"\n        uses: dtolnay/rust-toolchain@stable\n        with:\n          targets: x86_64-unknown-linux-musl\n          components: rustfmt, clippy\n      - name: Cache\n        uses: actions/cache@v5\n        with:\n          path: |\n            ~/.cargo/registry\n            ~/.cargo/git\n            target\n          key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}\n      - name: Build\n        run: cargo build --release --verbose --target=x86_64-unknown-linux-musl\n      - uses: actions/upload-artifact@v7\n        with:\n          name: linux\n          path: ./target/x86_64-unknown-linux-musl/release/${{ env.BINARY_NAME }}\n\n  build_linux_arm64:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v6\n      - name: \"Get the Rust toolchain\"\n        uses: dtolnay/rust-toolchain@stable\n        with:\n          targets: aarch64-unknown-linux-musl\n      - name: Install cross\n        run: cargo install cross --git https://github.com/cross-rs/cross\n      - name: Cache\n        uses: actions/cache@v5\n        with:\n          path: |\n            ~/.cargo/registry\n            ~/.cargo/git\n            target\n          key: ${{ runner.os }}-cargo-aarch64-${{ hashFiles('**/Cargo.lock') }}\n      - name: Build\n        run: cross build --release --verbose --target=aarch64-unknown-linux-musl\n      - uses: actions/upload-artifact@v7\n        with:\n          name: linux-arm64\n          path: ./target/aarch64-unknown-linux-musl/release/${{ env.BINARY_NAME }}\n\n  build_windows:\n    runs-on: windows-latest\n    steps:\n      - uses: actions/checkout@v6\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n      - name: Build\n        run: cargo build --verbose --release\n      - uses: actions/upload-artifact@v7\n        with:\n          name: windows\n          path: ./target/release/${{ env.BINARY_NAME }}.exe\n\n  build_osx:\n    runs-on: macos-latest\n    steps:\n      - uses: actions/checkout@v6\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n        with:\n          target: aarch64-apple-darwin\n      - name: Build\n        run: |\n          cargo build --verbose --release --target aarch64-apple-darwin\n          ls ./target\n      - uses: actions/upload-artifact@v7\n        with:\n          name: apple-darwin-arm64\n          path: target/aarch64-apple-darwin/release/${{ env.BINARY_NAME }}\n\n  release_docker:\n    runs-on: ubuntu-latest\n    needs: [build_osx, build_windows, build_linux, build_linux_arm64, test]\n    if: startsWith(github.ref, 'refs/tags/')\n    steps:\n      - uses: actions/checkout@v6\n      - name: Download artifact\n        uses: actions/download-artifact@v8\n        with:\n          name: linux\n          path: ./target/release\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v4\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v4\n      - name: Set env\n        run: |\n          version=${GITHUB_REF#refs/*/}\n          version=${version:1}\n          echo \"RELEASE_VERSION=$version\" >> $GITHUB_ENV\n      - run: echo Push docker image $RELEASE_VERSION\n      - name: Login to Docker Hub\n        uses: docker/login-action@v4\n        with:\n          username: ${{ secrets.DOCKERHUB_USERNAME }}\n          password: ${{ secrets.DOCKERHUB_PW }}\n      - name: Build and push\n        uses: docker/build-push-action@v7\n        with:\n          context: .\n          push: true\n          tags: becheran/mlc:latest,becheran/mlc:${{ env.RELEASE_VERSION }}\n\n  release:\n    runs-on: ubuntu-latest\n    needs: [release_docker]\n    if: startsWith(github.ref, 'refs/tags/')\n    steps:\n      - uses: actions/download-artifact@v8\n        with:\n          name: linux\n          path: mlc-x86_64-linux\n      - uses: actions/download-artifact@v8\n        with:\n          name: linux-arm64\n          path: mlc-aarch64-linux\n      - uses: actions/download-artifact@v8\n        with:\n          name: windows\n          path: mlc-x86_64-windows\n      - uses: actions/download-artifact@v8\n        with:\n          name: apple-darwin-arm64\n          path: mlc-aarch64-apple-darwin\n      - name: Rename files\n        run: |\n          ls\n          ls mlc-x86_64-linux\n          ls mlc-aarch64-linux\n          ls mlc-aarch64-apple-darwin\n          ls mlc-x86_64-windows\n          mv ./mlc-x86_64-linux/mlc mlc\n          rm -rd ./mlc-x86_64-linux\n          mv ./mlc mlc-x86_64-linux\n          mv ./mlc-aarch64-linux/mlc mlc\n          rm -rd ./mlc-aarch64-linux\n          mv ./mlc mlc-aarch64-linux\n          mv ./mlc-aarch64-apple-darwin/mlc mlc\n          rm -rd ./mlc-aarch64-apple-darwin\n          mv ./mlc mlc-aarch64-apple-darwin\n          mv ./mlc-x86_64-windows/mlc.exe mlc-x86_64-windows.exe\n          rm -rd ./mlc-x86_64-windows\n          ls\n      - name: GitHub Release\n        uses: softprops/action-gh-release@v3\n        with:\n          generate_release_notes: true\n          files: |\n            mlc-x86_64-linux\n            mlc-aarch64-linux\n            mlc-aarch64-apple-darwin\n            mlc-x86_64-windows.exe\n"
  },
  {
    "path": ".github/workflows/major-release-tag.yml",
    "content": "# Copyright (c) 2021 Vincent A. Cicirello\n# MIT License\n\nname: Update Major Release Tag\n\non:\n  release:\n    types: [published]\n\njobs:\n  movetag:\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v6\n\n      - name: Get major version num and update tag\n        run: |\n          VERSION=${GITHUB_REF#refs/tags/}\n          MAJOR=${VERSION%%.*}\n          git config --global user.name 'YOUR NAME HERE'\n          git config --global user.email 'USERNAME@users.noreply.github.com'\n          git tag -fa \"${MAJOR}\" -m 'Update major version tag'\n          git push origin \"${MAJOR}\" --force\n"
  },
  {
    "path": ".gitignore",
    "content": "*~\n.#*\n.DS_Store\n.cproject\n.hg/\n.hgignore\n.idea\n*.iml\n__pycache__/\n*.py[cod]\n*$py.class\n.project\n.settings/\n.valgrindrc\n.vscode/\n.favorites.json\n/*-*-*-*/\n/*-*-*/\n/Makefile\n/build\n/config.toml\n/dist/\n/dl/\n/doc\n/inst/\n/llvm/\n/mingw-build/\n/nd/\n/obj/\n/rt/\n/rustllvm/\n/src/libcore/unicode/DerivedCoreProperties.txt\n/src/libcore/unicode/DerivedNormalizationProps.txt\n/src/libcore/unicode/PropList.txt\n/src/libcore/unicode/ReadMe.txt\n/src/libcore/unicode/Scripts.txt\n/src/libcore/unicode/SpecialCasing.txt\n/src/libcore/unicode/UnicodeData.txt\n/stage[0-9]+/\n/target\ntarget/\n/test/\n/tmp/\ntags\ntags.*\nTAGS\nTAGS.*\n\\#*\n\\#*\\#\nconfig.mk\nconfig.stamp\nkeywords.md\nlexer.ml\nmir_dump\nSession.vim\nsrc/etc/dl\ntmp.*.rs\nversion.md\nversion.ml\nversion.texi\n#.cargo\n!src/vendor/**\n/src/target/\n\nno_llvm_build\n\n.mlc.toml"
  },
  {
    "path": "CHANGELOG.md",
    "content": "<!-- The changelog shall follow the recommendations described here: https://keepachangelog.com/en/1.0.0/ \nTypes for Changes:\n- Added\n- Changed\n- Deprecated\n- Removed\n- Fixed\n- Security\n-->\n\n# Changelog\n\n<!-- next-header -->\n\n## [Unreleased] - ReleaseDate\n\n### Changed\n\n- Gitignore files in sub dirs are now also checked\n\n## [1.2.0] - 2025-12-13\n\n### Added\n\n- Custom HTTP headers support for HTTP checks (`--http-headers` / `-H`) [#119](https://github.com/becheran/mlc/pull/119)\n\n### Changed\n\n- GitHub Actions usage example now documents how to pass custom HTTP headers\n- Developer instructions now include pre-PR `cargo fmt` / `cargo clippy` steps [#120](https://github.com/becheran/mlc/pull/120)\n\n### Fixed\n\n- GitHub-flavored markdown task list checkboxes are no longer detected as links [#121](https://github.com/becheran/mlc/pull/121)\n- Root directory configuration is validated to avoid crashes when `root-dir` does not exist\n\n## [1.1.0] - 2025-12-11\n\n### Added\n\n- Support for ignore/disable comments to skip specific links or blocks in markup files [#114](https://github.com/becheran/mlc/pull/114)\n- `--files` option to specify individual files to check [#115](https://github.com/becheran/mlc/pull/115)\n- Severity column to CSV reports to distinguish errors from warnings [#109](https://github.com/becheran/mlc/pull/109)\n- ARM64 binary support for Linux\n\n### Changed\n\n- Replace external URL dependencies in E2E tests with local mock servers [#118](https://github.com/becheran/mlc/pull/118)\n- CI workflow now auto-fixes and pushes formatting/clippy changes instead of failing [#116](https://github.com/becheran/mlc/pull/116)\n- Optimize URL comparison to avoid unnecessary cloning\n\n### Fixed\n\n- False redirect warnings for URLs with fragments\n- Linux ARM64 build by using cross for proper musl cross-compilation\n- CSV file race condition by using unique file names for each test\n- Build status badge link in README\n\n## [1.0.0] - 2025-07-07\n\n## [0.22.0] - 2025-05-29\n\n- Add csv file output [#40](https://github.com/becheran/mlc/issues/40)\n\n## [0.21.0] - 2025-02-08\n\n- Fix do not log warnings #100\n\n## [0.20.0] - 2025-02-08\n\n- Fix remove trailing slashes from OK messages\n- Feat add realistic browser accept headers\n- Feat warn if reference in markdown document is broken\n\n## [0.19.2] - 2025-02-04\n\n## [0.19.1] - 2025-02-04\n\n## [0.19.0] - 2024-11-30\n\n## [0.18.0] - 2024-06-30\n\n- Add `--gitignore` option #94\n\n## [0.17.2] - 2024-06-23\n\n- Do not panic if ignore paths are not found #92\n\n## [0.17.1] - 2024-06-05\n\n- Fix config ignore path from config toml interpreted correctly #78\n- Changed make ignore directory much faster when traversing\n\n## [0.17.0] - 2024-05-19\n\n- Changed enhanced logging and do not crash if path can not be canonicalized\n- Added option to hide redirects #84\n- Changed use ARM64 Mac OS\n- Fixed upgrade dependencies and added security fixes\n\n## [0.16.3] - 2023-11-20\n\n- Fixes issue with throttle parameter\n\n## [0.16.2] - 2023-06-15\n\n## [0.16.1] - 2022-12-19\n\n- Fixed Installation via `cargo install` failed #67\n\n## [0.16.0] - 2022-12-06\n\n- Added config file\n- Added workflow command output for github actions #63\n- Added format links in vs code console so that ctrl + left click opens the file at right location #60\n- Changed report redirects as warnings, unless their destination errors #55\n- Fixed wrong first line separator on windows #61\n- Fixed set accept encoding headers #52\n\n## [0.15.4] - 2022-08-23\n\n- Fix #54 column line index for files with CR + LF endings\n- Update external dependencies\n\n## [0.15.3] - 2022-08-18\n\n- Fix #53 broken docker container\n\n## [0.15.2] - 2022-07-11\n\n## [0.15.1] - 2022-07-07\n\n## [0.15.0] - 2022-07-07\n\n- Changed markdown parser to be CommonMark compatible\n- Changed column of detected link to start of tag instead of actual link\n- Fixed issue #35 detect link in headlines\n\n## [0.14.3] - 2021-05-15\n\n- Changed throttle for increased performance\n- Security upgraded external dependencies\n- Fixed #33 link not found near code block\n\n## [0.14.2] - 2021-03-13\n\n- Fixed broken path check if ../ were included on windows file systems\n\n## [0.14.1] - 2021-03-07\n\n## [0.14.0] - 2020-11-11\n\n- Fallback to GET requests if HEAD fails. See <https://github.com/becheran/mlc/issues/28>\n\n## [0.13.12] - 2020-10-26\n\n- Added GitHub action to README.md\n\n## [0.13.11] - 2020-10-26\n\n## [0.13.9] - 2020-10-25\n\n- Fix wrong count output for skipped links\n\n## [0.13.8] - 2020-10-25\n\n- Fix ignore-links\n- Add -i short for ignore-links argument\n- #23 - Add github action\n- Upgrade external dependencies\n\n## [0.13.7] - 2020-09-16\n\n- Fixed #24 - thanks to Alex Melville (Melvillian) for fixing the issue\n- Fixed #26 - add user-agent to requests\n\n## [0.13.6] - 2020-08-31\n\n- OSX builds\n\n## [0.13.5] - 2020-08-30\n\n- Fixed http requests to crates.io. Added header fields (#20)\n\n## [0.13.4] - 2020-08-04\n\n## [0.13.3] - 2020-08-04\n\n- Fixed https requests in docker container (#17)\n\n## [0.13.2] - 2020-07-21\n\n## [0.13.1] - 2020-07-21\n\n## [0.13.0] - 2020-07-17\n\n- Added `--throttle` command\n\n## [0.12.0] - 2020-07-15\n\n- Added `--match-file-extension` switch\n- Changed check links only once for speed improvement  \n\n## [0.11.2] - 2020-07-13\n\n### Changed\n\n- Improve fs checkup speed\n\n## [0.11.1] - 2020-07-10\n\n### Fixed\n\n- Ignore path parameter\n\n## [0.11.0] - 2020-07-08\n\n### Added\n\n- Ignore files and directories\n\n## [0.10.5] - 2020-07-07\n\n### Fixed\n\n- Allow email address with special chars\n- Add html comment support to markdown files\n\n## [0.10.4] - 2020-07-03\n\n### Changed\n\n- No error for unknown URL schemes\n\n### Fixed\n\n- Ref links with hashtag are not classified as error\n- Case insensitive mail addresses\n\n## [0.10.3] - 2020-07-02\n\n### Fixed\n\n- Link refs only allowed at beginning of line\n- Path separator for os\n\n### Changed\n\n- Allow mails without mailto schema\n\n## [0.10.2] - 2020-07-02\n\n## [0.10.1] - 2020-07-02\n\n## [0.10.0] - 2020-07-01\n\n### Added\n\n- Virtual root dir for easier local testing\n\n## [0.9.3] - 2020-06-24\n\n## [0.9.2] - 2020-05-24\n\n## [0.9.1] - 2020-01-29\n\n### Fixed\n\n- Mailto URI path accepted without double slashes\n\n## [0.9.0] - 2020-01-20\n\n### Changed\n\n- Faster execution with async tasks\n\n### Fixed\n\n- Wildcard parser for excluded links\n\n## [0.8.0] - 2020-01-11\n\n### Added\n\n- HTML support\n\n### Fixed\n\n- No panic for not UTF-8 encoded files\n\n## [0.7.0] - 2020-01-02\n\n### Added\n\n- Reference readme file\n- Ignore links option\n- No web link option for faster checks without following weblinks\n\n## [0.6.4] - 2019-12-30\n\n### Changed\n\n- Retry with Get for status code 405 Method Not Allowed instead of error\n- Column number now points to the link directly instead of the markdown link beginning\n\n### Fixed\n\n- Nested link support (Issue #1)\n\n## [0.6.3] - 2019-12-29\n\n### Changed\n\n- Release binaries on GitHub releases instead of GitLab\n\n## [0.6.2] - 2019-12-28\n\n### Removed\n\n- Remove pipeline badge from crates io\n\n## [0.6.1] - 2019-12-28\n\n### Changed\n\n- Speedup for http links. Do create client only once\n- Move from GitLab to GitHub\n\n## [0.6.0] - 2019-12-26\n\n### Added\n\n- Mail check support\n\n## [0.5.1] - 2019-12-25\n\n### Fixed\n\n- Inline html link at start of line\n\n## [0.5.0] - 2019-12-25\n\n### Added\n\n- Markup reference link support\n\n## [0.4.2] - 2019-12-24\n\n### Changed\n\n- Description in readme\n\n### Fixed\n\n- Typo\n\n## [0.4.1] - 2019-12-23\n\n### Changed\n\n- Result output formatting\n\n## [0.4.0] - 2019-12-23\n\n### Added\n\n- Change Log\n- Code block support in markdown files\n- More file markdown endings support (markdown, mkdn,...)\n\n### Fixed\n\n- File extension separator (previously \"somefilemd\" was also taken as markdown file)\n\n## [0.3.1] - 2019-12-21\n\n### Fixed\n\n- Code cleanup\n- Readme update\n\n## [0.3.0] - 2019-12-19\n\n### Added\n\n- First version of markup link checker (previously mlc was another rust lib project)\n"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "content": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nIn the interest of fostering an open and welcoming environment, we as\ncontributors and maintainers pledge to making participation in our project and\nour community a harassment-free experience for everyone, regardless of age, body\nsize, disability, ethnicity, sex characteristics, gender identity and expression,\nlevel of experience, education, socio-economic status, nationality, personal\nappearance, race, religion, or sexual identity and orientation.\n\n## Our Standards\n\nExamples of behavior that contributes to creating a positive environment\ninclude:\n\n* Using welcoming and inclusive language\n* Being respectful of differing viewpoints and experiences\n* Gracefully accepting constructive criticism\n* Focusing on what is best for the community\n* Showing empathy towards other community members\n\nExamples of unacceptable behavior by participants include:\n\n* The use of sexualized language or imagery and unwelcome sexual attention or\n advances\n* Trolling, insulting/derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or electronic\n address, without explicit permission\n* Other conduct which could reasonably be considered inappropriate in a\n professional setting\n\n## Our Responsibilities\n\nProject maintainers are responsible for clarifying the standards of acceptable\nbehavior and are expected to take appropriate and fair corrective action in\nresponse to any instances of unacceptable behavior.\n\nProject maintainers have the right and responsibility to remove, edit, or\nreject comments, commits, code, wiki edits, issues, and other contributions\nthat are not aligned to this Code of Conduct, or to ban temporarily or\npermanently any contributor for other behaviors that they deem inappropriate,\nthreatening, offensive, or harmful.\n\n## Scope\n\nThis Code of Conduct applies both within project spaces and in public spaces\nwhen an individual is representing the project or its community. Examples of\nrepresenting a project or community include using an official project e-mail\naddress, posting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event. Representation of a project may be\nfurther defined and clarified by project maintainers.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported by contacting the project team at becherarmin@gmail.com. All\ncomplaints will be reviewed and investigated and will result in a response that\nis deemed necessary and appropriate to the circumstances. The project team is\nobligated to maintain confidentiality with regard to the reporter of an incident.\nFurther details of specific enforcement policies may be posted separately.\n\nProject maintainers who do not follow or enforce the Code of Conduct in good\nfaith may face temporary or permanent repercussions as determined by other\nmembers of the project's leadership.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,\navailable at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html\n\n[homepage]: https://www.contributor-covenant.org\n\nFor answers to common questions about this code of conduct, see\nhttps://www.contributor-covenant.org/faq\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "# Contribution\n\nAll contributions and comments welcome! Open an issue or create a Pull Request whenever you find a bug or have an idea to improve this crate.\n"
  },
  {
    "path": "Cargo.toml",
    "content": "[package]\nname = \"mlc\"\nversion = \"1.2.0\"\nauthors = [\"Armin Becher <becherarmin@gmail.com>\"]\nedition = \"2018\"\ndescription = \"The markup link checker (mlc) checks for broken links in markup files.\"\nkeywords = [ \"link-checker\", \"broken\", \"markup\", \"html\", \"markdown\"]\nreadme = \"README.md\"\nlicense = \"MIT\"\nrepository = \"https://github.com/becheran/mlc\"\n\n[badges]\nmaintenance = { status = \"actively-developed\" }\nis-it-maintained-open-issues = { repository = \"becheran/mlc\" }\nis-it-maintained-issue-resolution = { repository = \"becheran/mlc\" }\n\n[dependencies]\nclap = { version = \"4.6.0\", features = [\"cargo\"] }\nlog = \"0.4.29\"\nfern = \"0.7.1\"\nwalkdir = \"2.5.0\"\nregex = \"1.12.3\"\nlazy_static = \"1.5.0\"\nurl = \"2.5.4\"\ncolored = \"3.1.1\"\nasync-std = \"1.13.2\"\nreqwest = {version=\"0.13.2\", features = [\"native-tls-vendored\", \"brotli\", \"gzip\", \"deflate\"] }\ntokio = {version=\"1.51.1\", features = [\"rt-multi-thread\", \"macros\", \"time\"] }\nfutures = \"0.3.32\"\nwildmatch = \"2.6.1\"\npulldown-cmark = \"0.13.3\"\ntoml = \"1.1.2\"\nserde = { version = \"1.0.219\", features = [\"derive\"] }\nurl-escape = \"0.1.1\"\n\n[dev-dependencies]\nntest = \"0.9.5\"\ncriterion = \"0.8.2\"\nmockito = \"1.7.2\"\n\n[[bench]]\nname = \"benchmarks\"\nharness = false\n"
  },
  {
    "path": "Dockerfile",
    "content": "FROM ubuntu:24.04\n\nRUN apt-get update; apt-get install -y ca-certificates; update-ca-certificates\nRUN apt-get install git -y\nADD ./target/release/mlc /bin/mlc\nRUN chmod +x /bin/mlc\nRUN PATH=$PATH:/bin/mlc\n"
  },
  {
    "path": "GithubAction-Dockerfile",
    "content": "FROM becheran/mlc:1.2.0\n\nLABEL repository=\"https://github.com/becheran/mlc\"\n\nCOPY entrypoint.sh /entrypoint.sh\nRUN chmod +x /entrypoint.sh\n\nCOPY LICENSE README.md /\n\nENTRYPOINT [\"/entrypoint.sh\"]"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2020 Armin Becher\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Markup Link Checker\n\n[![crates.io](https://img.shields.io/crates/v/mlc.svg?color=orange)](https://crates.io/crates/mlc)\n[![downloads](https://badgen.net/crates/d/mlc?color=blue)](https://crates.io/crates/mlc)\n[![build status](https://github.com/becheran/mlc/actions/workflows/ci.yml/badge.svg)](https://github.com/becheran/mlc/actions/workflows/ci.yml)\n[![license](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit)\n[![PRs welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/becheran/mlc/blob/master/CONTRIBUTING.md)\n\n![image](./docs/mlc.gif)\n\nCheck for broken links in markup files. Currently `html` and `markdown` files are supported. The Markup Link Checker can easily be integrated in your CI pipeline to prevent broken links in your markup docs.\n\n## Features\n\n* Find and check links in `markdown` and `html` files\n* Validated absolute and relative file paths and URLs\n* Support for ignore/disable comments to skip specific links or blocks\n* User friendly command line interface\n* Easy [CI pipeline integration](#ci-pipeline)\n* Very fast execution using [async rust](https://rust-lang.github.io/async-book/)\n* Efficient link resolving strategy which tries with minimized network load\n* Throttle option to prevent *429 Too Many Requests* errors\n* Report broken links via GitHub workflow commands\n\n## Install Locally\n\nThere are different ways to install and use *mlc*.\n\n### Cargo\n\nUse rust's package manager [cargo](https://doc.rust-lang.org/cargo/) to install *mlc* from [crates.io](https://crates.io/crates/mlc):\n\n``` bash\ncargo install mlc\n```\n\n### Download Binaries\n\nTo download a compiled binary version of *mlc* go to [github releases](https://github.com/becheran/mlc/releases) and download the binaries compiled for:\n\n* **Linux**: x86_64 and aarch64 (arm64)\n* **macOS**: aarch64 (Apple Silicon)\n* **Windows**: x86_64\n\n### Arch Linux\n\nYou can install from the [official repositories](https://archlinux.org/packages/extra/x86_64/markuplinkchecker/) using [pacman](https://wiki.archlinux.org/title/Pacman):\n\n```bash\npacman -S markuplinkchecker\n```\n\n## CI Pipeline\n\n### GitHub Actions\n\nUse *mlc* in GitHub using the *GitHub-Action* from the [Marketplace](https://github.com/marketplace/actions/markup-link-checker-mlc).\n\n``` yaml\n- name: Markup Link Checker (mlc)\n  uses: becheran/mlc@v1\n```\n\nUse *mlc* command line arguments using the `with` argument:\n\n``` yaml\n- name: Markup Link Checker (mlc)\n  uses: becheran/mlc@v1\n  with:\n    args: >-\n      ./README.md\n      -H \"User-Agent: Mozilla/5.0\"\n      -H \"Authorization: Bearer ${{ secrets.MY_TOKEN }}\"\n```\n\nThe action does uses [GitHub workflow commands](https://docs.github.com/en/actions/reference/workflows-and-actions/workflow-commands) to highlight broken links:\n\n![annotation](./docs/FailingAnnotation.PNG)\n\n### Binary\n\nTo integrate *mlc* in your CI pipeline running in a *linux x86_64 environment* you can add the following commands to download and execute it:\n\n``` bash\ncurl -L https://github.com/becheran/mlc/releases/download/v1.2.0/mlc-x86_64-linux -o mlc\nchmod +x mlc\n./mlc\n```\n\nFor **linux aarch64/arm64** environments, use:\n\n``` bash\ncurl -L https://github.com/becheran/mlc/releases/download/v1.2.0/mlc-aarch64-linux -o mlc\nchmod +x mlc\n./mlc\n```\n\nFor example take a look at the [ntest repo](https://github.com/becheran/ntest/blob/master/.github/workflows/ci.yml) which uses *mlc* in the CI pipeline.\n\n## Docker\n\nUse the *mlc* docker image from the [docker hub](https://hub.docker.com/r/becheran/mlc) which includes *mlc*:\n\n``` sh\ndocker run becheran/mlc mlc\n```\n\n## Usage\n\nOnce you have *mlc* installed, it can be called from the command line. The following call will check all links in markup files found in the current folder and all subdirectories:\n\n``` bash\nmlc\n```\n\nAnother example is to call *mlc* on a certain directory or file:\n\n``` bash\nmlc ./docs\n```\n\nTo check only specific files, for example all `README.md` files in a monorepo:\n\n```bash\nmlc --files \"./README.md,./project1/README.md,./project2/README.md\"\n```\n\nAlternatively you may want to ignore all files currently ignored by `git` (requires `git` binary to be found on $PATH) and set a root-dir for relative links:\n\n```bash\nmlc --gitignore --root-dir .\n```\n\nCall *mlc* with the `--help` flag to display all available cli arguments:\n\n``` bash\nmlc -h\n```\n\nThe following arguments are available:\n\n| Argument         | Short | Description |\n|------------------|-------|-------------|\n| `<directory>`    |       | Only positional argument. Path to directory which shall be checked with all sub-dirs. Can also be a specific filename which shall be checked. |\n| `--help`         | `-h`  | Print help |\n| `--debug`        | `-d`  | Show verbose debug information |\n| `--do-not-warn-for-redirect-to` | | Do not warn for links which redirect to the given URL. Allows the same link format as `--ignore-links`. For example, `--do-not-warn-for-redirect-to \"http*://crates.io*\"` will not warn for links which redirect to the `crates.io` website. |\n| `--offline`      | `-o`  | Do not check any web links. Renamed from `--no-web-links` which is still an alias for downwards compatibility |\n| `--match-file-extension` | `-e`  | Set the flag, if the file extension shall be checked as well. For example the following markup link `[link](dir/file)` matches if for example a file called `file.md` exists in `dir`, but would fail when the `--match-file-extension` flag is set. |\n| `--version`      | `-V` | Print current version of mlc |\n| `--ignore-path`  | `-p` | Comma separated list of directories or files which shall be ignored. For example  |\n| `--gitignore`    | `-g` | Ignore all files currently ignored by git (requires `git` binary to be available on $PATH). |\n| `--gituntracked` | `-u` | Ignore all files currently untracked by git (requires `git` binary to be available on $PATH). |\n| `--ignore-links` | `-i` | Comma separated list of links which shall be ignored. Use simple `?` and `*` wildcards. For example `--ignore-links \"http*://crates.io*\"` will skip all links to the crates.io website. See the [used lib](https://github.com/becheran/wildmatch) for more information.  |\n| `--markup-types` | `-t` | Comma separated list list of markup types which shall be checked. Possible values: `md`, `html` |\n| `--root-dir`     | `-r` | All links to the file system starting with a slash on linux or backslash on windows will use another virtual root dir. For example the link in a file `[link](/dir/other/file.md)` checked with the cli arg `--root-dir /env/another/dir` will let *mlc* check the existence of `/env/another/dir/dir/other/file.md`. |\n| `--throttle`     | `-T` | Number of milliseconds to wait in between web requests to the same host. Default is zero which means no throttling. Set this if you need to slow down the web request frequency to avoid `429 - Too Many Requests` responses. For example with `--throttle 15`, between each http check to the same host, 15 ms will be waited. Note that this setting can slow down the link checker. |\n| `--csv`          |      | Path to csv file which contains all failed requests and warnings in the format `source,line,column,target,severity`. The severity column contains `ERR` for errors and `WARN` for warnings. |\n| `--files`        | `-f` | Comma separated list of files which shall be checked. For example `--files \"./README.md,./docs/README.md\"` will check only the specified files. This is useful for checking specific files in a monorepo without having to exclude many directories. |\n| `--http-headers` | `-H` | Comma separated list of custom HTTP headers in the format `'Name: Value'`. This is useful for setting custom user agents or other headers required by specific websites. For example `--http-headers \"User-Agent: Mozilla/5.0,X-Custom-Header: value\"` will set both a custom user agent and an additional header. |\n\n## Ignore Comments\n\nYou can use HTML comments to disable link checking for specific lines or blocks in both markdown and HTML files:\n\n### Disable for Current Line\n\n```markdown\n<!-- mlc-disable-line --> [This link](http://broken-link.invalid) will be ignored\n```\n\n### Disable for Next Line\n\n```markdown\n<!-- mlc-disable-next-line -->\n[This link](http://broken-link.invalid) will be ignored\n```\n\n### Disable/Enable Blocks\n\n```markdown\n[This link](http://example.com) will be checked\n\n<!-- mlc-disable -->\n[This link](http://broken-link.invalid) will be ignored\n[This link](http://also-broken.invalid) will also be ignored\n<!-- mlc-enable -->\n\n[This link](http://example.org) will be checked again\n```\n\nIf you use `<!-- mlc-disable -->` without a corresponding `<!-- mlc-enable -->`, all links from that point until the end of the file will be ignored.\n\nThese comments work in both markdown and HTML files.\n\nAll optional arguments which can be passed via the command line can also be configured via the `.mlc.toml` config file in the working directory:\n\n``` toml\n# Print debug information to console\ndebug = true\n# Do not warn for links which redirect to the given URL\ndo-not-warn-for-redirect-to=[\"http*://crates.io*\"]\n# Do not check web links\noffline = true\n# Check the exact file extension when searching for a file\nmatch-file-extension= true\n# List of files and directories which will be ignored\nignore-path=[\"./ignore-me\",\"./src\"]\n# Ignore all files ignored by git\ngitignore = true\n# List of links which will be ignored\nignore-links=[\"http://ignore-me.de/*\",\"http://*.ignoresub-domain/*\"]\n# List of markup types which shall be checked\nmarkup-types=[\"Markdown\",\"Html\"]\n# Wait time in milliseconds between http request to the same host\nthrottle= 100\n# Path to the root folder used to resolve all relative paths\nroot-dir=\"./\"\n# Path to csv file which contains all failed requests and warnings\ncsv=\"output.csv\"\n# List of specific files to check\nfiles=[\"./README.md\",\"./docs/README.md\"]\n# Custom HTTP headers to send with web requests\nhttp-headers=[\"User-Agent: Mozilla/5.0\",\"X-Custom-Header: value\"]\n```\n\n## Changelog\n\nCheckout the [changelog file](https://github.com/becheran/mlc/blob/master/CHANGELOG.md) to see the changes between different versions.\n\n## License\n\nThis project is licensed under the *MIT License* - see the [LICENSE file](https://github.com/becheran/mlc/blob/master/LICENSE) for more details.\n"
  },
  {
    "path": "action.yml",
    "content": "name: 'Markup Link Checker (mlc)'\ndescription: 'Check links in markup files'\ninputs:\n  args:\n    description: 'arguments'\n    default: './'\nruns:\n  using: 'docker'\n  image: 'GithubAction-Dockerfile'\nbranding:\n  icon: 'link'  \n  color: 'green'"
  },
  {
    "path": "benches/benchmark/html/many_links.html",
    "content": "<html>\n<h1>Hello, world!</h1>\n<p>bla bla <a hreflang=\"en\" href=\"https://www.w3schools.com\">Visit W3Schools.com!</a> bla bla </p>\n<p>bla bla <a href  = \"https://www.w3schools.com\">Visit W3Schools.com!</a> bla bla </p>\n<p>multiline\n    <a \n    href=  \n    \"https://www.w3schools.com\"   >Visit W3Schools.com!\n    </> bla bla\n</p>\nsdjklf slfkj <!--\n<p>commented </p>\n-->\n</html>"
  },
  {
    "path": "benches/benchmark/html/no_links.html",
    "content": ""
  },
  {
    "path": "benches/benchmark/html/xhtml.xhtml",
    "content": "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n\n<html xmlns=\"http://www.w3.org/1999/xhtml\">\n\n<head>\n  <title>Title of document</title>\n  <a href  = \"https://www.w3schools.com\">Visit W3Schools.com!</a>\n</head>\n\n<body>\n  some content\n</body>\n\n</html>"
  },
  {
    "path": "benches/benchmark/markdown/HashLinks.md",
    "content": "# Chapter 1\n\n[go to chapter 2](#chapter-2)\n\n[go to chapter 2-2](#####chapter-21)\n\n[go to chapter 2-2](#####chapter-22)\n\n[go to Other Page](./ref_links.md#ref-link-chapters)\n\n[go to Other Page](./ref_links.md###ref-link-chapters)\n\n# Chapter 2\n\n## Chapter 21\n\n## Chapter 22\n\n[go to chapter 1](#####chapter-1)\n\n# Chapter 3"
  },
  {
    "path": "benches/benchmark/markdown/ansi_encoded.md",
    "content": "# File containing some ansi char\n\n�"
  },
  {
    "path": "benches/benchmark/markdown/broken-local-link.md",
    "content": "# All broken links\n\n[broken](./doc/broken-local-link.doc)\n[ok](./binary_file.md)\n"
  },
  {
    "path": "benches/benchmark/markdown/deep/deeper/go_up.md",
    "content": "[furtherup](../../HashLinks.md)"
  },
  {
    "path": "benches/benchmark/markdown/ignore_me.md",
    "content": "[Broken](broken_Link)\n[Broken](broken_Link)\n[Broken](broken_Link)\n"
  },
  {
    "path": "benches/benchmark/markdown/ignore_me_dir/ignore_me copy.md",
    "content": "[Broken](broken_Link)\n[Broken](broken_Link)\n[Broken](broken_Link)\n"
  },
  {
    "path": "benches/benchmark/markdown/ignore_me_dir/ignore_me.md",
    "content": "[Broken](broken_Link)\n[Broken](broken_Link)\n[Broken](broken_Link)\n"
  },
  {
    "path": "benches/benchmark/markdown/link_ignore_file_extension.md",
    "content": "# Chapter 1\n\n[ref](./ref_links)\n[ref](./no_links/no_links)\n"
  },
  {
    "path": "benches/benchmark/markdown/many_links/many_links (another copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n[many_links](./many_links.md)\n\n[many_links](./many_links)\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n[many_links](./many_links.md)\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n[many_links](./many_links.md)\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\n[many_links](./many_links.md)\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n[many_links](./many_links.md)\n\n## Deos Atlas\n[many_links](./many_links.md)\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\n[many_links](./many_links.md)\n\nFecit interea **sub Melaneus**, veniente loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n"
  },
  {
    "path": "benches/benchmark/markdown/many_links/many_links (copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n[many_links](./many_links.md)\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n[many_links](./many_links.md)\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n[many_links](./many_links.md)\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\n[many_links](./many_links.md)\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n[many_links](./many_links.md)\n\n## Deos Atlas\n[many_links](./many_links.md)\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\n[many_links](./many_links.md)\n\nFecit interea **sub Melaneus**, veniente loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n"
  },
  {
    "path": "benches/benchmark/markdown/many_links/many_links.md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n[many_links](./many_links.md)\n[a Stack data structure](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n[many_links](./many_links.md)\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n[many_links](./many_links.md)\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\n[many_links](./many_links.md)\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n[many_links](./many_links.md)\n\n## Deos Atlas\n[many_links](./many_links.md)\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\n[many_links](./many_links.md)\n\nFecit interea **sub Melaneus**, veniente loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n[many_links](./many_links.md)\n\n* [option one] foo\n* [option two]: bar\n"
  },
  {
    "path": "benches/benchmark/markdown/many_links.md",
    "content": "# Many Links\n\n[local_file](many_links.md)\n[folder](./deep)\n[https_link](https://www.google.de/)\n[https_link2](https://www.google.de/?hl=de)\n\n[mail](mailto://test.mail@tester.com)\n\n[unkown_url](another://foobar)\n"
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/F3_with_umlaut.md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente [si\n\nipsum](NoLInk) loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/NotMardown.nm",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f1.md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente [si\n\nipsum](NoLInk) loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f10.text",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f11.Rmd",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f12.mkd",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f2.MD",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente [si\n\nipsum](NoLInk) loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f4.markdown",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f5.mkdown",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f6.mkdn",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f7.mdwn",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f8.mdtxt",
    "content": "\n"
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/f9.mdtext",
    "content": ""
  },
  {
    "path": "benches/benchmark/markdown/md_file_endings/notmd",
    "content": "Domus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote."
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (3rd copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (4th copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (5th copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (6th copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (7th copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (another copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links (copy).md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/no_links/no_links.md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n## Siquid suis Anguem sola\n\nLorem markdownum illum Hymenaee crudelius, sub magni, sic missa, sui quas dixit\nadde Othrys successor conspecta. Acuto fuit, hinc nata caedit dolentes.\n\nMea nec, honores egreditur fugae, suffuderat nudaque nomen redeuntia tamen\ncommenta. Eris dum; caelumque felix poscebatur **diro et** virgine totumque\nfactis: satiata Ophias agnovit parvos gratulor.\n\nSpatio et hasta. Somni hic Pergama saeviat vincta. Quod fessa aethere stratoque,\nabluit hoc *erat*, vera sua protinus rati? Indignatur ferenda arma *moverat* ubi\nspina his, conpagibus saepe altera. Esto sortem vota: esse, molle armo auras\n*et* quod mortale cum isdem *nigra surgit*?\n\n## Figuram ait credita auctus Stygiisque ventrem redimicula\n\nEt quemquam nec nostri, nil atque stabat unus. Defensamus numina! Aevi non\nmutantur dedecus minus rediit, carent contermina Thybrin adorant volucrisque\nvita quassaque adparet residunt Proximus.\n\n    characterRwData(user_scrolling_wiki, -5, 76);\n    if (wi_platform_cpl) {\n        token_encoding_android.progressiveLeft = kbps_honeypot + motherboard;\n    }\n    if (uploadBus.iphone(ppl + 95, jquery - 436282)) {\n        webmail.plug -= -3;\n        log_expansion += mampProperty;\n    } else {\n        installer_error = koffice_wiki_upnp * analystMultitasking;\n        whois_card_correction = rippingMemorySsid;\n    }\n    bar_pim += esports(33, hardSli * verticalBarcraft *\n            systemAutoresponderFlowchart);\n    var laser_vdsl = displayCaptcha + standaloneBrowser + character + 17;\n\nDomus nobis mihi, iam nec temperat: opes liceat volucri, tamen pars cruor\nnymphae feroxque. Et hiems audierat atque ora avia huic Sidone; ut non est nubes\nepota. Erat satos nec suo ut inhaesi ignes, est **fer praesens cum** genus.\nMurmure ad et possit mensum, et speque, diversa et adnuit, singula clamavit\nfacitote.\n\n## Deos Atlas\n\nHausit telluris et tandem inscius. Negaretur manu, scopulis fuit vulgique,\ninvenit putes locuta.\n\n    ppp *= mini(intranetMmsLte, system_repository_ldap) + gis + 17 / san + 1;\n    vdu = dataIpad;\n    pci(2);\n    offline_hardening_cycle += telnet_flash_spyware + cybercrimeFormulaRate(37);\n\nFecit interea **sub Melaneus**, veniente  loco matre. Solacia Titani digitis\ninterrita communemque venit grator oraque supplex frigora, tibi. Si digitorum se\nhumum variasque **viscera** Lyciaeque a poscitis incurvae erat nullo quod\nrelictus.\n"
  },
  {
    "path": "benches/benchmark/markdown/ref_links.md",
    "content": "[link1][1] another Link: [link2][foo] \n\nUse link again [link1 agin][1]\n\nLKJDF\n[1]: ./ref_links.md\n        [foo]: ./ref_links.md\n     This aint no link   [boo]: ./not_existent.md\n\n# Ref Link Chapter"
  },
  {
    "path": "benches/benchmark/markdown/reference_link.md",
    "content": "# Contain reference style markdown links\n\n[I'm a reference-style link][Arbitrary case-insensitive reference text]\n\n[I'm a relative reference to a repository file](./many_links.md)\n\n[You can use numbers for reference-style link definitions][1]\n\nOr leave it empty and use the [link text itself].\n\n[This is not a valid reference link][2]\n\nURLs and URLs in angle brackets will automatically get turned into links.\n<http://www.example.com> or <http://www.example.com> and sometimes\nexample.com (but not on Github, for example).\n\nSome text to show that the reference links can follow later.\n\n[arbitrary case-insensitive reference text]: https://www.mozilla.org\n[1]: http://slashdot.org\n[link text itself]: https://www.google.com\n"
  },
  {
    "path": "benches/benchmark/markdown/repeate_same_link.md",
    "content": "# Chapter 1\n\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)\n[Google](https://google.de)"
  },
  {
    "path": "benches/benchmark/markdown/script_and_comments.md",
    "content": "[this is a link](./script_and_comments.md)\n\n``` js\n[this is not a link](./nowhere)\n```\n\n[this is a link](./script_and_comments.md)\n\nalso not a link `[this is not a link](./nowhere)`\n\n```\n[this is not a link](./nowhere)\n```\n\n[this is a link](./script_and_comments.md)\n\n<script type=\"text/javascript\">\n[this is not a link](./nowhere)\n</script>\n\n[this is a link](./script_and_comments.md)\n\n<!-- commented\n[this is not a link](./nowhere)\n-->\n\n<!-- commented [this is not a link](./nowhere)-->\n\n[this is a link](./script_and_comments.md)\n"
  },
  {
    "path": "benches/benchmark/markdown/withUmlaut_ö/LinksWithUmläuts.md",
    "content": "# Torrentur suum abstrahor quique Iuppiter rerum mediocris\n\n[many_links](./LinksWithUmläuts.md)\n\n"
  },
  {
    "path": "benches/benchmarks.rs",
    "content": "#[cfg(test)]\n#[macro_use]\nextern crate criterion;\n\nuse criterion::Criterion;\nuse mlc::markup::MarkupType;\nuse mlc::{Config, OptionalConfig};\nuse std::fs;\n\nasync fn end_to_end_benchmark() {\n    let config = Config {\n        directory: fs::canonicalize(\"./benches/benchmark/markdown/ignore_me_dir\").unwrap(),\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n    mlc::run(&config).await.unwrap();\n}\n\nfn criterion_benchmark(c: &mut Criterion) {\n    c.bench_function(\"End to end benchmark\", |b| b.iter(end_to_end_benchmark));\n}\n\ncriterion_group! {\n    name = benches;\n    config = Criterion::default().sample_size(10);\n    targets = criterion_benchmark\n}\ncriterion_main!(benches);\n"
  },
  {
    "path": "benches/different_root/one/two.md",
    "content": "[one](\\one.md)\n[two](/two.md)"
  },
  {
    "path": "benches/different_root/one.md",
    "content": "[one](/one.md)\n[two](/two.md)\n[two](/one/two.md)"
  },
  {
    "path": "benches/different_root/two.md",
    "content": "[one](/one.md)\n[two](\\two.md)\n[two](/one/two.md)"
  },
  {
    "path": "benches/throttle/different_host.md",
    "content": "# Chapter 1\n\n[fooRandomNotValidURLBla0](https://fooRandomNotValidURLBla0.de/f0)\n[fooRandomNotValidURLBla1](https://fooRandomNotValidURLBla1.de/f1)\n[fooRandomNotValidURLBla2](https://fooRandomNotValidURLBla2.de/f2)\n[fooRandomNotValidURLBla3](https://fooRandomNotValidURLBla3.de/f3)\n[fooRandomNotValidURLBla4](https://fooRandomNotValidURLBla4.de/f4)\n[fooRandomNotValidURLBla5](https://fooRandomNotValidURLBla5.de/f5)\n[fooRandomNotValidURLBla6](https://fooRandomNotValidURLBla6.de/f6)\n[fooRandomNotValidURLBla7](https://fooRandomNotValidURLBla7.de/f7)\n[fooRandomNotValidURLBla8](https://fooRandomNotValidURLBla8.de/f8)\n[fooRandomNotValidURLBla9](https://fooRandomNotValidURLBla9.de/f9)\n"
  },
  {
    "path": "benches/throttle/same_host.md",
    "content": "# Chapter 1\n\n[foo](https://fooRandomNotValidURLBla0.de/f0)\n[foo](https://fooRandomNotValidURLBla0.de/f1)\n[foo](https://fooRandomNotValidURLBla0.de/f2)\n[foo](https://fooRandomNotValidURLBla0.de/f3)\n[foo](https://fooRandomNotValidURLBla0.de/f4)\n[foo](https://fooRandomNotValidURLBla0.de/f5)\n[foo](https://fooRandomNotValidURLBla0.de/f6)\n[foo](https://fooRandomNotValidURLBla0.de/f7)\n[foo](https://fooRandomNotValidURLBla0.de/f8)\n[foo](https://fooRandomNotValidURLBla0.de/f9)"
  },
  {
    "path": "benches/throttle/same_ip.md",
    "content": "# Chapter 1\n\n[foo](https://127.0.0.1/f0)\n[foo](https://127.0.0.1/f1)\n[foo](https://127.0.0.1/f2)\n[foo](https://127.0.0.1/f3)\n[foo](https://127.0.0.1/f4)\n[foo](https://127.0.0.1/f5)\n[foo](https://127.0.0.1/f6)\n[foo](https://127.0.0.1/f7)\n[foo](https://127.0.0.1/f8)\n[foo](https://127.0.0.1/f9)\n"
  },
  {
    "path": "entrypoint.sh",
    "content": "#!/bin/bash\n\nmlc $*"
  },
  {
    "path": "release.toml",
    "content": "pre-release-replacements = [\n  {file=\"README.md\", search=\"releases/download/v[0-9\\\\.-]+\", replace=\"releases/download/v{{version}}\"},\n  {file=\"CHANGELOG.md\", search=\"Unreleased\", replace=\"{{version}}\"},\n  {file=\"CHANGELOG.md\", search=\"ReleaseDate\", replace=\"{{date}}\"},\n  {file=\"CHANGELOG.md\", search=\"<!-- next-header -->\", replace=\"<!-- next-header -->\\n\\n## [Unreleased] - ReleaseDate\"},\n  {file=\"GithubAction-Dockerfile\", search=\"FROM becheran/mlc:[0-9\\\\.-]+\", replace=\"FROM becheran/mlc:{{version}}\"},\n]"
  },
  {
    "path": "src/cli.rs",
    "content": "use crate::markup::MarkupType;\nuse crate::Config;\nuse crate::OptionalConfig;\nuse clap::Arg;\nuse clap::ArgAction;\nuse std::fs;\nuse std::path::Path;\nuse std::path::MAIN_SEPARATOR;\n\nconst CONFIG_FILE_PATH: &str = \"./.mlc.toml\";\n\nfn normalize_path_separators(path: &str) -> String {\n    path.replace(['/', '\\\\'], std::path::MAIN_SEPARATOR_STR)\n}\n\n#[must_use]\npub fn parse_args() -> Config {\n    let mut opt: OptionalConfig = match fs::read_to_string(CONFIG_FILE_PATH) {\n        Ok(content) => match toml::from_str(&content) {\n            Ok(o) => o,\n            Err(err) => panic!(\"Invalid TOML file {:?}\", err),\n        },\n        Err(_) => OptionalConfig::default(),\n    };\n\n    if let Some(root_dir) = &opt.root_dir {\n        if !root_dir.is_dir() {\n            eprintln!(\"Root path {root_dir:?} must be an existing directory (from .mlc.toml).\");\n            std::process::exit(1);\n        }\n    }\n\n    let matches = command!()\n        .arg(\n            Arg::new(\"directory\")\n                .help(\"Check all links in given directory and subdirectory\")\n                .required(false)\n                .index(1),\n        )\n        .arg(arg!(-d --debug \"Print debug information to console\").required(false))\n        .arg(\n            arg!(-o --offline \"Do not check web links\")\n                .alias(\"no-web-links\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"do-not-warn-for-redirect-to\")\n                .long(\"do-not-warn-for-redirect-to\")\n                .value_name(\"LINKS\")\n                .value_delimiter(',')\n                .action(ArgAction::Append)\n                .help(\"Comma separated list of links which will be ignored\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"match-file-extension\")\n                .long(\"match-file-extension\")\n                .short('e')\n                .action(ArgAction::SetTrue)\n                .help(\"Check the exact file extension when searching for a file\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"ignore-path\")\n                .long(\"ignore-path\")\n                .short('p')\n                .help(\"Comma separated list of files and directories which will be ignored\")\n                .value_name(\"PATHS\")\n                .value_delimiter(',')\n                .action(ArgAction::Append)\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"ignore-links\")\n                .long(\"ignore-links\")\n                .short('i')\n                .value_name(\"LINKS\")\n                .value_delimiter(',')\n                .action(ArgAction::Append)\n                .help(\"Comma separated list of links which will be ignored\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"markup-types\")\n                .long(\"markup-types\")\n                .short('t')\n                .value_name(\"TYPES\")\n                .help(\"Comma separated list of markup types which shall be checked\")\n                .action(ArgAction::Append)\n                .value_delimiter(',')\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"throttle\")\n                .long(\"throttle\")\n                .short('T')\n                .value_name(\"DELAY-MS\")\n                .help(\"Wait time in milliseconds between http request to the same host\")\n                .action(ArgAction::Append)\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"root-dir\")\n                .long(\"root-dir\")\n                .short('r')\n                .value_name(\"DIR\")\n                .help(\"Path to the root folder used to resolve all relative paths\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"gitignore\")\n                .long(\"gitignore\")\n                .short('g')\n                .value_name(\"GIT\")\n                .help(\"Ignore all files ignored by git\")\n                .action(ArgAction::SetTrue)\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"csv\")\n                .long(\"csv\")\n                .value_name(\"CSV_FILE\")\n                .help(\"set the output file for the CSV report\")\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"gituntracked\")\n                .long(\"gituntracked\")\n                .short('u')\n                .value_name(\"GITUNTRACKED\")\n                .help(\"Ignore all files untracked by git\")\n                .action(ArgAction::SetTrue)\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"files\")\n                .long(\"files\")\n                .short('f')\n                .help(\"Comma separated list of files which shall be checked\")\n                .value_name(\"FILES\")\n                .value_delimiter(',')\n                .action(ArgAction::Append)\n                .required(false),\n        )\n        .arg(\n            Arg::new(\"http-headers\")\n                .long(\"http-headers\")\n                .short('H')\n                .help(\"Comma separated list of custom HTTP headers in the format 'Name: Value'. For example 'User-Agent: Mozilla/5.0'\")\n                .value_name(\"HEADERS\")\n                .value_delimiter(',')\n                .action(ArgAction::Append)\n                .required(false),\n        )\n        .get_matches();\n\n    let default_dir = format!(\".{}\", &MAIN_SEPARATOR);\n    let dir_string = matches\n        .get_one::<String>(\"directory\")\n        .unwrap_or(&default_dir);\n    let directory = normalize_path_separators(dir_string)\n        .parse()\n        .expect(\"failed to parse path\");\n\n    if matches.get_flag(\"debug\") {\n        opt.debug = Some(true);\n    }\n\n    if let Some(do_not_warn_for_redirect_to) =\n        matches.get_many::<String>(\"do-not-warn-for-redirect-to\")\n    {\n        opt.do_not_warn_for_redirect_to =\n            Some(do_not_warn_for_redirect_to.map(|x| x.to_string()).collect());\n    }\n\n    if let Some(throttle_str) = matches.get_one::<String>(\"throttle\") {\n        let throttle = throttle_str.parse::<u32>().unwrap();\n        opt.throttle = Some(throttle);\n    }\n\n    if let Some(f) = matches.get_one::<String>(\"csv\") {\n        opt.csv_file = Some(Path::new(&normalize_path_separators(f)).to_path_buf());\n    }\n\n    if let Some(markup_types) = matches.get_many::<String>(\"markup-types\") {\n        opt.markup_types = Some(\n            markup_types\n                .map(|v| v.as_str().parse().expect(\"invalid markup type\"))\n                .collect(),\n        );\n    }\n    if opt.markup_types.is_none() {\n        opt.markup_types = Some(vec![MarkupType::Markdown, MarkupType::Html]);\n    }\n\n    if matches.get_flag(\"offline\") {\n        opt.offline = Some(true);\n    }\n\n    if matches.get_flag(\"match-file-extension\") {\n        opt.match_file_extension = Some(true)\n    }\n\n    if let Some(ignore_links) = matches.get_many::<String>(\"ignore-links\") {\n        opt.ignore_links = Some(ignore_links.map(|x| x.to_string()).collect());\n    }\n\n    if let Some(ignore_path) = matches.get_many::<String>(\"ignore-path\") {\n        let mut paths: Vec<_> = ignore_path.map(|x| Path::new(x).to_path_buf()).collect();\n        for p in paths.iter_mut() {\n            match fs::canonicalize(&p) {\n                Ok(canonical_path) => {\n                    *p = canonical_path;\n                }\n                Err(e) => {\n                    println!(\"⚠ Warn: Ignore path {p:?} not found. {e:?}.\");\n                }\n            };\n        }\n        opt.ignore_path = Some(paths);\n    }\n\n    if matches.get_flag(\"gitignore\") {\n        opt.gitignore = Some(true);\n    }\n\n    if matches.get_flag(\"gituntracked\") {\n        opt.gituntracked = Some(true);\n    }\n\n    if let Some(files) = matches.get_many::<String>(\"files\") {\n        let mut file_paths: Vec<_> = files\n            .map(|x| Path::new(&normalize_path_separators(x)).to_path_buf())\n            .collect();\n        for p in file_paths.iter_mut() {\n            match fs::canonicalize(&p) {\n                Ok(canonical_path) => {\n                    *p = canonical_path;\n                }\n                Err(e) => {\n                    println!(\"⚠ Warn: File path {p:?} not found. {e:?}.\");\n                }\n            };\n        }\n        opt.files = Some(file_paths);\n    }\n\n    if let Some(http_headers) = matches.get_many::<String>(\"http-headers\") {\n        opt.http_headers = Some(http_headers.map(|x| x.to_string()).collect());\n    }\n\n    if let Some(root_dir) = matches.get_one::<String>(\"root-dir\") {\n        let root_path = Path::new(&normalize_path_separators(root_dir)).to_path_buf();\n        if !root_path.is_dir() {\n            eprintln!(\"Root path {root_path:?} must be a directory!\");\n            std::process::exit(1);\n        }\n        opt.root_dir = Some(root_path)\n    }\n\n    Config {\n        directory,\n        optional: opt,\n    }\n}\n"
  },
  {
    "path": "src/file_traversal.rs",
    "content": "extern crate walkdir;\n\nuse crate::markup::{MarkupFile, MarkupType};\nuse crate::Config;\nuse std::collections::HashSet;\nuse std::fs;\nuse std::path::PathBuf;\nuse walkdir::WalkDir;\n\n/// Checks if a file path has already been seen and adds it to the set if not.\n/// Returns true if the file should be skipped (already seen), false otherwise.\nfn should_skip_file(seen_paths: &mut HashSet<PathBuf>, abs_path: PathBuf, f_name: &str) -> bool {\n    if seen_paths.contains(&abs_path) {\n        debug!(\n            \"Skip file {f_name}, already checked via canonical path: {:?}\",\n            abs_path\n        );\n        true\n    } else {\n        seen_paths.insert(abs_path);\n        false\n    }\n}\n\npub fn find(config: &Config, result: &mut Vec<MarkupFile>) {\n    let mut seen_paths: HashSet<PathBuf> = HashSet::new();\n    let markup_types = match &config.optional.markup_types {\n        Some(t) => t,\n        None => panic!(\"Bug! markup_types must be set\"),\n    };\n\n    // If specific files are provided, process only those files\n    if let Some(files) = &config.optional.files {\n        info!(\"Checking specific files: {files:?}\");\n\n        for file_path in files {\n            if !file_path.exists() {\n                warn!(\"File path '{file_path:?}' does not exist.\");\n                continue;\n            }\n\n            if !file_path.is_file() {\n                warn!(\"Path '{file_path:?}' is not a file.\");\n                continue;\n            }\n\n            let f_name = file_path\n                .file_name()\n                .map(|n| n.to_string_lossy().to_string())\n                .unwrap_or_default();\n\n            debug!(\"Check file: '{f_name}'\");\n\n            if let Some(markup_type) = markup_type(&f_name, markup_types) {\n                let abs_path = match fs::canonicalize(file_path) {\n                    Ok(abs_path) => abs_path,\n                    Err(e) => {\n                        warn!(\"Path '{file_path:?}' not able to canonicalize path. '{e}'\");\n                        continue;\n                    }\n                };\n\n                let ignore = match &config.optional.ignore_path {\n                    Some(p) => p.iter().any(|ignore_path| ignore_path == &abs_path),\n                    None => false,\n                };\n\n                if ignore {\n                    debug!(\"Ignore file {f_name}, because it is in the ignore path list.\");\n                } else if !should_skip_file(&mut seen_paths, abs_path, &f_name) {\n                    let file = MarkupFile {\n                        markup_type,\n                        path: file_path.to_string_lossy().to_string(),\n                    };\n                    debug!(\"Found file: {file:?}.\");\n                    result.push(file);\n                }\n            } else {\n                warn!(\"File '{f_name}' does not match any supported markup type.\");\n            }\n        }\n        return;\n    }\n\n    // Otherwise, use directory traversal\n    let root = &config.directory;\n    info!(\"Search for files of markup types '{markup_types:?}' in directory '{root:?}'\");\n\n    for entry in WalkDir::new(root)\n        .follow_links(false)\n        .into_iter()\n        .filter_entry(|e| {\n            !(e.file_type().is_dir()\n                && config.optional.ignore_path.as_ref().is_some_and(|x| {\n                    x.iter().any(|f| {\n                        let ignore = f.is_dir()\n                            && e.path()\n                                .canonicalize()\n                                .unwrap_or_default()\n                                .starts_with(fs::canonicalize(f).unwrap_or_default());\n                        if ignore {\n                            info!(\"Ignore directory: '{f:?}'\");\n                        }\n                        ignore\n                    })\n                }))\n        })\n        .filter_map(Result::ok)\n        .filter(|e| !e.file_type().is_dir())\n    {\n        let f_name = entry.file_name().to_string_lossy();\n        debug!(\"Check file: '{f_name}'\");\n\n        if let Some(markup_type) = markup_type(&f_name, markup_types) {\n            let path = entry.path();\n\n            let abs_path = match fs::canonicalize(path) {\n                Ok(abs_path) => abs_path,\n                Err(e) => {\n                    warn!(\"Path '{path:?}' not able to canonicalize path. '{e}'\");\n                    continue;\n                }\n            };\n\n            let ignore = match &config.optional.ignore_path {\n                Some(p) => p.iter().any(|ignore_path| ignore_path == &abs_path),\n                None => false,\n            };\n            if ignore {\n                debug!(\"Ignore file {f_name}, because it is in the ignore path list.\");\n            } else if !should_skip_file(&mut seen_paths, abs_path, &f_name) {\n                let file = MarkupFile {\n                    markup_type,\n                    path: path.to_string_lossy().to_string(),\n                };\n                debug!(\"Found file: {file:?}.\");\n                result.push(file);\n            }\n        }\n    }\n}\n\nfn markup_type(file: &str, markup_types: &[MarkupType]) -> Option<MarkupType> {\n    let file_low = file.to_lowercase();\n    for markup_type in markup_types {\n        let extensions = markup_type.file_extensions();\n        for ext in extensions {\n            let mut ext_low = String::from(\".\");\n            ext_low.push_str(&ext.to_lowercase());\n            if file_low.ends_with(&ext_low) {\n                return Some(*markup_type);\n            }\n        }\n    }\n\n    None\n}\n"
  },
  {
    "path": "src/lib.rs",
    "content": "#[macro_use]\nextern crate log;\n#[macro_use]\nextern crate clap;\n#[macro_use]\nextern crate lazy_static;\n\nuse crate::link_extractors::link_extractor::MarkupLink;\nuse crate::link_validator::link_type::get_link_type;\nuse crate::link_validator::link_type::LinkType;\nuse crate::link_validator::resolve_target_link;\nuse crate::markup::MarkupFile;\nuse link_extractors::link_extractor::BrokenExtractedLink;\nuse serde::Deserialize;\nuse std::collections::HashMap;\nuse std::env;\nuse std::fmt;\nuse std::fs;\nuse std::io::Write;\nuse std::path::Path;\nuse std::path::PathBuf;\nuse std::process::Command;\nuse std::sync::Arc;\nuse std::vec;\nuse tokio::sync::Mutex;\nuse tokio::time::{sleep_until, Duration, Instant};\npub mod cli;\npub mod file_traversal;\npub mod link_extractors;\npub mod link_validator;\npub mod logger;\npub mod markup;\npub use colored::*;\npub use wildmatch::WildMatch;\n\nuse futures::{stream, StreamExt};\nuse link_validator::LinkCheckResult;\nuse url::Url;\n\nconst PARALLEL_REQUESTS: usize = 20;\n\n#[derive(Default, Debug, Deserialize)]\npub struct OptionalConfig {\n    pub debug: Option<bool>,\n    #[serde(rename(deserialize = \"do-not-warn-for-redirect-to\"))]\n    pub do_not_warn_for_redirect_to: Option<Vec<String>>,\n    #[serde(rename(deserialize = \"markup-types\"))]\n    pub markup_types: Option<Vec<markup::MarkupType>>,\n    pub offline: Option<bool>,\n    #[serde(rename(deserialize = \"match-file-extension\"))]\n    pub match_file_extension: Option<bool>,\n    #[serde(rename(deserialize = \"ignore-links\"))]\n    pub ignore_links: Option<Vec<String>>,\n    #[serde(rename(deserialize = \"ignore-path\"))]\n    pub ignore_path: Option<Vec<PathBuf>>,\n    #[serde(rename(deserialize = \"root-dir\"))]\n    pub root_dir: Option<PathBuf>,\n    #[serde(rename(deserialize = \"csv\"))]\n    pub csv_file: Option<PathBuf>,\n    #[serde(rename(deserialize = \"gitignore\"))]\n    pub gitignore: Option<bool>,\n    #[serde(rename(deserialize = \"gituntracked\"))]\n    pub gituntracked: Option<bool>,\n    pub throttle: Option<u32>,\n    #[serde(rename(deserialize = \"files\"))]\n    pub files: Option<Vec<PathBuf>>,\n    #[serde(rename(deserialize = \"http-headers\"))]\n    pub http_headers: Option<Vec<String>>,\n}\n\n#[derive(Default, Debug, Deserialize)]\npub struct Config {\n    pub directory: PathBuf,\n    pub optional: OptionalConfig,\n}\n\nimpl fmt::Display for Config {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let ignore_str: Vec<String> = match &self.optional.ignore_links {\n            Some(s) => s.iter().map(|m| m.to_string()).collect(),\n            None => vec![],\n        };\n        let root_dir_str = match &self.optional.root_dir {\n            Some(p) => p.to_str().unwrap_or(\"\"),\n            None => \"\",\n        };\n        let ignore_path_str: Vec<String> = match &self.optional.ignore_path {\n            Some(p) => p.iter().map(|m| m.to_str().unwrap().to_string()).collect(),\n            None => vec![],\n        };\n        let csv_file_str: Vec<String> = match &self.optional.csv_file {\n            Some(p) => p.iter().map(|m| m.to_str().unwrap().to_string()).collect(),\n            None => vec![],\n        };\n        let markup_types_str: Vec<String> = match &self.optional.markup_types {\n            Some(p) => p.iter().map(|m| format![\"{m:?}\"]).collect(),\n            None => vec![],\n        };\n        let files_str: Vec<String> = match &self.optional.files {\n            Some(p) => p.iter().map(|m| m.to_str().unwrap().to_string()).collect(),\n            None => vec![],\n        };\n        let http_headers_str: Vec<String> = match &self.optional.http_headers {\n            Some(h) => h.clone(),\n            None => vec![],\n        };\n        write!(\n            f,\n            \"\nDebug: {:?}\nDir: {}\nDoNotWarnForRedirectTo: {:?}\nTypes: {:?}\nOffline: {}\nMatchExt: {}\nRootDir: {}\nGitignore: {}\nGituntracked: {}\nIgnoreLinks: {}\nIgnorePath: {:?}\nThrottle: {} ms\nCSVFile: {:?}\nFiles: {:?}\nHttpHeaders: {:?}\",\n            self.optional.debug.unwrap_or(false),\n            self.directory.to_str().unwrap_or_default(),\n            self.optional.do_not_warn_for_redirect_to,\n            markup_types_str,\n            self.optional.offline.unwrap_or_default(),\n            self.optional.match_file_extension.unwrap_or_default(),\n            root_dir_str,\n            self.optional.gitignore.unwrap_or_default(),\n            self.optional.gituntracked.unwrap_or_default(),\n            ignore_str.join(\",\"),\n            ignore_path_str,\n            self.optional.throttle.unwrap_or(0),\n            csv_file_str,\n            files_str,\n            http_headers_str\n        )\n    }\n}\n\n#[derive(Debug, Clone)]\nstruct FinalResult {\n    target: Target,\n    result_code: LinkCheckResult,\n}\n\n#[derive(Hash, PartialEq, Eq, Clone, Debug)]\nstruct Target {\n    target: String,\n    link_type: LinkType,\n}\n\nfn find_all_links(config: &Config) -> Vec<Result<MarkupLink, BrokenExtractedLink>> {\n    let mut files: Vec<MarkupFile> = Vec::new();\n    file_traversal::find(config, &mut files);\n    let mut links = vec![];\n    for file in files {\n        links.append(&mut link_extractors::link_extractor::find_links(&file));\n    }\n    links\n}\n\nfn git_repo_root(scan_root: &Path) -> Option<PathBuf> {\n    let output = Command::new(\"git\")\n        .arg(\"-C\")\n        .arg(scan_root)\n        .args([\"rev-parse\", \"--show-toplevel\"])\n        .output()\n        .ok()?;\n\n    if !output.status.success() {\n        debug!(\n            \"git rev-parse failed: {}\",\n            String::from_utf8_lossy(&output.stderr)\n        );\n        return None;\n    }\n\n    let root = String::from_utf8(output.stdout).ok()?;\n    let root = root.trim();\n    if root.is_empty() {\n        None\n    } else {\n        Some(PathBuf::from(root))\n    }\n}\n\nfn scan_root_dir(config: &Config) -> &Path {\n    let p = config.directory.as_path();\n    if p.is_file() {\n        p.parent().unwrap_or_else(|| Path::new(\".\"))\n    } else {\n        p\n    }\n}\n\nfn find_git_ignored_files(config: &Config) -> Option<Vec<PathBuf>> {\n    let scan_root = scan_root_dir(config);\n    let repo_root = git_repo_root(scan_root)?;\n\n    // Limit ls-files to the scanned subtree so nested .gitignore files are respected\n    // and we don't accidentally base results on the caller's current working directory.\n    let output = Command::new(\"git\")\n        .arg(\"-C\")\n        .arg(&repo_root)\n        .args([\n            \"ls-files\",\n            \"--ignored\",\n            \"--others\",\n            \"--exclude-standard\",\n            \"--\",\n            \".\",\n        ])\n        .current_dir(scan_root)\n        .output()\n        .ok()?;\n\n    if output.status.success() {\n        let ignored_files = String::from_utf8(output.stdout)\n            .ok()?\n            .lines()\n            .filter(|line| line.ends_with(\".md\") || line.ends_with(\".html\"))\n            .filter_map(|line| {\n                let rel = line.trim();\n                let full = repo_root.join(rel);\n                fs::canonicalize(full).ok()\n            })\n            .collect::<Vec<_>>();\n        Some(ignored_files)\n    } else {\n        eprintln!(\n            \"git ls-files command failed: {}\",\n            String::from_utf8_lossy(&output.stderr)\n        );\n        None\n    }\n}\n\nfn find_git_untracked_files(config: &Config) -> Option<Vec<PathBuf>> {\n    let scan_root = scan_root_dir(config);\n    let repo_root = git_repo_root(scan_root)?;\n\n    let output = Command::new(\"git\")\n        .arg(\"-C\")\n        .arg(&repo_root)\n        .args([\"ls-files\", \"--others\", \"--exclude-standard\", \"--\", \".\"])\n        .current_dir(scan_root)\n        .output()\n        .ok()?;\n\n    if output.status.success() {\n        let untracked_files = String::from_utf8(output.stdout)\n            .ok()?\n            .lines()\n            .filter(|line| line.ends_with(\".md\") || line.ends_with(\".html\"))\n            .filter_map(|line| {\n                let rel = line.trim();\n                let full = repo_root.join(rel);\n                fs::canonicalize(full).ok()\n            })\n            .collect::<Vec<_>>();\n        Some(untracked_files)\n    } else {\n        eprintln!(\n            \"git ls-files command failed: {}\",\n            String::from_utf8_lossy(&output.stderr)\n        );\n        None\n    }\n}\n\nfn print_helper(\n    link: &MarkupLink,\n    status_code: &colored::ColoredString,\n    msg: &str,\n    error_channel: bool,\n) {\n    let mut link_str = format!(\"[{:^4}] {}\", status_code, link.source_str());\n    if !msg.is_empty() {\n        link_str += &format!(\" - {msg}\");\n    }\n    if error_channel {\n        eprintln!(\"{link_str}\");\n    } else {\n        println!(\"{link_str}\");\n    }\n}\n\nfn print_result(result: &FinalResult, map: &HashMap<Target, Vec<MarkupLink>>) {\n    for link in &map[&result.target] {\n        match &result.result_code {\n            LinkCheckResult::Ok => {\n                print_helper(link, &\"OK\".green(), \"\", false);\n            }\n            LinkCheckResult::NotImplemented(msg) | LinkCheckResult::Warning(msg) => {\n                print_helper(link, &\"Warn\".yellow(), msg, false);\n            }\n            LinkCheckResult::Ignored(msg) => {\n                print_helper(link, &\"Skip\".green(), msg, false);\n            }\n            LinkCheckResult::Failed(msg) => {\n                print_helper(link, &\"Err\".red(), msg, true);\n            }\n        }\n    }\n}\n\npub async fn run(config: &Config) -> Result<(), ()> {\n    let links = find_all_links(config);\n    let mut link_target_groups: HashMap<Target, Vec<MarkupLink>> = HashMap::new();\n\n    let mut skipped = 0;\n\n    let ignore_links: Vec<WildMatch> = match &config.optional.ignore_links {\n        Some(s) => s.iter().map(|m| WildMatch::new(m)).collect(),\n        None => vec![],\n    };\n\n    let gitignored_files: Option<Vec<PathBuf>> = if config.optional.gitignore.is_some() {\n        let files = find_git_ignored_files(config);\n        debug!(\"Found gitignored files: {files:?}\");\n        files\n    } else {\n        None\n    };\n\n    let is_gitignore_enabled = gitignored_files.is_some();\n\n    let gituntracked_files: Option<Vec<PathBuf>> = if config.optional.gituntracked.is_some() {\n        let files = find_git_untracked_files(config);\n        debug!(\"Found gituntracked files: {files:?}\");\n        files\n    } else {\n        None\n    };\n\n    let is_gituntracked_enabled = gituntracked_files.is_some();\n\n    let mut broken_references: Vec<BrokenExtractedLink> = vec![];\n    for link in &links {\n        match link {\n            Ok(link) => {\n                let canonical_link_source = match fs::canonicalize(&link.source) {\n                    Ok(path) => path,\n                    Err(e) => {\n                        warn!(\n                            \"Failed to canonicalize link source: {}. Error: {:?}\",\n                            link.source, e\n                        );\n                        continue;\n                    }\n                };\n\n                if is_gitignore_enabled {\n                    if let Some(ref gif) = gitignored_files {\n                        if gif.iter().any(|path| path == &canonical_link_source) {\n                            print_helper(\n                                link,\n                                &\"Skip\".green(),\n                                \"Ignore link because it is ignored by git.\",\n                                false,\n                            );\n                            skipped += 1;\n                            continue;\n                        }\n                    }\n                }\n\n                if is_gituntracked_enabled {\n                    if let Some(ref gif) = gituntracked_files {\n                        if gif.iter().any(|path| path == &canonical_link_source) {\n                            print_helper(\n                                link,\n                                &\"Skip\".green(),\n                                \"Ignore link because it is untracked by git.\",\n                                false,\n                            );\n                            skipped += 1;\n                            continue;\n                        }\n                    }\n                }\n\n                if ignore_links.iter().any(|m| m.matches(&link.target)) {\n                    print_helper(\n                        link,\n                        &\"Skip\".green(),\n                        \"Ignore link because of ignore-links option.\",\n                        false,\n                    );\n                    skipped += 1;\n                    continue;\n                }\n\n                let link_type = get_link_type(&link.target);\n                let target = resolve_target_link(link, &link_type, config).await;\n                let t = Target { target, link_type };\n                match link_target_groups.get_mut(&t) {\n                    Some(v) => v.push(link.clone()),\n                    None => {\n                        link_target_groups.insert(t, vec![link.clone()]);\n                    }\n                }\n            }\n            Err(broken_reference) => {\n                broken_references.push(broken_reference.clone());\n            }\n        }\n    }\n\n    let do_not_warn_for_redirect_to: Arc<Vec<WildMatch>> =\n        Arc::new(match &config.optional.do_not_warn_for_redirect_to {\n            Some(s) => s.iter().map(|m| WildMatch::new(m)).collect(),\n            None => vec![],\n        });\n\n    // Parse HTTP headers from config\n    let http_headers: Arc<Vec<(String, String)>> = Arc::new(match &config.optional.http_headers {\n        Some(headers) => headers\n            .iter()\n            .filter_map(|h| {\n                let parts: Vec<&str> = h.splitn(2, ':').collect();\n                if parts.len() == 2 {\n                    Some((parts[0].trim().to_string(), parts[1].trim().to_string()))\n                } else {\n                    warn!(\"Invalid HTTP header format (expected 'Name: Value'): {}\", h);\n                    None\n                }\n            })\n            .collect(),\n        None => vec![],\n    });\n    info!(\"Custom HTTP headers: {:?}\", http_headers);\n\n    let throttle = config.optional.throttle.unwrap_or_default() > 0;\n    info!(\"Throttle HTTP requests to same host: {throttle:?}\");\n    let waits = Arc::new(Mutex::new(HashMap::new()));\n    // See also http://patshaughnessy.net/2020/1/20/downloading-100000-files-using-async-rust\n    let mut buffered_stream = stream::iter(link_target_groups.keys())\n        .map(|target| {\n            let waits = waits.clone();\n            let do_not_warn_for_redirect_to = Arc::clone(&do_not_warn_for_redirect_to);\n            let http_headers = Arc::clone(&http_headers);\n            async move {\n                if throttle && target.link_type == LinkType::Http {\n                    let parsed = match Url::parse(&target.target) {\n                        Ok(parsed) => parsed,\n                        Err(error) => {\n                            return FinalResult {\n                                target: target.clone(),\n                                result_code: LinkCheckResult::Failed(format!(\n                                    \"Could not parse URL type. Err: {error:?}\"\n                                )),\n                            }\n                        }\n                    };\n                    let host = match parsed.host_str() {\n                        Some(host) => host.to_string(),\n                        None => {\n                            return FinalResult {\n                                target: target.clone(),\n                                result_code: LinkCheckResult::Failed(\n                                    \"Failed to determine host\".to_string(),\n                                ),\n                            }\n                        }\n                    };\n                    let mut waits = waits.lock().await;\n\n                    let mut wait_until: Option<Instant> = None;\n                    let next_wait = match waits.get(&host) {\n                        Some(old) => {\n                            wait_until = Some(*old);\n                            *old + Duration::from_millis(\n                                config.optional.throttle.unwrap_or_default().into(),\n                            )\n                        }\n                        None => {\n                            Instant::now()\n                                + Duration::from_millis(\n                                    config.optional.throttle.unwrap_or_default().into(),\n                                )\n                        }\n                    };\n                    waits.insert(host, next_wait);\n                    drop(waits);\n\n                    if let Some(deadline) = wait_until {\n                        sleep_until(deadline).await;\n                    }\n                }\n\n                let result_code = link_validator::check(\n                    &target.target,\n                    &target.link_type,\n                    config,\n                    &do_not_warn_for_redirect_to,\n                    &http_headers,\n                )\n                .await;\n\n                FinalResult {\n                    target: target.clone(),\n                    result_code,\n                }\n            }\n        })\n        .buffer_unordered(PARALLEL_REQUESTS);\n\n    let mut oks = 0;\n    let mut warnings = 0;\n    let mut errors = vec![];\n    let mut warning_results = vec![];\n\n    let is_github_runner_env = env::var(\"GITHUB_ENV\").is_ok();\n    if is_github_runner_env {\n        info!(\"Running in github environment. Print errors and warnings as workflow commands\");\n    }\n\n    let mut process_result = |result: FinalResult| match &result.result_code {\n        LinkCheckResult::Ok => {\n            oks += link_target_groups[&result.target].len();\n        }\n        LinkCheckResult::NotImplemented(msg) | LinkCheckResult::Warning(msg) => {\n            warnings += link_target_groups[&result.target].len();\n            warning_results.push(result.clone());\n            if is_github_runner_env {\n                for link in &link_target_groups[&result.target] {\n                    println!(\n                        \"::warning file={},line={},col={},title=link checker warning::{}. {}\",\n                        link.source, link.line, link.column, result.target.target, msg\n                    );\n                }\n            }\n        }\n        LinkCheckResult::Ignored(_) => {\n            skipped += link_target_groups[&result.target].len();\n        }\n        LinkCheckResult::Failed(msg) => {\n            errors.push(result.clone());\n            if is_github_runner_env {\n                for link in &link_target_groups[&result.target] {\n                    println!(\n                        \"::error file={},line={},col={},title=broken link::{}. {}\",\n                        link.source, link.line, link.column, result.target.target, msg\n                    );\n                }\n            }\n        }\n    };\n\n    while let Some(result) = buffered_stream.next().await {\n        print_result(&result, &link_target_groups);\n        process_result(result);\n    }\n    for broken_ref in &broken_references {\n        warnings += 1;\n        println!(\n            \"[{:^4}] {}:{}:{} => {} - {}\",\n            &\"Warn\".yellow(),\n            broken_ref.source,\n            broken_ref.line,\n            broken_ref.column,\n            broken_ref.reference,\n            broken_ref.error\n        );\n    }\n\n    println!();\n    let error_sum: usize = errors\n        .iter()\n        .map(|e| link_target_groups[&e.target].len())\n        .sum();\n    let sum = skipped + error_sum + warnings + oks;\n    println!(\"Result ({sum} links):\");\n    println!();\n    println!(\"OK       {oks}\");\n    println!(\"Skipped  {skipped}\");\n    println!(\"Warnings {warnings}\");\n    println!(\"Errors   {error_sum}\");\n    println!();\n\n    // Prepare CSV file if needed\n    let mut csv_file = if let Some(csv_path) = &config.optional.csv_file {\n        info!(\"Write CSV file: {}\", csv_path.display());\n        let mut file = fs::File::create(csv_path).unwrap();\n        writeln!(file, \"source,line,column,target,severity\").unwrap();\n        Some(file)\n    } else {\n        None\n    };\n\n    // Helper function to write warnings to CSV\n    let write_warnings_to_csv = |csv_file: &mut Option<fs::File>| {\n        if let Some(ref mut file) = csv_file {\n            // Write link-based warnings\n            for res in &warning_results {\n                for link in &link_target_groups[&res.target] {\n                    writeln!(\n                        file,\n                        \"{},{},{},{},WARN\",\n                        link.source, link.line, link.column, link.target\n                    )\n                    .unwrap();\n                }\n            }\n            // Write broken reference warnings\n            for broken_ref in &broken_references {\n                writeln!(\n                    file,\n                    \"{},{},{},{},WARN\",\n                    broken_ref.source, broken_ref.line, broken_ref.column, broken_ref.reference\n                )\n                .unwrap();\n            }\n        }\n    };\n\n    if errors.is_empty() {\n        write_warnings_to_csv(&mut csv_file);\n        Ok(())\n    } else {\n        println!();\n        println!(\"The following links could not be resolved:\");\n        println!();\n        for res in &errors {\n            for link in &link_target_groups[&res.target] {\n                println!(\"{}\", link.source_str());\n\n                if let Some(ref mut file) = csv_file {\n                    writeln!(\n                        file,\n                        \"{},{},{},{},ERR\",\n                        link.source, link.line, link.column, link.target\n                    )\n                    .unwrap();\n                }\n            }\n        }\n\n        write_warnings_to_csv(&mut csv_file);\n        Err(())\n    }\n}\n"
  },
  {
    "path": "src/link_extractors/html_link_extractor.rs",
    "content": "use crate::link_extractors::link_extractor::LinkExtractor;\nuse crate::link_extractors::link_extractor::MarkupLink;\nuse crate::link_validator::link_type::get_link_type;\nuse crate::link_validator::link_type::LinkType;\n\nuse super::ignore_comments::IgnoreRegions;\nuse super::link_extractor::BrokenExtractedLink;\npub struct HtmlLinkExtractor();\n\n#[derive(Clone, Copy, Debug)]\nenum ParserState {\n    Text,\n    Comment,\n    Anchor,\n    EqualSign,\n    Link,\n}\n\nimpl LinkExtractor for HtmlLinkExtractor {\n    fn find_links(&self, text: &str) -> Vec<Result<MarkupLink, BrokenExtractedLink>> {\n        let mut result: Vec<Result<MarkupLink, BrokenExtractedLink>> = Vec::new();\n        let mut state: ParserState = ParserState::Text;\n        let mut link_column = 0;\n        let mut link_line = 0;\n        let ignore_regions = IgnoreRegions::from_text(text);\n\n        for (line, line_str) in text.lines().enumerate() {\n            let line_chars: Vec<char> = line_str.chars().collect();\n            let mut column: usize = 0;\n            while line_chars.get(column).is_some() {\n                match state {\n                    ParserState::Comment => {\n                        if line_chars.get(column) == Some(&'-')\n                            && line_chars.get(column + 1) == Some(&'-')\n                            && line_chars.get(column + 2) == Some(&'>')\n                        {\n                            column += 2;\n                            state = ParserState::Text;\n                        }\n                    }\n                    ParserState::Text => {\n                        link_column = column;\n                        link_line = line;\n                        if line_chars.get(column) == Some(&'<')\n                            && line_chars.get(column + 1) == Some(&'!')\n                            && line_chars.get(column + 2) == Some(&'-')\n                            && line_chars.get(column + 3) == Some(&'-')\n                        {\n                            column += 3;\n                            state = ParserState::Comment;\n                        } else if line_chars.get(column) == Some(&'<')\n                            && line_chars.get(column + 1) == Some(&'a')\n                        {\n                            column += 1;\n                            state = ParserState::Anchor;\n                        }\n                    }\n                    ParserState::Anchor => {\n                        if line_chars.get(column) == Some(&'h')\n                            && line_chars.get(column + 1) == Some(&'r')\n                            && line_chars.get(column + 2) == Some(&'e')\n                            && line_chars.get(column + 3) == Some(&'f')\n                        {\n                            column += 3;\n                            state = ParserState::EqualSign;\n                        }\n                    }\n                    ParserState::EqualSign => {\n                        match line_chars.get(column) {\n                            Some(x) if x.is_whitespace() => {}\n                            Some(x) if x == &'=' => state = ParserState::Link,\n                            Some(_) => state = ParserState::Anchor,\n                            None => {}\n                        };\n                    }\n                    ParserState::Link => {\n                        match line_chars.get(column) {\n                            Some(x) if !x.is_whitespace() && x != &'\"' => {\n                                let start_col = column;\n                                while line_chars.get(column).is_some()\n                                    && !line_chars[column].is_whitespace()\n                                    && line_chars[column] != '\"'\n                                {\n                                    column += 1;\n                                }\n                                while let Some(c) = line_chars.get(column) {\n                                    if c == &'\"' {\n                                        break;\n                                    }\n                                    column += 1;\n                                }\n                                let mut link =\n                                    (line_chars[start_col..column]).iter().collect::<String>();\n                                if get_link_type(&link) == LinkType::FileSystem {\n                                    link = url_escape::decode(link.as_str()).to_string();\n                                };\n\n                                // Check if this line should be ignored\n                                let line_num = link_line + 1; // Convert to 1-indexed\n                                if !ignore_regions.is_line_ignored(line_num) {\n                                    result.push(Ok(MarkupLink {\n                                        column: link_column + 1,\n                                        line: line_num,\n                                        target: link.to_string(),\n                                        source: \"\".to_string(),\n                                    }));\n                                }\n                                state = ParserState::Text;\n                            }\n                            Some(_) | None => {}\n                        };\n                    }\n                }\n                column += 1;\n            }\n        }\n        result\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use ntest::test_case;\n\n    #[test]\n    fn no_link() {\n        let le = HtmlLinkExtractor();\n        let input = \"]This is not a <has> no link <h1>Bla</h1> attribute.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn commented() {\n        let le = HtmlLinkExtractor();\n        let input = \"df <!-- <a href=\\\"http://wiki.selfhtml.org\\\"> haha</a> -->\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn space() {\n        let le = HtmlLinkExtractor();\n        let result = le.find_links(\"blah <a href=\\\"some file.html\\\">foo</a>.\");\n        let expected = Ok(MarkupLink {\n            target: \"some file.html\".to_string(),\n            line: 1,\n            column: 6,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn url_encoded_path() {\n        let le = HtmlLinkExtractor();\n        let result = le.find_links(\"blah <a href=\\\"some%20file.html\\\">foo</a>.\");\n        let expected = Ok(MarkupLink {\n            target: \"some file.html\".to_string(),\n            line: 1,\n            column: 6,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test_case(\"<a href=\\\"https://www.w3schools.com\\\">Visit W3Schools.com!</a>\", 1, 1)]\n    #[test_case(\n        \"<a\\nhref\\n=\\n  \\\"https://www.w3schools.com\\\">\\nVisit W3Schools.com!\\n</a>\",\n        1,\n        1\n    )]\n    #[test_case(\n        \"<a hreflang=\\\"en\\\" href=\\\"https://www.w3schools.com\\\">Visit W3Schools.com!</a>\",\n        1,\n        1\n    )]\n    #[test_case(\n        \"<!--comment--><a href=\\\"https://www.w3schools.com\\\">Visit W3Schools.com!</a>\",\n        1,\n        15\n    )]\n    fn links(input: &str, line: usize, column: usize) {\n        let le = HtmlLinkExtractor();\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"https://www.w3schools.com\".to_string(),\n            line,\n            column,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn ignore_disable_line() {\n        let le = HtmlLinkExtractor();\n        let input = \"<!-- mlc-disable-line --> <a href=\\\"http://example.net/\\\">link</a>\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn ignore_disable_next_line() {\n        let le = HtmlLinkExtractor();\n        let input = \"<!-- mlc-disable-next-line -->\\n<a href=\\\"http://example.net/\\\">link</a>\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn ignore_disable_block() {\n        let le = HtmlLinkExtractor();\n        let input = \"<!-- mlc-disable -->\\n<a href=\\\"http://example.net/\\\">link1</a>\\n<!-- mlc-enable -->\\n<a href=\\\"http://example.com/\\\">link2</a>\";\n        let result = le.find_links(input);\n        assert_eq!(1, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://example.com/\");\n        assert_eq!(result[0].as_ref().unwrap().line, 4);\n    }\n\n    #[test]\n    fn ignore_multiple_blocks() {\n        let le = HtmlLinkExtractor();\n        let input = \"<a href=\\\"http://a.com/\\\">1</a>\\n<!-- mlc-disable -->\\n<a href=\\\"http://b.com/\\\">2</a>\\n<!-- mlc-enable -->\\n<a href=\\\"http://c.com/\\\">3</a>\";\n        let result = le.find_links(input);\n        assert_eq!(2, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://a.com/\");\n        assert_eq!(result[1].as_ref().unwrap().target, \"http://c.com/\");\n    }\n}\n"
  },
  {
    "path": "src/link_extractors/ignore_comments.rs",
    "content": "/// Module for parsing ignore/disable comments in markup files\n/// Supports comments like:\n/// - `<!-- mlc-disable -->` / `<!-- mlc-enable -->`\n/// - `<!-- mlc-disable-next-line -->`\n/// - `<!-- mlc-disable-line -->`\nuse std::collections::HashSet;\n\n#[derive(Debug, Clone, Copy, PartialEq)]\nenum IgnoreState {\n    Enabled,\n    Disabled,\n}\n\n#[derive(Debug, Clone)]\npub struct IgnoreRegions {\n    /// Lines that should be ignored (1-indexed)\n    ignored_lines: HashSet<usize>,\n    /// Ranges of lines that should be ignored (1-indexed, inclusive)\n    ignored_ranges: Vec<(usize, usize)>,\n}\n\nimpl IgnoreRegions {\n    /// Create a new IgnoreRegions from text content\n    pub fn from_text(text: &str) -> Self {\n        let mut ignored_lines = HashSet::new();\n        let mut ignored_ranges = Vec::new();\n        let mut state = IgnoreState::Enabled;\n        let mut disable_start_line = 0;\n\n        for (line_idx, line) in text.lines().enumerate() {\n            let line_num = line_idx + 1; // 1-indexed\n\n            // Check for disable/enable blocks\n            if line.contains(\"<!-- mlc-disable -->\") {\n                if state == IgnoreState::Enabled {\n                    state = IgnoreState::Disabled;\n                    disable_start_line = line_num;\n                }\n            } else if line.contains(\"<!-- mlc-enable -->\") && state == IgnoreState::Disabled {\n                // Add the range from disable to enable (inclusive)\n                ignored_ranges.push((disable_start_line, line_num));\n                state = IgnoreState::Enabled;\n            }\n\n            // Check for single-line ignores\n            if line.contains(\"<!-- mlc-disable-line -->\") {\n                ignored_lines.insert(line_num);\n            }\n\n            // Check for next-line ignore\n            if line.contains(\"<!-- mlc-disable-next-line -->\") {\n                ignored_lines.insert(line_num + 1);\n            }\n        }\n\n        // If we ended in disabled state, ignore from disable_start_line to end\n        if state == IgnoreState::Disabled {\n            let total_lines = text.lines().count();\n            if total_lines > 0 {\n                ignored_ranges.push((disable_start_line, total_lines));\n            }\n        }\n\n        Self {\n            ignored_lines,\n            ignored_ranges,\n        }\n    }\n\n    /// Check if a given line number (1-indexed) should be ignored\n    pub fn is_line_ignored(&self, line: usize) -> bool {\n        // Check if line is in ignored_lines\n        if self.ignored_lines.contains(&line) {\n            return true;\n        }\n\n        // Check if line is in any ignored range\n        for (start, end) in &self.ignored_ranges {\n            if line >= *start && line <= *end {\n                return true;\n            }\n        }\n\n        false\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn no_ignore_comments() {\n        let text = \"This is a normal line\\nAnother line\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(!regions.is_line_ignored(2));\n    }\n\n    #[test]\n    fn disable_line_comment() {\n        let text = \"Line 1\\n<!-- mlc-disable-line --> Line 2\\nLine 3\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(regions.is_line_ignored(2));\n        assert!(!regions.is_line_ignored(3));\n    }\n\n    #[test]\n    fn disable_next_line_comment() {\n        let text = \"Line 1\\n<!-- mlc-disable-next-line -->\\nLine 3\\nLine 4\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(!regions.is_line_ignored(2));\n        assert!(regions.is_line_ignored(3));\n        assert!(!regions.is_line_ignored(4));\n    }\n\n    #[test]\n    fn disable_enable_block() {\n        let text = \"Line 1\\n<!-- mlc-disable -->\\nLine 3\\nLine 4\\n<!-- mlc-enable -->\\nLine 6\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(regions.is_line_ignored(2));\n        assert!(regions.is_line_ignored(3));\n        assert!(regions.is_line_ignored(4));\n        assert!(regions.is_line_ignored(5));\n        assert!(!regions.is_line_ignored(6));\n    }\n\n    #[test]\n    fn disable_without_enable() {\n        let text = \"Line 1\\nLine 2\\n<!-- mlc-disable -->\\nLine 4\\nLine 5\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(!regions.is_line_ignored(2));\n        assert!(regions.is_line_ignored(3));\n        assert!(regions.is_line_ignored(4));\n        assert!(regions.is_line_ignored(5));\n    }\n\n    #[test]\n    fn multiple_disable_blocks() {\n        let text = \"Line 1\\n<!-- mlc-disable -->\\nLine 3\\n<!-- mlc-enable -->\\nLine 5\\n<!-- mlc-disable -->\\nLine 7\\n<!-- mlc-enable -->\\nLine 9\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(regions.is_line_ignored(2));\n        assert!(regions.is_line_ignored(3));\n        assert!(regions.is_line_ignored(4));\n        assert!(!regions.is_line_ignored(5));\n        assert!(regions.is_line_ignored(6));\n        assert!(regions.is_line_ignored(7));\n        assert!(regions.is_line_ignored(8));\n        assert!(!regions.is_line_ignored(9));\n    }\n\n    #[test]\n    fn mixed_ignore_types() {\n        let text = \"Line 1\\n<!-- mlc-disable-line --> Line 2\\n<!-- mlc-disable-next-line -->\\nLine 4\\n<!-- mlc-disable -->\\nLine 6\\n<!-- mlc-enable -->\\nLine 8\";\n        let regions = IgnoreRegions::from_text(text);\n        assert!(!regions.is_line_ignored(1));\n        assert!(regions.is_line_ignored(2)); // disable-line\n        assert!(!regions.is_line_ignored(3));\n        assert!(regions.is_line_ignored(4)); // disable-next-line\n        assert!(regions.is_line_ignored(5)); // disable block start\n        assert!(regions.is_line_ignored(6)); // disable block\n        assert!(regions.is_line_ignored(7)); // disable block end (enable)\n        assert!(!regions.is_line_ignored(8));\n    }\n}\n"
  },
  {
    "path": "src/link_extractors/link_extractor.rs",
    "content": "use super::html_link_extractor::HtmlLinkExtractor;\nuse super::markdown_link_extractor::MarkdownLinkExtractor;\nuse crate::markup::{MarkupFile, MarkupType};\nuse std::env;\nuse std::fmt;\nuse std::fs;\n\n/// Link found in markup files\n#[derive(Eq, PartialEq, Clone)]\npub struct MarkupLink {\n    /// The source file of the link\n    pub source: String,\n    /// The target the link points to\n    pub target: String,\n    /// The line number were the link was found\n    pub line: usize,\n    /// The column number were the link was found\n    pub column: usize,\n}\n/// Broken link found in document\n#[derive(Eq, PartialEq, Clone, Debug)]\npub struct BrokenExtractedLink {\n    /// The error message\n    pub error: String,\n    /// The source\n    pub source: String,\n    /// The target\n    pub reference: String,\n    /// The line number were the link was found\n    pub line: usize,\n    /// The column number were the link was found\n    pub column: usize,\n}\n\nimpl fmt::Debug for MarkupLink {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(\n            f,\n            \"{} => {} (line {}, column {})\",\n            self.source, self.target, self.line, self.column\n        )\n    }\n}\n\nimpl MarkupLink {\n    pub fn source_str(&self) -> String {\n        lazy_static! {\n            static ref IS_VS_CODE_TERMINAL: bool =\n                env::var(\"TERM_PROGRAM\") == Ok(\"vscode\".to_string());\n        }\n        if *IS_VS_CODE_TERMINAL {\n            format! {\"{}:{}:{} => {}\", self.source, self.line, self.column, self.target}\n        } else {\n            format! {\"{} ({}, {}) => {}\", self.source, self.line, self.column, self.target}\n        }\n    }\n}\n\n#[must_use]\npub fn find_links(file: &MarkupFile) -> Vec<Result<MarkupLink, BrokenExtractedLink>> {\n    let path = &file.path;\n    let link_extractor = link_extractor_factory(file.markup_type);\n\n    info!(\"Scan file at path '{path}' for links.\");\n    match fs::read_to_string(path) {\n        Ok(text) => {\n            let mut links = link_extractor.find_links(&text);\n            for l in &mut links {\n                match l {\n                    Ok(link) => {\n                        link.source = path.to_string();\n                    }\n                    Err(broken_link) => {\n                        broken_link.source = path.to_string();\n                    }\n                }\n            }\n            links\n        }\n        Err(e) => {\n            warn!(\"File '{path}'. IO Error: \\\"{e}\\\". Check your file encoding.\");\n            vec![]\n        }\n    }\n}\n\nfn link_extractor_factory(markup_type: MarkupType) -> Box<dyn LinkExtractor> {\n    match markup_type {\n        MarkupType::Markdown => Box::new(MarkdownLinkExtractor()),\n        MarkupType::Html => Box::new(HtmlLinkExtractor()),\n    }\n}\n\npub trait LinkExtractor {\n    fn find_links(&self, text: &str) -> Vec<Result<MarkupLink, BrokenExtractedLink>>;\n}\n"
  },
  {
    "path": "src/link_extractors/markdown_link_extractor.rs",
    "content": "use super::html_link_extractor::HtmlLinkExtractor;\nuse super::ignore_comments::IgnoreRegions;\nuse super::link_extractor::BrokenExtractedLink;\nuse crate::link_extractors::link_extractor::LinkExtractor;\nuse crate::link_extractors::link_extractor::MarkupLink;\nuse pulldown_cmark::{BrokenLink, Event, Options, Parser, Tag};\n\npub struct MarkdownLinkExtractor();\n\nimpl LinkExtractor for MarkdownLinkExtractor {\n    fn find_links(&self, text: &str) -> Vec<Result<MarkupLink, BrokenExtractedLink>> {\n        use std::cell::RefCell;\n        let result: RefCell<Vec<Result<MarkupLink, BrokenExtractedLink>>> =\n            RefCell::new(Vec::new());\n\n        let html_extractor = HtmlLinkExtractor();\n        let converter = LineColumnConverter::new(text);\n        let ignore_regions = IgnoreRegions::from_text(text);\n\n        let callback = &mut |broken_link: BrokenLink| {\n            let line_col = converter.line_column_from_idx(broken_link.span.start);\n\n            // Skip if line is ignored\n            if ignore_regions.is_line_ignored(line_col.0) {\n                return None;\n            }\n\n            info!(\n                \"Broken link in md file: {} (line {}, column {})\",\n                broken_link.reference, line_col.0, line_col.1\n            );\n            result.borrow_mut().push(Err(BrokenExtractedLink {\n                source: String::new(),\n                line: line_col.0,\n                column: line_col.1,\n                reference: broken_link.reference.to_string(),\n                error: \"Markdown reference not found\".to_string(),\n            }));\n            None\n        };\n\n        let parser =\n            Parser::new_with_broken_link_callback(text, Options::ENABLE_TASKLISTS, Some(callback));\n\n        for (evt, range) in parser.into_offset_iter() {\n            match evt {\n                Event::Start(Tag::Link { dest_url, .. } | Tag::Image { dest_url, .. }) => {\n                    let line_col = converter.line_column_from_idx(range.start);\n\n                    // Skip if line is ignored\n                    if ignore_regions.is_line_ignored(line_col.0) {\n                        continue;\n                    }\n\n                    result.borrow_mut().push(Ok(MarkupLink {\n                        line: line_col.0,\n                        column: line_col.1,\n                        source: String::new(),\n                        target: dest_url.to_string(),\n                    }));\n                }\n                Event::Html(html) | Event::InlineHtml(html) => {\n                    let line_col = converter.line_column_from_idx(range.start);\n                    let html_result = html_extractor.find_links(html.as_ref());\n                    let mut parsed_html = html_result\n                        .iter()\n                        .filter_map(|res| res.as_ref().ok())\n                        .map(|md_link| {\n                            let line = line_col.0 + md_link.line - 1;\n                            let column = if md_link.line > 1 {\n                                md_link.column\n                            } else {\n                                line_col.1 + md_link.column - 1\n                            };\n                            Ok(MarkupLink {\n                                column,\n                                line,\n                                source: md_link.source.clone(),\n                                target: md_link.target.clone(),\n                            })\n                        })\n                        .filter(|link| {\n                            // Skip if line is ignored\n                            if let Ok(ml) = link {\n                                !ignore_regions.is_line_ignored(ml.line)\n                            } else {\n                                true\n                            }\n                        })\n                        .collect();\n                    result.borrow_mut().append(&mut parsed_html);\n                }\n                _ => (),\n            };\n        }\n        result.into_inner()\n    }\n}\n\nstruct LineColumnConverter {\n    line_lengths: Vec<usize>,\n}\n\nimpl LineColumnConverter {\n    fn new(text: &str) -> Self {\n        let mut line_lengths: Vec<usize> = Vec::new();\n        let mut current_line_len = 0;\n        for c in text.chars() {\n            current_line_len += c.len_utf8();\n            if c == '\\n' {\n                line_lengths.push(current_line_len);\n                current_line_len = 0;\n            }\n        }\n        Self { line_lengths }\n    }\n\n    fn line_column_from_idx(&self, idx: usize) -> (usize, usize) {\n        let mut line = 1;\n        let mut column = idx + 1;\n        for line_length in &self.line_lengths {\n            if *line_length >= column {\n                return (line, column);\n            }\n            column -= line_length;\n            line += 1;\n        }\n        (line, column)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use ntest::test_case;\n\n    #[test]\n    fn inline_no_link() {\n        let le = MarkdownLinkExtractor();\n        let input = \"]This is not a () link](! has no title attribute.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn commented_link() {\n        let le = MarkdownLinkExtractor();\n        let input = \"]This is not a () <!--[link](link)-->.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn nested_links() {\n        let le = MarkdownLinkExtractor();\n        let input =\n            \"\\n\\r\\t\\n[![](http://meritbadge.herokuapp.com/mlc)](https://crates.io/crates/mlc)\";\n        let result = le.find_links(input);\n        let img = Ok(MarkupLink {\n            target: \"http://meritbadge.herokuapp.com/mlc\".to_string(),\n            line: 3,\n            column: 2,\n            source: \"\".to_string(),\n        });\n        let link = Ok(MarkupLink {\n            target: \"https://crates.io/crates/mlc\".to_string(),\n            line: 3,\n            column: 1,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![link, img], result);\n    }\n\n    #[test]\n    fn link_escaped() {\n        let le = MarkdownLinkExtractor();\n        let input = \"This is not a \\\\[link\\\\](random_link).\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn link_in_headline() {\n        let le = MarkdownLinkExtractor();\n        let input = \"  # This is a [link](http://example.net/).\";\n        let result = le.find_links(input);\n        assert_eq!(result[0].as_ref().unwrap().column, 15);\n    }\n\n    #[test]\n    fn no_link_colon() {\n        let le = MarkdownLinkExtractor();\n        let input = \"This is not a [link:bla.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn broken_reference_link() {\n        let le = MarkdownLinkExtractor();\n        let input = \"This is not a [link]:bla.\";\n        let result = le.find_links(input);\n\n        let expected = Err(BrokenExtractedLink {\n            source: \"\".to_string(),\n            reference: \"link\".to_string(),\n            line: 1,\n            column: 15,\n            error: \"Markdown reference not found\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn inline_code() {\n        let le = MarkdownLinkExtractor();\n        let input = \" `[code](http://example.net/)`, no link!.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn link_near_inline_code() {\n        let le = MarkdownLinkExtractor();\n        let input = \" `bug` [code](http://example.net/), link!.\";\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column: 8,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn link_very_near_inline_code() {\n        let le = MarkdownLinkExtractor();\n        let input = \"`bug`[code](http://example.net/)\";\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column: 6,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn code_block() {\n        let le = MarkdownLinkExtractor();\n        let input = \" ``` js\\n[code](http://example.net/)```, no link!.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn html_code_block() {\n        let le = MarkdownLinkExtractor();\n        let input = \"<script>\\n[code](http://example.net/)</script>, no link!.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn escaped_code_block() {\n        let le = MarkdownLinkExtractor();\n        let input = \"   klsdjf \\\\`[escape](http://example.net/)\\\\`, no link!.\";\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column: 13,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn link_in_code_block() {\n        let le = MarkdownLinkExtractor();\n        let input = \"```\\n[only code](http://example.net/)\\n```.\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn image_reference() {\n        let le = MarkdownLinkExtractor();\n        let link_str = \"http://example.net/\";\n        let input = format!(\"\\n\\nBla ![This is an image link]({link_str})\");\n        let result = le.find_links(&input);\n        let expected = Ok(MarkupLink {\n            target: link_str.to_string(),\n            line: 3,\n            column: 5,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn link_no_title() {\n        let le = MarkdownLinkExtractor();\n        let link_str = \"http://example.net/\";\n        let input = format!(\"[This link]({link_str}) has no title attribute.\");\n        let result = le.find_links(&input);\n        let expected = Ok(MarkupLink {\n            target: link_str.to_string(),\n            line: 1,\n            column: 1,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn link_with_title() {\n        let le = MarkdownLinkExtractor();\n        let link_str = \"http://example.net/\";\n        let input = format!(\"\\n123[This is a link]({link_str} \\\"with title\\\") oh yea.\");\n        let result = le.find_links(&input);\n        let expected = Ok(MarkupLink {\n            target: link_str.to_string(),\n            line: 2,\n            column: 4,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test_case(\"<http://example.net/>\", 1)]\n    // TODO GitHub Link style support\n    //#[test_case(\"This is a short link http://example.net/\", 22)]\n    //#[test_case(\"http://example.net/\", 1)]\n    #[test_case(\"This is a short link <http://example.net/>\", 22)]\n    fn inline_link(input: &str, column: usize) {\n        let le = MarkdownLinkExtractor();\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test_case(\n        \"<a href=\\\"http://example.net/\\\"> target=\\\"_blank\\\">Visit W3Schools!</a>\",\n        test_name = \"html_link_with_target\"\n    )]\n    #[test_case(\n        \"<a href=\\\"http://example.net/\\\"> link text</a>\",\n        test_name = \"html_link_no_target\"\n    )]\n    fn html_link(input: &str) {\n        let le = MarkdownLinkExtractor();\n        let result = le.find_links(input);\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column: 1,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn html_link_ident() {\n        let le = MarkdownLinkExtractor();\n        let result = le.find_links(\"123<a href=\\\"http://example.net/\\\"> link text</a>\");\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 1,\n            column: 4,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn html_link_new_line() {\n        let le = MarkdownLinkExtractor();\n        let result = le.find_links(\"\\n123<a href=\\\"http://example.net/\\\"> link text</a>\");\n        let expected = Ok(MarkupLink {\n            target: \"http://example.net/\".to_string(),\n            line: 2,\n            column: 4,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn raw_html_issue_31() {\n        let le = MarkdownLinkExtractor();\n        let result = le.find_links(\"Some text <a href=\\\"some_url\\\">link text</a> more text.\");\n        let expected = Ok(MarkupLink {\n            target: \"some_url\".to_string(),\n            line: 1,\n            column: 11,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn referenced_link() {\n        let le = MarkdownLinkExtractor();\n        let link_str = \"http://example.net/\";\n        let input = format!(\n            \"This is [an example][arbitrary case-insensitive reference text] reference-style link.\\n\\n[Arbitrary CASE-insensitive reference text]: {link_str}\"\n        );\n        let result = le.find_links(&input);\n        let expected = Ok(MarkupLink {\n            target: link_str.to_string(),\n            line: 1,\n            column: 9,\n            source: \"\".to_string(),\n        });\n        assert_eq!(vec![expected], result);\n    }\n\n    #[test]\n    fn referenced_link_tag_only() {\n        let le = MarkdownLinkExtractor();\n        let link_str = \"http://example.net/\";\n        let input = format!(\"Foo Bar\\n\\n[Arbitrary CASE-insensitive reference text]: {link_str}\");\n        let result = le.find_links(&input);\n        assert_eq!(0, result.len());\n    }\n\n    #[test]\n    fn referenced_link_no_tag_only() {\n        let le = MarkdownLinkExtractor();\n        let input = \"[link][reference]\";\n        let result = le.find_links(input);\n        assert_eq!(1, result.len());\n    }\n\n    #[test]\n    fn ignore_disable_line() {\n        let le = MarkdownLinkExtractor();\n        let input = \"<!-- mlc-disable-line --> [link](http://example.net/)\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn ignore_disable_next_line() {\n        let le = MarkdownLinkExtractor();\n        let input = \"<!-- mlc-disable-next-line -->\\n[link](http://example.net/)\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn ignore_disable_block() {\n        let le = MarkdownLinkExtractor();\n        let input = \"<!-- mlc-disable -->\\n[link1](http://example.net/)\\n<!-- mlc-enable -->\\n[link2](http://example.com/)\";\n        let result = le.find_links(input);\n        assert_eq!(1, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://example.com/\");\n        assert_eq!(result[0].as_ref().unwrap().line, 4);\n    }\n\n    #[test]\n    fn ignore_multiple_blocks() {\n        let le = MarkdownLinkExtractor();\n        let input = \"[link1](http://a.com/)\\n<!-- mlc-disable -->\\n[link2](http://b.com/)\\n<!-- mlc-enable -->\\n[link3](http://c.com/)\\n<!-- mlc-disable -->\\n[link4](http://d.com/)\\n<!-- mlc-enable -->\\n[link5](http://e.com/)\";\n        let result = le.find_links(input);\n        assert_eq!(3, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://a.com/\");\n        assert_eq!(result[1].as_ref().unwrap().target, \"http://c.com/\");\n        assert_eq!(result[2].as_ref().unwrap().target, \"http://e.com/\");\n    }\n\n    #[test]\n    fn ignore_html_link_in_markdown() {\n        let le = MarkdownLinkExtractor();\n        let input = \"<!-- mlc-disable-next-line -->\\n<a href=\\\"http://example.net/\\\">link</a>\";\n        let result = le.find_links(input);\n        assert!(result.is_empty());\n    }\n\n    #[test]\n    fn ignore_mixed_types() {\n        let le = MarkdownLinkExtractor();\n        let input = \"[link1](http://a.com/)\\n<!-- mlc-disable-line --> [link2](http://b.com/)\\n[link3](http://c.com/)\";\n        let result = le.find_links(input);\n        assert_eq!(2, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://a.com/\");\n        assert_eq!(result[1].as_ref().unwrap().target, \"http://c.com/\");\n    }\n\n    #[test]\n    fn gfm_checkbox_not_link() {\n        let le = MarkdownLinkExtractor();\n        let input = \"- [x] checked task\\n- [ ] unchecked task\";\n        let result = le.find_links(input);\n        // GitHub-flavored markdown task list checkboxes should NOT be treated as links\n        assert!(\n            result.is_empty(),\n            \"Task list checkboxes should not be detected as links: {:?}\",\n            result\n        );\n    }\n\n    #[test]\n    fn gfm_checkbox_with_link() {\n        let le = MarkdownLinkExtractor();\n        let input = \"- [x] [actual link](http://example.com/)\\n- [ ] unchecked task\";\n        let result = le.find_links(input);\n        // Only the actual link should be detected, not the checkboxes\n        assert_eq!(1, result.len());\n        assert_eq!(result[0].as_ref().unwrap().target, \"http://example.com/\");\n    }\n}\n"
  },
  {
    "path": "src/link_extractors/mod.rs",
    "content": "mod html_link_extractor;\nmod ignore_comments;\npub mod link_extractor;\nmod markdown_link_extractor;\n"
  },
  {
    "path": "src/link_validator/file_system.rs",
    "content": "use crate::link_validator::LinkCheckResult;\nuse crate::Config;\nuse async_std::fs::canonicalize;\nuse async_std::path::Path;\nuse async_std::path::PathBuf;\nuse std::path::MAIN_SEPARATOR;\nuse walkdir::WalkDir;\n\npub async fn check_filesystem(target: &str, config: &Config) -> LinkCheckResult {\n    let target = Path::new(target);\n    debug!(\"Absolute target path {target:?}\");\n    if target.exists().await {\n        LinkCheckResult::Ok\n    } else if !config.optional.match_file_extension.unwrap_or_default()\n        && target.extension().is_none()\n    {\n        // Check if file exists ignoring the file extension\n        let target_file_name = match target.file_name() {\n            Some(s) => s,\n            None => return LinkCheckResult::Failed(\"Target path not found.\".to_string()),\n        };\n        let target_parent = match target.parent() {\n            Some(s) => s,\n            None => return LinkCheckResult::Failed(\"Target parent not found.\".to_string()),\n        };\n        debug!(\"Check if file ignoring the extension exists.\");\n        if target_parent.exists().await {\n            debug!(\"Parent {target_parent:?} exists. Search dir for file ignoring the extension.\");\n            for entry in WalkDir::new(target_parent)\n                .follow_links(false)\n                .max_depth(1)\n                .into_iter()\n                .filter_map(Result::ok)\n                .filter(|e| !e.file_type().is_dir())\n            {\n                let mut file_on_system = entry.into_path();\n                file_on_system.set_extension(\"\");\n                match file_on_system.file_name() {\n                    Some(file_name) => {\n                        if target_file_name == file_name {\n                            info!(\"Found file {file_on_system:?}\");\n                            return LinkCheckResult::Ok;\n                        }\n                    }\n                    None => {\n                        return LinkCheckResult::Failed(\"Target filename not found.\".to_string())\n                    }\n                }\n            }\n            LinkCheckResult::Failed(\"Target not found.\".to_string())\n        } else {\n            LinkCheckResult::Failed(\"Target not found.\".to_string())\n        }\n    } else {\n        LinkCheckResult::Failed(\"Target filename not found.\".to_string())\n    }\n}\n\npub async fn resolve_target_link(source: &str, target: &str, config: &Config) -> String {\n    let mut normalized_link = target.replace(['/', '\\\\'], std::path::MAIN_SEPARATOR_STR);\n    if let Some(idx) = normalized_link.find('#') {\n        info!(\n            \"Strip everything after #. The chapter part '{}' is not checked.\",\n            &normalized_link[idx..]\n        );\n        normalized_link = normalized_link[..idx].to_string();\n    }\n    let mut fs_link_target = Path::new(&normalized_link).to_path_buf();\n    if normalized_link.starts_with(MAIN_SEPARATOR) && config.optional.root_dir.is_some() {\n        match canonicalize(&config.optional.root_dir.as_ref().unwrap()).await {\n            Ok(new_root) => fs_link_target = new_root.join(Path::new(&normalized_link[1..])),\n            Err(e) => panic!(\n                \"Root path could not be converted to an absolute path. Does the directory exit? {}\",\n                e\n            ),\n        }\n    }\n\n    debug!(\"Check file system link target {target:?}\");\n    let abs_path = absolute_target_path(source, &fs_link_target)\n        .await\n        .to_str()\n        .expect(\"Could not resolve target path\")\n        .to_string();\n    // Remove verbatim path identifier which causes trouble on windows when using ../../ in paths\n    abs_path\n        .strip_prefix(\"\\\\\\\\?\\\\\")\n        .unwrap_or(&abs_path)\n        .to_string()\n}\n\nasync fn absolute_target_path(source: &str, target: &PathBuf) -> PathBuf {\n    let abs_source = canonicalize(source).await.expect(\"Expected path to exist.\");\n    if target.is_relative() {\n        let root = format!(\"{MAIN_SEPARATOR}\");\n        let parent = abs_source.parent().unwrap_or_else(|| Path::new(&root));\n        let new_target = match target.strip_prefix(format!(\".{MAIN_SEPARATOR}\")) {\n            Ok(t) => t,\n            Err(_) => target,\n        };\n        parent.join(new_target)\n    } else {\n        target.clone()\n    }\n}\n\n#[cfg(test)]\nmod test {\n    use super::*;\n\n    #[tokio::test]\n    async fn remove_dot() {\n        let source = Path::new(file!())\n            .parent()\n            .unwrap()\n            .parent()\n            .unwrap()\n            .parent()\n            .unwrap()\n            .join(\"benches\")\n            .join(\"benchmark\");\n        let target = Path::new(\"./script_and_comments.md\").to_path_buf();\n\n        let path = absolute_target_path(source.to_str().unwrap(), &target).await;\n\n        let path_str = path.to_str().unwrap().to_string();\n        println!(\"{path_str:?}\");\n        assert_eq!(path_str.matches('.').count(), 1);\n    }\n}\n"
  },
  {
    "path": "src/link_validator/http.rs",
    "content": "use crate::link_validator::LinkCheckResult;\n\nuse reqwest::header::ACCEPT;\nuse reqwest::header::USER_AGENT;\nuse reqwest::Client;\nuse reqwest::Method;\nuse reqwest::Request;\nuse reqwest::StatusCode;\nuse wildmatch::WildMatch;\n\nconst BROWSER_ACCEPT_HEADER: &str =\n    \"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\";\n\npub async fn check_http(\n    target: &str,\n    do_not_warn_for_redirect_to: &[WildMatch],\n    http_headers: &[(String, String)],\n) -> LinkCheckResult {\n    debug!(\"Check http link target {target:?}\");\n    let url = reqwest::Url::parse(target).expect(\"URL of unknown type\");\n\n    match http_request(&url, do_not_warn_for_redirect_to, http_headers).await {\n        Ok(response) => response,\n        Err(error_msg) => LinkCheckResult::Failed(format!(\"Http(s) request failed. {error_msg}\")),\n    }\n}\n\nfn new_request(method: Method, url: &reqwest::Url, http_headers: &[(String, String)]) -> Request {\n    let mut req = Request::new(method, url.clone());\n    let headers = req.headers_mut();\n    headers.insert(ACCEPT, BROWSER_ACCEPT_HEADER.parse().unwrap());\n\n    // Set default user agent if no custom User-Agent is provided\n    let has_custom_user_agent = http_headers\n        .iter()\n        .any(|(k, _)| k.to_lowercase() == \"user-agent\");\n    if !has_custom_user_agent {\n        headers.insert(USER_AGENT, \"mlc (github.com/becheran/mlc)\".parse().unwrap());\n    }\n\n    // Apply custom headers\n    for (key, value) in http_headers {\n        if let (Ok(header_name), Ok(header_value)) = (\n            reqwest::header::HeaderName::from_bytes(key.as_bytes()),\n            reqwest::header::HeaderValue::from_str(value),\n        ) {\n            headers.insert(header_name, header_value);\n        } else {\n            warn!(\"Invalid HTTP header: {}: {}\", key, value);\n        }\n    }\n\n    req\n}\n\nasync fn http_request(\n    url: &reqwest::Url,\n    do_not_warn_for_redirect_to: &[WildMatch],\n    http_headers: &[(String, String)],\n) -> reqwest::Result<LinkCheckResult> {\n    lazy_static! {\n        static ref CLIENT: Client = reqwest::Client::builder()\n            .brotli(true)\n            .gzip(true)\n            .deflate(true)\n            .build()\n            .expect(\"Bug! failed to build client\");\n    }\n\n    fn status_to_string(status: StatusCode) -> String {\n        format!(\n            \"{} - {}\",\n            status.as_str(),\n            status.canonical_reason().unwrap_or(\"Unknown reason\")\n        )\n    }\n\n    let response = CLIENT\n        .execute(new_request(Method::HEAD, url, http_headers))\n        .await?;\n    let check_redirect = |response_url: &reqwest::Url| -> reqwest::Result<LinkCheckResult> {\n        // Compare URLs ignoring fragments since fragments are not sent to the server\n        // and the response URL will never have them\n        let urls_match = url.scheme() == response_url.scheme()\n            && url.host() == response_url.host()\n            && url.port() == response_url.port()\n            && url.path() == response_url.path()\n            && url.query() == response_url.query();\n\n        if urls_match\n            || do_not_warn_for_redirect_to\n                .iter()\n                .any(|x| x.matches(response_url.as_ref()))\n        {\n            Ok(LinkCheckResult::Ok)\n        } else {\n            Ok(LinkCheckResult::Warning(\n                \"Request was redirected to \".to_string() + response_url.as_ref(),\n            ))\n        }\n    };\n\n    let status = response.status();\n    if status.is_success() || status.is_redirection() {\n        check_redirect(response.url())\n    } else {\n        debug!(\"Got the status code {status:?}. Retry with get-request.\");\n        let get_request = new_request(Method::GET, url, http_headers);\n\n        let response = CLIENT.execute(get_request).await?;\n        let status = response.status();\n        if status.is_success() || status.is_redirection() {\n            check_redirect(response.url())\n        } else {\n            Ok(LinkCheckResult::Failed(status_to_string(status)))\n        }\n    }\n}\n\n#[cfg(test)]\nmod test {\n    use super::*;\n\n    #[tokio::test]\n    async fn check_http_is_available() {\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n\n        let result = check_http(&server.url(), &[], &[]).await;\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n\n    #[tokio::test]\n    async fn check_http_fail() {\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(500)\n            .create_async()\n            .await;\n\n        let result = check_http(&server.url(), &[], &[]).await;\n        assert_eq!(\n            result,\n            LinkCheckResult::Failed(\"500 - Internal Server Error\".to_string())\n        );\n    }\n\n    #[tokio::test]\n    async fn check_http_is_redirection() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(301)\n            .with_header(\"Location\", &redirect_server.url())\n            .create_async()\n            .await;\n\n        let result = check_http(&server.url(), &[], &[]).await;\n        assert_eq!(\n            result,\n            LinkCheckResult::Warning(format!(\n                \"Request was redirected to {}/\",\n                &redirect_server.url()\n            ))\n        );\n    }\n\n    #[tokio::test]\n    async fn check_http_redirection_do_not_warn_if_ignored() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(301)\n            .with_header(\"Location\", &redirect_server.url())\n            .create_async()\n            .await;\n\n        let result = check_http(\n            &server.url(),\n            &[WildMatch::new(&format!(\"{}*\", &redirect_server.url()))],\n            &[],\n        )\n        .await;\n\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n\n    #[tokio::test]\n    async fn check_http_redirection_do_not_warn_if_ignored_star_pattern() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(301)\n            .with_header(\"Location\", &redirect_server.url())\n            .create_async()\n            .await;\n\n        let result = check_http(&server.url(), &[WildMatch::new(\"*\")], &[]).await;\n\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n\n    #[tokio::test]\n    async fn check_http_redirection_do_warn_if_ignored_mismatch() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(301)\n            .with_header(\"Location\", &redirect_server.url())\n            .create_async()\n            .await;\n\n        let result = check_http(\n            &server.url(),\n            &[WildMatch::new(\"http://is-mismatched.com/*\")],\n            &[],\n        )\n        .await;\n\n        assert_eq!(\n            result,\n            LinkCheckResult::Warning(format!(\n                \"Request was redirected to {}/\",\n                &redirect_server.url()\n            ))\n        );\n    }\n\n    #[tokio::test]\n    async fn check_http_is_redirection_failure() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/\")\n            .with_status(403)\n            .create_async()\n            .await;\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(301)\n            .with_header(\"Location\", &redirect_server.url())\n            .create_async()\n            .await;\n\n        let result = check_http(&server.url(), &[], &[]).await;\n\n        assert_eq!(\n            result,\n            LinkCheckResult::Failed(\"403 - Forbidden\".to_string())\n        );\n    }\n\n    #[tokio::test]\n    async fn check_http_with_fragment_no_warning() {\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/page\")\n            .with_status(200)\n            .create_async()\n            .await;\n\n        // The URL with a fragment should not produce a redirect warning\n        // because the fragment is not sent to the server\n        let url_with_fragment = format!(\"{}/page#anchor\", server.url());\n        let result = check_http(&url_with_fragment, &[], &[]).await;\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n\n    #[tokio::test]\n    async fn check_http_with_fragment_real_redirect_warns() {\n        let mut redirect_server = mockito::Server::new_async().await;\n        redirect_server\n            .mock(\"GET\", \"/other-page\")\n            .with_status(200)\n            .create_async()\n            .await;\n\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/page\")\n            .with_status(301)\n            .with_header(\"Location\", &format!(\"{}/other-page\", redirect_server.url()))\n            .create_async()\n            .await;\n\n        // A real redirect to a different page should still produce a warning\n        // even if the original URL had a fragment\n        let url_with_fragment = format!(\"{}/page#anchor\", server.url());\n        let result = check_http(&url_with_fragment, &[], &[]).await;\n        assert_eq!(\n            result,\n            LinkCheckResult::Warning(format!(\n                \"Request was redirected to {}/other-page\",\n                &redirect_server.url()\n            ))\n        );\n    }\n\n    #[tokio::test]\n    async fn check_http_with_custom_headers() {\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"GET\", \"/\")\n            .match_header(\"user-agent\", \"CustomAgent/1.0\")\n            .match_header(\"x-custom-header\", \"test-value\")\n            .with_status(200)\n            .create_async()\n            .await;\n\n        let custom_headers = vec![\n            (\"User-Agent\".to_string(), \"CustomAgent/1.0\".to_string()),\n            (\"X-Custom-Header\".to_string(), \"test-value\".to_string()),\n        ];\n        let result = check_http(&server.url(), &[], &custom_headers).await;\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n}\n"
  },
  {
    "path": "src/link_validator/link_type.rs",
    "content": "extern crate url;\n\nuse self::url::Url;\nuse regex::Regex;\n\n#[derive(Debug, PartialEq, Eq, Hash, Clone, Copy)]\npub enum LinkType {\n    Http,\n    Ftp,\n    Mail,\n    FileSystem,\n    UnknownUrlSchema,\n    Unknown,\n}\n\n#[must_use]\npub fn get_link_type(link: &str) -> LinkType {\n    lazy_static! {\n        static ref FILE_SYSTEM_REGEX: Regex =\n            Regex::new(r\"^(([[:alpha:]]:(\\\\|/))|(..?(\\\\|/))|((\\\\\\\\?|//?))).*\").unwrap();\n    }\n\n    if FILE_SYSTEM_REGEX.is_match(link) || !link.contains(':') {\n        return if link.contains('@') {\n            LinkType::Mail\n        } else {\n            LinkType::FileSystem\n        };\n    }\n\n    if let Ok(url) = Url::parse(link) {\n        let scheme = url.scheme();\n        debug!(\"Link {link} is a URL type with scheme {scheme}\");\n        return match scheme {\n            \"http\" | \"https\" => LinkType::Http,\n            \"ftp\" | \"ftps\" => LinkType::Ftp,\n            \"mailto\" => LinkType::Mail,\n            \"file\" => LinkType::FileSystem,\n            _ => LinkType::UnknownUrlSchema,\n        };\n    }\n    LinkType::UnknownUrlSchema\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use ntest::test_case;\n\n    fn test_link(link: &str, expected_type: &LinkType) {\n        let link_type = get_link_type(link);\n        assert_eq!(link_type, *expected_type);\n    }\n\n    #[test_case(\"https://doc.rust-lang.org.html\")]\n    #[test_case(\"http://www.website.php\")]\n    fn http_link_types(link: &str) {\n        test_link(link, &LinkType::Http);\n    }\n\n    #[test_case(\"ftp://mueller:12345@ftp.downloading.ch\")]\n    fn ftp_link_types(ftp: &str) {\n        test_link(ftp, &LinkType::Ftp);\n    }\n\n    #[test_case(\"F:/fake/windows/paths\")]\n    #[test_case(\"\\\\\\\\smb}\\\\paths\")]\n    #[test_case(\"C:\\\\traditional\\\\paths\")]\n    #[test_case(\"\\\\file.ext\")]\n    #[test_case(\"file:///some/path/\")]\n    #[test_case(\"path\")]\n    #[test_case(\"./file.ext\")]\n    #[test_case(\".\\\\file.md\")]\n    #[test_case(\"../upper_dir.md\")]\n    #[test_case(\"..\\\\upper_dir.mdc\")]\n    #[test_case(\"D:\\\\Program Files(x86)\\\\file.log\")]\n    #[test_case(\"D:\\\\Program Files(x86)\\\\folder\\\\file.log\")]\n    fn test_file_system_link_types(link: &str) {\n        test_link(link, &LinkType::FileSystem);\n    }\n}\n"
  },
  {
    "path": "src/link_validator/mail.rs",
    "content": "use crate::link_validator::LinkCheckResult;\nuse regex::Regex;\n\npub fn check_mail(target: &str) -> LinkCheckResult {\n    debug!(\"Check mail target {target:?}\");\n    let mut mail = target;\n    if let Some(stripped) = target.strip_prefix(\"mailto://\") {\n        mail = stripped;\n    } else if let Some(stripped) = target.strip_prefix(\"mailto:\") {\n        mail = stripped;\n    }\n    lazy_static! {\n        static ref EMAIL_REGEX: Regex = Regex::new(\n            r\"^((?i)[a-z0-9_!#$%&'*+-/=?^`{|}~+]([a-z0-9_!#$%&'*+-/=?^`{|}~+.]*[a-z0-9_!#$%&'*+-/=?^_{|}~+])?)@([a-z0-9]+([\\-\\.]{1}[a-z0-9]+)*\\.[a-z]{2,6})\"\n        )\n        .unwrap();\n    }\n    if EMAIL_REGEX.is_match(mail) {\n        LinkCheckResult::Ok\n    } else {\n        LinkCheckResult::Failed(\"Not a valid mail address.\".to_string())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use ntest::test_case;\n\n    #[test_case(\"mailto://+bar@bar.com\")]\n    #[test_case(\"mailto://foo+@bar.com\")]\n    #[test_case(\"mailto://foo.lastname@bar.com\")]\n    #[test_case(\"mailto://tst@xyz.us\")]\n    #[test_case(\"mailto:bla.bla@web.de\")]\n    #[test_case(\"mailto:bla.bla.ext@web.de\")]\n    #[test_case(\"mailto:BlA.bLa.ext@web.de\")]\n    #[test_case(\"mailto:foo-bar@foobar.com\")]\n    #[test_case(\"mailto:!#$%&'*+-/=?^_`{|}~-foo@foobar.com\")]\n    #[test_case(\"mailto:some@hostnumbers123.com\")]\n    #[test_case(\"mailto:some@host-name.com\")]\n    #[test_case(\"bla.bla@web.de\")]\n    fn mail_links(link: &str) {\n        let result = check_mail(link);\n        assert_eq!(result, LinkCheckResult::Ok);\n    }\n\n    #[test_case(\"mailto://@bar@bar\")]\n    #[test_case(\"mailto://foobar.com\")]\n    #[test_case(\"mailto://foo.lastname.com\")]\n    #[test_case(\"mailto:foo.do@l$astname.cOM\")]\n    #[test_case(\"mailto:foo@l_astname.cOM\")]\n    fn invalid_mail_links(link: &str) {\n        let result = check_mail(link);\n        assert!(result != LinkCheckResult::Ok);\n    }\n}\n"
  },
  {
    "path": "src/link_validator/mod.rs",
    "content": "mod file_system;\nmod http;\nmod mail;\n\npub mod link_type;\n\nuse crate::link_extractors::link_extractor::MarkupLink;\nuse crate::link_validator::file_system::check_filesystem;\nuse crate::link_validator::http::check_http;\nuse crate::Config;\nuse mail::check_mail;\n\npub use link_type::get_link_type;\npub use link_type::LinkType;\nuse wildmatch::WildMatch;\n\n#[derive(Debug, Eq, PartialEq, Clone)]\npub enum LinkCheckResult {\n    Ok,\n    Failed(String),\n    Warning(String),\n    Ignored(String),\n    NotImplemented(String),\n}\n\npub async fn resolve_target_link(\n    link: &MarkupLink,\n    link_type: &LinkType,\n    config: &Config,\n) -> String {\n    if link_type == &LinkType::FileSystem {\n        file_system::resolve_target_link(&link.source, &link.target, config).await\n    } else {\n        link.target.to_string()\n    }\n}\n\npub async fn check(\n    link_target: &str,\n    link_type: &LinkType,\n    config: &Config,\n    do_not_warn_for_redirect_to: &[WildMatch],\n    http_headers: &[(String, String)],\n) -> LinkCheckResult {\n    info!(\"Check link {}.\", &link_target);\n    match link_type {\n        LinkType::Ftp => LinkCheckResult::NotImplemented(format!(\n            \"Link type '{:?}' is not supported yet...\",\n            &link_target\n        )),\n        LinkType::UnknownUrlSchema | LinkType::Unknown => LinkCheckResult::NotImplemented(\n            \"Link type is not implemented yet and cannot be checked.\".to_string(),\n        ),\n        LinkType::Mail => check_mail(link_target),\n        LinkType::Http => {\n            if config.optional.offline.unwrap_or_default() {\n                LinkCheckResult::Ignored(\"Ignore web link because of the offline flag.\".to_string())\n            } else {\n                check_http(link_target, do_not_warn_for_redirect_to, http_headers).await\n            }\n        }\n        LinkType::FileSystem => check_filesystem(link_target, config).await,\n    }\n}\n"
  },
  {
    "path": "src/logger.rs",
    "content": "use std::time::SystemTime;\n\npub fn init(log_level: log::LevelFilter) -> Result<(), fern::InitError> {\n    fern::Dispatch::new()\n        .format(|out, message, record| {\n            out.finish(format_args!(\n                \"\\x1B[{}m[{} {} {}] {}\\x1B[0m\",\n                match record.level() {\n                    log::Level::Error => \"31\", // Red\n                    log::Level::Warn => \"33\",  // Yellow\n                    log::Level::Info => \"32\",  // Green\n                    log::Level::Debug => \"34\", // Blue\n                    log::Level::Trace => \"37\", // White\n                },\n                SystemTime::now()\n                    .duration_since(SystemTime::UNIX_EPOCH)\n                    .unwrap()\n                    .as_secs(),\n                record.level(),\n                record.target(),\n                message\n            ))\n        })\n        .level(log_level)\n        .chain(std::io::stdout())\n        .apply()?;\n    debug!(\"Initialized logging\");\n    Ok(())\n}\n"
  },
  {
    "path": "src/main.rs",
    "content": "#[macro_use]\nextern crate log;\n\nuse mlc::cli;\nuse mlc::logger;\nuse std::process;\n\n#[macro_use]\nextern crate clap;\n\nfn print_header() {\n    let width = 60;\n    let header = format!(\"markup link checker - mlc v{:}\", crate_version!());\n    println!();\n    println!(\"{:+<1$}\", \"\", width);\n    print!(\"+\");\n    print!(\"{: <1$}\", \"\", width - 2);\n    println!(\"+\");\n    print!(\"+\");\n    print!(\"{: ^1$}\", header, width - 2);\n    println!(\"+\");\n    print!(\"+\");\n    print!(\"{: <1$}\", \"\", width - 2);\n    println!(\"+\");\n    println!(\"{:+<1$}\", \"\", width);\n    println!();\n}\n\n#[tokio::main]\nasync fn main() -> Result<(), Box<dyn std::error::Error>> {\n    print_header();\n    let config = cli::parse_args();\n    let log_level = match config.optional.debug {\n        Some(true) => log::LevelFilter::Debug,\n        _ => log::LevelFilter::Error,\n    };\n    logger::init(log_level)?;\n    info!(\"Config: {}\", &config);\n    if mlc::run(&config).await.is_err() {\n        process::exit(1);\n    } else {\n        process::exit(0);\n    }\n}\n"
  },
  {
    "path": "src/markup.rs",
    "content": "use serde::Deserialize;\nuse std::str::FromStr;\n\n#[derive(Debug)]\npub struct MarkupFile {\n    pub markup_type: MarkupType,\n    pub path: String,\n}\n\n#[derive(Debug, Clone, Copy, Deserialize)]\npub enum MarkupType {\n    Markdown,\n    Html,\n}\n\nimpl FromStr for MarkupType {\n    type Err = ();\n\n    fn from_str(s: &str) -> Result<MarkupType, ()> {\n        match s {\n            \"md\" => Ok(MarkupType::Markdown),\n            \"html\" => Ok(MarkupType::Html),\n            _ => Err(()),\n        }\n    }\n}\n\nimpl MarkupType {\n    #[must_use]\n    pub fn file_extensions(&self) -> Vec<String> {\n        match self {\n            MarkupType::Markdown => vec![\n                \"md\".to_string(),\n                \"markdown\".to_string(),\n                \"mkdown\".to_string(),\n                \"mkdn\".to_string(),\n                \"mkd\".to_string(),\n                \"mdwn\".to_string(),\n                \"mdtxt\".to_string(),\n                \"mdtext\".to_string(),\n                \"text\".to_string(),\n                \"rmd\".to_string(),\n            ],\n            MarkupType::Html => vec![\"htm\".to_string(), \"html\".to_string(), \"xhtml\".to_string()],\n        }\n    }\n}\n"
  },
  {
    "path": "tests/end_to_end.rs",
    "content": "#[cfg(test)]\nmod helper;\n\nuse helper::benches_dir;\nuse mlc::markup::MarkupType;\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse std::fs;\nuse std::path::MAIN_SEPARATOR;\n\n#[tokio::test]\nasync fn end_to_end() {\n    let config = Config {\n        directory: benches_dir().join(\"benchmark\"),\n        optional: OptionalConfig {\n            debug: None,\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: Some(true), // Use offline mode to avoid checking external URLs\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: Some(vec![\"./doc/broken-local-link.doc\".to_string()]),\n            ignore_path: Some(vec![\n                fs::canonicalize(\"benches/benchmark/markdown/ignore_me.md\").unwrap(),\n                fs::canonicalize(\"./benches/benchmark/markdown/ignore_me_dir\").unwrap(),\n            ]),\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: None,\n            files: None,\n            http_headers: None,\n        },\n    };\n    if let Err(e) = mlc::run(&config).await {\n        panic!(\"Test failed. {:?}\", e);\n    }\n}\n\n#[tokio::test]\nasync fn end_to_end_different_root() {\n    let test_files = benches_dir().join(\"different_root\");\n    let csv_output = std::env::temp_dir().join(\"mlc_test_different_root.csv\");\n    let config = Config {\n        directory: test_files.clone(),\n        optional: OptionalConfig {\n            debug: Some(true),\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: None,\n            match_file_extension: None,\n            ignore_links: None,\n            ignore_path: None,\n            throttle: None,\n            root_dir: Some(test_files),\n            gitignore: None,\n            gituntracked: None,\n            csv_file: Some(csv_output.clone()),\n            files: None,\n            http_headers: None,\n        },\n    };\n    if let Err(e) = mlc::run(&config).await {\n        panic!(\"Test with custom root failed. {:?}\", e);\n    } else {\n        // Check if the CSV file was created, but is empty except for the header\n        let content = fs::read_to_string(csv_output).unwrap();\n        let lines: Vec<&str> = content.lines().collect();\n        assert_eq!(lines.len(), 1);\n        assert_eq!(lines[0], \"source,line,column,target,severity\");\n    }\n}\n\n#[tokio::test]\nasync fn end_to_end_write_csv_file() {\n    let csv_output = std::env::temp_dir().join(\"mlc_test_write_csv.csv\");\n    let config = Config {\n        directory: benches_dir().join(\"benchmark/markdown/ignore_me.md\"),\n        optional: OptionalConfig {\n            debug: None,\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: None,\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: None,\n            ignore_path: None,\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: Some(csv_output.clone()),\n            files: None,\n            http_headers: None,\n        },\n    };\n    if (mlc::run(&config).await).is_err() {\n        let content = fs::read_to_string(csv_output).unwrap();\n        let lines: Vec<&str> = content.lines().collect();\n        assert_eq!(lines.len(), 4);\n        assert_eq!(lines[0], \"source,line,column,target,severity\");\n        for (i, line) in lines.iter().enumerate().skip(1) {\n            assert_eq!(\n                line,\n                &format!(\n                    \"benches{MAIN_SEPARATOR}benchmark/markdown/ignore_me.md,{i},1,broken_Link,ERR\",\n                )\n            );\n        }\n    } else {\n        panic!(\"Should have detected errors\");\n    }\n}\n\n#[tokio::test]\nasync fn end_to_end_csv_include_warnings() {\n    let csv_output = std::env::temp_dir().join(\"mlc_test_csv_warnings.csv\");\n    let config = Config {\n        directory: benches_dir().join(\"benchmark/markdown/ref_links.md\"),\n        optional: OptionalConfig {\n            debug: None,\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: Some(true), // Use offline mode to avoid actual HTTP calls\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: None,\n            ignore_path: None,\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: Some(csv_output.clone()),\n            files: None,\n            http_headers: None,\n        },\n    };\n    // Run the check - should succeed because we're offline\n    let result = mlc::run(&config).await;\n\n    // Check that CSV was created\n    assert!(csv_output.exists(), \"CSV file should exist\");\n\n    let content = fs::read_to_string(&csv_output).unwrap();\n    let lines: Vec<&str> = content.lines().collect();\n\n    // Should have header and warning entries\n    assert!(\n        lines.len() > 1,\n        \"CSV should have header and warning entries\"\n    );\n    assert_eq!(lines[0], \"source,line,column,target,severity\");\n\n    // Verify that warning entries are present - the ref_links.md file has several broken markdown references\n    // Check that all lines after header have the expected CSV format with severity column\n    for line in lines.iter().skip(1) {\n        let parts: Vec<&str> = line.split(',').collect();\n        assert_eq!(\n            parts.len(),\n            5,\n            \"Each CSV line should have 5 columns including severity\"\n        );\n        assert!(\n            parts[0].contains(\"ref_links.md\"),\n            \"Source should be ref_links.md\"\n        );\n        assert_eq!(parts[4], \"WARN\", \"Severity should be WARN for warnings\");\n    }\n\n    // Verify specific warnings are captured (broken markdown references)\n    assert!(\n        content.contains(\",WARN\"),\n        \"CSV should contain WARN severity\"\n    );\n\n    // Clean up\n    let _ = fs::remove_file(csv_output);\n\n    // Also verify the test would pass\n    assert!(result.is_ok(), \"Should succeed with warnings only\");\n}\n"
  },
  {
    "path": "tests/end_to_end_mock.rs",
    "content": "#[cfg(test)]\nmod helper;\n\nuse helper::benches_dir;\nuse mlc::markup::MarkupType;\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse mockito::ServerGuard;\nuse std::fs;\nuse std::path::PathBuf;\n\nasync fn setup_mock_servers() -> Vec<ServerGuard> {\n    let mut servers = Vec::new();\n\n    // Create multiple mock servers\n    for _ in 0..8 {\n        let mut server = mockito::Server::new_async().await;\n        server\n            .mock(\"HEAD\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n        server\n            .mock(\"GET\", \"/\")\n            .with_status(200)\n            .create_async()\n            .await;\n        servers.push(server);\n    }\n\n    servers\n}\n\nfn replace_mock_urls(content: &str, servers: &[ServerGuard]) -> String {\n    let mut result = content.to_string();\n    for (i, server) in servers.iter().enumerate() {\n        let placeholder = format!(\"MOCK_SERVER_URL_{}\", i + 1);\n        result = result.replace(&placeholder, &server.url());\n    }\n    result\n}\n\nfn test_files_dir() -> PathBuf {\n    benches_dir()\n        .parent()\n        .unwrap()\n        .join(\"tests\")\n        .join(\"test_files\")\n}\n\n#[tokio::test]\nasync fn end_to_end_with_mock_servers() {\n    // Set up mock servers\n    let servers = setup_mock_servers().await;\n\n    // Create temporary directory for test files with replaced URLs\n    let temp_dir = std::env::temp_dir().join(\"mlc_test_mock_servers\");\n    if temp_dir.exists() {\n        fs::remove_dir_all(&temp_dir).unwrap();\n    }\n    fs::create_dir_all(&temp_dir).unwrap();\n    fs::create_dir_all(temp_dir.join(\"deep\")).unwrap();\n\n    // Copy and replace URLs in test files\n    let test_files = test_files_dir();\n\n    // Reference links file\n    let content = fs::read_to_string(test_files.join(\"reference_links.md\")).unwrap();\n    let updated_content = replace_mock_urls(&content, &servers);\n    fs::write(temp_dir.join(\"reference_links.md\"), updated_content).unwrap();\n\n    // Many links file\n    let content = fs::read_to_string(test_files.join(\"many_links.md\")).unwrap();\n    let updated_content = replace_mock_urls(&content, &servers);\n    fs::write(temp_dir.join(\"many_links.md\"), updated_content).unwrap();\n\n    // Repeat links file\n    let content = fs::read_to_string(test_files.join(\"repeat_links.md\")).unwrap();\n    let updated_content = replace_mock_urls(&content, &servers);\n    fs::write(temp_dir.join(\"repeat_links.md\"), updated_content).unwrap();\n\n    // Deep directory file\n    let content = fs::read_to_string(test_files.join(\"deep/index.md\")).unwrap();\n    fs::write(temp_dir.join(\"deep/index.md\"), content).unwrap();\n\n    // Run mlc with the temporary directory\n    let config = Config {\n        directory: temp_dir.clone(),\n        optional: OptionalConfig {\n            debug: Some(true),\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: None,\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: Some(vec![\n                // Only ignore non-http links that are expected to be unsupported\n                \"mailto://*\".to_string(),\n                \"another://*\".to_string(),\n            ]),\n            ignore_path: None,\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: None,\n            files: None,\n            http_headers: None,\n        },\n    };\n\n    // Run the link checker - should succeed because all mock servers return 200\n    if let Err(e) = mlc::run(&config).await {\n        panic!(\"Test failed with mock servers. {:?}\", e);\n    }\n\n    // Clean up\n    fs::remove_dir_all(&temp_dir).unwrap();\n}\n\n#[tokio::test]\nasync fn end_to_end_with_mock_server_failure() {\n    // Set up a mock server that returns 404\n    let mut server = mockito::Server::new_async().await;\n    server\n        .mock(\"HEAD\", \"/\")\n        .with_status(404)\n        .create_async()\n        .await;\n    server\n        .mock(\"GET\", \"/\")\n        .with_status(404)\n        .create_async()\n        .await;\n\n    // Create temporary directory for test files\n    let temp_dir = std::env::temp_dir().join(\"mlc_test_mock_failure\");\n    if temp_dir.exists() {\n        fs::remove_dir_all(&temp_dir).unwrap();\n    }\n    fs::create_dir_all(&temp_dir).unwrap();\n\n    // Create a simple test file with a broken link\n    let content = format!(\"[Broken Link]({})\", server.url());\n    fs::write(temp_dir.join(\"broken.md\"), content).unwrap();\n\n    let config = Config {\n        directory: temp_dir.clone(),\n        optional: OptionalConfig {\n            debug: Some(true),\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: None,\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: None,\n            ignore_path: None,\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: None,\n            files: None,\n            http_headers: None,\n        },\n    };\n\n    // Run the link checker - should fail because server returns 404\n    if mlc::run(&config).await.is_ok() {\n        panic!(\"Test should have failed due to 404 response from mock server\");\n    }\n\n    // Clean up\n    fs::remove_dir_all(&temp_dir).unwrap();\n}\n\n#[tokio::test]\nasync fn end_to_end_with_mock_server_redirect() {\n    // Set up redirect and target mock servers\n    let mut target_server = mockito::Server::new_async().await;\n    target_server\n        .mock(\"HEAD\", \"/\")\n        .with_status(200)\n        .create_async()\n        .await;\n    target_server\n        .mock(\"GET\", \"/\")\n        .with_status(200)\n        .create_async()\n        .await;\n\n    let mut redirect_server = mockito::Server::new_async().await;\n    redirect_server\n        .mock(\"HEAD\", \"/\")\n        .with_status(301)\n        .with_header(\"Location\", &target_server.url())\n        .create_async()\n        .await;\n    redirect_server\n        .mock(\"GET\", \"/\")\n        .with_status(301)\n        .with_header(\"Location\", &target_server.url())\n        .create_async()\n        .await;\n\n    // Create temporary directory for test files\n    let temp_dir = std::env::temp_dir().join(\"mlc_test_mock_redirect\");\n    if temp_dir.exists() {\n        fs::remove_dir_all(&temp_dir).unwrap();\n    }\n    fs::create_dir_all(&temp_dir).unwrap();\n\n    // Create a test file with a redirect\n    let content = format!(\"[Redirect Link]({})\", redirect_server.url());\n    fs::write(temp_dir.join(\"redirect.md\"), content).unwrap();\n\n    let config = Config {\n        directory: temp_dir.clone(),\n        optional: OptionalConfig {\n            debug: Some(true),\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: None,\n            match_file_extension: None,\n            throttle: None,\n            ignore_links: None,\n            ignore_path: None,\n            root_dir: None,\n            gitignore: None,\n            gituntracked: None,\n            csv_file: None,\n            files: None,\n            http_headers: None,\n        },\n    };\n\n    // Run the link checker - should succeed but with warnings\n    // The run should succeed even with redirect warnings\n    let result = mlc::run(&config).await;\n    assert!(\n        result.is_ok(),\n        \"Test should succeed even with redirect warnings\"\n    );\n\n    // Clean up\n    fs::remove_dir_all(&temp_dir).unwrap();\n}\n"
  },
  {
    "path": "tests/file_traversal.rs",
    "content": "#[cfg(test)]\nuse mlc::file_traversal;\nuse mlc::markup::{MarkupFile, MarkupType};\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse std::path::Path;\n\n#[test]\nfn find_markdown_files() {\n    let path = Path::new(\"./benches/benchmark/markdown/md_file_endings\").to_path_buf();\n    let config: Config = Config {\n        directory: path,\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n    let mut result: Vec<MarkupFile> = Vec::new();\n\n    file_traversal::find(&config, &mut result);\n    assert_eq!(result.len(), 12);\n}\n\n#[test]\nfn empty_folder() {\n    let path = Path::new(\"./benches/benchmark/markdown/empty\").to_path_buf();\n    let config: Config = Config {\n        directory: path,\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n    let mut result: Vec<MarkupFile> = Vec::new();\n\n    file_traversal::find(&config, &mut result);\n    assert!(result.is_empty());\n}\n"
  },
  {
    "path": "tests/files_option.rs",
    "content": "use mlc::file_traversal;\nuse mlc::markup::{MarkupFile, MarkupType};\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse std::path::{Path, PathBuf};\n\n#[test]\nfn find_specific_files() {\n    let file1 = Path::new(\"./README.md\").to_path_buf();\n    let file2 = Path::new(\"./CHANGELOG.md\").to_path_buf();\n\n    let config: Config = Config {\n        directory: PathBuf::from(\".\"),\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            files: Some(vec![file1, file2]),\n            ..Default::default()\n        },\n    };\n\n    let mut result: Vec<MarkupFile> = Vec::new();\n    file_traversal::find(&config, &mut result);\n\n    assert_eq!(result.len(), 2);\n    assert!(result.iter().any(|f| f.path.contains(\"README.md\")));\n    assert!(result.iter().any(|f| f.path.contains(\"CHANGELOG.md\")));\n}\n\n#[test]\nfn find_single_file() {\n    let file1 = Path::new(\"./README.md\").to_path_buf();\n\n    let config: Config = Config {\n        directory: PathBuf::from(\".\"),\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            files: Some(vec![file1]),\n            ..Default::default()\n        },\n    };\n\n    let mut result: Vec<MarkupFile> = Vec::new();\n    file_traversal::find(&config, &mut result);\n\n    assert_eq!(result.len(), 1);\n    assert!(result[0].path.contains(\"README.md\"));\n}\n\n#[test]\nfn find_files_ignores_non_matching_types() {\n    // Test with a markdown file but only HTML markup type configured\n    let file1 = Path::new(\"./README.md\").to_path_buf();\n\n    let config: Config = Config {\n        directory: PathBuf::from(\".\"),\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Html]),\n            files: Some(vec![file1]),\n            ..Default::default()\n        },\n    };\n\n    let mut result: Vec<MarkupFile> = Vec::new();\n    file_traversal::find(&config, &mut result);\n\n    // Should not find any files since README.md is markdown, not HTML\n    assert_eq!(result.len(), 0);\n}\n\n#[test]\nfn find_files_with_ignore_path() {\n    let file1 = Path::new(\"./README.md\").to_path_buf();\n    let ignore_file = std::fs::canonicalize(Path::new(\"./README.md\")).unwrap();\n\n    let config: Config = Config {\n        directory: PathBuf::from(\".\"),\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            files: Some(vec![file1]),\n            ignore_path: Some(vec![ignore_file]),\n            ..Default::default()\n        },\n    };\n\n    let mut result: Vec<MarkupFile> = Vec::new();\n    file_traversal::find(&config, &mut result);\n\n    // Should be empty because the file is in ignore_path\n    assert_eq!(result.len(), 0);\n}\n"
  },
  {
    "path": "tests/gitignore_recursive.rs",
    "content": "use mlc::markup::MarkupType;\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse std::fs;\nuse std::path::{Path, PathBuf};\nuse std::process::Command;\n\nstruct TempDir {\n    path: PathBuf,\n}\n\nimpl TempDir {\n    fn new(name: &str) -> Self {\n        let mut path = std::env::temp_dir();\n        let unique = format!(\n            \"mlc_test_{name}_{}_{}\",\n            std::process::id(),\n            std::time::SystemTime::now()\n                .duration_since(std::time::UNIX_EPOCH)\n                .unwrap_or_default()\n                .as_nanos()\n        );\n        path.push(unique);\n        fs::create_dir_all(&path).expect(\"failed to create temp dir\");\n        Self { path }\n    }\n}\n\nimpl Drop for TempDir {\n    fn drop(&mut self) {\n        let _ = fs::remove_dir_all(&self.path);\n    }\n}\n\nfn git_available() -> bool {\n    Command::new(\"git\")\n        .arg(\"--version\")\n        .output()\n        .map(|o| o.status.success())\n        .unwrap_or(false)\n}\n\nfn run_git(repo: &Path, args: &[&str]) {\n    let status = Command::new(\"git\")\n        .current_dir(repo)\n        .args(args)\n        .status()\n        .expect(\"failed to run git\");\n    assert!(status.success(), \"git command failed: git {:?}\", args);\n}\n\n#[tokio::test]\nasync fn gitignore_is_recursive_nested_gitignore_is_respected() {\n    if !git_available() {\n        panic!(\"Failing test: git executable must be available\");\n    }\n\n    let repo = TempDir::new(\"gitignore_recursive\");\n\n    // Create nested structure\n    let docs_dir = repo.path.join(\"docs\");\n    fs::create_dir_all(&docs_dir).expect(\"failed to create docs dir\");\n\n    // Nested .gitignore ignores only ignored.md (not configured in root .gitignore)\n    fs::write(docs_dir.join(\".gitignore\"), \"ignored.md\\n\")\n        .expect(\"failed to write nested .gitignore\");\n\n    // Tracked files (should be checked)\n    fs::write(docs_dir.join(\"ok_target.md\"), \"# ok\\n\").expect(\"failed to write ok_target.md\");\n    fs::write(docs_dir.join(\"checked.md\"), \"[ok](./ok_target.md)\\n\")\n        .expect(\"failed to write checked.md\");\n\n    // Ignored file contains a broken link; if this file is (incorrectly) checked, mlc should fail.\n    fs::write(docs_dir.join(\"ignored.md\"), \"[broken](./missing.md)\\n\")\n        .expect(\"failed to write ignored.md\");\n\n    // Initialize git repo and commit tracked files.\n    run_git(&repo.path, &[\"init\"]);\n    run_git(&repo.path, &[\"config\", \"user.email\", \"test@example.com\"]);\n    run_git(&repo.path, &[\"config\", \"user.name\", \"mlc test\"]);\n    run_git(\n        &repo.path,\n        &[\n            \"add\",\n            \"docs/.gitignore\",\n            \"docs/ok_target.md\",\n            \"docs/checked.md\",\n        ],\n    );\n    run_git(&repo.path, &[\"commit\", \"-m\", \"test fixtures\"]);\n\n    let config = Config {\n        directory: repo.path.clone(),\n        optional: OptionalConfig {\n            debug: None,\n            do_not_warn_for_redirect_to: None,\n            markup_types: Some(vec![MarkupType::Markdown]),\n            offline: Some(true),\n            match_file_extension: None,\n            ignore_links: None,\n            ignore_path: None,\n            root_dir: None,\n            gitignore: Some(true),\n            gituntracked: None,\n            csv_file: None,\n            throttle: None,\n            files: None,\n            http_headers: None,\n        },\n    };\n\n    let result = mlc::run(&config).await;\n\n    assert!(\n        result.is_ok(),\n        \"Expected ok because ignored.md should be ignored by nested .gitignore\"\n    );\n}\n"
  },
  {
    "path": "tests/helper/mod.rs",
    "content": "#[cfg(test)]\nuse std::path::{Path, PathBuf};\n\npub fn benches_dir() -> PathBuf {\n    Path::new(file!())\n        .parent()\n        .unwrap()\n        .parent()\n        .unwrap()\n        .parent()\n        .unwrap()\n        .join(\"benches\")\n}\n"
  },
  {
    "path": "tests/markdown_files.rs",
    "content": "#[cfg(test)]\nuse mlc::link_extractors::link_extractor::find_links;\nuse mlc::markup::{MarkupFile, MarkupType};\n\n#[test]\nfn no_links() {\n    let path = \"./benches/benchmark/markdown/no_links/no_links.md\".to_string();\n    let file = MarkupFile {\n        path,\n        markup_type: MarkupType::Markdown,\n    };\n    let result = find_links(&file);\n    assert!(result.is_empty());\n}\n\n#[test]\nfn some_links() {\n    let path = \"./benches/benchmark/markdown/many_links/many_links.md\".to_string();\n    let file = MarkupFile {\n        path,\n        markup_type: MarkupType::Markdown,\n    };\n    let result = find_links(&file);\n    assert_eq!(result.len(), 12);\n}\n"
  },
  {
    "path": "tests/symlink_test.rs",
    "content": "#[cfg(test)]\nuse mlc::file_traversal;\nuse mlc::markup::{MarkupFile, MarkupType};\nuse mlc::Config;\nuse mlc::OptionalConfig;\nuse std::path::Path;\n\n#[test]\nfn test_symlink_dedupe() {\n    let path = Path::new(\"./tests/test_files/symlink_test\").to_path_buf();\n    let config: Config = Config {\n        directory: path,\n        optional: OptionalConfig {\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n    let mut result: Vec<MarkupFile> = Vec::new();\n\n    file_traversal::find(&config, &mut result);\n\n    // Should find only 1 file (not 2) since symlink.md points to original.md\n    assert_eq!(\n        result.len(),\n        1,\n        \"Expected to find only 1 file, but found {}: {:?}\",\n        result.len(),\n        result\n    );\n}\n"
  },
  {
    "path": "tests/test_files/deep/index.md",
    "content": "# Deep file\n\nSome content here.\n"
  },
  {
    "path": "tests/test_files/many_links.md",
    "content": "# Many Links\n\n[local_file](many_links.md)\n[folder](./deep)\n[https_link](MOCK_SERVER_URL_6)\n[https_link2](MOCK_SERVER_URL_7)\n\n[mail](mailto://test.mail@tester.com)\n\n[unknown_url](another://foobar)\n"
  },
  {
    "path": "tests/test_files/reference_links.md",
    "content": "# Contain reference style markdown links\n\n[I'm a reference-style link][Arbitrary case-insensitive reference text]\n\n[I'm a relative reference to a repository file](./many_links.md)\n\n[You can use numbers for reference-style link definitions][1]\n\nOr leave it empty and use the [link text itself].\n\n[This is not a valid reference link][2]\n\nURLs and URLs in angle brackets will automatically get turned into links.\n<MOCK_SERVER_URL_1> or <MOCK_SERVER_URL_2> and sometimes\nexample.com (but not on Github, for example).\n\nSome text to show that the reference links can follow later.\n\n[arbitrary case-insensitive reference text]: MOCK_SERVER_URL_3\n[1]: MOCK_SERVER_URL_4\n[link text itself]: MOCK_SERVER_URL_5\n"
  },
  {
    "path": "tests/test_files/repeat_links.md",
    "content": "# Chapter 1\n\n[Mock1](MOCK_SERVER_URL_8)\n[Mock2](MOCK_SERVER_URL_8)\n[Mock3](MOCK_SERVER_URL_8)\n"
  },
  {
    "path": "tests/test_files/symlink_test/original.md",
    "content": "# Test File\n\nThis is a test markdown file.\n\n[Valid Link](https://www.rust-lang.org/)\n"
  },
  {
    "path": "tests/throttle.rs",
    "content": "#[cfg(test)]\nmod helper;\n\nuse helper::benches_dir;\nuse mlc::{markup::MarkupType, Config, OptionalConfig};\nuse std::time::{Duration, Instant};\n\nconst TEST_THROTTLE_MS: u32 = 100;\nconst TEST_URLS: u32 = 10;\nconst THROTTLED_TIME_MS: u64 = (TEST_THROTTLE_MS as u64) * ((TEST_URLS as u64) - 1);\n\n#[tokio::test]\nasync fn throttle_different_hosts() {\n    let config = Config {\n        directory: benches_dir().join(\"throttle\").join(\"different_host.md\"),\n        optional: OptionalConfig {\n            throttle: Some(TEST_THROTTLE_MS),\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n    let start = Instant::now();\n    mlc::run(&config).await.unwrap_or(());\n    let duration = start.elapsed();\n    assert!(duration < Duration::from_millis(THROTTLED_TIME_MS))\n}\n\n#[tokio::test]\nasync fn throttle_same_hosts() {\n    let config = Config {\n        directory: benches_dir().join(\"throttle\").join(\"same_host.md\"),\n        optional: OptionalConfig {\n            throttle: Some(TEST_THROTTLE_MS),\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n\n    let start = Instant::now();\n    mlc::run(&config).await.unwrap_or(());\n    let duration = start.elapsed();\n    assert!(duration > Duration::from_millis(THROTTLED_TIME_MS))\n}\n\n#[tokio::test]\nasync fn throttle_same_ip() {\n    let config = Config {\n        directory: benches_dir().join(\"throttle\").join(\"same_ip.md\"),\n        optional: OptionalConfig {\n            throttle: Some(TEST_THROTTLE_MS),\n            markup_types: Some(vec![MarkupType::Markdown]),\n            ..Default::default()\n        },\n    };\n\n    let start = Instant::now();\n    mlc::run(&config).await.unwrap_or(());\n    let duration = start.elapsed();\n    assert!(duration > Duration::from_millis(THROTTLED_TIME_MS))\n}\n"
  }
]