[
  {
    "path": ".github/FUNDING.yml",
    "content": "# These are supported funding model platforms\n\ngithub: sammcj\n"
  },
  {
    "path": ".github/copilot-instructions.md",
    "content": "# GitHub Copilot Instructions\n\n## Contribution Guidelines\n\n### Before Committing\n\n1. **Run linters:** `make lint` (must pass without warnings or errors)\n2. **Run tests:** `make test` (must pass all tests)\n3. **Build successfully:** `make build` (must compile without warnings or errors)\n\n### Code Standards\n\n- Follow Go best practices and idiomatic patterns\n- Use Australian English spelling throughout code (unless it's a function or parameter to an upstream library) and documentation\n- No marketing terms like \"comprehensive\" or \"production-grade\"\n- Focus on clear, concise, actionable technical guidance\n- Keep responses token-efficient (avoid returning unnecessary data)\nverbosity\n\n## Code Quality Checks\n\n### General Code Quality\n- Verify proper module imports and dependencies\n- Check for hardcoded credentials or sensitive data\n- Ensure proper resource cleanup (defer statements)\n- Validate input parameters thoroughly\n- Use appropriate data types and structures\n- Follow consistent error message formatting\n\n## Configuration & Environment\n- Environment variables should have sensible defaults\n- Configuration should be documented in README\n- Support both development and production modes\n- Handle missing optional dependencies gracefully\n\n## General Guidelines\n\n- Do not use marketing terms such as 'comprehensive' or 'production-grade' in documentation or code comments.\n- Focus on clear, concise actionable technical guidance.\n\n## Review Checklist for Every PR\n\nBefore approving any pull request, verify:\n\n- [ ] Code follows the latest Golang best practices\n- [ ] No security issues or vulnerabilities introduced\n- [ ] All linting and tests pass successfully\n- [ ] Documentation updated if required\n- [ ] Australian English spelling used throughout, No American English spelling used (unless it's a function or parameter to an upstream library)\n- [ ] Context cancellation handled properly if applicable\n- [ ] Resource cleanup with defer statements if applicable\n\nIf you are re-reviewing a PR you've reviewed in the past and your previous comments / suggestions have been addressed or are no longer valid please resolve those previous review comments to keep the review history clean and easy to follow.\n"
  },
  {
    "path": ".github/workflows/build-release-publish.yml",
    "content": "name: Build, Release, and Publish\n\non:\n  push:\n    branches: [ main ]\n    paths-ignore:\n      - 'README.md'\n  pull_request:\n    branches: [ main ]\n    paths-ignore:\n      - 'README.md'\n\nenv:\n  GO_VERSION: '1.25.4' # Updated to match toolchain\n  BINARY_NAME: 'ingest'\n\npermissions:\n  contents: write\n  packages: write\n\njobs:\n  build:\n    if: ${{ ! contains(github.event.head_commit.message, '[skip ci]') && ! contains(github.event.pull_request.title, '[skip ci]')}}\n    name: Build\n    strategy:\n      matrix:\n        target:\n          - os: darwin\n            arch: arm64\n            runner: macos-14\n            c_compiler_package: \"\"\n          - os: linux\n            arch: amd64\n            runner: ubuntu-latest\n            c_compiler_package: \"build-essential\"\n          # - os: linux\n          #   arch: arm64\n          #   runner: ubuntu-latest-arm64 # Use native ARM64 runner\n          #   c_compiler_package: \"build-essential\" # Native compiler\n    runs-on: ${{ matrix.target.runner }}\n\n    outputs:\n      version: ${{ steps.set_version.outputs.new_tag }}\n      changelog: ${{ steps.set_version.outputs.changelog }}\n\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v6\n        with:\n          fetch-depth: 0\n\n      - name: Set up Go and cache dependencies\n        uses: actions/setup-go@v6\n        with:\n          go-version-file: \"go.mod\"\n\n      - name: Get version\n        id: set_version\n        uses: mathieudutour/github-tag-action@a22cf08638b34d5badda920f9daf6e72c477b07b # v6.2\n        with:\n          github_token: ${{ secrets.GITHUB_TOKEN }}\n          dry_run: true\n\n      - name: Get dependencies\n        run: go mod download\n\n      - name: Set up C compiler\n        if: startsWith(matrix.target.runner, 'ubuntu') && matrix.target.c_compiler_package != ''\n        run: |\n          sudo apt-get update\n          sudo apt-get install -y ${{ matrix.target.c_compiler_package }}\n\n      - name: golangci-lint\n        uses: golangci/golangci-lint-action@v9\n        with:\n          version: v2.6\n\n      - name: Run tests\n        run: make test\n\n      - name: Build\n        env:\n          CGO_ENABLED: \"1\" # Explicitly enable CGo\n          GOOS: ${{ matrix.target.os }}\n          GOARCH: ${{ matrix.target.arch }}\n          VERSION: ${{ steps.set_version.outputs.new_tag }}\n\n        run: |\n          go build -v -ldflags \"-w -s -X main.Version=$VERSION\" -o build/${{ env.BINARY_NAME }}-${{ matrix.target.os }}-${{ matrix.target.arch }} .\n          ls -ltarh build/\n\n      - name: Upload artifact\n        uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4\n        if: github.event_name == 'push' && github.ref == 'refs/heads/main'\n        with:\n          name: ${{ env.BINARY_NAME }}-${{ matrix.target.os }}-${{ matrix.target.arch }}\n          path: build/${{ env.BINARY_NAME }}-${{ matrix.target.os }}-${{ matrix.target.arch }}\n          retention-days: 90\n\n  release:\n    name: Release\n    needs: build\n    if: ${{github.event_name == 'push' && github.ref == 'refs/heads/main' && !contains(github.event.head_commit.message, '[skip ci]') && ! contains(github.event.pull_request.title, '[skip ci]')}}\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4\n        with:\n          fetch-depth: 0\n\n      - name: Download artifact\n        uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4\n        with:\n          path: build/\n\n      - name: Create a GitHub release\n        uses: ncipollo/release-action@2c591bcc8ecdcd2db72b97d6147f871fcd833ba5 # v1\n        if: ${{ startsWith(github.ref, 'refs/heads/main') && !contains(github.event.head_commit.message, '[skip ci]') && ! contains(github.event.pull_request.title, '[skip ci]') }}\n        with:\n          tag: ${{ needs.build.outputs.version }}\n          name: ${{ needs.build.outputs.version }}\n          body: ${{ needs.build.outputs.changelog }}\n          skipIfReleaseExists: true\n          generateReleaseNotes: true\n          allowUpdates: true\n          makeLatest: ${{ startsWith(github.ref, 'refs/heads/main') && !contains(github.event.head_commit.message, '[skip ci]') && ! contains(github.event.pull_request.title, '[skip ci]') }}\n          prerelease: ${{ !startsWith(github.ref, 'refs/heads/main') }}\n          artifactErrorsFailBuild: true\n          artifacts: |\n            build/${{ env.BINARY_NAME }}*/${{ env.BINARY_NAME }}*\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n"
  },
  {
    "path": ".gitignore",
    "content": "# If you prefer the allow list template instead of the deny list, see community template:\n# https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore\n#\n# Binaries for programs and plugins\n*.exe\n*.exe~\n*.dll\n*.so\n*.dylib\n\n# Test binary, built with `go test -c`\n*.test\n\n# Output of the go coverage tool, specifically when used with LiteIDE\n*.out\n\n# Dependency directories (remove the comment below to include it)\n# vendor/\n\n# Go workspace file\ngo.work\ngo.work.sum\n\n# build files\ningest\nbuild/\n\n**/.vscode\n**/.idea\n**/*.tmp\n**/*.swp\n**/*.log\n**/ingest.out.md\n\nvendor/\n"
  },
  {
    "path": ".golangci.yml",
    "content": "version: \"2\"\nlinters:\n  enable:\n    - unparam\n  settings:\n    unparam:\n      check-exported: false\n  exclusions:\n    generated: lax\n    presets:\n      - comments\n      - common-false-positives\n      - legacy\n      - std-error-handling\n    paths:\n      - third_party$\n      - builtin$\n      - examples$\nformatters:\n  exclusions:\n    generated: lax\n    paths:\n      - third_party$\n      - builtin$\n      - examples$\n      - screenshots$\n      - .github$\n      - .claude$\n      - bin$\n\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2024 Sam McLeod\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "Makefile",
    "content": "# Makefile for ingest project\n\n# Go parameters\nGOCMD=go\nGOBUILD=$(GOCMD) build\nGOCLEAN=$(GOCMD) clean\nGOTEST=$(GOCMD) test\nGOGET=$(GOCMD) get\n\n# Binary name\nBINARY_NAME=ingest\n\n# Version information\nVERSION := $(shell git describe --tags --always)\nBUILD_TIME := $(shell date -u '+%Y-%m-%d_%I:%M:%S%p')\nLDFLAGS := -ldflags \"-w -s -X main.Version=$(VERSION) -X main.BuildTime=$(BUILD_TIME)\"\n\n# Main package path\nMAIN_PACKAGE=.\n\n.PHONY: all build clean test deps\n\nall: clean build\n\nbuild:\n\t$(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME) $(MAIN_PACKAGE)\n\nclean:\n\t$(GOCLEAN)\n\trm -f $(BINARY_NAME)\n\nlint:\n\tgofmt -w -s .\n\tgolangci-lint run\n\tgo run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...\n\ntest:\n\t$(GOTEST) -v ./...\n\ndeps:\n\t$(GOGET) ./...\n\n# Run the application\nrun: build\n\t./$(BINARY_NAME)\n\n# Build for multiple platforms\nbuild-all:\n\tGOOS=linux GOARCH=amd64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-linux-amd64 $(MAIN_PACKAGE)\n\tGOOS=darwin GOARCH=amd64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-darwin-amd64 $(MAIN_PACKAGE)\n\tGOOS=windows GOARCH=amd64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-windows-amd64.exe $(MAIN_PACKAGE)\n\n# Install the binary\ninstall: build\n\tmv $(BINARY_NAME) $(GOPATH)/bin/$(BINARY_NAME)\n\n# Uninstall the binary\nuninstall:\n\trm -f $(GOPATH)/bin/$(BINARY_NAME)\n\n# output the version information\nversion:\n\t@echo $(VERSION)\n"
  },
  {
    "path": "README.md",
    "content": "# Ingest\n\n![](./ingest-logo-400.png)\n\nIngest parses directories of plain text files, such as source code, into a single markdown file suitable for ingestion by AI/LLMs.\n\n---\n\n![ingest](screenshot2.png)\n\nIngest can also pass the prompt directly to an LLM such as Ollama for processing.\n\n![ingest with --llm](screenshot.png)\n\nAnd ingest web URLs.\n\n![ingest with --web](screenshot3.png)\n\n## Features\n\n- Traverse directory structures and generate a tree view\n- Include/exclude files based on glob patterns\n- Compress code using Tree-sitter to extract key structural information while omitting implementation details\n- Estimate vRAM requirements and check model compatibility using another package I've created called [quantest](https://github.com/sammcj/quantest)\n- Parse output directly to LLMs such as Ollama or any OpenAI compatible API for processing\n- Generate and include git diffs and logs\n- Count tokens using offline tokeniser (default) or optionally use Anthropic API (API key required, but no charge for counting)\n- Customisable output templates\n- Copy output to clipboard (when available)\n- Export to file or print to console\n- Optional JSON output\n- Optionally save output to a file in ~/ingest\n- Shell completions for Bash, Zsh, and Fish\n- Web crawling to ingest web pages as Markdown\n- PDF to markdown conversion and ingestion\n\nIngest Intro (\"Podcast\" Episode):\n\n<audio src=\"https://github.com/sammcj/smcleod_files/raw/refs/heads/master/audio/podcast-ep-sw/Podcast%20Episode%20-%20Ingest.mp3\" controls preload></audio>\n\n## Installation\n\n### go install (recommended)\n\nMake sure you have Go installed on your system, then run:\n\n```shell\ngo install github.com/sammcj/ingest@HEAD\n```\n\n### curl\n\nI don't recommend this method as it's not as easy to update, but you can use the following command:\n\n```shell\ncurl -sL https://raw.githubusercontent.com/sammcj/ingest/refs/heads/main/scripts/install.sh | bash\n```\n\n### Manual install\n\n1. Download the latest release from the [releases page](https://github.com/sammcj/ingest/releases)\n2. Move the binary to a directory in your PATH, e.g. `mv ingest* /usr/local/bin/ingest`\n\n## Usage\n\nBasic usage:\n\n```shell\ningest [flags] <paths>\n```\n\ningest will default to the current working directory if no path is provided, e.g:\n\n```shell\n$ ingest\n\n⠋ Traversing directory and building tree...  [0s]\n[ℹ️] Tokens (Approximate): 15,945\n[✅] Copied to clipboard successfully.\n```\n\nThe first time ingest runs, it will download a small [tokeniser](https://github.com/pkoukk/tiktoken-go-loader/blob/main/assets/cl100k_base.tiktoken) called 'cl100k_base.tiktoken' this is used for tokenisation.\n\nGenerate a prompt from a directory, including only Python files:\n\n```shell\ningest -i \"**/*.py\" /path/to/project\n```\n\nGenerate a prompt with git diff and copy to clipboard:\n\n```shell\ningest -d /path/to/project\n```\n\nGenerate a prompt for multiple files/directories:\n\n```shell\ningest /path/to/project /path/to/other/project\n```\n\nGenerate a prompt and save to a file:\n\n```shell\ningest -o output.md /path/to/project\n```\n\nYou can also provide individual files or multiple paths:\n\n```shell\ningest /path/to/file /path/to/directory\n```\n\nSave output to to ~/ingest/<directory_name>.md:\n\n```shell\ningest --save /path/to/project\n```\n\n### VRAM Estimation and Model Compatibility\n\nIngest includes a feature to estimate VRAM requirements and check model compatibility using the [Gollama](https://github.com/sammcj/gollama)'s vramestimator package. This helps you determine if your generated content will fit within the specified model, VRAM, and quantisation constraints.\n\nTo use this feature, add the following flags to your ingest command:\n\n```shell\ningest --vram --model <model_id> [--memory <memory_in_gb>] [--quant <quantisation>] [--context <context_length>] [--kvcache <kv_cache_quant>] [--quanttype <quant_type>] [other flags] <paths>\n```\n\nExamples:\n\nEstimate VRAM usage for a specific context:\n\n```shell\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --quant q4_k_m --context 2048 --kvcache q4_0 .\n# Estimated VRAM usage: 5.35 GB\n```\n\nCalculate maximum context for a given memory constraint:\n\n```shell\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --quant q4_k_m --memory 6 --kvcache q8_0 .\n# Maximum context for 6.00 GB of memory: 5069\n```\n\nFind the best BPW (Bits Per Weight):\n\n```shell\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --memory 6 --quanttype gguf .\n# Best BPW for 6.00 GB of memory: IQ3_S\n```\n\nThe tool also works for exl2 (ExllamaV2) models:\n\n```shell\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --quant 5.0 --context 2048 --kvcache q4_0 . # For exl2 models\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --quant 5.0 --memory 6 --kvcache q8_0 . # For exl2 models\n```\n\nWhen using the VRAM estimation feature along with content generation, ingest will provide information about the generated content's compatibility with the specified constraints:\n\n```shell\ningest --vram --model NousResearch/Hermes-2-Theta-Llama-3-8B --memory 8 --quant q4_0 .\n⠋ Traversing directory and building tree... [0s]\n[ℹ️] 14,702 Tokens (Approximate)\n[ℹ️] Maximum context for 8.00 GB of memory: 10240\n[✅] Generated content (14,702 tokens) fits within maximum context.\nTop 15 largest files (by estimated token count):\n1. /Users/samm/git/sammcj/ingest/main.go (4,682 tokens)\n2. /Users/samm/git/sammcj/ingest/filesystem/filesystem.go (2,694 tokens)\n3. /Users/samm/git/sammcj/ingest/README.md (1,895 tokens)\n4. /Users/samm/git/sammcj/ingest/utils/utils.go (948 tokens)\n5. /Users/samm/git/sammcj/ingest/config/config.go (884 tokens)\n[✅] Copied to clipboard successfully.\n```\n\nAvailable flags for VRAM estimation:\n\n- `--vram`: Enable VRAM estimation and model compatibility check\n- `--model`: Specify the model ID to check against (required for estimation)\n- `--memory`: Specify the available memory in GB for context calculation (optional)\n- `--quant`: Specify the quantisation type (e.g., q4_k_m) or bits per weight (e.g., 5.0)\n- `--context`: Specify the context length for VRAM estimation (optional)\n- `--kvcache`: Specify the KV cache quantisation (fp16, q8_0, or q4_0)\n- `--quanttype`: Specify the quantisation type (gguf or exl2)\n\nIngest will provide appropriate output based on the combination of flags used, such as estimating VRAM usage, calculating maximum context, or finding the best BPW. If the generated content fits within the specified constraints, you'll see a success message. Otherwise, you'll receive a warning that the content may not fit.\n\n## LLM Integration\n\nIngest can pass the generated prompt to LLMs that have an OpenAI compatible API such as [Ollama](https://ollama.com) for processing.\n\n```shell\ningest --llm /path/to/project\n```\n\nBy default this will use any prompt suffix from your configuration file:\n\n```shell\n./ingest utils.go --llm\n⠋ Traversing directory and building tree...  [0s]\nThis is Go code for a file named `utils.go`. It contains various utility functions for\nhandling terminal output, clipboard operations, and configuration directories.\n...\n```\n\nYou can provide a prompt suffix to append to the generated prompt:\n\n```shell\ningest --llm -p \"explain this code\" /path/to/project\n```\n\n## Token Counting\n\nIngest provides token counting using either an offline tokeniser (default) or the Anthropic API for more accurate counts.\n\n### Offline Token Counting (Default)\n\nBy default, ingest uses an offline tokeniser with a correction factor for improved accuracy:\n\n```shell\ningest /path/to/project\n# [ℹ️] Tokens (Approximate): 15,945\n```\n\nThe offline tokeniser applies a 1.18x multiplier based on empirical analysis comparing it with Anthropic's API. This correction reduces average estimation error from ~17% to ~2%, providing slightly more accurate token counts without requiring an API key.\n\nTo disable the correction factor and use raw token counts, use the `--no-correction` flag:\n\n```shell\ningest --no-correction /path/to/project\n# Uses raw offline tokeniser without correction multiplier\n```\n\nThe first time ingest runs, it downloads a small tokeniser file for offline use.\n\n### Anthropic API Token Counting\n\nFor accurate token counts using Anthropic's counting API, use the `-a` or `--anthropic` flag:\n\n```shell\nexport ANTHROPIC_API_KEY=\"your-api-key\"\ningest -a /path/to/project\n# ✓ Using Anthropic API (claude-sonnet-4-5) for token counting\n# [ℹ️] Tokens (Approximate): 15,942\n```\n\nThe API accepts keys from these environment variables (checked in order):\n- `ANTHROPIC_API_KEY`\n- `ANTHROPIC_TOKEN`\n- `ANTHROPIC_TOKEN_COUNT_KEY`\n\n**Performance optimisation**: When counting tokens for multiple files (e.g. in the \"Top 15 largest files\" report), ingest processes API requests in parallel batches of 4, significantly reducing the time needed for token counting.\n\nIf the API call fails, ingest automatically falls back to the offline tokeniser.\n\n## Code Compression with Tree-sitter\n\n**Experimental**\n\nIngest can compress source code files by extracting key structural information while omitting implementation details. This is useful for reducing token usage while preserving the important parts of the code structure.\n\n```shell\ningest --compress /path/to/project\n```\n\nThe compression extracts:\n- Package/module declarations\n- Import statements\n- Function/method signatures (without bodies)\n- Class definitions (without method bodies)\n- Type definitions\n- Comments\n\nCurrently supported languages:\n- Go\n- Python\n- JavaScript (including arrow functions and ES6 module syntax)\n- Bash\n- C\n- CSS\n\nExample of compressed JavaScript:\n\n```\n// This is a JavaScript comment\nimport { something } from 'module';\nexport class MyJSClass { ... } // Body removed\nconstructor(name) { ... } // Body removed\ngreet(message) { ... } // Body removed\nexport function myJSFunction(x, y) { ... } // Body removed\nconst myArrowFunc = (a, b) => { ... } // Body removed\n```\n\n## Web Crawling & Ingestion\n\nCrawl with explicit web mode\n\n```shell\ningest --web https://example.com\n```\n\nAuto-detect URL and crawl\n\n```shell\ningest https://example.com\n```\n\nCrawl with domain restriction\n\n```shell\ningest --web --web-domains example.com https://example.com\n```\n\nCrawl deeper with more concurrency\n\n```shell\ningest --web --web-depth 3 --web-concurrent 10 https://example.com\n```\n\nExclude a path from the crawl\n\n```shell\ningest --web https://example.com -e '/posts/**'\n```\n\n## Shell Completions\n\nIngest includes shell completions for Bash, Zsh, and Fish.\n\nTo load completions for the current session:\n\n**Bash:**\n```shell\nsource <(ingest completion bash)\n```\n\n**Zsh:**\n```shell\nsource <(ingest completion zsh)\n```\n\n**Fish:**\n```shell\ningest completion fish | source\n```\n\nFor persistent completions (loaded automatically in each new shell session), see `ingest completion --help` for installation instructions specific to your system.\n\n## Configuration\n\nIngest uses a configuration file located at `~/.config/ingest/ingest.json`.\n\nYou can make Ollama processing run without prompting setting `\"llm_auto_run\": true` in the config file.\n\nThe config file also contains:\n\n- `llm_model`: The model to use for processing the prompt, e.g. \"llama3.1:8b-q5_k_m\".\n- `llm_prompt_prefix`: An optional prefix to prepend to the prompt, e.g. \"This is my application.\"\n- `llm_prompt_suffix`: An optional suffix to append to the prompt, e.g. \"explain this code\"\n\nIngest uses the following directories for user-specific configuration:\n\n- `~/.config/ingest/patterns/exclude`: Add .glob files here to exclude additional patterns.\n- `~/.config/ingest/patterns/templates`: Add custom .tmpl files here for different output formats.\n\nThese directories will be created automatically on first run, along with README files explaining their purpose.\n\n### Flags\n\n- `-a, --anthropic`: Use Anthropic API for token counting (requires API key in environment)\n- `--compress`: Enable code compression using Tree-sitter to extract key structural information while omitting implementation details\n- `--config`: Opens the config file in the default editor\n- `--no-correction`: Disable offline tokeniser correction factor (use raw token count)\n- `--context`: Specify the context length for VRAM estimation\n- `--exclude-from-tree`: Exclude files/folders from the source tree based on exclude patterns\n- `--git-diff-branch`: Generate git diff between two branches\n- `--git-log-branch`: Retrieve git log between two branches\n- `--include-priority`: Include files in case of conflict between include and exclude patterns\n- `--json`: Print output as JSON\n- `--kvcache`: Specify the KV cache quantisation\n- `--llm`: Send the generated prompt to an OpenAI compatible LLM server (such as Ollama) for processing\n- `--memory`: Specify the available memory in GB for context calculation\n- `--model`: Specify the model ID for VRAM estimation\n- `--no-codeblock`: Disable wrapping code inside markdown code blocks\n- `--no-default-excludes`: Disable default exclude patterns\n- `--pattern-exclude`: Path to a specific .glob file for exclude patterns\n- `--print-default-excludes`: Print the default exclude patterns\n- `--print-default-template`: Print the default template\n- `--quant`: Specify the quantisation type or bits per weight\n- `--quanttype`: Specify the quantisation type (gguf or exl2)\n- `--relative-paths`: Use relative paths instead of absolute paths\n- `--report`: Print the largest parsed files\n- `--save`: Save output to ~/ingest/<directory_name>.md\n- `--tokens`: Display the token count of the generated prompt\n- `--verbose`: Print verbose output\n- `--vram`: Estimate VRAM usage and check model compatibility\n- `--web-concurrent`: Maximum concurrent requests for web crawling\n- `--web-depth`: Maximum depth for web crawling\n- `--web-domains`: Comma-separated list of domains to restrict web crawling\n- `--web`: Crawl a web page\n- `-c, --encoding`: Optional tokeniser to use for token count\n- `-d, --diff`: Include git diff\n- `-e, --exclude`: Patterns to exclude (can be used multiple times)\n- `-i, --include`: Patterns to include (can be used multiple times)\n- `-l, --line-number`: Add line numbers to the source code\n- `-n, --no-clipboard`: Disable copying to clipboard\n- `-o, --output`: Optional output file path\n- `-p, --prompt`: Optional prompt suffix to append to the generated prompt\n- `-t, --template`: Path to a custom Handlebars template\n- `-V, --version`: Print the version number (WIP - still trying to get this to work nicely)\n\n### Excludes\n\nYou can get a list of the default excludes by parsing `--print-default-excludes` to ingest.\nThese are defined in [defaultExcludes.go](https://github.com/sammcj/ingest/blob/main/filesystem/defaultExcludes.go).\n\nTo override the default excludes, create a `default.glob` file in `~/.config/ingest/patterns/exclude` with the patterns you want to exclude.\n\n### Templates\n\nTemplates are written in standard [go templating syntax](https://pkg.go.dev/text/template).\n\nYou can get a list of the default templates by parsing `--print-default-template` to ingest.\nThese are defined in [template.go](https://github.com/sammcj/ingest/blob/main/template/template.go).\n\nTo override the default templates, create a `default.tmpl` file in `~/.config/ingest/patterns/templates` with the template you want to use by default.\n\n## Contributing\n\nContributions are welcome, Please feel free to submit a Pull Request.\n\nYou can help sponsor the project by trading the $INGEST SOL Token: https://bags.fm/Dm98Qa1Xw2n35bq73R2t1bFgXPApUKu2YwzU8TjWBAGS\n\n## License\n\n- Copyright 2024 Sam McLeod\n- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n<script src=\"http://api.html5media.info/1.1.8/html5media.min.js\"></script>\n"
  },
  {
    "path": "config/config.go",
    "content": "package config\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n\t\"os/exec\"\n\t\"path/filepath\"\n\n\t\"github.com/mitchellh/go-homedir\"\n)\n\ntype OllamaConfig struct {\n\tModel        string `json:\"llm_model\"`\n\tPromptPrefix string `json:\"llm_prompt_prefix\"`\n\tPromptSuffix string `json:\"llm_prompt_suffix\"`\n\tAutoRun      bool   `json:\"llm_auto_run\"`\n}\n\ntype Config struct {\n\tOllama   []OllamaConfig `json:\"ollama\"`\n\tLLM      LLMConfig      `json:\"llm\"`\n\tAutoSave bool           `json:\"auto_save\"`\n}\n\ntype LLMConfig struct {\n\tAuthToken        string   `json:\"llm_auth_token\"`\n\tBaseURL          string   `json:\"llm_base_url\"`\n\tModel            string   `json:\"llm_model\"`\n\tMaxTokens        int      `json:\"llm_max_tokens\"`\n\tTemperature      *float32 `json:\"llm_temperature,omitempty\"`\n\tTopP             *float32 `json:\"llm_top_p,omitempty\"`\n\tPresencePenalty  *float32 `json:\"llm_presence_penalty,omitempty\"`\n\tFrequencyPenalty *float32 `json:\"llm_frequency_penalty,omitempty\"`\n\tAPIType          string   `json:\"llm_api_type\"`\n}\n\n// loads the config file\nfunc LoadConfig() (*Config, error) {\n\thome, err := homedir.Dir()\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to get home directory: %w\", err)\n\t}\n\n\tconfigPath := filepath.Join(home, \".config\", \"ingest\", \"ingest.json\")\n\tif _, err := os.Stat(configPath); os.IsNotExist(err) {\n\t\treturn createDefaultConfig(configPath)\n\t}\n\n\tfile, err := os.ReadFile(configPath)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to read config file: %w\", err)\n\t}\n\n\tvar config Config\n\tif err := json.Unmarshal(file, &config); err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to parse config file: %w\", err)\n\t}\n\n\t// Set default values for LLM config\n\tif config.LLM.AuthToken == \"\" {\n\t\tconfig.LLM.AuthToken = os.Getenv(\"OPENAI_API_KEY\")\n\t}\n\tif config.LLM.BaseURL == \"\" {\n\t\tconfig.LLM.BaseURL = getDefaultBaseURL()\n\t}\n\tif config.LLM.Model == \"\" {\n\t\tconfig.LLM.Model = \"llama3.1:8b-instruct-q6_K\"\n\t}\n\tif config.LLM.MaxTokens == 0 {\n\t\tconfig.LLM.MaxTokens = 2048\n\t}\n\tif config.LLM.APIType == \"\" {\n\t\tconfig.LLM.APIType = \"OPEN_AI\"\n\t}\n\n\treturn &config, nil\n}\n\nfunc createDefaultConfig(configPath string) (*Config, error) {\n\tdefaultConfig := Config{\n\t\tOllama: []OllamaConfig{\n\t\t\t{\n\t\t\t\tModel:        \"llama3.1:8b-instruct-q6_K\",\n\t\t\t\tPromptPrefix: \"Code: \",\n\t\t\t\tPromptSuffix: \"\",\n\t\t\t\tAutoRun:      false,\n\t\t\t},\n\t\t},\n\t\tLLM: LLMConfig{\n\t\t\tBaseURL:   getDefaultBaseURL(),\n\t\t\tModel:     \"llama3.1:8b-instruct-q6_K\",\n\t\t\tMaxTokens: 2048,\n\t\t},\n\t\tAutoSave: false,\n\t}\n\n\terr := os.MkdirAll(filepath.Dir(configPath), 0750)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to create config directory: %w\", err)\n\t}\n\n\tfile, err := json.MarshalIndent(defaultConfig, \"\", \"  \")\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to marshal default config: %w\", err)\n\t}\n\n\tif err := os.WriteFile(configPath, file, 0644); err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to write default config file: %w\", err)\n\t}\n\n\treturn &defaultConfig, nil\n}\n\n// returns the default base URL for the LLM API\nfunc getDefaultBaseURL() string {\n\tif url := os.Getenv(\"OPENAI_API_BASE\"); url != \"\" {\n\t\treturn url\n\t}\n\tif url := os.Getenv(\"llm_HOST\"); url != \"\" {\n\t\treturn url + \"/v1\"\n\t}\n\treturn \"http://localhost:11434/v1\"\n}\n\n// opens the config file in the default editor\nfunc OpenConfig() error {\n\thome, err := homedir.Dir()\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to get home directory: %w\", err)\n\t}\n\n\tconfigPath := filepath.Join(home, \".config\", \"ingest\", \"ingest.json\")\n\tif _, err := os.Stat(configPath); os.IsNotExist(err) {\n\t\treturn fmt.Errorf(\"config file does not exist\")\n\t}\n\n\teditor := os.Getenv(\"EDITOR\")\n\tif editor == \"\" {\n\t\teditor = \"vim\"\n\t}\n\n\treturn runCommand(editor, configPath)\n}\n\n// runs a command in the shell\nfunc runCommand(command string, args ...string) error {\n\tcmd := exec.Command(command, args...)\n\tcmd.Stdin = os.Stdin\n\tcmd.Stdout = os.Stdout\n\tcmd.Stderr = os.Stderr\n\treturn cmd.Run()\n}\n"
  },
  {
    "path": "filesystem/defaultExcludes.go",
    "content": "package filesystem\n\nimport (\n\t\"bufio\"\n\t\"fmt\"\n\t\"strings\"\n)\n\n// defaultGlobContent contains the content of default.glob\nconst defaultGlobContent = `\n# Directories\n**/.cargo/**\n**/.devcontainer/**\n**/.git/**\n**/.github/**\n**/.next/**\n**/.venv/**\n**/.vim/**\n**/.vscode-insiders/**\n**/.vscode-oss/**\n**/.vscode-print-resource-cache/**\n**/.vscode-react-native/**\n**/.vscode/**\n**/.wasmedge/**\n**/.widsurf/**\n**/.widsurf/**\n**/.wine/**\n**/.wine32/**\n**/.yarn/**\n**/.zcompcache/**\n**/.zfunc/**\n**/.zgen/**\n**/.zsh_sessions/**\n**/.zsh.d/**\n**/backups/**\n**/build/**\n**/conda/**\n**/coverage/**\n**/dist/**\n**/mamba/**\n**/node_modules/**\n**/out/**\n**/pyenv/**\n**/__pycache__/**\n**/screenshots/**\n**/target/**\n**/temp/**\n**/tmp/**\n**/venv/**\n**/virtualenv/**\n**/wineprefix/**\n\n# File patterns\n**/__tests__/*\n**/_data/*\n**/.aider*\n**/.aider/*\n**/.bash_history\n**/.boto\n**/.claude.json\n**/.cline/*\n**/.condarc\n**/.cursor/*\n**/.dream_history\n**/.fzf.bash\n**/.fzf.zsh\n**/.git-credentials\n**/.llamafile_history\n**/.lscolors\n**/.netrc\n**/.psql_history\n**/.python_history\n**/.terraform/*\n**/.webpack/*\n**/.Xauthority\n**/*.7z\n**/*.apk\n**/*.app\n**/*.avi\n**/*.bak\n**/*.baseline\n**/*.bin\n**/*.blend\n**/*.bmp\n**/*.bz2\n**/*.cert\n**/*.crt\n**/*.csv\n**/*.dat\n**/*.deb\n**/*.db\n**/*.diff\n**/*.dll\n**/*.dmg\n**/*.doc\n**/*.docx\n**/*.DS_Store\n**/*.eot\n**/*.excalidrawlib\n**/*.exe\n**/*.fbx\n**/*.fig\n**/*.flac\n**/*.gif\n**/*.gguf\n**/*.ggml\n**/*.exl2\n**/*.exl3\n**/*.gz\n**/*.heic\n**/*.hiec\n**/*.icns\n**/*.ico\n**/*.iso\n**/*.jar\n**/*.jpeg\n**/*.jpg\n**/*.key\n**/*.lock\n**/*.log*\n**/*.mp3\n**/*.mp4\n**/*.msi\n**/*.mvnw*\n**/*.obj\n**/*.odf\n**/*.otf\n**/*.pdf\n**/*.partial\n**/*.pem\n**/*.png\n**/*.ppt\n**/*.pptx\n**/*.ps1\n**/*.pub\n**/*.pyc\n**/*.pyo\n**/*.pysave\n**/*.rpm\n**/*.sqlite\n**/*.sqlite3\n**/*.svg\n**/*.swp*\n**/*.tar*\n**/*.terraform.tfstate.lock.info\n**/*.tfgraph\n**/*.tmp\n**/*.ttf*\n**/*.war\n**/*.wav\n**/*.webm\n**/*.webp\n**/*.woff\n**/*.woff2\n**/*.xd\n**/*.xls\n**/*.xlsx\n**/*.zip\n**/terraform.tfstate.*\n**/test/*\n**/tests/*\n**/vendor/*\n\n\n# Specific files\n**/.aiderrules\n**/.aider.*\n**/.clinerules\n**/.cursorrules\n**/.DS_Store\n**/.editorconfig\n**/.env*\n**/.eslintignore\n**/.eslintrc*\n**/.gitattributes\n**/.gitconfig\n**/.gitconfig-no_push\n**/.gitignore\n**/.gitignoreglobal\n**/.gitlab-ci.yml\n**/.gitmodules\n**/.gitpod.yml\n**/.npmrc\n**/.nvmrc\n**/.pre-commit-config.yaml\n**/.pre-commit-config.yml\n**/.prettierignore\n**/.prettierrc*\n**/.saml2aws\n**/.saml2aws-auto.yml\n**/.stylelintrc*\n**/.terraform.lock.hcl\n**/.terraform.lock.hcl.lock\n**/.vimrc\n**/.whitesource\n**/.zcompdump*\n**/.claude/*.json\n**/.mcp.json\n**/bat-config\n**/changelog.md\n**/CHANGELOG*\n**/CLA.md\n**/CODE_OF_CONDUCT.md\n**/CODEOWNERS\n**/CONTRIBUTORS.md\n**/commitlint.config.js\n**/contributing.md\n**/CONTRIBUTING*\n**/dircolors\n**/esbuild.config.mjs\n**/go.mod\n**/go.sum\n**/LICENSE*\n**/LICENCE*\n**/manifest.json\n**/package-lock.json\n**/plan\n**/plan.out\n**/pnpm-lock.yaml\n**/poetry.lock\n**/pre-commit-config.yaml\n**/renovate.json\n**/SECURITY*\n**/SUPPORT.md\n**/terraform.rc\n**/terraform.tfplan\n**/terraform.tfplan.json\n**/terraform.tfstate\n**/terraform.tfstate.backup\n**/TODO.md\n**/TROUBLESHOOTING.md\n**/tsconfig.json\n**/version-bump.mjs\n**/versions.json\n**/yarn.lock\n`\n\n// GetDefaultExcludes returns a list of default exclude patterns\nfunc GetDefaultExcludes() ([]string, error) {\n\tvar defaultExcludes []string\n\tscanner := bufio.NewScanner(strings.NewReader(defaultGlobContent))\n\tfor scanner.Scan() {\n\t\tline := strings.TrimSpace(scanner.Text())\n\t\tif line != \"\" && !strings.HasPrefix(line, \"#\") {\n\t\t\tdefaultExcludes = append(defaultExcludes, line)\n\t\t}\n\t}\n\n\tif err := scanner.Err(); err != nil {\n\t\treturn nil, fmt.Errorf(\"error scanning default glob content: %w\", err)\n\t}\n\n\treturn defaultExcludes, nil\n}\n"
  },
  {
    "path": "filesystem/filesystem.go",
    "content": "package filesystem\n\nimport (\n\t\"bufio\"\n\t\"fmt\"\n\t\"io\"\n\t\"io/fs\"\n\t\"net/http\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"sort\"\n\t\"strings\"\n\t\"sync\"\n\n\t\"github.com/bmatcuk/doublestar/v4\"\n\t\"github.com/fatih/color\"\n\t\"github.com/mitchellh/go-homedir\"\n\tignore \"github.com/sabhiram/go-gitignore\"\n\t\"github.com/sammcj/ingest/internal/compressor\"\n\t\"github.com/sammcj/ingest/pdf\"\n\t\"github.com/sammcj/ingest/utils\"\n)\n\ntype FileInfo struct {\n\tPath      string `json:\"path\"`\n\tExtension string `json:\"extension\"`\n\tCode      string `json:\"code\"`\n}\n\n// New type to track excluded files and directories\ntype ExcludedInfo struct {\n\tDirectories map[string]int // Directory path -> count of excluded files\n\tExtensions  map[string]int // File extension -> count of excluded files\n\tTotalFiles  int            // Total number of excluded files\n\tFiles       []string       // List of excluded files (if total ≤ 20)\n}\n\ntype treeNode struct {\n\tname     string\n\tchildren []*treeNode\n\tisDir    bool\n\texcluded bool\n}\n\nfunc ReadExcludePatterns(patternExclude string, noDefaultExcludes bool) ([]string, error) {\n\tvar patterns []string\n\n\t// If a specific pattern exclude file is provided, use it\n\tif patternExclude != \"\" {\n\t\treturn readGlobFile(patternExclude)\n\t}\n\n\tif !noDefaultExcludes {\n\t\t// Get the default excludes\n\t\tdefaultPatterns, err := GetDefaultExcludes()\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"failed to read default exclude patterns: %w\", err)\n\t\t}\n\t\tpatterns = defaultPatterns\n\t}\n\n\t// Check for user-specific patterns\n\thome, err := homedir.Dir()\n\tif err == nil {\n\t\tuserPatternsDir := filepath.Join(home, \".config\", \"ingest\", \"patterns\", \"exclude\")\n\t\tuserDefaultGlob := filepath.Join(userPatternsDir, \"default.glob\")\n\n\t\t// If user has a default.glob, it overrides the default patterns\n\t\tif _, err := os.Stat(userDefaultGlob); err == nil {\n\t\t\treturn readGlobFile(userDefaultGlob)\n\t\t}\n\n\t\t// Read other user-defined patterns\n\t\tuserPatterns, _ := readGlobFilesFromDir(userPatternsDir)\n\n\t\t// Combine user patterns with default patterns (if not disabled)\n\t\tpatterns = append(patterns, userPatterns...)\n\t}\n\n\treturn patterns, nil\n}\n\n// Helper functions to track exclusions\nfunc trackExcludedFile(excluded *ExcludedInfo, path string, mu *sync.Mutex) {\n\tmu.Lock()\n\tdefer mu.Unlock()\n\n\texcluded.TotalFiles++\n\n\t// Track the directory\n\tdir := filepath.Dir(path)\n\texcluded.Directories[dir]++\n\n\t// Track the extension\n\text := filepath.Ext(path)\n\tif ext != \"\" {\n\t\texcluded.Extensions[ext]++\n\t}\n\n\t// Only store individual files if we haven't exceeded 20\n\tif excluded.TotalFiles <= 20 {\n\t\texcluded.Files = append(excluded.Files, path)\n\t}\n}\n\nfunc readGlobFile(filename string) ([]string, error) {\n\tfile, err := os.Open(filename)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tdefer file.Close()\n\n\tvar patterns []string\n\tscanner := bufio.NewScanner(file)\n\tfor scanner.Scan() {\n\t\tline := strings.TrimSpace(scanner.Text())\n\t\tif line != \"\" && !strings.HasPrefix(line, \"#\") {\n\t\t\tpatterns = append(patterns, line)\n\t\t}\n\t}\n\n\tif err := scanner.Err(); err != nil {\n\t\treturn nil, err\n\t}\n\n\treturn patterns, nil\n}\n\nfunc readGlobFilesFromDir(dir string) ([]string, error) {\n\tvar patterns []string\n\terr := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\t\tif !info.IsDir() && strings.HasSuffix(info.Name(), \".glob\") {\n\t\t\tfilePatterns, err := readGlobFile(path)\n\t\t\tif err != nil {\n\t\t\t\treturn err\n\t\t\t}\n\t\t\tpatterns = append(patterns, filePatterns...)\n\t\t}\n\t\treturn nil\n\t})\n\treturn patterns, err\n}\n\nfunc trackExcludedDirectory(excluded *ExcludedInfo, path string, mu *sync.Mutex) {\n\tmu.Lock()\n\tdefer mu.Unlock()\n\texcluded.Directories[path] = 0 // Initialize directory count\n}\n\nfunc WalkDirectory(rootPath string, includePatterns, excludePatterns []string, patternExclude string, includePriority, lineNumber, relativePaths, excludeFromTree, noCodeblock, noDefaultExcludes, followSymlinks bool, comp *compressor.GenericCompressor) (string, []FileInfo, *ExcludedInfo, error) {\n\tvar files []FileInfo\n\tvar mu sync.Mutex\n\tvar wg sync.WaitGroup\n\n\texcluded := &ExcludedInfo{\n\t\tDirectories: make(map[string]int),\n\t\tExtensions:  make(map[string]int),\n\t\tFiles:       make([]string, 0),\n\t}\n\n\t// Read exclude patterns\n\tdefaultExcludes, err := ReadExcludePatterns(patternExclude, noDefaultExcludes)\n\tif err != nil {\n\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to read exclude patterns: %w\", err)\n\t}\n\n\t// Combine user-provided exclude patterns with default excludes (if not disabled)\n\tallExcludePatterns := append(excludePatterns, defaultExcludes...)\n\n\t// Always exclude .git directories\n\tallExcludePatterns = append(allExcludePatterns, \"**/.git/**\")\n\n\t// Read .gitignore if it exists\n\tgitignore, err := readGitignore(rootPath)\n\tif err != nil {\n\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to read .gitignore: %w\", err)\n\t}\n\n\t// Check if rootPath is a file or directory\n\tfileInfo, err := os.Stat(rootPath)\n\tif err != nil {\n\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to get file info: %w\", err)\n\t}\n\n\t// Check if rootPath is a single PDF file\n\tif !fileInfo.IsDir() {\n\t\tisPDF, err := pdf.IsPDF(rootPath)\n\t\tif err != nil {\n\t\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to check if file is PDF: %w\", err)\n\t\t}\n\n\t\tif isPDF {\n\t\t\t// Process single PDF file directly\n\t\t\tcontent, err := pdf.ConvertPDFToMarkdown(rootPath, false)\n\t\t\tif err != nil {\n\t\t\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to convert PDF: %w\", err)\n\t\t\t}\n\n\t\t\treturn fmt.Sprintf(\"File: %s\", rootPath), []FileInfo{{\n\t\t\t\tPath:      rootPath,\n\t\t\t\tExtension: \".md\",\n\t\t\t\tCode:      content,\n\t\t\t}}, excluded, nil\n\t\t}\n\t}\n\n\tvar treeString string\n\n\tif !fileInfo.IsDir() {\n\t\t// Check if the single file is a symlink\n\t\tif !followSymlinks {\n\t\t\tlinkInfo, err := os.Lstat(rootPath)\n\t\t\tif err != nil {\n\t\t\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to get symlink info: %w\", err)\n\t\t\t}\n\t\t\tif linkInfo.Mode()&os.ModeSymlink != 0 {\n\t\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Skipping symlinked file: %s\", rootPath), color.FgCyan)\n\t\t\t\treturn fmt.Sprintf(\"File: %s (symlink, skipped)\", rootPath), []FileInfo{}, excluded, nil\n\t\t\t}\n\t\t}\n\n\t\t// Handle single file\n\t\trelPath := filepath.Base(rootPath)\n\t\tif shouldIncludeFile(relPath, includePatterns, allExcludePatterns, gitignore, includePriority) {\n\t\t\twg.Go(func() {\n\t\t\t\tprocessFile(rootPath, relPath, filepath.Dir(rootPath), lineNumber, relativePaths, noCodeblock, &mu, &files, comp)\n\t\t\t})\n\t\t} else {\n\t\t\ttrackExcludedFile(excluded, rootPath, &mu)\n\t\t}\n\t\ttreeString = fmt.Sprintf(\"File: %s\", rootPath)\n\t} else {\n\t\t// Generate the tree representation for directory\n\t\ttreeString, err = generateTreeString(rootPath, allExcludePatterns)\n\t\tif err != nil {\n\t\t\treturn \"\", nil, nil, fmt.Errorf(\"failed to generate directory tree: %w\", err)\n\t\t}\n\n\t\t// Process files in directory\n\t\terr = filepath.Walk(rootPath, func(path string, info os.FileInfo, err error) error {\n\t\t\tif err != nil {\n\t\t\t\treturn err\n\t\t\t}\n\n\t\t\trelPath, err := filepath.Rel(rootPath, path)\n\t\t\tif err != nil {\n\t\t\t\treturn err\n\t\t\t}\n\n\t\t\t// Check if the path is a symlink\n\t\t\tif !followSymlinks {\n\t\t\t\tlinkInfo, err := os.Lstat(path)\n\t\t\t\tif err != nil {\n\t\t\t\t\treturn err\n\t\t\t\t}\n\t\t\t\tif linkInfo.Mode()&os.ModeSymlink != 0 {\n\t\t\t\t\tif linkInfo.IsDir() || (info != nil && info.IsDir()) {\n\t\t\t\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Skipping symlinked directory: %s\", path), color.FgCyan)\n\t\t\t\t\t\treturn filepath.SkipDir\n\t\t\t\t\t}\n\t\t\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Skipping symlinked file: %s\", path), color.FgCyan)\n\t\t\t\t\treturn nil\n\t\t\t\t}\n\t\t\t}\n\n\t\t\t// Check if the current path (file or directory) should be excluded\n\t\t\tif shouldExcludePath(relPath, allExcludePatterns, gitignore) {\n\t\t\t\tif info.IsDir() {\n\t\t\t\t\ttrackExcludedDirectory(excluded, path, &mu)\n\t\t\t\t\treturn filepath.SkipDir\n\t\t\t\t}\n\t\t\t\ttrackExcludedFile(excluded, path, &mu)\n\t\t\t\treturn nil\n\t\t\t}\n\n\t\t\tif !info.IsDir() && !shouldIncludeFile(relPath, includePatterns, allExcludePatterns, gitignore, includePriority) {\n\t\t\t\ttrackExcludedFile(excluded, path, &mu)\n\t\t\t\treturn nil\n\t\t\t}\n\n\t\t\tif !info.IsDir() && shouldIncludeFile(relPath, includePatterns, allExcludePatterns, gitignore, includePriority) {\n\t\t\t\twg.Add(1)\n\t\t\t\tgo func(path, relPath string) {\n\t\t\t\t\tdefer wg.Done()\n\t\t\t\t\tprocessFile(path, relPath, rootPath, lineNumber, relativePaths, noCodeblock, &mu, &files, comp)\n\t\t\t\t}(path, relPath)\n\t\t\t}\n\n\t\t\treturn nil\n\t\t})\n\t}\n\n\twg.Wait()\n\n\tif err != nil {\n\t\treturn \"\", nil, excluded, err\n\t}\n\n\treturn treeString, files, excluded, nil\n}\n\n// New helper function to check if a path should be excluded\nfunc shouldExcludePath(path string, excludePatterns []string, gitignore *ignore.GitIgnore) bool {\n\tfor _, pattern := range excludePatterns {\n\t\tif match, _ := doublestar.Match(pattern, path); match {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn gitignore != nil && gitignore.MatchesPath(path)\n}\n\nfunc shouldIncludeFile(path string, includePatterns, excludePatterns []string, gitignore *ignore.GitIgnore, includePriority bool) bool {\n\t// Check if the file is explicitly included\n\tincluded := len(includePatterns) == 0 || matchesAny(path, includePatterns)\n\n\t// Check if the file is explicitly excluded\n\texcluded := isExcluded(path, excludePatterns) || (gitignore != nil && gitignore.MatchesPath(path))\n\n\tif included && excluded {\n\t\treturn includePriority\n\t}\n\treturn included && !excluded\n}\n\nfunc matchesAny(path string, patterns []string) bool {\n\tfor _, pattern := range patterns {\n\t\tif match, _ := doublestar.Match(pattern, path); match {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n\nfunc readGitignore(rootPath string) (*ignore.GitIgnore, error) {\n\tgitignorePath := filepath.Join(rootPath, \".gitignore\")\n\tif _, err := os.Stat(gitignorePath); os.IsNotExist(err) {\n\t\treturn nil, nil\n\t}\n\n\treturn ignore.CompileIgnoreFile(gitignorePath)\n}\n\nfunc addLineNumbers(code string) string {\n\tlines := strings.Split(code, \"\\n\")\n\tfor i := range lines {\n\t\tlines[i] = fmt.Sprintf(\"%4d | %s\", i+1, lines[i])\n\t}\n\treturn strings.Join(lines, \"\\n\")\n}\n\nfunc wrapCodeBlock(code, extension string) string {\n\tif extension == \"\" {\n\t\treturn fmt.Sprintf(\"```\\n%s\\n```\", code)\n\t}\n\treturn fmt.Sprintf(\"```%s\\n%s\\n```\", extension[1:], code)\n}\n\nfunc isBinaryFile(filePath string) (bool, error) {\n\t// First check if it's a PDF\n\tisPDF, err := pdf.IsPDF(filePath)\n\tif err != nil {\n\t\treturn false, err\n\t}\n\tif isPDF {\n\t\treturn false, nil // Don't treat PDFs as binary files\n\t}\n\n\tfile, err := os.Open(filePath)\n\tif err != nil {\n\t\treturn false, err\n\t}\n\tdefer file.Close()\n\n\t// Read the first 512 bytes of the file\n\tbuffer := make([]byte, 512)\n\tn, err := file.Read(buffer)\n\tif err != nil && err != io.EOF {\n\t\treturn false, err\n\t}\n\n\t// Use http.DetectContentType to determine the content type\n\tcontentType := http.DetectContentType(buffer[:n])\n\n\t// Allow PDFs and text files\n\treturn !strings.HasPrefix(contentType, \"text/\") && contentType != \"application/pdf\", nil\n}\n\nfunc PrintDefaultExcludes() {\n\texcludes, err := GetDefaultExcludes()\n\tif err != nil {\n\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to get default excludes: %v\", err), color.FgRed)\n\t\tos.Exit(1)\n\t}\n\tfmt.Println(strings.Join(excludes, \"\\n\"))\n}\n\nfunc processFile(path, relPath string, rootPath string, lineNumber, relativePaths, noCodeblock bool, mu *sync.Mutex, files *[]FileInfo, comp *compressor.GenericCompressor) {\n\t// Check if it's the root path being processed (explicitly provided file)\n\tisExplicitFile := path == rootPath\n\n\t// Check if file is a PDF\n\tisPDF, err := pdf.IsPDF(path)\n\tif err != nil {\n\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to check if file is PDF %s: %v\", path, err), color.FgRed)\n\t\treturn\n\t}\n\n\tif isPDF {\n\t\tif !isExplicitFile {\n\t\t\t// Skip PDFs during directory traversal\n\t\t\treturn\n\t\t}\n\n\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Converting PDF to markdown: %s\", path), color.FgBlue)\n\t\tcontent, err := pdf.ConvertPDFToMarkdown(path, false)\n\t\tif err != nil {\n\t\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to convert PDF %s: %v\", path, err), color.FgRed)\n\t\t\treturn\n\t\t}\n\n\t\tfilePath := path\n\t\tif relativePaths {\n\t\t\tfilePath = filepath.Join(filepath.Base(rootPath), relPath)\n\t\t}\n\n\t\tmu.Lock()\n\t\t*files = append(*files, FileInfo{\n\t\t\tPath:      filePath,\n\t\t\tExtension: \".md\",\n\t\t\tCode:      content,\n\t\t})\n\t\tmu.Unlock()\n\t\treturn\n\t}\n\n\t// Check if the file is binary\n\tisBinary, err := isBinaryFile(path)\n\tif err != nil {\n\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to check if file is binary %s: %v\", path, err), color.FgRed)\n\t\treturn\n\t}\n\n\tif isBinary {\n\t\treturn // Skip binary files\n\t}\n\n\tcontent, err := os.ReadFile(path)\n\tif err != nil {\n\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to read file %s: %v\", path, err), color.FgRed)\n\t\treturn\n\t}\n\n\tcode := string(content)\n\n\t// Attempt compression if compressor is provided and it's not a PDF\n\tif comp != nil && !isPDF {\n\t\tlangID, err := compressor.IdentifyLanguage(path)\n\t\tif err == nil { // Language identified\n\t\t\tcompressedCode, err := comp.Compress(content, langID)\n\t\t\tif err == nil {\n\t\t\t\tcode = compressedCode\n\t\t\t\t// If compressed, we might not want to add line numbers or wrap in a generic code block\n\t\t\t\t// as the compressor might handle formatting. For now, let's assume compressed output\n\t\t\t\t// is final for this file's content.\n\t\t\t\t// We'll skip line numbering and code block wrapping for compressed content.\n\t\t\t\tgoto skipFormatting\n\t\t\t} else {\n\t\t\t\tutils.PrintColouredMessage(\"⚠️\", fmt.Sprintf(\"Compression failed for %s: %v. Using original content.\", path, err), color.FgYellow)\n\t\t\t}\n\t\t} else {\n\t\t\t// Language not identified for compression, use original content\n\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Language not identified for compression for %s. Using original content.\", path), color.FgBlue)\n\t\t}\n\t}\n\n\tif lineNumber {\n\t\tcode = addLineNumbers(code)\n\t}\n\tif !noCodeblock {\n\t\tcode = wrapCodeBlock(code, filepath.Ext(path))\n\t}\n\nskipFormatting:\n\tfilePath := path\n\tif relativePaths {\n\t\tfilePath = filepath.Join(filepath.Base(rootPath), relPath)\n\t}\n\n\tmu.Lock()\n\t*files = append(*files, FileInfo{\n\t\tPath:      filePath,\n\t\tExtension: filepath.Ext(path),\n\t\tCode:      code,\n\t})\n\tmu.Unlock()\n}\n\nfunc generateTreeString(rootPath string, excludePatterns []string) (string, error) {\n\troot := &treeNode{name: filepath.Base(rootPath), isDir: true}\n\thasExclusions := false\n\n\terr := filepath.Walk(rootPath, func(path string, info fs.FileInfo, err error) error {\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\trelPath, err := filepath.Rel(rootPath, path)\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\t// Skip the root directory\n\t\tif relPath == \".\" {\n\t\t\treturn nil\n\t\t}\n\n\t\t// Check if the path should be excluded\n\t\texcluded := isExcluded(relPath, excludePatterns)\n\t\tif excluded {\n\t\t\thasExclusions = true\n\t\t\tif info.IsDir() {\n\t\t\t\t// Add the excluded directory to the tree with an X marker\n\t\t\t\tparts := strings.Split(relPath, string(os.PathSeparator))\n\t\t\t\tcurrent := root\n\t\t\t\tfor i, part := range parts {\n\t\t\t\t\tfound := false\n\t\t\t\t\tfor _, child := range current.children {\n\t\t\t\t\t\tif child.name == part {\n\t\t\t\t\t\t\tcurrent = child\n\t\t\t\t\t\t\tfound = true\n\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\t\tif !found {\n\t\t\t\t\t\tnewNode := &treeNode{\n\t\t\t\t\t\t\tname:     part,\n\t\t\t\t\t\t\tisDir:    true,\n\t\t\t\t\t\t\texcluded: true,\n\t\t\t\t\t\t}\n\t\t\t\t\t\tcurrent.children = append(current.children, newNode)\n\t\t\t\t\t\tcurrent = newNode\n\t\t\t\t\t}\n\t\t\t\t\tif i == len(parts)-1 {\n\t\t\t\t\t\tcurrent.isDir = true\n\t\t\t\t\t\tcurrent.excluded = true\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t\treturn filepath.SkipDir\n\t\t\t}\n\t\t\t// Add excluded files to the tree with an X marker\n\t\t\tparts := strings.Split(relPath, string(os.PathSeparator))\n\t\t\tcurrent := root\n\t\t\tfor i, part := range parts {\n\t\t\t\tfound := false\n\t\t\t\tfor _, child := range current.children {\n\t\t\t\t\tif child.name == part {\n\t\t\t\t\t\tcurrent = child\n\t\t\t\t\t\tfound = true\n\t\t\t\t\t\tbreak\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t\tif !found {\n\t\t\t\t\tnewNode := &treeNode{\n\t\t\t\t\t\tname:     part,\n\t\t\t\t\t\tisDir:    i < len(parts)-1,\n\t\t\t\t\t\texcluded: true,\n\t\t\t\t\t}\n\t\t\t\t\tcurrent.children = append(current.children, newNode)\n\t\t\t\t\tcurrent = newNode\n\t\t\t\t}\n\t\t\t}\n\t\t\treturn nil\n\t\t}\n\n\t\tparts := strings.Split(relPath, string(os.PathSeparator))\n\t\tcurrent := root\n\t\tfor i, part := range parts {\n\t\t\tfound := false\n\t\t\tfor _, child := range current.children {\n\t\t\t\tif child.name == part {\n\t\t\t\t\tcurrent = child\n\t\t\t\t\tfound = true\n\t\t\t\t\tbreak\n\t\t\t\t}\n\t\t\t}\n\t\t\tif !found {\n\t\t\t\tnewNode := &treeNode{name: part, isDir: info.IsDir()}\n\t\t\t\tcurrent.children = append(current.children, newNode)\n\t\t\t\tcurrent = newNode\n\t\t\t}\n\t\t\tif i == len(parts)-1 && !info.IsDir() {\n\t\t\t\tcurrent.isDir = false\n\t\t\t}\n\t\t}\n\n\t\treturn nil\n\t})\n\n\tif err != nil {\n\t\treturn \"\", err\n\t}\n\n\tvar output strings.Builder\n\tif hasExclusions {\n\t\toutput.WriteString(\"(Files/directories marked with ❌ are excluded or not included here)\\n\\n\")\n\t}\n\toutput.WriteString(root.name + \"/\\n\")\n\tfor i, child := range root.children {\n\t\tprintTree(child, \"\", i == len(root.children)-1, &output)\n\t}\n\n\treturn strings.TrimSuffix(output.String(), \"\\n\"), nil\n}\n\nfunc printTree(node *treeNode, prefix string, isLast bool, output *strings.Builder) {\n\toutput.WriteString(prefix)\n\tif isLast {\n\t\toutput.WriteString(\"└── \")\n\t\tprefix += \"    \"\n\t} else {\n\t\toutput.WriteString(\"├── \")\n\t\tprefix += \"│   \"\n\t}\n\toutput.WriteString(node.name)\n\tif node.isDir {\n\t\toutput.WriteString(\"/\")\n\t}\n\tif node.excluded {\n\t\toutput.WriteString(\" ❌\")\n\t}\n\toutput.WriteString(\"\\n\")\n\n\tsort.Slice(node.children, func(i, j int) bool {\n\t\tif node.children[i].isDir != node.children[j].isDir {\n\t\t\treturn node.children[i].isDir\n\t\t}\n\t\treturn node.children[i].name < node.children[j].name\n\t})\n\n\tfor i, child := range node.children {\n\t\tprintTree(child, prefix, i == len(node.children)-1, output)\n\t}\n}\n\nfunc isExcluded(path string, patterns []string) bool {\n\tfor _, pattern := range patterns {\n\t\tif match, _ := doublestar.Match(pattern, path); match {\n\t\t\tif strings.HasSuffix(path, \".pdf\") {\n\t\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"PDF file detected: %s. PDF to markdown conversion is supported, but the file was excluded\", path), color.FgYellow)\n\t\t\t}\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n\nfunc ProcessSingleFile(path string, lineNumber, relativePaths, noCodeblock, followSymlinks bool, comp *compressor.GenericCompressor) (FileInfo, error) {\n\t// Check if the file is a symlink\n\tif !followSymlinks {\n\t\tlinkInfo, err := os.Lstat(path)\n\t\tif err != nil {\n\t\t\treturn FileInfo{}, fmt.Errorf(\"failed to get symlink info: %w\", err)\n\t\t}\n\t\tif linkInfo.Mode()&os.ModeSymlink != 0 {\n\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Skipping symlinked file: %s\", path), color.FgCyan)\n\t\t\treturn FileInfo{}, fmt.Errorf(\"file is a symlink and --follow-symlinks is not set\")\n\t\t}\n\t}\n\n\t// Check if it's a PDF first\n\tisPDF, err := pdf.IsPDF(path)\n\tif err != nil {\n\t\treturn FileInfo{}, fmt.Errorf(\"failed to check if file is PDF: %w\", err)\n\t}\n\n\tif isPDF {\n\t\tcontent, err := pdf.ConvertPDFToMarkdown(path, false)\n\t\tif err != nil {\n\t\t\treturn FileInfo{}, fmt.Errorf(\"failed to convert PDF: %w\", err)\n\t\t}\n\n\t\treturn FileInfo{\n\t\t\tPath:      path,\n\t\t\tExtension: \".md\",\n\t\t\tCode:      content,\n\t\t}, nil\n\t}\n\n\t// Handle non-PDF files\n\tcontent, err := os.ReadFile(path)\n\tif err != nil {\n\t\treturn FileInfo{}, fmt.Errorf(\"failed to read file: %w\", err)\n\t}\n\n\tcode := string(content)\n\n\t// Attempt compression if compressor is provided and it's not a PDF\n\tif comp != nil && !isPDF {\n\t\tlangID, err := compressor.IdentifyLanguage(path)\n\t\tif err == nil { // Language identified\n\t\t\tcompressedCode, err := comp.Compress(content, langID)\n\t\t\tif err == nil {\n\t\t\t\tcode = compressedCode\n\t\t\t\t// Skip standard formatting for compressed content\n\t\t\t\tgoto skipSingleFileFormatting\n\t\t\t} else {\n\t\t\t\tutils.PrintColouredMessage(\"⚠️\", fmt.Sprintf(\"Compression failed for %s: %v. Using original content.\", path, err), color.FgYellow)\n\t\t\t}\n\t\t} else {\n\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Language not identified for compression for %s. Using original content.\", path), color.FgBlue)\n\t\t}\n\t}\n\n\tif lineNumber {\n\t\tcode = addLineNumbers(code)\n\t}\n\tif !noCodeblock {\n\t\tcode = wrapCodeBlock(code, filepath.Ext(path))\n\t}\n\nskipSingleFileFormatting:\n\tfilePath := path\n\tif relativePaths {\n\t\tfilePath = filepath.Base(path)\n\t}\n\n\treturn FileInfo{\n\t\tPath:      filePath,\n\t\tExtension: filepath.Ext(path),\n\t\tCode:      code,\n\t}, nil\n}\n"
  },
  {
    "path": "git/git.go",
    "content": "package git\n\nimport (\n\t\"fmt\"\n\t\"os/exec\"\n)\n\nfunc GetGitDiff(repoPath string) (string, error) {\n\tcmd := exec.Command(\"git\", \"-C\", repoPath, \"diff\")\n\toutput, err := cmd.Output()\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"failed to get git diff: %w\", err)\n\t}\n\treturn string(output), nil\n}\n\nfunc GetGitDiffBetweenBranches(repoPath, branch1, branch2 string) (string, error) {\n\tcmd := exec.Command(\"git\", \"-C\", repoPath, \"diff\", branch1+\"..\"+branch2)\n\toutput, err := cmd.Output()\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"failed to get git diff between branches: %w\", err)\n\t}\n\treturn string(output), nil\n}\n\nfunc GetGitLog(repoPath, branch1, branch2 string) (string, error) {\n\tcmd := exec.Command(\"git\", \"-C\", repoPath, \"log\", \"--oneline\", branch1+\"..\"+branch2)\n\toutput, err := cmd.Output()\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"failed to get git log: %w\", err)\n\t}\n\treturn string(output), nil\n}\n\nfunc BranchExists(repoPath, branchName string) (bool, error) {\n\tcmd := exec.Command(\"git\", \"-C\", repoPath, \"rev-parse\", \"--verify\", branchName)\n\terr := cmd.Run()\n\tif err != nil {\n\t\tif exitError, ok := err.(*exec.ExitError); ok {\n\t\t\t// Branch doesn't exist\n\t\t\tif exitError.ExitCode() == 1 {\n\t\t\t\treturn false, nil\n\t\t\t}\n\t\t}\n\t\treturn false, fmt.Errorf(\"failed to check if branch exists: %w\", err)\n\t}\n\treturn true, nil\n}\n"
  },
  {
    "path": "go.mod",
    "content": "module github.com/sammcj/ingest\n\ngo 1.25.6\n\nrequire (\n\tgithub.com/JohannesKaufmann/html-to-markdown v1.6.0\n\tgithub.com/PuerkitoBio/goquery v1.11.0\n\tgithub.com/atotto/clipboard v0.1.4\n\tgithub.com/bmatcuk/doublestar/v4 v4.9.2\n\tgithub.com/charmbracelet/glamour v0.10.0\n\tgithub.com/fatih/color v1.18.0\n\tgithub.com/ledongthuc/pdf v0.0.0-20250511090121-5959a4027728\n\tgithub.com/mitchellh/go-homedir v1.1.0\n\tgithub.com/pkoukk/tiktoken-go v0.1.8\n\tgithub.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06\n\tgithub.com/sashabaranov/go-openai v1.41.2\n\tgithub.com/schollz/progressbar/v3 v3.19.0\n\tgithub.com/smacker/go-tree-sitter v0.0.0-20240827094217-dd81d9e9be82\n\tgithub.com/spf13/cobra v1.10.2\n)\n\nrequire (\n\tgithub.com/andybalholm/cascadia v1.3.3 // indirect\n\tgithub.com/charmbracelet/colorprofile v0.4.1 // indirect\n\tgithub.com/charmbracelet/x/cellbuf v0.0.14 // indirect\n\tgithub.com/charmbracelet/x/exp/slice v0.0.0-20260119114936-fd556377ea59 // indirect\n\tgithub.com/charmbracelet/x/term v0.2.2 // indirect\n\tgithub.com/clipperhouse/displaywidth v0.7.0 // indirect\n\tgithub.com/clipperhouse/stringish v0.1.1 // indirect\n\tgithub.com/clipperhouse/uax29/v2 v2.3.1 // indirect\n\tgithub.com/olekukonko/cat v0.0.0-20250911104152-50322a0618f6 // indirect\n\tgithub.com/olekukonko/errors v1.2.0 // indirect\n\tgithub.com/olekukonko/ll v0.1.4-0.20260115111900-9e59c2286df0 // indirect\n\tgithub.com/sammcj/gollama v1.37.5 // indirect\n\tgithub.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect\n\tgolang.org/x/text v0.33.0 // indirect\n\tgopkg.in/yaml.v2 v2.4.0 // indirect\n)\n\nrequire (\n\tgithub.com/alecthomas/chroma/v2 v2.23.0 // indirect\n\tgithub.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect\n\tgithub.com/aymerick/douceur v0.2.0 // indirect\n\tgithub.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834 // indirect\n\tgithub.com/charmbracelet/x/ansi v0.11.4 // indirect\n\tgithub.com/dlclark/regexp2 v1.11.5 // indirect\n\tgithub.com/go-ole/go-ole v1.3.0 // indirect\n\tgithub.com/google/uuid v1.6.0 // indirect\n\tgithub.com/gorilla/css v1.0.1 // indirect\n\tgithub.com/inconshreveable/mousetrap v1.1.0 // indirect\n\tgithub.com/lucasb-eyer/go-colorful v1.3.0 // indirect\n\tgithub.com/mattn/go-colorable v0.1.14 // indirect\n\tgithub.com/mattn/go-isatty v0.0.20 // indirect\n\tgithub.com/mattn/go-runewidth v0.0.19 // indirect\n\tgithub.com/microcosm-cc/bluemonday v1.0.27 // indirect\n\tgithub.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db // indirect\n\tgithub.com/muesli/reflow v0.3.0 // indirect\n\tgithub.com/muesli/termenv v0.16.0 // indirect\n\tgithub.com/natefinch/lumberjack v2.0.0+incompatible // indirect\n\tgithub.com/olekukonko/tablewriter v1.1.3 // indirect\n\tgithub.com/rivo/uniseg v0.4.7 // indirect\n\tgithub.com/rs/zerolog v1.34.0 // indirect\n\tgithub.com/sammcj/quantest v0.0.13\n\tgithub.com/shirou/gopsutil v3.21.11+incompatible // indirect\n\tgithub.com/spf13/pflag v1.0.10 // indirect\n\tgithub.com/yuin/goldmark v1.7.16 // indirect\n\tgithub.com/yuin/goldmark-emoji v1.0.6 // indirect\n\tgithub.com/yusufpapurcu/wmi v1.2.4 // indirect\n\tgolang.org/x/net v0.49.0 // indirect\n\tgolang.org/x/sys v0.40.0 // indirect\n\tgolang.org/x/term v0.39.0 // indirect\n)\n"
  },
  {
    "path": "go.sum",
    "content": "github.com/BurntSushi/toml v1.4.0 h1:kuoIxZQy2WRRk1pttg9asf+WVv6tWQuBNVmK8+nqPr0=\ngithub.com/BurntSushi/toml v1.4.0/go.mod h1:ukJfTF/6rtPPRCnwkur4qwRxa8vTRFBF0uk2lLoLwho=\ngithub.com/JohannesKaufmann/html-to-markdown v1.6.0 h1:04VXMiE50YYfCfLboJCLcgqF5x+rHJnb1ssNmqpLH/k=\ngithub.com/JohannesKaufmann/html-to-markdown v1.6.0/go.mod h1:NUI78lGg/a7vpEJTz/0uOcYMaibytE4BUOQS8k78yPQ=\ngithub.com/PuerkitoBio/goquery v1.9.2/go.mod h1:GHPCaP0ODyyxqcNoFGYlAprUFH81NuRPd0GX3Zu2Mvk=\ngithub.com/PuerkitoBio/goquery v1.11.0 h1:jZ7pwMQXIITcUXNH83LLk+txlaEy6NVOfTuP43xxfqw=\ngithub.com/PuerkitoBio/goquery v1.11.0/go.mod h1:wQHgxUOU3JGuj3oD/QFfxUdlzW6xPHfqyHre6VMY4DQ=\ngithub.com/alecthomas/assert/v2 v2.11.0 h1:2Q9r3ki8+JYXvGsDyBXwH3LcJ+WK5D0gc5E8vS6K3D0=\ngithub.com/alecthomas/assert/v2 v2.11.0/go.mod h1:Bze95FyfUr7x34QZrjL+XP+0qgp/zg8yS+TtBj1WA3k=\ngithub.com/alecthomas/chroma/v2 v2.20.0 h1:sfIHpxPyR07/Oylvmcai3X/exDlE8+FA820NTz+9sGw=\ngithub.com/alecthomas/chroma/v2 v2.20.0/go.mod h1:e7tViK0xh/Nf4BYHl00ycY6rV7b8iXBksI9E359yNmA=\ngithub.com/alecthomas/chroma/v2 v2.23.0 h1:u/Orux1J0eLuZDeQ44froV8smumheieI0EofhbyKhhk=\ngithub.com/alecthomas/chroma/v2 v2.23.0/go.mod h1:NqVhfBR0lte5Ouh3DcthuUCTUpDC9cxBOfyMbMQPs3o=\ngithub.com/alecthomas/repr v0.5.1 h1:E3G4t2QbHTSNpPKBgMTln5KLkZHLOcU7r37J4pXBuIg=\ngithub.com/alecthomas/repr v0.5.1/go.mod h1:Fr0507jx4eOXV7AlPV6AVZLYrLIuIeSOWtW57eE/O/4=\ngithub.com/andybalholm/cascadia v1.3.2/go.mod h1:7gtRlve5FxPPgIgX36uWBX58OdBsSS6lUvCFb+h7KvU=\ngithub.com/andybalholm/cascadia v1.3.3 h1:AG2YHrzJIm4BZ19iwJ/DAua6Btl3IwJX+VI4kktS1LM=\ngithub.com/andybalholm/cascadia v1.3.3/go.mod h1:xNd9bqTn98Ln4DwST8/nG+H0yuB8Hmgu1YHNnWw0GeA=\ngithub.com/atotto/clipboard v0.1.4 h1:EH0zSVneZPSuFR11BlR9YppQTVDbh5+16AmcJi4g1z4=\ngithub.com/atotto/clipboard v0.1.4/go.mod h1:ZY9tmq7sm5xIbd9bOK4onWV4S6X0u6GY7Vn0Yu86PYI=\ngithub.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k=\ngithub.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8=\ngithub.com/aymanbagabas/go-udiff v0.2.0 h1:TK0fH4MteXUDspT88n8CKzvK0X9O2xu9yQjWpi6yML8=\ngithub.com/aymanbagabas/go-udiff v0.2.0/go.mod h1:RE4Ex0qsGkTAJoQdQQCA0uG+nAzJO/pI/QwceO5fgrA=\ngithub.com/aymerick/douceur v0.2.0 h1:Mv+mAeH1Q+n9Fr+oyamOlAkUNPWPlA8PPGR0QAaYuPk=\ngithub.com/aymerick/douceur v0.2.0/go.mod h1:wlT5vV2O3h55X9m7iVYN0TBM0NH/MmbLnd30/FjWUq4=\ngithub.com/bmatcuk/doublestar/v4 v4.9.1 h1:X8jg9rRZmJd4yRy7ZeNDRnM+T3ZfHv15JiBJ/avrEXE=\ngithub.com/bmatcuk/doublestar/v4 v4.9.1/go.mod h1:xBQ8jztBU6kakFMg+8WGxn0c6z1fTSPVIjEY1Wr7jzc=\ngithub.com/bmatcuk/doublestar/v4 v4.9.2 h1:b0mc6WyRSYLjzofB2v/0cuDUZ+MqoGyH3r0dVij35GI=\ngithub.com/bmatcuk/doublestar/v4 v4.9.2/go.mod h1:xBQ8jztBU6kakFMg+8WGxn0c6z1fTSPVIjEY1Wr7jzc=\ngithub.com/charmbracelet/colorprofile v0.3.3 h1:DjJzJtLP6/NZ8p7Cgjno0CKGr7wwRJGxWUwh2IyhfAI=\ngithub.com/charmbracelet/colorprofile v0.3.3/go.mod h1:nB1FugsAbzq284eJcjfah2nhdSLppN2NqvfotkfRYP4=\ngithub.com/charmbracelet/colorprofile v0.4.1 h1:a1lO03qTrSIRaK8c3JRxJDZOvhvIeSco3ej+ngLk1kk=\ngithub.com/charmbracelet/colorprofile v0.4.1/go.mod h1:U1d9Dljmdf9DLegaJ0nGZNJvoXAhayhmidOdcBwAvKk=\ngithub.com/charmbracelet/glamour v0.10.0 h1:MtZvfwsYCx8jEPFJm3rIBFIMZUfUJ765oX8V6kXldcY=\ngithub.com/charmbracelet/glamour v0.10.0/go.mod h1:f+uf+I/ChNmqo087elLnVdCiVgjSKWuXa/l6NU2ndYk=\ngithub.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834 h1:ZR7e0ro+SZZiIZD7msJyA+NjkCNNavuiPBLgerbOziE=\ngithub.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834/go.mod h1:aKC/t2arECF6rNOnaKaVU6y4t4ZeHQzqfxedE/VkVhA=\ngithub.com/charmbracelet/x/ansi v0.11.2 h1:XAG3FSjiVtFvgEgGrNBkCNNYrsucAt8c6bfxHyROLLs=\ngithub.com/charmbracelet/x/ansi v0.11.2/go.mod h1:9tY2bzX5SiJCU0iWyskjBeI2BRQfvPqI+J760Mjf+Rg=\ngithub.com/charmbracelet/x/ansi v0.11.4 h1:6G65PLu6HjmE858CnTUQY1LXT3ZUWwfvqEROLF8vqHI=\ngithub.com/charmbracelet/x/ansi v0.11.4/go.mod h1:/5AZ+UfWExW3int5H5ugnsG/PWjNcSQcwYsHBlPFQN4=\ngithub.com/charmbracelet/x/cellbuf v0.0.14 h1:iUEMryGyFTelKW3THW4+FfPgi4fkmKnnaLOXuc+/Kj4=\ngithub.com/charmbracelet/x/cellbuf v0.0.14/go.mod h1:P447lJl49ywBbil/KjCk2HexGh4tEY9LH0/1QrZZ9rA=\ngithub.com/charmbracelet/x/exp/golden v0.0.0-20240806155701-69247e0abc2a h1:G99klV19u0QnhiizODirwVksQB91TJKV/UaTnACcG30=\ngithub.com/charmbracelet/x/exp/golden v0.0.0-20240806155701-69247e0abc2a/go.mod h1:wDlXFlCrmJ8J+swcL/MnGUuYnqgQdW9rhSD61oNMb6U=\ngithub.com/charmbracelet/x/exp/slice v0.0.0-20251126160633-0b68cdcd21da h1:6HDQl5MSww6jOImEZ6qu4OPUOcOauUgexOrAOkWGFs8=\ngithub.com/charmbracelet/x/exp/slice v0.0.0-20251126160633-0b68cdcd21da/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=\ngithub.com/charmbracelet/x/exp/slice v0.0.0-20260119114936-fd556377ea59 h1:QtkqQl+yAR6RwQnNjdWHRX093ajX8FZ/WAz3Dvw+xWg=\ngithub.com/charmbracelet/x/exp/slice v0.0.0-20260119114936-fd556377ea59/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=\ngithub.com/charmbracelet/x/term v0.2.2 h1:xVRT/S2ZcKdhhOuSP4t5cLi5o+JxklsoEObBSgfgZRk=\ngithub.com/charmbracelet/x/term v0.2.2/go.mod h1:kF8CY5RddLWrsgVwpw4kAa6TESp6EB5y3uxGLeCqzAI=\ngithub.com/chengxilo/virtualterm v1.0.4 h1:Z6IpERbRVlfB8WkOmtbHiDbBANU7cimRIof7mk9/PwM=\ngithub.com/chengxilo/virtualterm v1.0.4/go.mod h1:DyxxBZz/x1iqJjFxTFcr6/x+jSpqN0iwWCOK1q10rlY=\ngithub.com/clipperhouse/displaywidth v0.6.0 h1:k32vueaksef9WIKCNcoqRNyKbyvkvkysNYnAWz2fN4s=\ngithub.com/clipperhouse/displaywidth v0.6.0/go.mod h1:R+kHuzaYWFkTm7xoMmK1lFydbci4X2CicfbGstSGg0o=\ngithub.com/clipperhouse/displaywidth v0.7.0 h1:QNv1GYsnLX9QBrcWUtMlogpTXuM5FVnBwKWp1O5NwmE=\ngithub.com/clipperhouse/displaywidth v0.7.0/go.mod h1:R+kHuzaYWFkTm7xoMmK1lFydbci4X2CicfbGstSGg0o=\ngithub.com/clipperhouse/stringish v0.1.1 h1:+NSqMOr3GR6k1FdRhhnXrLfztGzuG+VuFDfatpWHKCs=\ngithub.com/clipperhouse/stringish v0.1.1/go.mod h1:v/WhFtE1q0ovMta2+m+UbpZ+2/HEXNWYXQgCt4hdOzA=\ngithub.com/clipperhouse/uax29/v2 v2.3.0 h1:SNdx9DVUqMoBuBoW3iLOj4FQv3dN5mDtuqwuhIGpJy4=\ngithub.com/clipperhouse/uax29/v2 v2.3.0/go.mod h1:Wn1g7MK6OoeDT0vL+Q0SQLDz/KpfsVRgg6W7ihQeh4g=\ngithub.com/clipperhouse/uax29/v2 v2.3.1 h1:RjM8gnVbFbgI67SBekIC7ihFpyXwRPYWXn9BZActHbw=\ngithub.com/clipperhouse/uax29/v2 v2.3.1/go.mod h1:Wn1g7MK6OoeDT0vL+Q0SQLDz/KpfsVRgg6W7ihQeh4g=\ngithub.com/coreos/go-systemd/v22 v22.5.0/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=\ngithub.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=\ngithub.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=\ngithub.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=\ngithub.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=\ngithub.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ=\ngithub.com/dlclark/regexp2 v1.11.5/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=\ngithub.com/fatih/color v1.18.0 h1:S8gINlzdQ840/4pfAwic/ZE0djQEH3wM94VfqLTZcOM=\ngithub.com/fatih/color v1.18.0/go.mod h1:4FelSpRwEGDpQ12mAdzqdOukCy4u8WUtOY6lkT/6HfU=\ngithub.com/go-ole/go-ole v1.2.6/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0=\ngithub.com/go-ole/go-ole v1.3.0 h1:Dt6ye7+vXGIKZ7Xtk4s6/xVdGDQynvom7xCFEdWr6uE=\ngithub.com/go-ole/go-ole v1.3.0/go.mod h1:5LS6F96DhAwUc7C+1HLexzMXY1xGRSryjyPPKW6zv78=\ngithub.com/godbus/dbus/v5 v5.0.4/go.mod h1:xhWf0FNVPg57R7Z0UbKHbJfkEywrmjJnf7w5xrFpKfA=\ngithub.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=\ngithub.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=\ngithub.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=\ngithub.com/gorilla/css v1.0.1 h1:ntNaBIghp6JmvWnxbZKANoLyuXTPZ4cAMlo6RyhlbO8=\ngithub.com/gorilla/css v1.0.1/go.mod h1:BvnYkspnSzMmwRK+b8/xgNPLiIuNZr6vbZBTPQ2A3b0=\ngithub.com/hexops/gotextdiff v1.0.3 h1:gitA9+qJrrTCsiCl7+kh75nPqQt1cx4ZkudSTLoUqJM=\ngithub.com/hexops/gotextdiff v1.0.3/go.mod h1:pSWU5MAI3yDq+fZBTazCSJysOMbxWL1BSow5/V2vxeg=\ngithub.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=\ngithub.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=\ngithub.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI=\ngithub.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=\ngithub.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=\ngithub.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=\ngithub.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=\ngithub.com/ledongthuc/pdf v0.0.0-20250511090121-5959a4027728 h1:QwWKgMY28TAXaDl+ExRDqGQltzXqN/xypdKP86niVn8=\ngithub.com/ledongthuc/pdf v0.0.0-20250511090121-5959a4027728/go.mod h1:1fEHWurg7pvf5SG6XNE5Q8UZmOwex51Mkx3SLhrW5B4=\ngithub.com/lucasb-eyer/go-colorful v1.3.0 h1:2/yBRLdWBZKrf7gB40FoiKfAWYQ0lqNcbuQwVHXptag=\ngithub.com/lucasb-eyer/go-colorful v1.3.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=\ngithub.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovkB8vQcUbaXHg=\ngithub.com/mattn/go-colorable v0.1.14 h1:9A9LHSqF/7dyVVX6g0U9cwm9pG3kP9gSzcuIPHPsaIE=\ngithub.com/mattn/go-colorable v0.1.14/go.mod h1:6LmQG8QLFO4G5z1gPvYEzlUgJ2wF+stgPZH1UqBm1s8=\ngithub.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=\ngithub.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=\ngithub.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=\ngithub.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=\ngithub.com/mattn/go-runewidth v0.0.12/go.mod h1:RAqKPSqVFrSLVXbA8x7dzmKdmGzieGRCM46jaSJTDAk=\ngithub.com/mattn/go-runewidth v0.0.19 h1:v++JhqYnZuu5jSKrk9RbgF5v4CGUjqRfBm05byFGLdw=\ngithub.com/mattn/go-runewidth v0.0.19/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs=\ngithub.com/microcosm-cc/bluemonday v1.0.27 h1:MpEUotklkwCSLeH+Qdx1VJgNqLlpY2KXwXFM08ygZfk=\ngithub.com/microcosm-cc/bluemonday v1.0.27/go.mod h1:jFi9vgW+H7c3V0lb6nR74Ib/DIB5OBs92Dimizgw2cA=\ngithub.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db h1:62I3jR2EmQ4l5rM/4FEfDWcRD+abF5XlKShorW5LRoQ=\ngithub.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db/go.mod h1:l0dey0ia/Uv7NcFFVbCLtqEBQbrT4OCwCSKTEv6enCw=\ngithub.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=\ngithub.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=\ngithub.com/muesli/reflow v0.3.0 h1:IFsN6K9NfGtjeggFP+68I4chLZV2yIKsXJFNZ+eWh6s=\ngithub.com/muesli/reflow v0.3.0/go.mod h1:pbwTDkVPibjO2kyvBQRBxTWEEGDGq0FlB1BIKtnHY/8=\ngithub.com/muesli/termenv v0.16.0 h1:S5AlUN9dENB57rsbnkPyfdGuWIlkmzJjbFf0Tf5FWUc=\ngithub.com/muesli/termenv v0.16.0/go.mod h1:ZRfOIKPFDYQoDFF4Olj7/QJbW60Ol/kL1pU3VfY/Cnk=\ngithub.com/natefinch/lumberjack v2.0.0+incompatible h1:4QJd3OLAMgj7ph+yZTuX13Ld4UpgHp07nNdFX7mqFfM=\ngithub.com/natefinch/lumberjack v2.0.0+incompatible/go.mod h1:Wi9p2TTF5DG5oU+6YfsmYQpsTIOm0B1VNzQg9Mw6nPk=\ngithub.com/olekukonko/cat v0.0.0-20250911104152-50322a0618f6 h1:zrbMGy9YXpIeTnGj4EljqMiZsIcE09mmF8XsD5AYOJc=\ngithub.com/olekukonko/cat v0.0.0-20250911104152-50322a0618f6/go.mod h1:rEKTHC9roVVicUIfZK7DYrdIoM0EOr8mK1Hj5s3JjH0=\ngithub.com/olekukonko/errors v1.1.0 h1:RNuGIh15QdDenh+hNvKrJkmxxjV4hcS50Db478Ou5sM=\ngithub.com/olekukonko/errors v1.1.0/go.mod h1:ppzxA5jBKcO1vIpCXQ9ZqgDh8iwODz6OXIGKU8r5m4Y=\ngithub.com/olekukonko/errors v1.2.0 h1:10Zcn4GeV59t/EGqJc8fUjtFT/FuUh5bTMzZ1XwmCRo=\ngithub.com/olekukonko/errors v1.2.0/go.mod h1:ppzxA5jBKcO1vIpCXQ9ZqgDh8iwODz6OXIGKU8r5m4Y=\ngithub.com/olekukonko/ll v0.1.2 h1:lkg/k/9mlsy0SxO5aC+WEpbdT5K83ddnNhAepz7TQc0=\ngithub.com/olekukonko/ll v0.1.2/go.mod h1:b52bVQRRPObe+yyBl0TxNfhesL0nedD4Cht0/zx55Ew=\ngithub.com/olekukonko/ll v0.1.4-0.20260115111900-9e59c2286df0 h1:jrYnow5+hy3WRDCBypUFvVKNSPPCdqgSXIE9eJDD8LM=\ngithub.com/olekukonko/ll v0.1.4-0.20260115111900-9e59c2286df0/go.mod h1:b52bVQRRPObe+yyBl0TxNfhesL0nedD4Cht0/zx55Ew=\ngithub.com/olekukonko/tablewriter v1.1.0 h1:N0LHrshF4T39KvI96fn6GT8HEjXRXYNDrDjKFDB7RIY=\ngithub.com/olekukonko/tablewriter v1.1.0/go.mod h1:5c+EBPeSqvXnLLgkm9isDdzR3wjfBkHR9Nhfp3NWrzo=\ngithub.com/olekukonko/tablewriter v1.1.3 h1:VSHhghXxrP0JHl+0NnKid7WoEmd9/urKRJLysb70nnA=\ngithub.com/olekukonko/tablewriter v1.1.3/go.mod h1:9VU0knjhmMkXjnMKrZ3+L2JhhtsQ/L38BbL3CRNE8tM=\ngithub.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=\ngithub.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=\ngithub.com/pkoukk/tiktoken-go v0.1.8 h1:85ENo+3FpWgAACBaEUVp+lctuTcYUO7BtmfhlN/QTRo=\ngithub.com/pkoukk/tiktoken-go v0.1.8/go.mod h1:9NiV+i9mJKGj1rYOT+njbv+ZwA/zJxYdewGl6qVatpg=\ngithub.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=\ngithub.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=\ngithub.com/rivo/uniseg v0.1.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJtxc=\ngithub.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJtxc=\ngithub.com/rivo/uniseg v0.4.7 h1:WUdvkW8uEhrYfLC4ZzdpI2ztxP1I582+49Oc5Mq64VQ=\ngithub.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88=\ngithub.com/rs/xid v1.6.0/go.mod h1:7XoLgs4eV+QndskICGsho+ADou8ySMSjJKDIan90Nz0=\ngithub.com/rs/zerolog v1.34.0 h1:k43nTLIwcTVQAncfCw4KZ2VY6ukYoZaBPNOE8txlOeY=\ngithub.com/rs/zerolog v1.34.0/go.mod h1:bJsvje4Z08ROH4Nhs5iH600c3IkWhwp44iRc54W6wYQ=\ngithub.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=\ngithub.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06 h1:OkMGxebDjyw0ULyrTYWeN0UNCCkmCWfjPnIA2W6oviI=\ngithub.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06/go.mod h1:+ePHsJ1keEjQtpvf9HHw0f4ZeJ0TLRsxhunSI2hYJSs=\ngithub.com/sammcj/gollama v1.37.3 h1:aOkGsyCObF8Y2TQ5fzj2PVnvZMugexMAwux+WH+wjtA=\ngithub.com/sammcj/gollama v1.37.3/go.mod h1:NOkydPF8yjWOsSypVQkMJH1yphHq80L52Z0AMzfnPfY=\ngithub.com/sammcj/gollama v1.37.5 h1:SrKJjdwDtrTkiOoySrTLiF7INm4Ik1GcqVYnm4InaBU=\ngithub.com/sammcj/gollama v1.37.5/go.mod h1:0fICL7D5ZUsUIIXSTrSh5OC/BxmLQ521HgtBWTLY+A8=\ngithub.com/sammcj/quantest v0.0.13 h1:d/f+Pp1aXFL0P2DRyZNOlDf4AAlqhmB1HQM+egHzDyo=\ngithub.com/sammcj/quantest v0.0.13/go.mod h1:nfmfRnybimGRLXd/Yq3xwIyetGyJDzJPi0a3X5RKCWk=\ngithub.com/sashabaranov/go-openai v1.41.2 h1:vfPRBZNMpnqu8ELsclWcAvF19lDNgh1t6TVfFFOPiSM=\ngithub.com/sashabaranov/go-openai v1.41.2/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=\ngithub.com/schollz/progressbar/v3 v3.18.0 h1:uXdoHABRFmNIjUfte/Ex7WtuyVslrw2wVPQmCN62HpA=\ngithub.com/schollz/progressbar/v3 v3.18.0/go.mod h1:IsO3lpbaGuzh8zIMzgY3+J8l4C8GjO0Y9S69eFvNsec=\ngithub.com/schollz/progressbar/v3 v3.19.0 h1:Ea18xuIRQXLAUidVDox3AbwfUhD0/1IvohyTutOIFoc=\ngithub.com/schollz/progressbar/v3 v3.19.0/go.mod h1:IsO3lpbaGuzh8zIMzgY3+J8l4C8GjO0Y9S69eFvNsec=\ngithub.com/sebdah/goldie/v2 v2.5.3 h1:9ES/mNN+HNUbNWpVAlrzuZ7jE+Nrczbj8uFRjM7624Y=\ngithub.com/sebdah/goldie/v2 v2.5.3/go.mod h1:oZ9fp0+se1eapSRjfYbsV/0Hqhbuu3bJVvKI/NNtssI=\ngithub.com/sergi/go-diff v1.0.0/go.mod h1:0CfEIISq7TuYL3j771MWULgwwjU+GofnZX9QAmXWZgo=\ngithub.com/sergi/go-diff v1.3.1 h1:xkr+Oxo4BOQKmkn/B9eMK0g5Kg/983T9DqqPHwYqD+8=\ngithub.com/sergi/go-diff v1.3.1/go.mod h1:aMJSSKb2lpPvRNec0+w3fl7LP9IOFzdc9Pa4NFbPK1I=\ngithub.com/shirou/gopsutil v3.21.11+incompatible h1:+1+c1VGhc88SSonWP6foOcLhvnKlUeu/erjjvaPEYiI=\ngithub.com/shirou/gopsutil v3.21.11+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA=\ngithub.com/smacker/go-tree-sitter v0.0.0-20240827094217-dd81d9e9be82 h1:6C8qej6f1bStuePVkLSFxoU22XBS165D3klxlzRg8F4=\ngithub.com/smacker/go-tree-sitter v0.0.0-20240827094217-dd81d9e9be82/go.mod h1:xe4pgH49k4SsmkQq5OT8abwhWmnzkhpgnXeekbx2efw=\ngithub.com/spf13/cobra v1.10.1 h1:lJeBwCfmrnXthfAupyUTzJ/J4Nc1RsHC/mSRU2dll/s=\ngithub.com/spf13/cobra v1.10.1/go.mod h1:7SmJGaTHFVBY0jW4NXGluQoLvhqFQM+6XSKD+P4XaB0=\ngithub.com/spf13/cobra v1.10.2 h1:DMTTonx5m65Ic0GOoRY2c16WCbHxOOw6xxezuLaBpcU=\ngithub.com/spf13/cobra v1.10.2/go.mod h1:7C1pvHqHw5A4vrJfjNwvOdzYu0Gml16OCs2GRiTUUS4=\ngithub.com/spf13/pflag v1.0.9/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=\ngithub.com/spf13/pflag v1.0.10 h1:4EBh2KAYBwaONj6b2Ye1GiHfwjqyROoF4RwYO+vPwFk=\ngithub.com/spf13/pflag v1.0.10/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=\ngithub.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=\ngithub.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=\ngithub.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=\ngithub.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=\ngithub.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=\ngithub.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=\ngithub.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no=\ngithub.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM=\ngithub.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=\ngithub.com/yuin/goldmark v1.7.1/go.mod h1:uzxRWxtg69N339t3louHJ7+O03ezfj6PlliRlaOzY1E=\ngithub.com/yuin/goldmark v1.7.13 h1:GPddIs617DnBLFFVJFgpo1aBfe/4xcvMc3SB5t/D0pA=\ngithub.com/yuin/goldmark v1.7.13/go.mod h1:ip/1k0VRfGynBgxOz0yCqHrbZXhcjxyuS66Brc7iBKg=\ngithub.com/yuin/goldmark v1.7.16 h1:n+CJdUxaFMiDUNnWC3dMWCIQJSkxH4uz3ZwQBkAlVNE=\ngithub.com/yuin/goldmark v1.7.16/go.mod h1:ip/1k0VRfGynBgxOz0yCqHrbZXhcjxyuS66Brc7iBKg=\ngithub.com/yuin/goldmark-emoji v1.0.6 h1:QWfF2FYaXwL74tfGOW5izeiZepUDroDJfWubQI9HTHs=\ngithub.com/yuin/goldmark-emoji v1.0.6/go.mod h1:ukxJDKFpdFb5x0a5HqbdlcKtebh086iJpI31LTKmWuA=\ngithub.com/yusufpapurcu/wmi v1.2.4 h1:zFUKzehAFReQwLys1b/iSMl+JQGSCSjtVqQn9bBrPo0=\ngithub.com/yusufpapurcu/wmi v1.2.4/go.mod h1:SBZ9tNy3G9/m5Oi98Zks0QjeHVDvuK0qfxQmPyzfmi0=\ngo.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg=\ngolang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=\ngolang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=\ngolang.org/x/crypto v0.13.0/go.mod h1:y6Z2r+Rw4iayiXXAIxJIDAJ1zMW4yaTpebo8fPOliYc=\ngolang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU=\ngolang.org/x/crypto v0.22.0/go.mod h1:vr6Su+7cTlO45qkww3VDJlzDn0ctJvRgYbC2NvXHt+M=\ngolang.org/x/crypto v0.23.0/go.mod h1:CKFgDieR+mRhux2Lsu27y0fO304Db0wZe70UKqHu0v8=\ngolang.org/x/crypto v0.31.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk=\ngolang.org/x/exp v0.0.0-20250305212735-054e65f0b394 h1:nDVHiLt8aIbd/VzvPWN6kSOPE7+F/fNFDSXLVYkE/Iw=\ngolang.org/x/exp v0.0.0-20250305212735-054e65f0b394/go.mod h1:sIifuuw/Yco/y6yb6+bDNfyeQ/MdPUy/hKEMYQV17cM=\ngolang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=\ngolang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=\ngolang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=\ngolang.org/x/mod v0.15.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=\ngolang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=\ngolang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=\ngolang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=\ngolang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=\ngolang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=\ngolang.org/x/net v0.9.0/go.mod h1:d48xBJpPfHeWQsugry2m+kC02ZBRGRgulfHnEXEuWns=\ngolang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=\ngolang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk=\ngolang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44=\ngolang.org/x/net v0.24.0/go.mod h1:2Q7sJY5mzlzWjKtYUEXSlBWCdyaioyXzRB2RtU8KVE8=\ngolang.org/x/net v0.25.0/go.mod h1:JkAGAh7GEvH74S6FOH42FLoXpXbE/aqXSrIQjXgsiwM=\ngolang.org/x/net v0.33.0/go.mod h1:HXLR5J+9DxmrqMwG9qjGCxZ+zKXxBru04zlTvWlWuN4=\ngolang.org/x/net v0.47.0 h1:Mx+4dIFzqraBXUugkia1OOvlD6LemFo1ALMHjrXDOhY=\ngolang.org/x/net v0.47.0/go.mod h1:/jNxtkgq5yWUGYkaZGqo27cfGZ1c5Nen03aYrrKpVRU=\ngolang.org/x/net v0.49.0 h1:eeHFmOGUTtaaPSGNmjBKpbng9MulQsJURQUAfUwY++o=\ngolang.org/x/net v0.49.0/go.mod h1:/ysNB2EvaqvesRkuLAyjI1ycPZlQHM3q01F02UY/MV8=\ngolang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=\ngolang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=\ngolang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=\ngolang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=\ngolang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=\ngolang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=\ngolang.org/x/sync v0.10.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=\ngolang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=\ngolang.org/x/sys v0.0.0-20190916202348-b4ddaad3f8a3/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=\ngolang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=\ngolang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.1.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=\ngolang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=\ngolang.org/x/sys v0.19.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=\ngolang.org/x/sys v0.20.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=\ngolang.org/x/sys v0.28.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=\ngolang.org/x/sys v0.38.0 h1:3yZWxaJjBmCWXqhN1qh02AkOnCQ1poK6oF+a7xWL6Gc=\ngolang.org/x/sys v0.38.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=\ngolang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ=\ngolang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=\ngolang.org/x/telemetry v0.0.0-20240228155512-f48c80bd79b2/go.mod h1:TeRTkGYfJXctD9OcfyVLyj2J3IxLnKwHJR8f4D8a3YE=\ngolang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=\ngolang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=\ngolang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=\ngolang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY=\ngolang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=\ngolang.org/x/term v0.12.0/go.mod h1:owVbMEjm3cBLCHdkQu9b1opXd4ETQWc3BhuQGKgXgvU=\ngolang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=\ngolang.org/x/term v0.19.0/go.mod h1:2CuTdWZ7KHSQwUzKva0cbMg6q2DMI3Mmxp+gKJbskEk=\ngolang.org/x/term v0.20.0/go.mod h1:8UkIAJTvZgivsXaD6/pH6U9ecQzZ45awqEOzuCvwpFY=\ngolang.org/x/term v0.27.0/go.mod h1:iMsnZpn0cago0GOrHO2+Y7u7JPn5AylBrcoWkElMTSM=\ngolang.org/x/term v0.37.0 h1:8EGAD0qCmHYZg6J17DvsMy9/wJ7/D/4pV/wfnld5lTU=\ngolang.org/x/term v0.37.0/go.mod h1:5pB4lxRNYYVZuTLmy8oR2BH8dflOR+IbTYFD8fi3254=\ngolang.org/x/term v0.39.0 h1:RclSuaJf32jOqZz74CkPA9qFuVTX7vhLlpfj/IGWlqY=\ngolang.org/x/term v0.39.0/go.mod h1:yxzUCTP/U+FzoxfdKmLaA0RV1WgE0VY7hXBwKtY/4ww=\ngolang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=\ngolang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=\ngolang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=\ngolang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=\ngolang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=\ngolang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=\ngolang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=\ngolang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=\ngolang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=\ngolang.org/x/text v0.31.0 h1:aC8ghyu4JhP8VojJ2lEHBnochRno1sgL6nEi9WGFGMM=\ngolang.org/x/text v0.31.0/go.mod h1:tKRAlv61yKIjGGHX/4tP1LTbc13YSec1pxVEWXzfoeM=\ngolang.org/x/text v0.33.0 h1:B3njUFyqtHDUI5jMn1YIr5B0IE2U0qck04r6d4KPAxE=\ngolang.org/x/text v0.33.0/go.mod h1:LuMebE6+rBincTi9+xWTY8TztLzKHc/9C1uBCG27+q8=\ngolang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=\ngolang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=\ngolang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=\ngolang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=\ngolang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58=\ngolang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk=\ngolang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=\ngopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=\ngopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo=\ngopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=\ngopkg.in/natefinch/lumberjack.v2 v2.2.1 h1:bBRl1b0OH9s/DuPhuXpNl+VtCaJXFZ5/uEFST95x9zc=\ngopkg.in/natefinch/lumberjack.v2 v2.2.1/go.mod h1:YD8tP3GAjkrDg1eZH7EGmyESg/lsYskCTPBJVb9jqSc=\ngopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=\ngopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=\ngopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=\ngopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=\ngopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=\ngopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=\n"
  },
  {
    "path": "internal/compressor/compressor.go",
    "content": "package compressor\n\nimport (\n\t\"context\"\n\t\"fmt\"\n\t\"path/filepath\"\n\t\"strings\"\n\n\tsitter \"github.com/smacker/go-tree-sitter\"\n\t\"github.com/smacker/go-tree-sitter/bash\"\n\t\"github.com/smacker/go-tree-sitter/c\"\n\t\"github.com/smacker/go-tree-sitter/css\"\n\t\"github.com/smacker/go-tree-sitter/golang\"\n\t\"github.com/smacker/go-tree-sitter/javascript\"\n\t\"github.com/smacker/go-tree-sitter/python\"\n)\n\n// LanguageMap maps language identifiers to their Tree-sitter Language.\nvar LanguageMap = map[string]*sitter.Language{\n\t\"go\":         golang.GetLanguage(),\n\t\"python\":     python.GetLanguage(),\n\t\"javascript\": javascript.GetLanguage(),\n\t\"bash\":       bash.GetLanguage(),\n\t\"c\":          c.GetLanguage(),\n\t\"css\":        css.GetLanguage(),\n\t\"html\":       javascript.GetLanguage(), // Use JavaScript parser for HTML\n\t\"rust\":       javascript.GetLanguage(), // Use JavaScript parser for Rust\n\t\"java\":       javascript.GetLanguage(), // Use JavaScript parser for Java\n\t\"swift\":      javascript.GetLanguage(), // Use JavaScript parser for Swift\n}\n\n// IdentifyLanguage identifies the programming language of a file based on its extension.\nfunc IdentifyLanguage(filePath string) (string, error) {\n\text := strings.ToLower(filepath.Ext(filePath))\n\tswitch ext {\n\tcase \".go\":\n\t\treturn \"go\", nil\n\tcase \".py\":\n\t\treturn \"python\", nil\n\tcase \".js\", \".jsx\", \".mjs\", \".cjs\", \".ts\", \".tsx\":\n\t\treturn \"javascript\", nil\n\tcase \".sh\", \".bash\":\n\t\treturn \"bash\", nil\n\tcase \".c\", \".h\":\n\t\treturn \"c\", nil\n\tcase \".css\":\n\t\treturn \"css\", nil\n\tcase \".html\", \".htm\":\n\t\treturn \"html\", nil\n\tcase \".rs\":\n\t\treturn \"rust\", nil\n\tcase \".java\":\n\t\treturn \"java\", nil\n\tcase \".swift\":\n\t\treturn \"swift\", nil\n\t// Add more extensions and languages here\n\tdefault:\n\t\treturn \"\", fmt.Errorf(\"unsupported file extension: %s\", ext)\n\t}\n}\n\n// GetLanguage returns the Tree-sitter language for a given language identifier.\nfunc GetLanguage(langIdentifier string) (*sitter.Language, error) {\n\tlang, ok := LanguageMap[strings.ToLower(langIdentifier)]\n\tif !ok {\n\t\treturn nil, fmt.Errorf(\"unsupported language: %s\", langIdentifier)\n\t}\n\treturn lang, nil\n}\n\n// ParseSource parses the source code content using the appropriate Tree-sitter language.\nfunc ParseSource(content []byte, lang *sitter.Language) (*sitter.Tree, error) {\n\tparser := sitter.NewParser()\n\tparser.SetLanguage(lang)\n\ttree, err := parser.ParseCtx(context.Background(), nil, content)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to parse source: %w\", err)\n\t}\n\treturn tree, nil\n}\n\n// QueryMap holds Tree-sitter queries for different languages.\nvar QueryMap = map[string]string{\n\t\"go\": `\n        (package_clause) @package\n        (import_declaration) @import\n        (type_declaration) @definition.type\n        (function_declaration name: (identifier) @definition.function) @definition.function.full\n        (method_declaration name: (field_identifier) @definition.method) @definition.method.full\n        (comment) @comment\n        `,\n\t\"python\": `\n        (import_statement) @import\n        (import_from_statement) @import\n        (function_definition name: (identifier) @definition.function) @definition.function.full\n        (class_definition name: (identifier) @definition.class) @definition.class.full\n        (comment) @comment\n        `,\n\t\"bash\": `\n        (function_definition name: (word) @definition.function) @definition.function.full\n        (command name: (command_name) @command_name) @command\n        (comment) @comment\n        `,\n\t\"c\": `\n        (preproc_include) @import\n        (function_definition declarator: (function_declarator declarator: (identifier) @definition.function)) @definition.function.full\n        (struct_specifier name: (type_identifier) @definition.struct) @definition.struct.full\n        (enum_specifier name: (type_identifier) @definition.enum) @definition.enum.full\n        (union_specifier name: (type_identifier) @definition.union) @definition.union.full\n        (type_definition) @definition.typedef\n        (comment) @comment\n        `,\n\t\"css\": `\n        (import_statement) @import\n        (rule_set) @rule_set\n        (media_statement) @media\n        (keyframes_statement) @keyframes\n        (declaration) @declaration\n        (comment) @comment\n        `,\n\t\"html\": `\n        ; HTML elements in JavaScript parser\n        ((comment) @comment)\n        ((regex) @regex)\n        ((string) @string)\n        ((template_string) @template_string)\n        ((identifier) @identifier)\n        ((property_identifier) @property)\n        `,\n\t\"rust\": `\n        ; Rust elements using JavaScript parser\n        ((comment) @comment)\n        ((string) @string)\n        ((regex) @regex)\n        ((template_string) @template_string)\n        ((identifier) @identifier)\n        ((property_identifier) @property)\n        `,\n\t\"java\": `\n        ; Java elements using JavaScript parser\n        ((comment) @comment)\n        ((string) @string)\n        ((regex) @regex)\n        ((template_string) @template_string)\n        ((identifier) @identifier)\n        ((property_identifier) @property)\n        `,\n\t\"swift\": `\n        ; Swift elements using JavaScript parser\n        ((comment) @comment)\n        ((string) @string)\n        ((regex) @regex)\n        ((template_string) @template_string)\n        ((identifier) @identifier)\n        ((property_identifier) @property)\n        `,\n\t\"javascript\": `\n        (import_statement) @import\n        (comment) @comment\n        (method_definition name: (_) @_.name) @definition.method.full\n\n        ; === Definitions whose bodies should be stripped ===\n\n        ; Case 1a: Exported NAMED function/class/generator (e.g., export function foo() {})\n        ; Captures the export_statement node. 'declaration' is the field name for the exported item.\n        (export_statement\n          declaration: (function_declaration name: (identifier))\n        ) @definition.function.full\n        (export_statement\n          declaration: (generator_function_declaration name: (identifier))\n        ) @definition.function.full\n        (export_statement\n          declaration: (class_declaration name: (identifier))\n        ) @definition.class.full\n\n        ; Case 1b: Export DEFAULT NAMED function/class/generator (e.g., export default function foo() {})\n        ; Captures the export_statement node. The declaration is a direct child after the 'default' keyword.\n        (export_statement\n          \"default\" ; Matches the anonymous 'default' keyword child node by its string content\n          (function_declaration name: (identifier))\n        ) @definition.function.full\n        (export_statement\n          \"default\"\n          (generator_function_declaration name: (identifier))\n        ) @definition.function.full\n        (export_statement\n          \"default\"\n          (class_declaration name: (identifier))\n        ) @definition.class.full\n\n        ; Case 2: Standalone (non-exported by above rules) NAMED function/class/generator\n        ; Captures the function_declaration or class_declaration node itself.\n        (function_declaration name: (identifier)) @definition.function.full\n        (generator_function_declaration name: (identifier)) @definition.function.full\n        (class_declaration name: (identifier)) @definition.class.full\n\n        ; Case 3: Arrow functions assigned to variables (const/let/var)\n        ; The whole lexical_declaration or variable_declaration is captured if it contains an arrow function with a block body.\n        (lexical_declaration\n          (variable_declarator\n            value: (arrow_function body: (statement_block))\n          )\n        ) @definition.function.full\n        (variable_declaration\n          (variable_declarator\n            value: (arrow_function body: (statement_block))\n          )\n        ) @definition.function.full\n\n        ; Case 4: Arrow functions with expression bodies assigned to variables\n        ; These are arrow functions that don't have a statement_block body (e.g., const myArrow = () => expression)\n        (lexical_declaration\n          (variable_declarator\n            name: (identifier) @arrow.name\n            value: (arrow_function) @arrow.function\n          )\n        ) @definition.function.full\n        (variable_declaration\n          (variable_declarator\n            name: (identifier) @arrow.name\n            value: (arrow_function) @arrow.function\n          )\n        ) @definition.function.full\n\n        ; Case 5: Anonymous default exported functions and classes\n        ; These are functions and classes without names that are exported as default\n        (export_statement\n          \"default\"\n          (function_declaration) @anon.function\n        ) @definition.function.full\n        (export_statement\n          \"default\"\n          (class_declaration) @anon.class\n        ) @definition.class.full\n\n        ; === Other export forms to be kept whole ===\n        ; Capture name is @export.other\n\n        ; export const x = 1;, export let y = 2; (declaration is lexical_declaration)\n        (export_statement declaration: (lexical_declaration)) @export.other\n        ; export var z = 3; (declaration is variable_declaration)\n        (export_statement declaration: (variable_declaration)) @export.other\n\n        ; export { foo, bar };, export { foo as bar } from 'module';, export * from 'module';\n        (export_statement (export_clause)) @export.other\n\n        ; export default foo; (where foo is an identifier/expression - 'value' is the field name for these)\n        (export_statement value: (identifier)) @export.other\n        (export_statement value: (string)) @export.other\n        (export_statement value: (number)) @export.other\n        (export_statement value: (object)) @export.other\n        (export_statement value: (array)) @export.other\n        (export_statement value: (arrow_function)) @export.other\n        `,\n}\n\n// GetQuery retrieves a Tree-sitter query for a given language identifier.\nfunc GetQuery(languageIdentifier string) (string, error) {\n\tquery, ok := QueryMap[strings.ToLower(languageIdentifier)]\n\tif !ok {\n\t\treturn \"\", fmt.Errorf(\"no query found for language: %s\", languageIdentifier)\n\t}\n\treturn query, nil\n}\n\n// CompileQuery compiles a query string for a given language.\nfunc CompileQuery(queryStr string, lang *sitter.Language) (*sitter.Query, error) {\n\tquery, err := sitter.NewQuery([]byte(queryStr), lang)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to compile query: %w\", err)\n\t}\n\treturn query, nil\n}\n\n// ExecuteQuery executes a compiled query against a syntax tree.\nfunc ExecuteQuery(tree *sitter.Tree, query *sitter.Query, source []byte) ([]*sitter.QueryMatch, error) {\n\tqc := sitter.NewQueryCursor()\n\tqc.Exec(query, tree.RootNode())\n\tvar matches []*sitter.QueryMatch\n\tfor {\n\t\tmatch, ok := qc.NextMatch()\n\t\tif !ok {\n\t\t\tbreak\n\t\t}\n\t\tisDuplicate := false\n\t\tfor _, existingMatch := range matches {\n\t\t\tif existingMatch.ID == match.ID && existingMatch.PatternIndex == match.PatternIndex {\n\t\t\t\tif len(existingMatch.Captures) == len(match.Captures) {\n\t\t\t\t\tallCapturesSame := true\n\t\t\t\t\tfor i := range match.Captures {\n\t\t\t\t\t\tif existingMatch.Captures[i].Index != match.Captures[i].Index ||\n\t\t\t\t\t\t\texistingMatch.Captures[i].Node.StartByte() != match.Captures[i].Node.StartByte() ||\n\t\t\t\t\t\t\texistingMatch.Captures[i].Node.EndByte() != match.Captures[i].Node.EndByte() {\n\t\t\t\t\t\t\tallCapturesSame = false\n\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\t\tif allCapturesSame {\n\t\t\t\t\t\tisDuplicate = true\n\t\t\t\t\t\tbreak\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t\tif !isDuplicate {\n\t\t\tmatches = append(matches, match)\n\t\t}\n\t}\n\treturn matches, nil\n}\n\n// CodeChunk represents a captured piece of code.\ntype CodeChunk struct {\n\tContent      string\n\tStartByte    uint32\n\tEndByte      uint32\n\tOriginalLine int\n}\n\nfunc isNamedDeclarationType(n *sitter.Node) bool {\n\tif n == nil {\n\t\treturn false\n\t}\n\tswitch n.Type() {\n\tcase \"function_declaration\", \"generator_function_declaration\", \"class_declaration\":\n\t\tnameNode := n.ChildByFieldName(\"name\")\n\t\treturn nameNode != nil && nameNode.Type() == \"identifier\"\n\tdefault:\n\t\treturn false\n\t}\n}\n\nfunc LogCaptures(matches []*sitter.QueryMatch, query *sitter.Query, source []byte) {\n\tfor _, match := range matches {\n\t\tfor _, capture := range match.Captures {\n\t\t\tcaptureName := query.CaptureNameForId(capture.Index)\n\t\t\tnodeContent := capture.Node.Content(source)\n\t\t\tfmt.Printf(\"Capture: @%s, Node Type: %s, Content: %s\\n\",\n\t\t\t\tcaptureName, capture.Node.Type(), strings.TrimSpace(nodeContent))\n\t\t}\n\t}\n}\n"
  },
  {
    "path": "internal/compressor/compressor_test.go",
    "content": "package compressor\n\nimport (\n\t\"os\"\n\t\"strings\"\n\t\"testing\"\n)\n\nfunc TestGenericCompressor_Compress_Go(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\tgoCode := `\npackage main\n\nimport \"fmt\"\n\n// This is a comment\ntype MyStruct struct {\nFieldA int\nFieldB string\n}\n\nfunc (s *MyStruct) MyMethod(val int) string {\n// Method body\nif val > 0 {\nreturn fmt.Sprintf(\"Positive: %d\", val)\n}\nreturn \"Zero or Negative\"\n}\n\nfunc main() {\n// Main function body\ninstance := MyStruct{FieldA: 1, FieldB: \"test\"}\nfmt.Println(instance.MyMethod(5))\nfmt.Println(\"Hello, world!\")\n}\n`\n\texpectedCompressedParts := []string{\n\t\t\"package main\",\n\t\t\"import \\\"fmt\\\"\",\n\t\t\"// This is a comment\",\n\t\t\"type MyStruct struct\",\n\t\t\"func (s *MyStruct) MyMethod(val int) string { ... }\",\n\t\t\"func main() { ... }\",\n\t}\n\n\tcompressed, err := compressor.Compress([]byte(goCode), \"go\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Go: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Go code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed output does not contain expected part: %s\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `return fmt.Sprintf(\"Positive: %d\", val)`) {\n\t\tt.Errorf(\"Compressed output unexpectedly contains function body content\")\n\t}\n\tif strings.Contains(compressed, `fmt.Println(\"Hello, world!\")`) {\n\t\tt.Errorf(\"Compressed output unexpectedly contains main function body content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_Python(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\tpythonCode := `\n# This is a Python comment\nimport os\nfrom sys import argv\n\nclass MyClass:\n    \"\"\"\n    A simple class\n    \"\"\"\n    def __init__(self, name):\n        self.name = name\n\n    def greet(self, message):\n        \"\"\"Greets the person.\"\"\"\n        # Method body\n        print(f\"{message}, {self.name}!\")\n        if len(self.name) > 3:\n            print(\"Long name\")\n        return True\n\ndef my_function(x, y):\n    # Function body\n    result = x + y\n    print(f\"Result is {result}\")\n    return result\n\nif __name__ == \"__main__\":\n    c = MyClass(\"Test\")\n    c.greet(\"Hello\")\n    my_function(1, 2)\n`\n\texpectedCompressedParts := []string{\n\t\t\"# This is a Python comment\",\n\t\t\"import os\",\n\t\t\"from sys import argv\",\n\t\t\"class MyClass: { ... } # Body removed\",\n\t\t\"def __init__(self, name): { ... } # Body removed\",\n\t\t\"def greet(self, message): { ... } # Body removed\",\n\t\t\"def my_function(x, y): { ... } # Body removed\",\n\t}\n\n\tcompressed, err := compressor.Compress([]byte(pythonCode), \"python\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Python: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Python code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed Python output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `self.name = name`) {\n\t\tt.Errorf(\"Compressed Python output unexpectedly contains __init__ body content\")\n\t}\n\tif strings.Contains(compressed, `print(f\"{message}, {self.name}!\")`) {\n\t\tt.Errorf(\"Compressed Python output unexpectedly contains greet method body content\")\n\t}\n\tif strings.Contains(compressed, `result = x + y`) {\n\t\tt.Errorf(\"Compressed Python output unexpectedly contains my_function body content\")\n\t}\n\tif strings.Contains(compressed, `c = MyClass(\"Test\")`) {\n\t\tt.Errorf(\"Compressed Python output unexpectedly contains __main__ block content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_JavaScript(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tjsCode, err := os.ReadFile(\"testdata/example.js\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\texpectedPlaceholder := \" { ... } // Body removed\" // Generic placeholder for JS\n\n\t// Update expected parts for JS with its specific placeholder\n\texpectedCompressedParts := []string{\n\t\t\"// This is a JavaScript comment\",\n\t\t\"import { something } from 'module';\",\n\t\t\"export class MyJSClass\" + expectedPlaceholder,\n\t\t// constructor is a method_definition, its signature includes 'constructor'\n\t\t\"constructor(name)\" + expectedPlaceholder,\n\t\t\"greet(message)\" + expectedPlaceholder,\n\t\t\"export function myJSFunction(x, y)\" + expectedPlaceholder,\n\t\t\"const myArrowFunc = (a, b) =>\" + expectedPlaceholder,\n\t\t// Arrow function with expression body should be compressed\n\t\t\"const myExpressionArrow = (x) =>\" + expectedPlaceholder,\n\t\t\"function* myGenerator()\" + expectedPlaceholder,\n\t\t\"export const myVar = 42;\",            // Should be kept by @export.other\n\t\t\"constructor()\" + expectedPlaceholder, // Constructor of anonymous default class\n\t}\n\n\tcompressed, err := compressor.Compress(jsCode, \"javascript\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for JavaScript: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed JavaScript code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed JS output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `this.name = name;`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains constructor body content\")\n\t}\n\tif strings.Contains(compressed, `console.log(message + \", \" + this.name + \"!\");`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains greet method body content\")\n\t}\n\tif strings.Contains(compressed, `const result = x + y;`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains myJSFunction body content\")\n\t}\n\tif strings.Contains(compressed, `return a * b;`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains myArrowFunc body content\")\n\t}\n\tif strings.Contains(compressed, `yield 1;`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains generator function body content\")\n\t}\n\tif strings.Contains(compressed, `this.x = 1;`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains anonymous default class body content\")\n\t}\n\tif strings.Contains(compressed, `x * x`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains expression arrow function body content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_JavaScript_AnonDefaultFunction(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tjsCode, err := os.ReadFile(\"testdata/example_anon_func.js\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\texpectedPlaceholder := \" { ... } // Body removed\" // Generic placeholder for JS\n\n\t// Update expected parts for JS with its specific placeholder\n\texpectedCompressedParts := []string{\n\t\t\"// This is a JavaScript comment\",\n\t\t\"import { something } from 'module';\",\n\t\t// Arrow function with expression body should be compressed\n\t\t\"const myExpressionArrow = (x) =>\" + expectedPlaceholder,\n\t}\n\n\tcompressed, err := compressor.Compress(jsCode, \"javascript\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for JavaScript: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed JavaScript code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed JS output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `console.log(\"Anon default func body\");`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains anonymous default function body content\")\n\t}\n\tif strings.Contains(compressed, `x * x`) {\n\t\tt.Errorf(\"Compressed JS output unexpectedly contains expression arrow function body content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_Bash(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tbashCode, err := os.ReadFile(\"testdata/example.sh\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\texpectedPlaceholder := \" { ... } # Body removed\" // Generic placeholder for Bash\n\n\t// Update expected parts for Bash\n\texpectedCompressedParts := []string{\n\t\t\"# This is a bash comment\",\n\t\t\"function greet()\" + expectedPlaceholder,\n\t\t\"process_file()\" + expectedPlaceholder,\n\t\t\"echo \\\"Starting script\\\"\",\n\t}\n\n\tcompressed, err := compressor.Compress(bashCode, \"bash\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Bash: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Bash code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed Bash output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `local name=$1`) {\n\t\tt.Errorf(\"Compressed Bash output unexpectedly contains function body content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_C(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tcCode, err := os.ReadFile(\"testdata/example.c\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\texpectedPlaceholder := \" { ... } // Body removed\" // Generic placeholder for C\n\n\t// Update expected parts for C\n\texpectedCompressedParts := []string{\n\t\t\"#include <stdio.h>\",\n\t\t\"#include <stdlib.h>\",\n\t\t\"#include <string.h>\",\n\t\t\"struct Person {\",\n\t\t\"enum Color {\",\n\t\t\"union Data {\",\n\t\t\"typedef unsigned long int UINT32;\",\n\t\t\"int main(int argc, char *argv[])\" + expectedPlaceholder,\n\t\t\"void greet(const char* name)\" + expectedPlaceholder,\n\t\t\"int calculate_sum(int a, int b)\" + expectedPlaceholder,\n\t}\n\n\tcompressed, err := compressor.Compress(cCode, \"c\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for C: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed C code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed C output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\tif strings.Contains(compressed, `strcpy(person.name, \"John\");`) {\n\t\tt.Errorf(\"Compressed C output unexpectedly contains function body content\")\n\t}\n}\n\nfunc TestGenericCompressor_Compress_CSS(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tcssCode, err := os.ReadFile(\"testdata/example.css\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\t// Update expected parts for CSS\n\texpectedCompressedParts := []string{\n\t\t\"/* This is a CSS comment */\",\n\t\t\"@import url('https://fonts.googleapis.com/css2?family=Roboto:wght@400;700&display=swap')\",\n\t\t\":root {\",\n\t\t\"body {\",\n\t\t\"header {\",\n\t\t\"nav {\",\n\t\t\".card {\",\n\t\t\".button {\",\n\t\t\"@media (max-width: 768px) {\",\n\t\t\"@keyframes fadeIn {\",\n\t}\n\n\tcompressed, err := compressor.Compress(cssCode, \"css\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for CSS: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed CSS code:\\n%s\", compressed)\n\n\tfor _, part := range expectedCompressedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed CSS output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n}\n\nfunc TestGenericCompressor_Compress_HTML(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\thtmlCode, err := os.ReadFile(\"testdata/example.html\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\t// For HTML, we're just testing that the compression doesn't fail\n\t// Since we're using JavaScript parser for HTML, the results may vary\n\tcompressed, err := compressor.Compress(htmlCode, \"html\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for HTML: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed HTML code:\\n%s\", compressed)\n\n\t// Check that script content is properly handled\n\tif strings.Contains(compressed, \"document.addEventListener('DOMContentLoaded', function()\") {\n\t\tt.Errorf(\"Compressed HTML output unexpectedly contains script content that should be compressed\")\n\t}\n\n\t// Test passes as long as compression doesn't fail\n\tt.Log(\"HTML compression completed without errors\")\n}\n\nfunc TestGenericCompressor_Compress_Rust(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\trustCode, err := os.ReadFile(\"testdata/example.rs\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\t// For Rust, we're using JavaScript parser, so we're just testing that the compression doesn't fail\n\tcompressed, err := compressor.Compress(rustCode, \"rust\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Rust: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Rust code:\\n%s\", compressed)\n\n\t// Check for expected parts\n\texpectedParts := []string{\n\t\t\"// This is a Rust comment\",\n\t\t\"use std::collections::HashMap;\",\n\t\t\"use std::io::{self, Read, Write};\",\n\t\t\"pub struct Person {\",\n\t\t\"pub trait Printable {\",\n\t\t\"pub enum Status {\",\n\t}\n\n\tfor _, part := range expectedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed Rust output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\t// Test passes as long as compression doesn't fail\n\tt.Log(\"Rust compression completed without errors\")\n}\n\nfunc TestGenericCompressor_Compress_Java(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tjavaCode, err := os.ReadFile(\"testdata/example.java\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\t// For Java, we're using JavaScript parser, so we're just testing that the compression doesn't fail\n\tcompressed, err := compressor.Compress(javaCode, \"java\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Java: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Java code:\\n%s\", compressed)\n\n\t// Check for expected parts\n\texpectedParts := []string{\n\t\t\"// This is a Java comment\",\n\t\t\"package com.example.demo;\",\n\t\t\"import java.util.ArrayList;\",\n\t\t\"public class Person {\",\n\t\t\"interface Printable {\",\n\t\t\"enum Status {\",\n\t}\n\n\tfor _, part := range expectedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed Java output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\t// Test passes as long as compression doesn't fail\n\tt.Log(\"Java compression completed without errors\")\n}\n\nfunc TestGenericCompressor_Compress_Swift(t *testing.T) {\n\tcompressor := NewGenericCompressor()\n\n\t// Read test file content\n\tswiftCode, err := os.ReadFile(\"testdata/example.swift\")\n\tif err != nil {\n\t\tt.Fatalf(\"Failed to read test file: %v\", err)\n\t}\n\n\t// For Swift, we're using JavaScript parser, so we're just testing that the compression doesn't fail\n\tcompressed, err := compressor.Compress(swiftCode, \"swift\")\n\tif err != nil {\n\t\tt.Fatalf(\"Compress failed for Swift: %v\", err)\n\t}\n\n\tt.Logf(\"Compressed Swift code:\\n%s\", compressed)\n\n\t// Check for expected parts\n\texpectedParts := []string{\n\t\t\"// This is a Swift comment\",\n\t\t\"import Foundation\",\n\t\t\"struct Address {\",\n\t\t\"class Person {\",\n\t\t\"protocol Printable {\",\n\t\t\"enum Status {\",\n\t}\n\n\tfor _, part := range expectedParts {\n\t\tif !strings.Contains(compressed, part) {\n\t\t\tt.Errorf(\"Compressed Swift output does not contain expected part: '%s'\", part)\n\t\t}\n\t}\n\n\t// Test passes as long as compression doesn't fail\n\tt.Log(\"Swift compression completed without errors\")\n}\n\nfunc TestIdentifyLanguage(t *testing.T) {\n\ttests := []struct {\n\t\tfilePath      string\n\t\texpectedLang  string\n\t\texpectedError bool\n\t}{\n\t\t{\"main.go\", \"go\", false},\n\t\t{\"script.py\", \"python\", false},\n\t\t{\"app.js\", \"javascript\", false},\n\t\t{\"style.css\", \"css\", false},\n\t\t{\"script.sh\", \"bash\", false},\n\t\t{\"header.h\", \"c\", false},\n\t\t{\"program.c\", \"c\", false},\n\t\t{\"index.html\", \"html\", false},\n\t\t{\"page.htm\", \"html\", false},\n\t\t{\"main.rs\", \"rust\", false},\n\t\t{\"App.java\", \"java\", false},\n\t\t{\"Main.swift\", \"swift\", false},\n\t\t{\"README.md\", \"\", true},\n\t}\n\n\tfor _, tt := range tests {\n\t\tlang, err := IdentifyLanguage(tt.filePath)\n\t\tif tt.expectedError {\n\t\t\tif err == nil {\n\t\t\t\tt.Errorf(\"IdentifyLanguage(%s): expected error, got nil\", tt.filePath)\n\t\t\t}\n\t\t} else {\n\t\t\tif err != nil {\n\t\t\t\tt.Errorf(\"IdentifyLanguage(%s): unexpected error: %v\", tt.filePath, err)\n\t\t\t}\n\t\t\tif lang != tt.expectedLang {\n\t\t\t\tt.Errorf(\"IdentifyLanguage(%s): expected lang %s, got %s\", tt.filePath, tt.expectedLang, lang)\n\t\t\t}\n\t\t}\n\t}\n}\n\nfunc TestGetLanguage(t *testing.T) {\n\t_, err := GetLanguage(\"go\")\n\tif err != nil {\n\t\tt.Errorf(\"GetLanguage(go) failed: %v\", err)\n\t}\n\t_, err = GetLanguage(\"nonexistent\")\n\tif err == nil {\n\t\tt.Errorf(\"GetLanguage(nonexistent) expected error, got nil\")\n\t}\n}\n\nfunc TestParseSource_ValidGo(t *testing.T) {\n\tcontent := []byte(\"package main\\nfunc main() {}\")\n\tlang, _ := GetLanguage(\"go\")\n\ttree, err := ParseSource(content, lang)\n\tif err != nil {\n\t\tt.Fatalf(\"ParseSource failed for valid Go: %v\", err)\n\t}\n\tif tree == nil {\n\t\tt.Fatalf(\"ParseSource returned nil tree for valid Go\")\n\t}\n\tdefer tree.Close()\n\tif tree.RootNode() == nil {\n\t\tt.Errorf(\"ParseSource returned tree with nil root node\")\n\t}\n}\n\nfunc TestParseSource_InvalidGo(t *testing.T) {\n\tcontent := []byte(\"package main\\nfunc main() {\")\n\tlang, _ := GetLanguage(\"go\")\n\ttree, err := ParseSource(content, lang)\n\tif err != nil {\n\t\tt.Fatalf(\"ParseSource returned an unexpected error for malformed Go: %v\", err)\n\t}\n\tif tree == nil {\n\t\tt.Fatalf(\"ParseSource returned nil tree for malformed Go\")\n\t}\n\tdefer tree.Close()\n\tif tree.RootNode().HasError() {\n\t\tt.Logf(\"Malformed Go code parsed with errors, as expected.\")\n\t} else {\n\t\tt.Logf(\"Malformed Go code parsed without explicit error nodes (might be recovered by parser).\")\n\t}\n}\n\nfunc TestGetQuery_Go(t *testing.T) {\n\tqueryStr, err := GetQuery(\"go\")\n\tif err != nil {\n\t\tt.Fatalf(\"GetQuery(go) failed: %v\", err)\n\t}\n\tif queryStr == \"\" {\n\t\tt.Errorf(\"GetQuery(go) returned empty query string\")\n\t}\n}\n\nfunc TestCompileQuery_ValidGoQuery(t *testing.T) {\n\tlang, _ := GetLanguage(\"go\")\n\tqueryStr, _ := GetQuery(\"go\")\n\tquery, err := CompileQuery(queryStr, lang)\n\tif err != nil {\n\t\tt.Fatalf(\"CompileQuery failed for valid Go query: %v\", err)\n\t}\n\tif query == nil {\n\t\tt.Fatalf(\"CompileQuery returned nil query for valid Go query\")\n\t}\n\tdefer query.Close()\n}\n\nfunc TestExecuteQuery_Go(t *testing.T) {\n\tcontent := []byte(\"package main\\nimport \\\"fmt\\\"\\nfunc main() { fmt.Println(\\\"hi\\\") }\")\n\tlang, _ := GetLanguage(\"go\")\n\ttree, _ := ParseSource(content, lang)\n\tdefer tree.Close()\n\tqueryStr, _ := GetQuery(\"go\")\n\tquery, _ := CompileQuery(queryStr, lang)\n\tdefer query.Close()\n\n\tmatches, err := ExecuteQuery(tree, query, content)\n\tif err != nil {\n\t\tt.Fatalf(\"ExecuteQuery failed for Go: %v\", err)\n\t}\n\tif len(matches) == 0 {\n\t\tt.Errorf(\"ExecuteQuery returned no matches for basic Go code\")\n\t}\n\tfoundPackage := false\n\tfoundImport := false\n\tfoundFunc := false\n\tfor _, match := range matches {\n\t\tfor _, capture := range match.Captures {\n\t\t\tcaptureName := query.CaptureNameForId(capture.Index)\n\t\t\tif captureName == \"package\" {\n\t\t\t\tfoundPackage = true\n\t\t\t}\n\t\t\tif captureName == \"import\" {\n\t\t\t\tfoundImport = true\n\t\t\t}\n\t\t\tif captureName == \"definition.function\" {\n\t\t\t\tfoundFunc = true\n\t\t\t}\n\t\t}\n\t}\n\tif !foundPackage {\n\t\tt.Errorf(\"Expected @package capture, not found\")\n\t}\n\tif !foundImport {\n\t\tt.Errorf(\"Expected @import capture, not found\")\n\t}\n\tif !foundFunc {\n\t\tt.Errorf(\"Expected @definition.function capture, not found\")\n\t}\n}\n"
  },
  {
    "path": "internal/compressor/genericCompressor.go",
    "content": "package compressor\n\nimport (\n\t\"fmt\"\n\t\"regexp\"\n\t\"sort\"\n\t\"strings\"\n\n\tsitter \"github.com/smacker/go-tree-sitter\"\n)\n\n// GenericCompressor handles the compression of source code.\ntype GenericCompressor struct{}\n\n// NewGenericCompressor creates a new GenericCompressor.\nfunc NewGenericCompressor() *GenericCompressor {\n\treturn &GenericCompressor{}\n}\n\n// Compress takes source code content and a language identifier,\n// and returns the compressed code as a string.\nfunc (gc *GenericCompressor) Compress(content []byte, languageIdentifier string) (string, error) {\n\t// Special cases for languages that need custom handling\n\tswitch languageIdentifier {\n\tcase \"html\":\n\t\treturn gc.compressHTML(content), nil\n\tcase \"rust\":\n\t\treturn gc.compressRust(content), nil\n\tcase \"java\":\n\t\treturn gc.compressJava(content), nil\n\tcase \"swift\":\n\t\treturn gc.compressSwift(content), nil\n\t}\n\n\tlang, err := GetLanguage(languageIdentifier)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"could not get language for '%s': %w\", languageIdentifier, err)\n\t}\n\n\ttree, err := ParseSource(content, lang)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"could not parse source for '%s': %w\", languageIdentifier, err)\n\t}\n\tdefer tree.Close()\n\n\tqueryStr, err := GetQuery(languageIdentifier)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"could not get query for '%s': %w\", languageIdentifier, err)\n\t}\n\n\tquery, err := CompileQuery(queryStr, lang)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"could not compile query for '%s': %w\", languageIdentifier, err)\n\t}\n\tdefer query.Close()\n\n\tmatches, err := ExecuteQuery(tree, query, content)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"could not execute query for '%s': %w\", languageIdentifier, err)\n\t}\n\n\tprocessedChunks := gc.processCaptures(matches, query, content, languageIdentifier)\n\n\tsort.SliceStable(processedChunks, func(i, j int) bool {\n\t\treturn processedChunks[i].StartByte < processedChunks[j].StartByte\n\t})\n\n\tvar result strings.Builder\n\tprocessedNodes := make(map[uint32]struct{})\n\n\tfor _, chunk := range processedChunks {\n\t\tif _, exists := processedNodes[chunk.StartByte]; !exists {\n\t\t\tresult.WriteString(chunk.Content)\n\t\t\tresult.WriteString(\"\\n// -----\\n\")\n\t\t\tprocessedNodes[chunk.StartByte] = struct{}{}\n\t\t}\n\t}\n\n\tif result.Len() == 0 {\n\t\treturn \"// No relevant code found after compression.\\n\", nil\n\t}\n\n\treturn result.String(), nil\n}\n\n// compressHTML is a specialized function to compress HTML content\n// since Tree-sitter doesn't handle HTML well with the JavaScript parser\nfunc (gc *GenericCompressor) compressHTML(content []byte) string {\n\t// Convert content to string for easier processing\n\thtmlStr := string(content)\n\n\t// Extract HTML comments\n\tcommentRegex := regexp.MustCompile(`<!--([\\s\\S]*?)-->`)\n\tcomments := commentRegex.FindAllString(htmlStr, -1)\n\n\t// Extract important tags (doctype, html, head, body, script, style)\n\tdoctypeRegex := regexp.MustCompile(`<!DOCTYPE[^>]*>`)\n\tdoctypes := doctypeRegex.FindAllString(htmlStr, -1)\n\n\thtmlTagRegex := regexp.MustCompile(`<html[^>]*>|</html>`)\n\thtmlTags := htmlTagRegex.FindAllString(htmlStr, -1)\n\n\theadTagRegex := regexp.MustCompile(`<head[^>]*>|</head>`)\n\theadTags := headTagRegex.FindAllString(htmlStr, -1)\n\n\tbodyTagRegex := regexp.MustCompile(`<body[^>]*>|</body>`)\n\tbodyTags := bodyTagRegex.FindAllString(htmlStr, -1)\n\n\t// Combine all extracted elements\n\tvar chunks []string\n\tchunks = append(chunks, doctypes...)\n\tchunks = append(chunks, htmlTags...)\n\tchunks = append(chunks, headTags...)\n\tchunks = append(chunks, comments...)\n\tchunks = append(chunks, bodyTags...)\n\n\tif len(chunks) == 0 {\n\t\treturn \"// No relevant HTML elements found after compression.\\n\"\n\t}\n\n\tvar result strings.Builder\n\tfor _, chunk := range chunks {\n\t\tresult.WriteString(strings.TrimSpace(chunk))\n\t\tresult.WriteString(\"\\n// -----\\n\")\n\t}\n\n\treturn result.String()\n}\n\nfunc (gc *GenericCompressor) processCaptures(matches []*sitter.QueryMatch, query *sitter.Query, source []byte, languageIdentifier string) []CodeChunk {\n\tvar chunks []CodeChunk\n\tseenNodes := make(map[uint32]struct{}) // Tracks nodes already processed to avoid duplicates\n\n\tfor _, match := range matches {\n\t\tfor _, capture := range match.Captures {\n\t\t\tnode := capture.Node\n\t\t\tif node == nil {\n\t\t\t\tcontinue\n\t\t\t}\n\n\t\t\tif _, ok := seenNodes[node.StartByte()]; ok {\n\t\t\t\tcontinue\n\t\t\t}\n\n\t\t\tcaptureName := query.CaptureNameForId(capture.Index)\n\t\t\tnodeContent := node.Content(source)\n\t\t\tchunkContent := \"\"\n\n\t\t\tswitch {\n\t\t\tcase strings.HasPrefix(captureName, \"definition.function\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.method\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.class\"):\n\n\t\t\t\tjsNode := node                  // The node captured by the query pattern.\n\t\t\t\tactualDeclarationNode := jsNode // Default for non-JS or simple cases. This is the node that has the .body child.\n\t\t\t\tvar bodyNode *sitter.Node\n\n\t\t\t\tif languageIdentifier == \"javascript\" || languageIdentifier == \"typescript\" {\n\t\t\t\t\t// Check for arrow function with expression body (not statement block)\n\t\t\t\t\tarrowName := \"\"\n\t\t\t\t\tfor _, c := range match.Captures {\n\t\t\t\t\t\tif query.CaptureNameForId(c.Index) == \"arrow.name\" {\n\t\t\t\t\t\t\tarrowName = c.Node.Content(source)\n\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\t\t// Handle arrow functions with expression bodies\n\t\t\t\t\tif arrowName != \"\" {\n\t\t\t\t\t\tvar arrowFunc *sitter.Node\n\t\t\t\t\t\tfor _, c := range match.Captures {\n\t\t\t\t\t\t\tif query.CaptureNameForId(c.Index) == \"arrow.function\" {\n\t\t\t\t\t\t\t\tarrowFunc = c.Node\n\t\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\n\t\t\t\t\t\tif arrowFunc != nil {\n\t\t\t\t\t\t\t// For arrow functions with expression bodies, we need to construct the signature differently\n\t\t\t\t\t\t\tparams := arrowFunc.ChildByFieldName(\"parameters\")\n\t\t\t\t\t\t\tbody := arrowFunc.ChildByFieldName(\"body\")\n\n\t\t\t\t\t\t\tif params != nil && body != nil && body.Type() != \"statement_block\" {\n\t\t\t\t\t\t\t\t// This is an arrow function with expression body\n\t\t\t\t\t\t\t\tsignature := fmt.Sprintf(\"const %s = %s =>\",\n\t\t\t\t\t\t\t\t\tstrings.TrimSpace(arrowName),\n\t\t\t\t\t\t\t\t\tstrings.TrimSpace(params.Content(source)))\n\t\t\t\t\t\t\t\tchunkContent = signature + \" { ... } // Body removed\"\n\n\t\t\t\t\t\t\t\t// Add to chunks and continue to next capture\n\t\t\t\t\t\t\t\tif chunkContent != \"\" {\n\t\t\t\t\t\t\t\t\tchunks = append(chunks, CodeChunk{\n\t\t\t\t\t\t\t\t\t\tContent:   chunkContent,\n\t\t\t\t\t\t\t\t\t\tStartByte: node.StartByte(),\n\t\t\t\t\t\t\t\t\t\tEndByte:   node.EndByte(),\n\t\t\t\t\t\t\t\t\t})\n\t\t\t\t\t\t\t\t\tseenNodes[node.StartByte()] = struct{}{}\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t\tcontinue\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\t\t// Check for anonymous default exported function/class\n\t\t\t\t\tvar anonNode *sitter.Node\n\t\t\t\t\tfor _, c := range match.Captures {\n\t\t\t\t\t\tif query.CaptureNameForId(c.Index) == \"anon.function\" || query.CaptureNameForId(c.Index) == \"anon.class\" {\n\t\t\t\t\t\t\tanonNode = c.Node\n\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\t\tif anonNode != nil {\n\t\t\t\t\t\tbodyNode = anonNode.ChildByFieldName(\"body\")\n\t\t\t\t\t\tif bodyNode != nil {\n\t\t\t\t\t\t\t// For anonymous default exports, we need to construct the signature differently\n\t\t\t\t\t\t\tsignature := \"\"\n\t\t\t\t\t\t\tif anonNode.Type() == \"function_declaration\" {\n\t\t\t\t\t\t\t\tsignature = \"export default function()\"\n\t\t\t\t\t\t\t} else if anonNode.Type() == \"class_declaration\" {\n\t\t\t\t\t\t\t\tsignature = \"export default class\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\tchunkContent = signature + \" { ... } // Body removed\"\n\n\t\t\t\t\t\t\t// Add to chunks and continue to next capture\n\t\t\t\t\t\t\tif chunkContent != \"\" {\n\t\t\t\t\t\t\t\tchunks = append(chunks, CodeChunk{\n\t\t\t\t\t\t\t\t\tContent:   chunkContent,\n\t\t\t\t\t\t\t\t\tStartByte: node.StartByte(),\n\t\t\t\t\t\t\t\t\tEndByte:   node.EndByte(),\n\t\t\t\t\t\t\t\t})\n\t\t\t\t\t\t\t\tseenNodes[node.StartByte()] = struct{}{}\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\tcontinue\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\t\t// Standard processing for other JavaScript constructs\n\t\t\t\t\tswitch jsNode.Type() {\n\t\t\t\t\tcase \"export_statement\":\n\t\t\t\t\t\tfoundDecl := jsNode.ChildByFieldName(\"declaration\")\n\t\t\t\t\t\tif foundDecl != nil && (isNamedDeclarationType(foundDecl) || foundDecl.Type() == \"arrow_function\" || foundDecl.Type() == \"function_expression\") {\n\t\t\t\t\t\t\tactualDeclarationNode = foundDecl\n\t\t\t\t\t\t} else {\n\t\t\t\t\t\t\t// Check for `export default function_declaration` etc. (direct named child)\n\t\t\t\t\t\t\tvar directChildDecl *sitter.Node\n\t\t\t\t\t\t\tfor i := 0; i < int(jsNode.NamedChildCount()); i++ {\n\t\t\t\t\t\t\t\tchild := jsNode.NamedChild(i)\n\t\t\t\t\t\t\t\t// Broaden condition to find anonymous function/class declarations as well\n\t\t\t\t\t\t\t\tswitch child.Type() {\n\t\t\t\t\t\t\t\tcase \"function_declaration\", \"class_declaration\", \"generator_function_declaration\", \"arrow_function\", \"function_expression\":\n\t\t\t\t\t\t\t\t\t// These are types that can be default exported and might have bodies to strip.\n\t\t\t\t\t\t\t\t\tdirectChildDecl = child\n\t\t\t\t\t\t\t\tdefault:\n\t\t\t\t\t\t\t\t\t// Not a type we are looking for as a direct declaration in `export default ...`\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\tif directChildDecl != nil {\n\t\t\t\t\t\t\t\tactualDeclarationNode = directChildDecl\n\t\t\t\t\t\t\t} else {\n\t\t\t\t\t\t\t\tactualDeclarationNode = nil // No suitable declaration found in export statement\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\tcase \"lexical_declaration\", \"variable_declaration\":\n\t\t\t\t\t\t// For `const foo = () => {}` or `var bar = function() {}`\n\t\t\t\t\t\t// jsNode is lexical_declaration/variable_declaration.\n\t\t\t\t\t\t// We need to find the specific variable_declarator with an arrow_function/function_expression.\n\t\t\t\t\t\tvar targetArrowFunc *sitter.Node\n\t\t\t\t\t\tfor i := 0; i < int(jsNode.NamedChildCount()); i++ {\n\t\t\t\t\t\t\tdeclarator := jsNode.NamedChild(i)\n\t\t\t\t\t\t\tif declarator != nil && declarator.Type() == \"variable_declarator\" {\n\t\t\t\t\t\t\t\tvalueNode := declarator.ChildByFieldName(\"value\")\n\t\t\t\t\t\t\t\tif valueNode != nil && (valueNode.Type() == \"arrow_function\" || valueNode.Type() == \"function_expression\") {\n\t\t\t\t\t\t\t\t\t// Check if this arrow_function/function_expression has a statement_block body,\n\t\t\t\t\t\t\t\t\t// as per the query (`body: (statement_block)`).\n\t\t\t\t\t\t\t\t\tbodyCheck := valueNode.ChildByFieldName(\"body\")\n\t\t\t\t\t\t\t\t\tif bodyCheck != nil {\n\t\t\t\t\t\t\t\t\t\ttargetArrowFunc = valueNode\n\t\t\t\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\tif targetArrowFunc != nil {\n\t\t\t\t\t\t\tactualDeclarationNode = targetArrowFunc\n\t\t\t\t\t\t} else {\n\t\t\t\t\t\t\tactualDeclarationNode = nil // No suitable arrow function found\n\t\t\t\t\t\t}\n\t\t\t\t\tcase \"function_declaration\", \"generator_function_declaration\", \"class_declaration\", \"method_definition\":\n\t\t\t\t\t\t// jsNode is already the one with the body.\n\t\t\t\t\t\tactualDeclarationNode = jsNode\n\t\t\t\t\tdefault:\n\t\t\t\t\t\tactualDeclarationNode = nil // Other types not meant for body stripping by this rule.\n\t\t\t\t\t}\n\t\t\t\t} else if languageIdentifier == \"go\" || languageIdentifier == \"python\" || languageIdentifier == \"bash\" || languageIdentifier == \"c\" {\n\t\t\t\t\tactualDeclarationNode = jsNode // For Go/Python/Bash/C, the captured node is the declaration itself\n\t\t\t\t} else {\n\t\t\t\t\tactualDeclarationNode = nil // Should not happen if language is supported\n\t\t\t\t}\n\n\t\t\t\tif actualDeclarationNode != nil {\n\t\t\t\t\tbodyNode = actualDeclarationNode.ChildByFieldName(\"body\")\n\t\t\t\t}\n\n\t\t\t\tif bodyNode != nil {\n\t\t\t\t\tvar signatureEndPos uint32\n\t\t\t\t\tif languageIdentifier == \"python\" {\n\t\t\t\t\t\tvar colonNode *sitter.Node\n\t\t\t\t\t\tfor i := 0; i < int(actualDeclarationNode.ChildCount()); i++ {\n\t\t\t\t\t\t\tchildNode := actualDeclarationNode.Child(i)\n\t\t\t\t\t\t\tif childNode != nil && childNode.Type() == \":\" {\n\t\t\t\t\t\t\t\tcolonNode = childNode\n\t\t\t\t\t\t\t\tbreak\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t\tif colonNode != nil {\n\t\t\t\t\t\t\tsignatureEndPos = colonNode.EndByte()\n\t\t\t\t\t\t} else {\n\t\t\t\t\t\t\tsignatureEndPos = bodyNode.StartByte() // Fallback\n\t\t\t\t\t\t}\n\t\t\t\t\t} else {\n\t\t\t\t\t\t// For Go and JS (functions, methods, classes, or arrow funcs with statement_block bodies)\n\t\t\t\t\t\t// Signature ends where the body node begins.\n\t\t\t\t\t\tsignatureEndPos = bodyNode.StartByte()\n\t\t\t\t\t}\n\n\t\t\t\t\t// Ensure signaturePortion starts from the beginning of the originally captured node (`jsNode`).\n\t\t\t\t\tif signatureEndPos > jsNode.StartByte() && signatureEndPos <= jsNode.EndByte() {\n\t\t\t\t\t\tsignaturePortion := string(source[jsNode.StartByte():signatureEndPos])\n\t\t\t\t\t\ttrimmedSignature := strings.TrimSpace(signaturePortion)\n\n\t\t\t\t\t\tplaceholder := \" { ... }\"\n\t\t\t\t\t\tswitch languageIdentifier {\n\t\t\t\t\t\tcase \"python\":\n\t\t\t\t\t\t\tplaceholder = \" { ... } # Body removed\"\n\t\t\t\t\t\tcase \"javascript\", \"typescript\", \"c\", \"html\":\n\t\t\t\t\t\t\tplaceholder = \" { ... } // Body removed\"\n\t\t\t\t\t\tcase \"bash\":\n\t\t\t\t\t\t\tplaceholder = \" { ... } # Body removed\"\n\t\t\t\t\t\t}\n\t\t\t\t\t\tchunkContent = trimmedSignature + placeholder\n\t\t\t\t\t} else {\n\t\t\t\t\t\t// Fallback: if positions are unusual, keep the original content of the captured node.\n\t\t\t\t\t\tchunkContent = strings.TrimSpace(jsNode.Content(source))\n\t\t\t\t\t}\n\t\t\t\t} else {\n\t\t\t\t\t// No bodyNode found, or actualDeclarationNode was nil (e.g. not a strippable type, or error in logic).\n\t\t\t\t\t// Keep the original content of the captured node.\n\t\t\t\t\tchunkContent = strings.TrimSpace(jsNode.Content(source))\n\t\t\t\t}\n\n\t\t\tcase strings.HasPrefix(captureName, \"import\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"package\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.type\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.struct\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.enum\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.union\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"definition.typedef\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"export.other\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"command\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"rule_set\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"media\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"keyframes\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"declaration\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"comment\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"doctype\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"tag\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"script\"),\n\t\t\t\tstrings.HasPrefix(captureName, \"style\"):\n\t\t\t\tchunkContent = strings.TrimSpace(nodeContent)\n\t\t\tdefault:\n\t\t\t\tcontinue\n\t\t\t}\n\n\t\t\tif chunkContent != \"\" {\n\t\t\t\tchunks = append(chunks, CodeChunk{\n\t\t\t\t\tContent:   chunkContent,\n\t\t\t\t\tStartByte: node.StartByte(),\n\t\t\t\t\tEndByte:   node.EndByte(),\n\t\t\t\t})\n\t\t\t\tseenNodes[node.StartByte()] = struct{}{}\n\t\t\t}\n\t\t}\n\t}\n\treturn chunks\n}\n\n// compressSwift is a specialized function to compress Swift content\nfunc (gc *GenericCompressor) compressSwift(content []byte) string {\n\t// Convert content to string for easier processing\n\tswiftStr := string(content)\n\n\t// Extract Swift comments\n\tsingleLineCommentRegex := regexp.MustCompile(`(?m)//.*$`)\n\tsingleLineComments := singleLineCommentRegex.FindAllString(swiftStr, -1)\n\n\tmultiLineCommentRegex := regexp.MustCompile(`/\\*[\\s\\S]*?\\*/`)\n\tmultiLineComments := multiLineCommentRegex.FindAllString(swiftStr, -1)\n\n\t// Extract imports\n\timportRegex := regexp.MustCompile(`import\\s+\\w+`)\n\timports := importRegex.FindAllString(swiftStr, -1)\n\n\t// Extract struct definitions\n\tstructRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|fileprivate\\s+|internal\\s+)?struct\\s+\\w+(?::\\s*[^{]+)?\\s*\\{`)\n\tstructs := structRegex.FindAllString(swiftStr, -1)\n\n\t// Extract class definitions\n\tclassRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|fileprivate\\s+|internal\\s+)?(?:final\\s+)?class\\s+\\w+(?::\\s*[^{]+)?\\s*\\{`)\n\tclasses := classRegex.FindAllString(swiftStr, -1)\n\n\t// Extract protocol definitions\n\tprotocolRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|fileprivate\\s+|internal\\s+)?protocol\\s+\\w+(?::\\s*[^{]+)?\\s*\\{`)\n\tprotocols := protocolRegex.FindAllString(swiftStr, -1)\n\n\t// Extract enum definitions\n\tenumRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|fileprivate\\s+|internal\\s+)?enum\\s+\\w+(?::\\s*[^{]+)?\\s*\\{`)\n\tenums := enumRegex.FindAllString(swiftStr, -1)\n\n\t// Extract function signatures\n\tfuncRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|fileprivate\\s+|internal\\s+)?(?:static\\s+|class\\s+)?func\\s+\\w+\\s*\\([^)]*\\)(?:\\s*->\\s*[^{]+)?\\s*\\{`)\n\tfuncs := funcRegex.FindAllString(swiftStr, -1)\n\n\t// Extract extension blocks\n\textensionRegex := regexp.MustCompile(`extension\\s+\\w+(?::\\s*[^{]+)?\\s*\\{`)\n\textensions := extensionRegex.FindAllString(swiftStr, -1)\n\n\t// Combine all extracted elements\n\tvar chunks []string\n\tchunks = append(chunks, singleLineComments...)\n\tchunks = append(chunks, multiLineComments...)\n\tchunks = append(chunks, imports...)\n\tchunks = append(chunks, structs...)\n\tchunks = append(chunks, classes...)\n\tchunks = append(chunks, protocols...)\n\tchunks = append(chunks, enums...)\n\tchunks = append(chunks, funcs...)\n\tchunks = append(chunks, extensions...)\n\n\tif len(chunks) == 0 {\n\t\treturn \"// No relevant Swift elements found after compression.\\n\"\n\t}\n\n\tvar result strings.Builder\n\tfor _, chunk := range chunks {\n\t\tresult.WriteString(strings.TrimSpace(chunk))\n\t\tresult.WriteString(\"\\n// -----\\n\")\n\t}\n\n\treturn result.String()\n}\n\n// compressJava is a specialized function to compress Java content\nfunc (gc *GenericCompressor) compressJava(content []byte) string {\n\t// Convert content to string for easier processing\n\tjavaStr := string(content)\n\n\t// Extract Java comments\n\tsingleLineCommentRegex := regexp.MustCompile(`(?m)//.*$`)\n\tsingleLineComments := singleLineCommentRegex.FindAllString(javaStr, -1)\n\n\tmultiLineCommentRegex := regexp.MustCompile(`/\\*[\\s\\S]*?\\*/`)\n\tmultiLineComments := multiLineCommentRegex.FindAllString(javaStr, -1)\n\n\t// Extract package declaration\n\tpackageRegex := regexp.MustCompile(`package\\s+[^;]+;`)\n\tpackages := packageRegex.FindAllString(javaStr, -1)\n\n\t// Extract imports\n\timportRegex := regexp.MustCompile(`import\\s+[^;]+;`)\n\timports := importRegex.FindAllString(javaStr, -1)\n\n\t// Extract class definitions\n\tclassRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|protected\\s+)?(?:abstract\\s+|final\\s+)?class\\s+\\w+(?:\\s+extends\\s+\\w+)?(?:\\s+implements\\s+[^{]+)?\\s*\\{`)\n\tclasses := classRegex.FindAllString(javaStr, -1)\n\n\t// Extract interface definitions\n\tinterfaceRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|protected\\s+)?interface\\s+\\w+(?:\\s+extends\\s+[^{]+)?\\s*\\{`)\n\tinterfaces := interfaceRegex.FindAllString(javaStr, -1)\n\n\t// Extract enum definitions\n\tenumRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|protected\\s+)?enum\\s+\\w+\\s*\\{`)\n\tenums := enumRegex.FindAllString(javaStr, -1)\n\n\t// Extract method signatures\n\tmethodRegex := regexp.MustCompile(`(?:public\\s+|private\\s+|protected\\s+)?(?:static\\s+|final\\s+|abstract\\s+)*(?:<[^>]+>\\s+)?(?:\\w+(?:\\[\\])?\\s+)?\\w+\\s*\\([^)]*\\)(?:\\s+throws\\s+[^{]+)?\\s*\\{`)\n\tmethods := methodRegex.FindAllString(javaStr, -1)\n\n\t// Combine all extracted elements\n\tvar chunks []string\n\tchunks = append(chunks, singleLineComments...)\n\tchunks = append(chunks, multiLineComments...)\n\tchunks = append(chunks, packages...)\n\tchunks = append(chunks, imports...)\n\tchunks = append(chunks, classes...)\n\tchunks = append(chunks, interfaces...)\n\tchunks = append(chunks, enums...)\n\tchunks = append(chunks, methods...)\n\n\tif len(chunks) == 0 {\n\t\treturn \"// No relevant Java elements found after compression.\\n\"\n\t}\n\n\tvar result strings.Builder\n\tfor _, chunk := range chunks {\n\t\tresult.WriteString(strings.TrimSpace(chunk))\n\t\tresult.WriteString(\"\\n// -----\\n\")\n\t}\n\n\treturn result.String()\n}\n\n// compressRust is a specialized function to compress Rust content\nfunc (gc *GenericCompressor) compressRust(content []byte) string {\n\t// Convert content to string for easier processing\n\trustStr := string(content)\n\n\t// Extract Rust comments\n\tsingleLineCommentRegex := regexp.MustCompile(`(?m)//.*$`)\n\tsingleLineComments := singleLineCommentRegex.FindAllString(rustStr, -1)\n\n\tmultiLineCommentRegex := regexp.MustCompile(`/\\*[\\s\\S]*?\\*/`)\n\tmultiLineComments := multiLineCommentRegex.FindAllString(rustStr, -1)\n\n\t// Extract imports\n\timportRegex := regexp.MustCompile(`use\\s+[^;]+;`)\n\timports := importRegex.FindAllString(rustStr, -1)\n\n\t// Extract struct definitions\n\tstructRegex := regexp.MustCompile(`(?:pub\\s+)?struct\\s+\\w+\\s*\\{[^}]*\\}`)\n\tstructs := structRegex.FindAllString(rustStr, -1)\n\n\t// Extract trait definitions\n\ttraitRegex := regexp.MustCompile(`(?:pub\\s+)?trait\\s+\\w+\\s*\\{[^}]*\\}`)\n\ttraits := traitRegex.FindAllString(rustStr, -1)\n\n\t// Extract enum definitions\n\tenumRegex := regexp.MustCompile(`(?:pub\\s+)?enum\\s+\\w+\\s*\\{[^}]*\\}`)\n\tenums := enumRegex.FindAllString(rustStr, -1)\n\n\t// Extract function signatures\n\tfuncRegex := regexp.MustCompile(`(?:pub\\s+)?fn\\s+\\w+\\s*\\([^)]*\\)(?:\\s*->\\s*[^{]+)?\\s*\\{`)\n\tfuncs := funcRegex.FindAllString(rustStr, -1)\n\n\t// Extract impl blocks\n\timplRegex := regexp.MustCompile(`impl(?:\\s+\\w+)?\\s+for\\s+\\w+\\s*\\{`)\n\timpls := implRegex.FindAllString(rustStr, -1)\n\n\t// Combine all extracted elements\n\tvar chunks []string\n\tchunks = append(chunks, singleLineComments...)\n\tchunks = append(chunks, multiLineComments...)\n\tchunks = append(chunks, imports...)\n\tchunks = append(chunks, structs...)\n\tchunks = append(chunks, traits...)\n\tchunks = append(chunks, enums...)\n\tchunks = append(chunks, funcs...)\n\tchunks = append(chunks, impls...)\n\n\tif len(chunks) == 0 {\n\t\treturn \"// No relevant Rust elements found after compression.\\n\"\n\t}\n\n\tvar result strings.Builder\n\tfor _, chunk := range chunks {\n\t\tresult.WriteString(strings.TrimSpace(chunk))\n\t\tresult.WriteString(\"\\n// -----\\n\")\n\t}\n\n\treturn result.String()\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.c",
    "content": "#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n// Define a struct\nstruct Person {\n    char name[50];\n    int age;\n    float height;\n};\n\n// Define an enum\nenum Color {\n    RED,\n    GREEN,\n    BLUE,\n    YELLOW\n};\n\n// Define a union\nunion Data {\n    int i;\n    float f;\n    char str[20];\n};\n\n// Function prototype\nvoid greet(const char* name);\nint calculate_sum(int a, int b);\n\n// Typedef example\ntypedef unsigned long int UINT32;\n\n/**\n * Main function\n */\nint main(int argc, char *argv[]) {\n    // Variable declarations\n    struct Person person;\n    enum Color favorite_color = BLUE;\n    union Data data;\n    UINT32 big_number = 123456789UL;\n\n    // Initialize struct\n    strcpy(person.name, \"John\");\n    person.age = 30;\n    person.height = 1.75;\n\n    // Print information\n    printf(\"Name: %s\\n\", person.name);\n    printf(\"Age: %d\\n\", person.age);\n    printf(\"Height: %.2f\\n\", person.height);\n\n    // Call functions\n    greet(person.name);\n    printf(\"Sum: %d\\n\", calculate_sum(5, 7));\n\n    // Use union\n    data.i = 10;\n    printf(\"data.i: %d\\n\", data.i);\n    data.f = 220.5;\n    printf(\"data.f: %.2f\\n\", data.f);\n    strcpy(data.str, \"C Programming\");\n    printf(\"data.str: %s\\n\", data.str);\n\n    return 0;\n}\n\n/**\n * Greet function implementation\n */\nvoid greet(const char* name) {\n    printf(\"Hello, %s!\\n\", name);\n\n    if (strlen(name) > 5) {\n        printf(\"You have a long name!\\n\");\n    } else {\n        printf(\"You have a short name!\\n\");\n    }\n}\n\n/**\n * Calculate sum function implementation\n */\nint calculate_sum(int a, int b) {\n    int result = a + b;\n    return result;\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.css",
    "content": "/* This is a CSS comment */\n\n/* Import statement */\n@import url('https://fonts.googleapis.com/css2?family=Roboto:wght@400;700&display=swap');\n\n/* Root variables */\n:root {\n  --primary-color: #3498db;\n  --secondary-color: #2ecc71;\n  --text-color: #333;\n  --background-color: #f9f9f9;\n  --spacing-unit: 8px;\n}\n\n/* Body styles */\nbody {\n  font-family: 'Roboto', sans-serif;\n  line-height: 1.6;\n  color: var(--text-color);\n  background-color: var(--background-color);\n  margin: 0;\n  padding: 0;\n}\n\n/* Header styles */\nheader {\n  background-color: var(--primary-color);\n  color: white;\n  padding: calc(var(--spacing-unit) * 2);\n  box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);\n}\n\n/* Navigation styles */\nnav {\n  display: flex;\n  justify-content: space-between;\n  align-items: center;\n}\n\nnav ul {\n  display: flex;\n  list-style: none;\n  margin: 0;\n  padding: 0;\n}\n\nnav li {\n  margin-left: var(--spacing-unit);\n}\n\nnav a {\n  color: white;\n  text-decoration: none;\n  padding: var(--spacing-unit);\n  border-radius: 4px;\n  transition: background-color 0.3s ease;\n}\n\nnav a:hover {\n  background-color: rgba(255, 255, 255, 0.2);\n}\n\n/* Main content */\nmain {\n  max-width: 1200px;\n  margin: 0 auto;\n  padding: calc(var(--spacing-unit) * 3);\n}\n\n/* Card component */\n.card {\n  background-color: white;\n  border-radius: 8px;\n  box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);\n  padding: calc(var(--spacing-unit) * 2);\n  margin-bottom: calc(var(--spacing-unit) * 2);\n}\n\n.card-title {\n  color: var(--primary-color);\n  margin-top: 0;\n}\n\n/* Button styles */\n.button {\n  display: inline-block;\n  background-color: var(--primary-color);\n  color: white;\n  padding: var(--spacing-unit) calc(var(--spacing-unit) * 2);\n  border: none;\n  border-radius: 4px;\n  cursor: pointer;\n  text-decoration: none;\n  transition: background-color 0.3s ease;\n}\n\n.button:hover {\n  background-color: #2980b9;\n}\n\n.button.secondary {\n  background-color: var(--secondary-color);\n}\n\n.button.secondary:hover {\n  background-color: #27ae60;\n}\n\n/* Media queries */\n@media (max-width: 768px) {\n  nav {\n    flex-direction: column;\n  }\n\n  nav ul {\n    margin-top: var(--spacing-unit);\n  }\n\n  .card {\n    padding: var(--spacing-unit);\n  }\n}\n\n/* Animation */\n@keyframes fadeIn {\n  from {\n    opacity: 0;\n  }\n  to {\n    opacity: 1;\n  }\n}\n\n.fade-in {\n  animation: fadeIn 0.5s ease-in;\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.go",
    "content": "package main\n\nimport \"fmt\"\n\n// This is a comment\ntype MyStruct struct {\n\tFieldA int\n\tFieldB string\n}\n\nfunc (s *MyStruct) MyMethod(val int) string {\n\t// Method body\n\tif val > 0 {\n\t\treturn fmt.Sprintf(\"Positive: %d\", val)\n\t}\n\treturn \"Zero or Negative\"\n}\n\nfunc main() {\n\t// Main function body\n\tinstance := MyStruct{FieldA: 1, FieldB: \"test\"}\n\tfmt.Println(instance.MyMethod(5))\n\tfmt.Println(\"Hello, world!\")\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n    <title>Example HTML Document</title>\n    <link rel=\"stylesheet\" href=\"styles.css\">\n    <script src=\"script.js\"></script>\n    <style>\n        body {\n            font-family: Arial, sans-serif;\n            line-height: 1.6;\n            color: #333;\n            max-width: 800px;\n            margin: 0 auto;\n            padding: 20px;\n        }\n        header {\n            background-color: #f4f4f4;\n            padding: 1rem;\n            margin-bottom: 1rem;\n        }\n        nav ul {\n            display: flex;\n            list-style: none;\n            padding: 0;\n        }\n        nav li {\n            margin-right: 1rem;\n        }\n        .container {\n            border: 1px solid #ddd;\n            padding: 1rem;\n        }\n        .btn {\n            display: inline-block;\n            background: #333;\n            color: #fff;\n            padding: 0.5rem 1rem;\n            text-decoration: none;\n            border-radius: 3px;\n        }\n    </style>\n</head>\n<body>\n    <!-- Header section -->\n    <header>\n        <h1>Example HTML Document</h1>\n        <nav>\n            <ul>\n                <li><a href=\"#home\">Home</a></li>\n                <li><a href=\"#about\">About</a></li>\n                <li><a href=\"#services\">Services</a></li>\n                <li><a href=\"#contact\">Contact</a></li>\n            </ul>\n        </nav>\n    </header>\n\n    <!-- Main content -->\n    <main>\n        <section id=\"home\">\n            <h2>Welcome to our website</h2>\n            <p>This is an example HTML document that demonstrates various HTML elements and structure.</p>\n            <div class=\"container\">\n                <p>This is a container with some content.</p>\n                <a href=\"#\" class=\"btn\">Learn More</a>\n            </div>\n        </section>\n\n        <section id=\"about\">\n            <h2>About Us</h2>\n            <p>We are a company that specializes in creating example HTML documents.</p>\n            <ul>\n                <li>Founded in 2023</li>\n                <li>Based in Example City</li>\n                <li>Serving clients worldwide</li>\n            </ul>\n        </section>\n\n        <section id=\"services\">\n            <h2>Our Services</h2>\n            <div class=\"service\">\n                <h3>Web Design</h3>\n                <p>We create beautiful and functional websites.</p>\n            </div>\n            <div class=\"service\">\n                <h3>Web Development</h3>\n                <p>We build robust web applications.</p>\n            </div>\n            <div class=\"service\">\n                <h3>SEO Optimization</h3>\n                <p>We help your website rank higher in search results.</p>\n            </div>\n        </section>\n\n        <section id=\"contact\">\n            <h2>Contact Us</h2>\n            <form>\n                <div>\n                    <label for=\"name\">Name:</label>\n                    <input type=\"text\" id=\"name\" name=\"name\" required>\n                </div>\n                <div>\n                    <label for=\"email\">Email:</label>\n                    <input type=\"email\" id=\"email\" name=\"email\" required>\n                </div>\n                <div>\n                    <label for=\"message\">Message:</label>\n                    <textarea id=\"message\" name=\"message\" rows=\"4\" required></textarea>\n                </div>\n                <button type=\"submit\">Send Message</button>\n            </form>\n        </section>\n    </main>\n\n    <!-- Footer section -->\n    <footer>\n        <p>&copy; 2023 Example Company. All rights reserved.</p>\n    </footer>\n\n    <script>\n        // Example JavaScript\n        document.addEventListener('DOMContentLoaded', function() {\n            const form = document.querySelector('form');\n            form.addEventListener('submit', function(event) {\n                event.preventDefault();\n                alert('Form submitted!');\n            });\n\n            const buttons = document.querySelectorAll('.btn');\n            buttons.forEach(button => {\n                button.addEventListener('click', function() {\n                    alert('Button clicked!');\n                });\n            });\n        });\n    </script>\n</body>\n</html>\n"
  },
  {
    "path": "internal/compressor/testdata/example.java",
    "content": "// This is a Java comment\n/* This is a multi-line\n   Java comment */\n\n// Package declaration\npackage com.example.demo;\n\n// Import statements\nimport java.util.ArrayList;\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\nimport java.util.Optional;\nimport java.util.stream.Collectors;\n\n/**\n * This is a Javadoc comment for the Person class\n * @author Example Author\n */\npublic class Person {\n    // Instance variables\n    private String name;\n    private int age;\n    private String address;\n\n    // Static variable\n    private static int count = 0;\n\n    // Constants\n    public static final int MAX_AGE = 120;\n\n    // Constructor\n    public Person(String name, int age) {\n        this.name = name;\n        this.age = age;\n        this.address = null;\n        count++;\n    }\n\n    // Getter methods\n    public String getName() {\n        return name;\n    }\n\n    public int getAge() {\n        return age;\n    }\n\n    public String getAddress() {\n        return address;\n    }\n\n    // Setter methods\n    public void setName(String name) {\n        this.name = name;\n    }\n\n    public void setAge(int age) {\n        if (age > MAX_AGE) {\n            throw new IllegalArgumentException(\"Age cannot be greater than \" + MAX_AGE);\n        }\n        this.age = age;\n    }\n\n    public void setAddress(String address) {\n        this.address = address;\n    }\n\n    // Static method\n    public static int getCount() {\n        return count;\n    }\n\n    // Method with return value\n    public String getInfo() {\n        if (address != null) {\n            return name + \", age \" + age + \", lives at \" + address;\n        } else {\n            return name + \", age \" + age;\n        }\n    }\n\n    // Method with parameters\n    public boolean isOlderThan(Person other) {\n        return this.age > other.age;\n    }\n\n    // Override toString method\n    @Override\n    public String toString() {\n        return getInfo();\n    }\n\n    // Inner class\n    public class Address {\n        private String street;\n        private String city;\n        private String zipCode;\n\n        public Address(String street, String city, String zipCode) {\n            this.street = street;\n            this.city = city;\n            this.zipCode = zipCode;\n        }\n\n        public String getFullAddress() {\n            return street + \", \" + city + \" \" + zipCode;\n        }\n    }\n}\n\n// Interface definition\ninterface Printable {\n    void print();\n\n    // Default method\n    default void printWithPrefix(String prefix) {\n        System.out.println(prefix + \": \" + toString());\n    }\n}\n\n// Enum definition\nenum Status {\n    ACTIVE(\"Active\"),\n    INACTIVE(\"Inactive\"),\n    PENDING(\"Pending\");\n\n    private final String label;\n\n    Status(String label) {\n        this.label = label;\n    }\n\n    public String getLabel() {\n        return label;\n    }\n}\n\n// Main class\npublic class Main {\n    public static void main(String[] args) {\n        // Create objects\n        Person person1 = new Person(\"John Doe\", 30);\n        person1.setAddress(\"123 Main St\");\n\n        Person person2 = new Person(\"Jane Smith\", 25);\n        person2.setAddress(\"456 Oak Ave\");\n\n        // Use methods\n        System.out.println(person1.getInfo());\n        System.out.println(\"Total persons: \" + Person.getCount());\n\n        // Conditional statement\n        if (person1.isOlderThan(person2)) {\n            System.out.println(person1.getName() + \" is older than \" + person2.getName());\n        } else {\n            System.out.println(person2.getName() + \" is older than \" + person1.getName());\n        }\n\n        // Collections\n        List<Person> people = new ArrayList<>();\n        people.add(person1);\n        people.add(person2);\n\n        // Stream API\n        double averageAge = people.stream()\n                .mapToInt(Person::getAge)\n                .average()\n                .orElse(0);\n\n        System.out.println(\"Average age: \" + averageAge);\n\n        // Lambda expression\n        people.forEach(p -> System.out.println(p.getName()));\n\n        // Try-catch block\n        try {\n            Person invalidPerson = new Person(\"Invalid\", 150);\n        } catch (IllegalArgumentException e) {\n            System.out.println(\"Error: \" + e.getMessage());\n        } finally {\n            System.out.println(\"Finished processing\");\n        }\n    }\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.js",
    "content": "// This is a JavaScript comment\nimport { something } from 'module';\n\nexport class MyJSClass {\n    constructor(name) {\n        this.name = name;\n    }\n\n    greet(message) {\n        // Method body\n        console.log(message + \", \" + this.name + \"!\");\n        if (this.name.length > 3) {\n            console.log(\"Long name in JS\");\n        }\n        return true;\n    }\n}\n\nexport function myJSFunction(x, y) {\n    // Function body\n    const result = x + y;\n    console.log(\"JS Result is \" + result);\n    return result;\n}\n\n// Arrow function with block body\nconst myArrowFunc = (a, b) => {\n    // Arrow function body\n    return a * b;\n};\n\n// Arrow function with expression body\nconst myExpressionArrow = (x) => x * x;\n\nfunction* myGenerator() {\n    yield 1;\n    yield 2;\n}\n\n// Another export type\nexport const myVar = 42;\n\n// Anonymous default export function\n// Commented out for testing separately\n// export default function() {\n//     console.log(\"Anon default func body\");\n//     return \"done\";\n// }\n\n// Anonymous default export class\nexport default class {\n    constructor() {\n        this.x = 1;\n    }\n}\n\n// Create a second test file for the anonymous function default export\n"
  },
  {
    "path": "internal/compressor/testdata/example.py",
    "content": "# This is a Python comment\nimport os\nfrom sys import argv\n\nclass MyClass:\n    \"\"\"\n    A simple class\n    \"\"\"\n    def __init__(self, name):\n        self.name = name\n\n    def greet(self, message):\n        \"\"\"Greets the person.\"\"\"\n        # Method body\n        print(f\"{message}, {self.name}!\")\n        if len(self.name) > 3:\n            print(\"Long name\")\n        return True\n\ndef my_function(x, y):\n    # Function body\n    result = x + y\n    print(f\"Result is {result}\")\n    return result\n\nif __name__ == \"__main__\":\n    c = MyClass(\"Test\")\n    c.greet(\"Hello\")\n    my_function(1, 2)\n"
  },
  {
    "path": "internal/compressor/testdata/example.rs",
    "content": "// This is a Rust comment\n/* This is a multi-line\n   Rust comment */\n\n// Import statements\nuse std::collections::HashMap;\nuse std::io::{self, Read, Write};\nuse std::sync::{Arc, Mutex};\n\n// Constants and statics\nconst MAX_SIZE: usize = 100;\nstatic GLOBAL_COUNTER: std::sync::atomic::AtomicUsize = std::sync::atomic::AtomicUsize::new(0);\n\n// Struct definition\npub struct Person {\n    name: String,\n    age: u32,\n    address: Option<String>,\n}\n\n// Implementation block\nimpl Person {\n    // Constructor\n    pub fn new(name: &str, age: u32) -> Self {\n        Person {\n            name: name.to_string(),\n            age,\n            address: None,\n        }\n    }\n\n    // Method with parameters\n    pub fn set_address(&mut self, address: String) {\n        self.address = Some(address);\n    }\n\n    // Method with return value\n    pub fn get_info(&self) -> String {\n        match &self.address {\n            Some(addr) => format!(\"{}, age {}, lives at {}\", self.name, self.age, addr),\n            None => format!(\"{}, age {}\", self.name, self.age),\n        }\n    }\n}\n\n// Trait definition\npub trait Printable {\n    fn print(&self);\n\n    // Default implementation\n    fn print_debug(&self) {\n        println!(\"Debug print\");\n    }\n}\n\n// Trait implementation\nimpl Printable for Person {\n    fn print(&self) {\n        println!(\"{}\", self.get_info());\n    }\n}\n\n// Enum definition\npub enum Status {\n    Active,\n    Inactive,\n    Pending(String),\n    Error { code: u32, message: String },\n}\n\n// Function with generic type\npub fn process<T: Printable>(item: &T) {\n    item.print();\n}\n\n// Main function\nfn main() {\n    // Variable declaration\n    let mut person = Person::new(\"John Doe\", 30);\n    person.set_address(\"123 Main St\".to_string());\n\n    // Function call\n    process(&person);\n\n    // Pattern matching\n    let status = Status::Pending(\"Awaiting approval\".to_string());\n    match status {\n        Status::Active => println!(\"Active\"),\n        Status::Inactive => println!(\"Inactive\"),\n        Status::Pending(reason) => println!(\"Pending: {}\", reason),\n        Status::Error { code, message } => println!(\"Error {}: {}\", code, message),\n    }\n\n    // Closure\n    let add = |a: i32, b: i32| a + b;\n    println!(\"5 + 3 = {}\", add(5, 3));\n\n    // Error handling\n    let result = std::fs::read_to_string(\"nonexistent.txt\");\n    match result {\n        Ok(content) => println!(\"File content: {}\", content),\n        Err(error) => println!(\"Error reading file: {}\", error),\n    }\n}\n"
  },
  {
    "path": "internal/compressor/testdata/example.sh",
    "content": "#!/bin/bash\n\n# This is a bash comment\n\n# Define a function\nfunction greet() {\n    local name=$1\n    echo \"Hello, $name!\"\n    if [[ ${#name} -gt 3 ]]; then\n        echo \"That's a long name!\"\n    fi\n    return 0\n}\n\n# Define another function\nprocess_file() {\n    local file=$1\n    if [[ -f \"$file\" ]]; then\n        echo \"Processing $file...\"\n        cat \"$file\" | grep \"pattern\"\n    else\n        echo \"File not found: $file\"\n        return 1\n    fi\n}\n\n# Main script execution\necho \"Starting script\"\ngreet \"World\"\nprocess_file \"/tmp/example.txt\"\n\n# Conditional logic\nif [[ $? -eq 0 ]]; then\n    echo \"Success!\"\nelse\n    echo \"Failed!\"\nfi\n\n# Loop example\nfor i in {1..5}; do\n    echo \"Iteration $i\"\ndone\n\nexit 0\n"
  },
  {
    "path": "internal/compressor/testdata/example.swift",
    "content": "// This is a Swift comment\n/* This is a multi-line\n   Swift comment */\n\n// Import statements\nimport Foundation\n\n// Constants and variables\nlet maxAge = 120\nvar count = 0\n\n// Struct definition\nstruct Address {\n    let street: String\n    let city: String\n    let zipCode: String\n\n    // Computed property\n    var fullAddress: String {\n        return \"\\(street), \\(city) \\(zipCode)\"\n    }\n}\n\n// Class definition\nclass Person {\n    // Properties\n    var name: String\n    var age: Int\n    var address: Address?\n\n    // Static property\n    static var count = 0\n\n    // Initializer\n    init(name: String, age: Int) {\n        self.name = name\n        self.age = age\n        Person.count += 1\n    }\n\n    // Convenience initializer\n    convenience init(name: String) {\n        self.init(name: name, age: 0)\n    }\n\n    // Deinitializer\n    deinit {\n        Person.count -= 1\n    }\n\n    // Method with parameters\n    func setAddress(street: String, city: String, zipCode: String) {\n        self.address = Address(street: street, city: city, zipCode: zipCode)\n    }\n\n    // Method with return value\n    func getInfo() -> String {\n        if let address = address {\n            return \"\\(name), age \\(age), lives at \\(address.fullAddress)\"\n        } else {\n            return \"\\(name), age \\(age)\"\n        }\n    }\n\n    // Method with parameters and return value\n    func isOlderThan(_ other: Person) -> Bool {\n        return self.age > other.age\n    }\n\n    // Static method\n    static func getCount() -> Int {\n        return count\n    }\n}\n\n// Extension\nextension Person {\n    // Additional method in extension\n    func celebrateBirthday() {\n        age += 1\n        print(\"\\(name) is now \\(age) years old!\")\n    }\n}\n\n// Protocol definition\nprotocol Printable {\n    func print()\n}\n\n// Protocol extension\nextension Printable {\n    // Default implementation\n    func printWithPrefix(_ prefix: String) {\n        Swift.print(\"\\(prefix): \\(self)\")\n    }\n}\n\n// Protocol conformance\nextension Person: Printable {\n    func print() {\n        Swift.print(getInfo())\n    }\n}\n\n// Enum definition\nenum Status {\n    case active\n    case inactive\n    case pending(String)\n    case error(code: Int, message: String)\n\n    // Method in enum\n    func description() -> String {\n        switch self {\n        case .active:\n            return \"Active\"\n        case .inactive:\n            return \"Inactive\"\n        case .pending(let reason):\n            return \"Pending: \\(reason)\"\n        case .error(let code, let message):\n            return \"Error \\(code): \\(message)\"\n        }\n    }\n}\n\n// Generic function\nfunc process<T: Printable>(_ item: T) {\n    item.print()\n}\n\n// Closure\nlet greet = { (name: String) -> String in\n    return \"Hello, \\(name)!\"\n}\n\n// Main function equivalent\nfunc main() {\n    // Create objects\n    let person1 = Person(name: \"John Doe\", age: 30)\n    person1.setAddress(street: \"123 Main St\", city: \"Anytown\", zipCode: \"12345\")\n\n    let person2 = Person(name: \"Jane Smith\", age: 25)\n    person2.setAddress(street: \"456 Oak Ave\", city: \"Somewhere\", zipCode: \"67890\")\n\n    // Use methods\n    print(person1.getInfo())\n    print(\"Total persons: \\(Person.getCount())\")\n\n    // Conditional statement\n    if person1.isOlderThan(person2) {\n        print(\"\\(person1.name) is older than \\(person2.name)\")\n    } else {\n        print(\"\\(person2.name) is older than \\(person1.name)\")\n    }\n\n    // Collections\n    var people = [person1, person2]\n\n    // Higher-order functions\n    let averageAge = people.reduce(0) { $0 + $1.age } / people.count\n    print(\"Average age: \\(averageAge)\")\n\n    // Closure usage\n    people.forEach { person in\n        print(person.name)\n    }\n\n    // Error handling\n    do {\n        let data = try Data(contentsOf: URL(fileURLWithPath: \"/nonexistent.txt\"))\n        print(\"File size: \\(data.count)\")\n    } catch {\n        print(\"Error reading file: \\(error)\")\n    }\n\n    // Pattern matching\n    let status = Status.pending(\"Awaiting approval\")\n    switch status {\n    case .active:\n        print(\"Active\")\n    case .inactive:\n        print(\"Inactive\")\n    case .pending(let reason):\n        print(\"Pending: \\(reason)\")\n    case .error(let code, let message):\n        print(\"Error \\(code): \\(message)\")\n    }\n}\n\n// Call main function\nmain()\n"
  },
  {
    "path": "internal/compressor/testdata/example_anon_func.js",
    "content": "// This is a JavaScript comment\nimport { something } from 'module';\n\n// Arrow function with expression body\nconst myExpressionArrow = (x) => x * x;\n\n// Anonymous default export function\nexport default function() {\n    console.log(\"Anon default func body\");\n    return \"done\";\n}\n"
  },
  {
    "path": "main.go",
    "content": "package main\n\nimport (\n\t\"context\"\n\t\"encoding/json\"\n\t\"errors\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/url\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"regexp\"\n\t\"sort\"\n\t\"strings\"\n\n\t\"github.com/charmbracelet/glamour\"\n\t\"github.com/fatih/color\"\n\t\"github.com/sammcj/ingest/config\"\n\t\"github.com/sammcj/ingest/filesystem\"\n\t\"github.com/sammcj/ingest/git\"\n\t\"github.com/sammcj/ingest/internal/compressor\" // Added compressor import\n\t\"github.com/sammcj/ingest/template\"\n\t\"github.com/sammcj/ingest/token\"\n\t\"github.com/sammcj/ingest/utils\"\n\t\"github.com/sammcj/ingest/web\"\n\t\"github.com/sammcj/quantest\"\n\topenai \"github.com/sashabaranov/go-openai\"\n\t\"github.com/spf13/cobra\"\n)\n\nvar (\n\tincludePriority      bool\n\texcludeFromTree      bool\n\ttokens               bool\n\tencoding             string\n\toutput               string\n\tdiff                 bool\n\tgitDiffBranch        string\n\tgitLogBranch         string\n\tlineNumber           bool\n\tnoCodeblock          bool\n\trelativePaths        bool\n\tnoClipboard          bool\n\ttemplatePath         string\n\tjsonOutput           bool\n\tpatternExclude       string\n\tprintDefaultExcludes bool\n\tprintDefaultTemplate bool\n\tpromptPrefix         string\n\tpromptSuffix         string\n\treport               bool\n\tvramFlag             bool\n\tmodelIDFlag          string\n\tquantFlag            string\n\tcontextFlag          int\n\tkvCacheFlag          string\n\tmemoryFlag           float64\n\tfitsFlag             float64\n\tquantTypeFlag        string\n\tverbose              bool\n\tnoDefaultExcludes    bool\n\tfollowSymlinks       bool\n\tVersion              string // This will be set by the linker at build time\n\trootCmd              *cobra.Command\n\twebCrawl             bool\n\twebMaxDepth          int\n\twebAllowedDomains    []string\n\twebTimeout           int\n\twebConcurrentJobs    int\n\tcompressFlag         bool // Added compress flag\n\tanthropicFlag        bool\n\tnoCorrectionFlag     bool\n)\n\ntype GitData struct {\n\tPath          string\n\tGitDiff       string\n\tGitDiffBranch string\n\tGitLogBranch  string\n}\n\nfunc init() {\n\trootCmd = &cobra.Command{\n\t\tUse:   \"ingest [flags] [path ...]\",\n\t\tShort: \"Generate a markdown LLM prompt from files and directories\",\n\t\tLong:  `ingest is a command-line tool to generate an LLM prompt from files and directories.`,\n\t\tRunE:  run,\n\t\tArgs:  cobra.ArbitraryArgs,\n\t}\n\n\t// Define flags\n\trootCmd.Flags().Bool(\"llm\", false, \"Send output to any OpenAI compatible API for inference\")\n\trootCmd.Flags().BoolP(\"version\", \"V\", false, \"Print the version number\")\n\trootCmd.Flags().BoolVar(&excludeFromTree, \"exclude-from-tree\", false, \"Exclude files/folders from the source tree based on exclude patterns\")\n\trootCmd.Flags().BoolVar(&includePriority, \"include-priority\", false, \"Include files in case of conflict between include and exclude patterns\")\n\trootCmd.Flags().BoolVar(&jsonOutput, \"json\", false, \"Print output as JSON\")\n\trootCmd.Flags().BoolVar(&noCodeblock, \"no-codeblock\", false, \"Disable wrapping code inside markdown code blocks\")\n\trootCmd.Flags().BoolVar(&printDefaultExcludes, \"print-default-excludes\", false, \"Print the default exclude patterns\")\n\trootCmd.Flags().BoolVar(&printDefaultTemplate, \"print-default-template\", false, \"Print the default template\")\n\trootCmd.Flags().BoolVar(&relativePaths, \"relative-paths\", false, \"Use relative paths instead of absolute paths, including the parent directory\")\n\trootCmd.Flags().BoolVar(&report, \"report\", true, \"Report the top 10 largest files included in the output\")\n\trootCmd.Flags().BoolVar(&tokens, \"tokens\", true, \"Display the token count of the generated prompt\")\n\trootCmd.Flags().BoolVarP(&diff, \"diff\", \"d\", false, \"Include git diff\")\n\trootCmd.Flags().BoolVarP(&lineNumber, \"line-number\", \"l\", false, \"Add line numbers to the source code\")\n\trootCmd.Flags().BoolVarP(&noClipboard, \"no-clipboard\", \"n\", false, \"Disable copying to clipboard\")\n\trootCmd.Flags().BoolVarP(&verbose, \"verbose\", \"v\", false, \"Enable verbose output\")\n\trootCmd.Flags().StringSliceP(\"exclude\", \"e\", nil, \"Patterns to exclude\")\n\trootCmd.Flags().StringSliceP(\"include\", \"i\", nil, \"Patterns to include\")\n\trootCmd.Flags().StringVar(&gitDiffBranch, \"git-diff-branch\", \"\", \"Generate git diff between two branches\")\n\trootCmd.Flags().StringVar(&gitLogBranch, \"git-log-branch\", \"\", \"Retrieve git log between two branches\")\n\trootCmd.Flags().StringVar(&patternExclude, \"pattern-exclude\", \"\", \"Path to a specific .glob file for exclude patterns\")\n\trootCmd.Flags().StringVarP(&encoding, \"encoding\", \"c\", \"o200k\", \"Tokeniser to use for token count (o200k, cl100k, p50k, r50k)\")\n\trootCmd.Flags().StringVarP(&output, \"output\", \"o\", \"\", \"Optional output file path\")\n\trootCmd.Flags().StringArrayP(\"prompt\", \"p\", nil, \"Prompt suffix to append to the generated content\")\n\trootCmd.Flags().StringVarP(&templatePath, \"template\", \"t\", \"\", \"Optional Path to a custom Handlebars template\")\n\trootCmd.Flags().BoolP(\"save\", \"s\", false, \"Automatically save the generated markdown to ~/ingest/<dirname>.md\")\n\trootCmd.Flags().Bool(\"config\", false, \"Open the config file in the default editor\")\n\trootCmd.Flags().BoolVar(&noDefaultExcludes, \"no-default-excludes\", false, \"Disable default exclude patterns\")\n\trootCmd.Flags().BoolVar(&followSymlinks, \"follow-symlinks\", false, \"Follow symlinked files and directories\")\n\trootCmd.Flags().BoolVar(&compressFlag, \"compress\", false, \"Enable code compression using Tree-sitter\") // Added compress flag\n\trootCmd.Flags().BoolVarP(&anthropicFlag, \"anthropic\", \"a\", false, \"Use Anthropic API for token counting (requires ANTHROPIC_API_KEY, ANTHROPIC_TOKEN, or ANTHROPIC_TOKEN_COUNT_KEY)\")\n\trootCmd.Flags().BoolVar(&noCorrectionFlag, \"no-correction\", false, \"Disable offline tokeniser correction factor (use raw token count)\")\n\n\t// Web Crawler flags\n\trootCmd.Flags().BoolVar(&webCrawl, \"web\", false, \"Enable web crawling mode\")\n\trootCmd.Flags().IntVar(&webMaxDepth, \"web-depth\", 1, \"Maximum crawling depth for web pages\")\n\trootCmd.Flags().StringSliceVar(&webAllowedDomains, \"web-domains\", nil, \"Allowed domains for web crawling\")\n\trootCmd.Flags().IntVar(&webTimeout, \"web-timeout\", 120, \"Timeout in seconds for web requests\")\n\trootCmd.Flags().IntVar(&webConcurrentJobs, \"web-concurrent\", 6, \"Number of concurrent crawling jobs\")\n\n\t// VRAM estimation flags\n\trootCmd.Flags().BoolVar(&vramFlag, \"vram\", false, \"Estimate vRAM usage\")\n\trootCmd.Flags().StringVarP(&modelIDFlag, \"model\", \"m\", \"\", \"vRAM Estimation - Model ID\")\n\trootCmd.Flags().StringVarP(&quantFlag, \"quant\", \"q\", \"\", \"vRAM Estimation - Quantization type (e.g., q4_k_m) or bits per weight (e.g., 5.0)\")\n\trootCmd.Flags().IntVar(&contextFlag, \"context\", 0, \"vRAM Estimation - Context length for vRAM estimation\")\n\trootCmd.Flags().StringVar(&kvCacheFlag, \"kvcache\", \"fp16\", \"vRAM Estimation - KV cache quantization: fp16, q8_0, or q4_0\")\n\trootCmd.Flags().StringVar(&quantTypeFlag, \"quanttype\", \"gguf\", \"vRAM Estimation - Quantization type: gguf or exl2\")\n\trootCmd.Flags().Float64Var(&memoryFlag, \"memory\", 0, \"vRAM Estimation - Available memory in GB for context calculation\")\n\t// if --fits is set, set memory to it's value\n\trootCmd.Flags().Float64VarP(&fitsFlag, \"fits\", \"f\", 0, \"(alias for --memory)\")\n\tif err := rootCmd.Flags().SetAnnotation(\"fits\", cobra.BashCompOneRequiredFlag, []string{\"--memory\"}); err != nil {\n\t\tfmt.Printf(\"Error setting annotation for fits flag: %v\\n\", err)\n\t}\n\n\t// Add completion command\n\trootCmd.AddCommand(&cobra.Command{\n\t\tUse:   \"completion [bash|zsh|fish]\",\n\t\tShort: \"Generate completion script\",\n\t\tLong: `Generate shell completion script for the specified shell.\n\nTo load completions:\n\nBash:\n  $ source <(ingest completion bash)\n\n  To load completions for each session, execute once:\n  Linux:\n    $ ingest completion bash > /etc/bash_completion.d/ingest\n  macOS:\n    $ ingest completion bash > $(brew --prefix)/etc/bash_completion.d/ingest\n\nZsh:\n  $ source <(ingest completion zsh)\n\n  To load completions for each session, execute once:\n  $ ingest completion zsh > \"${fpath[1]}/_ingest\"\n\nFish:\n  $ ingest completion fish | source\n\n  To load completions for each session, execute once:\n  $ ingest completion fish > ~/.config/fish/completions/ingest.fish\n`,\n\t\tDisableFlagsInUseLine: true,\n\t\tValidArgs:             []string{\"bash\", \"zsh\", \"fish\"},\n\t\tArgs:                  cobra.MatchAll(cobra.ExactArgs(1), cobra.OnlyValidArgs),\n\t\tRun:                   runCompletion,\n\t})\n}\n\nfunc main() {\n\tif err := rootCmd.Execute(); err != nil {\n\t\tfmt.Println(err)\n\t\tos.Exit(1)\n\t}\n}\n\nfunc run(cmd *cobra.Command, args []string) error {\n\t// If no arguments are provided, use the current directory\n\tif len(args) == 0 {\n\t\tcurrentDir, err := os.Getwd()\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to get current directory: %w\", err)\n\t\t}\n\t\targs = []string{currentDir}\n\t}\n\n\tif version, _ := cmd.Flags().GetBool(\"version\"); version {\n\t\tfmt.Printf(\"ingest version %s\\n\", Version)\n\t\treturn nil\n\t}\n\n\tif configFlag, _ := cmd.Flags().GetBool(\"config\"); configFlag {\n\t\tif err := config.OpenConfig(); err != nil {\n\t\t\treturn fmt.Errorf(\"failed to open config: %w\", err)\n\t\t}\n\t\tos.Exit(0)\n\t}\n\n\tcfg, err := config.LoadConfig()\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to load config: %w\", err)\n\t}\n\n\tif err := utils.EnsureConfigDirectories(); err != nil {\n\t\treturn fmt.Errorf(\"failed to ensure config directories: %w\", err)\n\t}\n\t// If no arguments are provided, use the current directory\n\tif len(args) == 0 {\n\t\tcurrentDir, err := os.Getwd()\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to get current directory: %w\", err)\n\t\t}\n\t\targs = []string{currentDir}\n\t}\n\n\t// Handle the prompt flag\n\tpromptArray, _ := cmd.Flags().GetStringArray(\"prompt\")\n\tpromptSuffix = strings.Join(promptArray, \" \")\n\n\tif printDefaultExcludes {\n\t\tfilesystem.PrintDefaultExcludes()\n\t\treturn nil\n\t}\n\n\tif printDefaultTemplate {\n\t\ttemplate.PrintDefaultTemplate()\n\t\treturn nil\n\t}\n\n\tincludePatterns, _ := cmd.Flags().GetStringSlice(\"include\")\n\texcludePatterns, _ := cmd.Flags().GetStringSlice(\"exclude\")\n\n\t// Setup template\n\ttmpl, err := template.SetupTemplate(templatePath)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to set up template: %w\", err)\n\t}\n\n\t// Setup progress spinner\n\tspinner := utils.SetupSpinner(\"Traversing directory and building tree..\")\n\tdefer func() {\n\t\tif err := spinner.Finish(); err != nil {\n\t\t\tfmt.Printf(\"Error finishing spinner: %v\\n\", err)\n\t\t}\n\t}()\n\n\t// If verbose, print active excludes\n\tif verbose {\n\t\tactiveExcludes, err := filesystem.ReadExcludePatterns(patternExclude, false)\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to read exclude patterns: %w\", err)\n\t\t}\n\t\tprintExcludePatterns(activeExcludes)\n\t}\n\n\t// Process all provided paths\n\tvar allFiles []filesystem.FileInfo\n\tvar allTrees []string\n\tvar gitData []GitData\n\tvar allExcluded []*filesystem.ExcludedInfo\n\n\tremainingArgs := make([]string, len(args))\n\tcopy(remainingArgs, args)\n\n\tfor i := range remainingArgs {\n\t\targ := remainingArgs[i]\n\n\t\t// Check if this is a URL (either with --web flag or auto-detected)\n\t\tif webCrawl || isURL(arg) {\n\t\t\tif !isURL(arg) {\n\t\t\t\treturn fmt.Errorf(\"web crawling is enabled but the argument '%s' is not a URL\", arg)\n\t\t\t}\n\n\t\t\tutils.PrintColouredMessage(\"ℹ️\", fmt.Sprintf(\"Processing URL: %s\", arg), color.FgBlue)\n\n\t\t\t// Process as web URL - now passing excludePatterns\n\t\t\tresult, err := processWebInput(arg, excludePatterns)\n\t\t\tif err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to process web URL %s: %w\", arg, err)\n\t\t\t}\n\n\t\t\tallFiles = append(allFiles, result.Files...)\n\t\t\tallTrees = append(allTrees, result.TreeString)\n\t\t\tcontinue\n\t\t}\n\n\t\t// Process as local file/directory\n\t\tabsPath, err := filepath.Abs(arg)\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to get absolute path for %s: %w\", arg, err)\n\t\t}\n\n\t\tfileInfo, err := os.Stat(absPath)\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to get file info for %s: %w\", arg, err)\n\t\t}\n\n\t\tvar files []filesystem.FileInfo\n\t\tvar tree string\n\t\tvar excluded *filesystem.ExcludedInfo\n\n\t\t// Initialize the compressor if the flag is set\n\t\tvar comp *compressor.GenericCompressor\n\t\tif compressFlag {\n\t\t\tcomp = compressor.NewGenericCompressor()\n\t\t}\n\n\t\tif fileInfo.IsDir() {\n\t\t\t// Existing directory processing logic\n\t\t\ttree, files, excluded, err = filesystem.WalkDirectory(absPath, includePatterns, excludePatterns, patternExclude, includePriority, lineNumber, relativePaths, excludeFromTree, noCodeblock, noDefaultExcludes, followSymlinks, comp) // Pass compressor\n\t\t\tif err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to process directory %s: %w\", arg, err)\n\t\t\t}\n\t\t\t// Use relative path for tree header when relativePaths flag is set\n\t\t\ttreePath := absPath\n\t\t\tif relativePaths {\n\t\t\t\ttreePath = filepath.Base(absPath)\n\t\t\t}\n\t\t\ttree = fmt.Sprintf(\"%s:\\n%s\", treePath, tree)\n\t\t} else {\n\t\t\t// New file processing logic\n\t\t\tfile, err := filesystem.ProcessSingleFile(absPath, lineNumber, relativePaths, noCodeblock, followSymlinks, comp) // Pass compressor\n\t\t\tif err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to process file %s: %w\", arg, err)\n\t\t\t}\n\t\t\tfiles = []filesystem.FileInfo{file}\n\t\t\t// Use relative path for file header when relativePaths flag is set\n\t\t\tfilePath := absPath\n\t\t\tif relativePaths {\n\t\t\t\tfilePath = filepath.Base(absPath)\n\t\t\t}\n\t\t\ttree = fmt.Sprintf(\"File: %s\", filePath)\n\t\t}\n\n\t\tallFiles = append(allFiles, files...)\n\t\tallTrees = append(allTrees, tree)\n\t\tif excluded != nil {\n\t\t\tallExcluded = append(allExcluded, excluded)\n\t\t}\n\n\t\t// Handle git operations for each path\n\t\tgitDiffContent := \"\"\n\t\tgitDiffBranchContent := \"\"\n\t\tgitLogBranchContent := \"\"\n\n\t\tif diff {\n\t\t\tgitDiffContent, err = git.GetGitDiff(absPath)\n\t\t\tif err != nil {\n\t\t\t\t// Log the error but continue processing\n\t\t\t\tfmt.Printf(\"Warning: failed to get git diff for %s: %v\\n\", absPath, err)\n\t\t\t}\n\t\t}\n\n\t\tif gitDiffBranch != \"\" {\n\t\t\tbranches := strings.Split(gitDiffBranch, \",\")\n\t\t\tif len(branches) == 2 {\n\t\t\t\tgitDiffBranchContent, err = git.GetGitDiffBetweenBranches(absPath, branches[0], branches[1])\n\t\t\t\tif err != nil {\n\t\t\t\t\tfmt.Printf(\"Warning: failed to get git diff between branches for %s: %v\\n\", absPath, err)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\tif gitLogBranch != \"\" {\n\t\t\tbranches := strings.Split(gitLogBranch, \",\")\n\t\t\tif len(branches) == 2 {\n\t\t\t\tgitLogBranchContent, err = git.GetGitLog(absPath, branches[0], branches[1])\n\t\t\t\tif err != nil {\n\t\t\t\t\tfmt.Printf(\"Warning: failed to get git log for %s: %v\\n\", absPath, err)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\tgitData = append(gitData, GitData{\n\t\t\tPath:          absPath,\n\t\t\tGitDiff:       gitDiffContent,\n\t\t\tGitDiffBranch: gitDiffBranchContent,\n\t\t\tGitLogBranch:  gitLogBranchContent,\n\t\t})\n\t}\n\n\t// Prepare data for template\n\tvar excludedInfo any\n\tif len(allExcluded) > 0 {\n\t\texcludedInfo = allExcluded[0] // Use the first excluded info if available\n\t}\n\n\tdata := map[string]any{\n\t\t\"source_trees\": strings.Join(allTrees, \"\\n\\n\"),\n\t\t\"files\":        allFiles,\n\t\t\"git_data\":     gitData,\n\t\t\"excluded\":     excludedInfo,\n\t}\n\n\tif err := spinner.Finish(); err != nil {\n\t\treturn fmt.Errorf(\"failed to finish spinner: %w\", err)\n\t}\n\n\t// Render template\n\trendered, err := template.RenderTemplate(tmpl, data)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to render template: %w\", err)\n\t}\n\n\t// Check if save is set in config or flag\n\tautoSave, _ := cmd.Flags().GetBool(\"save\")\n\tif cfg.AutoSave || autoSave {\n\t\t// Pass the output flag value to autoSaveOutput\n\t\tif err := autoSaveOutput(rendered, output, args[0]); err != nil { // Assuming args[0] is a representative source path\n\t\t\tutils.PrintColouredMessage(\"❌\", fmt.Sprintf(\"Error auto-saving file: %v\", err), color.FgRed)\n\t\t}\n\t}\n\n\t// VRAM estimation\n\tif vramFlag {\n\t\tfmt.Println()\n\t\tif err := performVRAMEstimation(rendered); err != nil {\n\t\t\tutils.PrintColouredMessage(\"❌\", fmt.Sprintf(\"VRAM estimation error: %v\", err), color.FgRed)\n\t\t}\n\t}\n\tuseLLM, _ := cmd.Flags().GetBool(\"llm\")\n\n\t// Handle output\n\tif useLLM {\n\t\tif err := handleLLMOutput(rendered, cfg.LLM, tokens, encoding); err != nil {\n\t\t\tutils.PrintColouredMessage(\"❌\", fmt.Sprintf(\"LLM output error: %v\", err), color.FgRed)\n\t\t}\n\t} else {\n\t\t// If both --save and --output are used, we don't want handleOutput to write the file\n\t\t// as autoSaveOutput will handle it\n\t\toutputForHandleOutput := output\n\t\tif (cfg.AutoSave || autoSave) && output != \"\" {\n\t\t\toutputForHandleOutput = \"\"\n\t\t}\n\t\tif err := handleOutput(rendered, tokens, encoding, noClipboard, outputForHandleOutput, jsonOutput, report || verbose, allFiles); err != nil {\n\t\t\treturn fmt.Errorf(\"failed to handle output: %w\", err)\n\t\t}\n\t}\n\n\t// Print all collected messages at the end\n\tutils.PrintMessages()\n\n\treturn nil\n}\n\nfunc reportLargestFiles(files []filesystem.FileInfo) {\n\tsort.Slice(files, func(i, j int) bool {\n\t\treturn len(files[i].Code) > len(files[j].Code)\n\t})\n\n\tutils.PrintColouredMessage(\"ℹ️\", \"Top 15 largest files (by estimated token count):\", color.FgCyan)\n\tcolourRange := []*color.Color{\n\t\tcolor.New(color.FgRed),\n\t\tcolor.New(color.FgRed),\n\t\tcolor.New(color.FgRed),\n\t\tcolor.New(color.FgRed),\n\t\tcolor.New(color.FgRed),\n\t\tcolor.New(color.FgYellow),\n\t\tcolor.New(color.FgYellow),\n\t\tcolor.New(color.FgYellow),\n\t\tcolor.New(color.FgYellow),\n\t\tcolor.New(color.FgYellow),\n\t\tcolor.New(color.FgGreen),\n\t\tcolor.New(color.FgGreen),\n\t\tcolor.New(color.FgGreen),\n\t\tcolor.New(color.FgGreen),\n\t\tcolor.New(color.FgGreen),\n\t}\n\n\t// Limit to top 15 files\n\tdisplayCount := min(len(files), 15)\n\n\t// Collect file contents for batch processing\n\tfileContents := make([]string, displayCount)\n\tfor i := range displayCount {\n\t\tfileContents[i] = files[i].Code\n\t}\n\n\t// Count tokens in batch (uses parallel API calls if Anthropic API is enabled)\n\ttokenCounts := token.CountTokensBatch(fileContents, encoding, anthropicFlag, noCorrectionFlag)\n\n\t// Print the files with their token counts\n\tfor i := range displayCount {\n\t\tcolour := colourRange[i]\n\t\tfmt.Printf(\"- %d. %s (%s tokens)\\n\", i+1, files[i].Path, colour.Sprint(utils.FormatNumber(tokenCounts[i])))\n\t}\n\n\tfmt.Println()\n}\n\nfunc handleOutput(rendered string, countTokens bool, encoding string, noClipboard bool, output string, jsonOutput bool, report bool, files []filesystem.FileInfo) error {\n\tif countTokens {\n\t\ttokenCount := token.CountTokens(rendered, encoding, anthropicFlag, noCorrectionFlag)\n\t\tprintln()\n\t\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Tokens (Approximate): %v\", utils.FormatNumber(tokenCount)), color.FgYellow, 1)\n\t}\n\n\tif report {\n\t\treportLargestFiles(files)\n\t}\n\n\tif jsonOutput {\n\t\tjsonData := map[string]any{\n\t\t\t\"prompt\":      rendered,\n\t\t\t\"token_count\": token.CountTokens(rendered, encoding, anthropicFlag, noCorrectionFlag),\n\t\t\t\"model_info\":  token.GetModelInfo(encoding),\n\t\t}\n\t\tjsonBytes, err := json.MarshalIndent(jsonData, \"\", \"  \")\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to marshal JSON: %w\", err)\n\t\t}\n\t\tfmt.Println(string(jsonBytes))\n\t} else {\n\t\toutputWritten := false\n\t\tif output != \"\" {\n\t\t\terr := utils.WriteToFile(output, rendered)\n\t\t\tif err != nil {\n\t\t\t\t// Report the error but continue to potentially copy to clipboard or print\n\t\t\t\tutils.PrintColouredMessage(\"❌\", fmt.Sprintf(\"Failed to write to file %s: %v\", output, err), color.FgRed)\n\t\t\t} else {\n\t\t\t\tutils.AddMessage(\"✅\", fmt.Sprintf(\"Written to file: %s\", output), color.FgGreen, 20)\n\t\t\t\toutputWritten = true\n\t\t\t}\n\t\t}\n\n\t\tclipboardCopied := false\n\t\tif !noClipboard {\n\t\t\terr := utils.CopyToClipboard(rendered)\n\t\t\tif err == nil {\n\t\t\t\tutils.AddMessage(\"✅\", \"Copied to clipboard successfully.\", color.FgGreen, 5)\n\t\t\t\tclipboardCopied = true\n\t\t\t} else {\n\t\t\t\t// Only show clipboard error if we didn't write to a file\n\t\t\t\tif !outputWritten {\n\t\t\t\t\tutils.PrintColouredMessage(\"⚠️\", fmt.Sprintf(\"Failed to copy to clipboard: %v.\", err), color.FgYellow)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\t// If neither output file was written nor clipboard was copied, print to console\n\t\tif !outputWritten && !clipboardCopied {\n\t\t\tfmt.Print(rendered)\n\t\t}\n\t}\n\n\treturn nil\n}\n\nfunc printExcludePatterns(patterns []string) {\n\tutils.PrintColouredMessage(\"i\", \"Active exclude patterns:\", color.FgCyan)\n\n\t// Define colours for syntax highlighting\n\tstarColour := color.New(color.FgHiGreen).SprintFunc()\n\tslashColour := color.New(color.FgGreen).SprintFunc()\n\tdotColour := color.New(color.FgBlue).SprintFunc()\n\n\t// Calculate the maximum width of patterns for alignment\n\tmaxWidth := 0\n\tfor _, pattern := range patterns {\n\t\tif len(pattern) > maxWidth {\n\t\t\tmaxWidth = len(pattern)\n\t\t}\n\t}\n\n\t// Print patterns in a horizontal list\n\tlineWidth := 0\n\n\t// get the width of the terminal\n\tw := utils.GetTerminalWidth()\n\n\tfor i, pattern := range patterns {\n\t\thighlighted := pattern\n\t\thighlighted = strings.ReplaceAll(highlighted, \"*\", starColour(\"*\"))\n\t\thighlighted = strings.ReplaceAll(highlighted, \"/\", slashColour(\"/\"))\n\t\thighlighted = strings.ReplaceAll(highlighted, \".\", dotColour(\".\"))\n\n\t\t// Add padding to align patterns\n\t\tpadding := strings.Repeat(\" \", maxWidth-len(pattern)+2)\n\n\t\tif lineWidth+len(pattern)+2 > w && i > 0 {\n\t\t\tfmt.Println()\n\t\t\tlineWidth = 0\n\t\t}\n\n\t\tif lineWidth == 0 {\n\t\t\tfmt.Print(\"  \")\n\t\t}\n\n\t\tfmt.Print(highlighted + padding)\n\t\tlineWidth += len(pattern) + len(padding)\n\n\t\tif i < len(patterns)-1 {\n\t\t\tfmt.Print(\"| \")\n\t\t\tlineWidth += 2\n\t\t}\n\t}\n}\n\nfunc handleLLMOutput(rendered string, llmConfig config.LLMConfig, countTokens bool, encoding string) error {\n\tif countTokens {\n\t\ttokenCount := token.CountTokens(rendered, encoding, anthropicFlag, noCorrectionFlag)\n\t\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Tokens (Approximate): %v\", utils.FormatNumber(tokenCount)), color.FgYellow, 40)\n\t}\n\n\tif promptPrefix != \"\" {\n\t\trendered = promptPrefix + \"\\n\" + rendered\n\t}\n\n\tif promptSuffix != \"\" {\n\t\trendered += \"\\n\" + promptSuffix\n\t}\n\n\tif llmConfig.AuthToken == \"\" {\n\t\treturn fmt.Errorf(\"LLM auth token is empty\")\n\t}\n\n\tclientConfig := openai.DefaultConfig(llmConfig.AuthToken)\n\tclientConfig.BaseURL = llmConfig.BaseURL\n\tclientConfig.APIType = openai.APIType(llmConfig.APIType)\n\n\tc := openai.NewClientWithConfig(clientConfig)\n\tctx := context.Background()\n\n\treq := openai.CompletionRequest{\n\t\tModel:     llmConfig.Model,\n\t\tMaxTokens: llmConfig.MaxTokens,\n\t\tPrompt:    rendered,\n\t\tStream:    true,\n\t}\n\n\tif llmConfig.Temperature != nil {\n\t\treq.Temperature = *llmConfig.Temperature\n\t}\n\tif llmConfig.TopP != nil {\n\t\treq.TopP = *llmConfig.TopP\n\t}\n\tif llmConfig.PresencePenalty != nil {\n\t\treq.PresencePenalty = *llmConfig.PresencePenalty\n\t}\n\tif llmConfig.FrequencyPenalty != nil {\n\t\treq.FrequencyPenalty = *llmConfig.FrequencyPenalty\n\t}\n\n\tstream, err := c.CreateCompletionStream(ctx, req)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"LLM CompletionStream error: %w\", err)\n\t}\n\tdefer stream.Close()\n\n\ttermWidth := min(\n\t\t// if the term width is over 160, set it to 160\n\t\tutils.GetTerminalWidth(), 160)\n\n\tr, err := glamour.NewTermRenderer(\n\t\tglamour.WithStandardStyle(\"dracula\"),\n\t\tglamour.WithWordWrap(termWidth-10),\n\t)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to create renderer: %w\", err)\n\t}\n\n\tvar buffer strings.Builder\n\tvar output strings.Builder\n\tfor {\n\t\tresponse, err := stream.Recv()\n\t\tif errors.Is(err, io.EOF) {\n\t\t\tbreak\n\t\t}\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"stream error: %w\", err)\n\t\t}\n\n\t\tbuffer.WriteString(response.Choices[0].Text)\n\n\t\t// Process complete lines\n\t\tfor {\n\t\t\tline, rest, found := strings.Cut(buffer.String(), \"\\n\")\n\t\t\tif !found {\n\t\t\t\tbreak\n\t\t\t}\n\t\t\toutput.WriteString(line + \"\\n\")\n\t\t\tbuffer.Reset()\n\t\t\tbuffer.WriteString(rest)\n\n\t\t\t// Render and print if we have a complete code block or enough non-code content\n\t\t\t//  if there are any headers, render the markdown chuck before each header\n\t\t\tisHeading := regexp.MustCompile(`^#+`).MatchString(output.String())\n\t\t\tif isHeading && output.Len() > 0 && !strings.Contains(output.String(), \"```\") &&\n\t\t\t\t(strings.HasSuffix(output.String(), \"```\\n\") || output.Len() > 200) {\n\t\t\t\tcontentToRender := strings.TrimSpace(output.String()) + \"\\n\"\n\t\t\t\trenderedContent, err := r.Render(contentToRender)\n\t\t\t\tif err != nil {\n\t\t\t\t\treturn fmt.Errorf(\"rendering error: %w\", err)\n\t\t\t\t}\n\t\t\t\tfmt.Print(renderedContent)\n\t\t\t\toutput.Reset()\n\t\t\t} else {\n\t\t\t\tif !strings.Contains(output.String(), \"```\") &&\n\t\t\t\t\t(strings.HasSuffix(output.String(), \"```\\n\") || output.Len() > 200) {\n\t\t\t\t\t// Trim excess newlines before rendering\n\t\t\t\t\tcontentToRender := strings.TrimSpace(output.String()) + \"\\n\"\n\t\t\t\t\trenderedContent, err := r.Render(contentToRender)\n\t\t\t\t\tif err != nil {\n\t\t\t\t\t\treturn fmt.Errorf(\"rendering error: %w\", err)\n\t\t\t\t\t}\n\t\t\t\t\tfmt.Print(renderedContent)\n\t\t\t\t\toutput.Reset()\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t}\n\n\t// Render and print any remaining content\n\tif buffer.Len() > 0 {\n\t\toutput.WriteString(buffer.String())\n\t}\n\tif output.Len() > 0 {\n\t\trenderedContent, err := r.Render(output.String())\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"rendering error: %w\", err)\n\t\t}\n\t\tfmt.Print(renderedContent)\n\t}\n\n\treturn nil\n}\n\nfunc performVRAMEstimation(content string) error {\n\tif modelIDFlag == \"\" {\n\t\treturn fmt.Errorf(\"model ID is required for vRAM estimation\")\n\t}\n\n\ttokenCount := token.CountTokens(content, encoding, anthropicFlag, noCorrectionFlag)\n\n\t// TODO: fix this:\n\t// quant, err := quantest.GetOllamaQuantLevel(modelIDFlag)\n\t// if err != nil {\n\t// \treturn fmt.Errorf(\"error getting quantisation level: %w\", err)\n\t// }\n\n\tquant := \"q4_k_m\"\n\n\testimation, err := quantest.EstimateVRAMForModel(modelIDFlag, memoryFlag, tokenCount, quant, kvCacheFlag)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"error estimating vRAM: %w\", err)\n\t}\n\n\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Model: %s\", estimation.ModelName), color.FgCyan, 10)\n\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Estimated vRAM Required: %.2f GB\", estimation.EstimatedVRAM), color.FgCyan, 3)\n\t// print the vram available\n\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Available vRAM: %.2f GB\", memoryFlag), color.FgCyan, 10)\n\tif estimation.FitsAvailable {\n\t\tutils.AddMessage(\"✅\", \"Fits Available vRAM\", color.FgGreen, 2)\n\t} else {\n\t\tutils.AddMessage(\"❌\", \"Does Not Fit Available vRAM\", color.FgYellow, 2)\n\t}\n\tutils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Max Context Size: %d\", estimation.MaxContextSize), color.FgCyan, 8)\n\t// utils.AddMessage(\"ℹ️\", fmt.Sprintf(\"Maximum Quantisation: %s\", estimation.MaximumQuant), color.FgCyan, 10)\n\t// TODO: - this isn't that useful, come up with something smarter\n\n\t// Generate and print the quant table\n\ttable, err := quantest.GenerateQuantTable(estimation.ModelConfig, memoryFlag)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"error generating quant table: %w\", err)\n\t}\n\tfmt.Println(quantest.PrintFormattedTable(table))\n\n\t// Check if the content fits within the specified constraints\n\tif memoryFlag > 0 {\n\t\tif tokenCount > estimation.MaxContextSize {\n\t\t\tutils.AddMessage(\"❗️\", fmt.Sprintf(\"Generated content (%d tokens) exceeds maximum context (%d tokens).\", tokenCount, estimation.MaxContextSize), color.FgYellow, 2)\n\t\t} else {\n\t\t\tutils.AddMessage(\"✅\", fmt.Sprintf(\"Generated content (%d tokens) fits within maximum context (%d tokens).\", tokenCount, estimation.MaxContextSize), color.FgGreen, 2)\n\t\t}\n\t}\n\n\treturn nil\n}\n\n// autoSaveOutput saves the content based on the combination of --save and --output flags:\n// - If only --save is used, save to ~/ingest/<dirname>.md\n// - If --save and --output ./somefile.md, only save to ./somefile.md\n// - If --save and --output somefile.md, save to ~/ingest/somefile.md\nfunc autoSaveOutput(content string, outputPath string, sourcePath string) error {\n\tvar finalPath string\n\n\tif outputPath != \"\" {\n\t\tif strings.HasPrefix(outputPath, \"./\") || strings.HasPrefix(outputPath, \"../\") || filepath.IsAbs(outputPath) {\n\t\t\t// Case: --output starts with ./ or ../ or is absolute path\n\t\t\t// Save only to the specified output path\n\t\t\tabsOutputPath, err := filepath.Abs(outputPath)\n\t\t\tif err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to get absolute path for output file %s: %w\", outputPath, err)\n\t\t\t}\n\t\t\tfinalPath = absOutputPath\n\t\t} else {\n\t\t\t// Case: --output is just a filename\n\t\t\t// Save to ~/ingest/ with the specified filename\n\t\t\thomeDir, err := os.UserHomeDir()\n\t\t\tif err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to get user home directory: %w\", err)\n\t\t\t}\n\t\t\tingestDir := filepath.Join(homeDir, \"ingest\")\n\t\t\tfinalPath = filepath.Join(ingestDir, outputPath)\n\t\t}\n\t} else {\n\t\t// Default --save behavior\n\t\thomeDir, err := os.UserHomeDir()\n\t\tif err != nil {\n\t\t\treturn fmt.Errorf(\"failed to get user home directory: %w\", err)\n\t\t}\n\t\tingestDir := filepath.Join(homeDir, \"ingest\")\n\t\tfileName := filepath.Base(sourcePath) + \".md\"\n\t\tfinalPath = filepath.Join(ingestDir, fileName)\n\t}\n\n\t// Ensure the directory for the final path exists\n\tfinalDir := filepath.Dir(finalPath)\n\tif err := os.MkdirAll(finalDir, 0700); err != nil {\n\t\treturn fmt.Errorf(\"failed to create directory %s for auto-save file: %w\", finalDir, err)\n\t}\n\n\t// Write the file using os.WriteFile\n\tif err := os.WriteFile(finalPath, []byte(content), 0600); err != nil {\n\t\treturn fmt.Errorf(\"failed to write auto-save file to %s: %w\", finalPath, err)\n\t}\n\n\tutils.AddMessage(\"💾\", fmt.Sprintf(\"Auto-saved to: %s\", finalPath), color.FgMagenta, 15) // Changed icon and message slightly\n\treturn nil\n}\n\nfunc runCompletion(cmd *cobra.Command, args []string) {\n\tswitch args[0] {\n\tcase \"bash\":\n\t\tif err := cmd.Root().GenBashCompletion(os.Stdout); err != nil {\n\t\t\tfmt.Printf(\"Error generating bash completion: %v\\n\", err)\n\t\t}\n\tcase \"zsh\":\n\t\tif err := cmd.Root().GenZshCompletion(os.Stdout); err != nil {\n\t\t\tfmt.Printf(\"Error generating zsh completion: %v\\n\", err)\n\t\t}\n\tcase \"fish\":\n\t\tif err := cmd.Root().GenFishCompletion(os.Stdout, true); err != nil {\n\t\t\tfmt.Printf(\"Error generating fish completion: %v\\n\", err)\n\t\t}\n\t}\n}\n\nfunc processWebInput(urlStr string, excludePatterns []string) (*web.CrawlResult, error) {\n\toptions := web.CrawlOptions{\n\t\tMaxDepth:       webMaxDepth,\n\t\tAllowedDomains: webAllowedDomains,\n\t\tTimeout:        webTimeout,\n\t\tConcurrentJobs: webConcurrentJobs,\n\t}\n\n\treturn web.ProcessWebURL(urlStr, options, excludePatterns)\n}\n\nfunc isURL(str string) bool {\n\tu, err := url.Parse(str)\n\treturn err == nil && u.Scheme != \"\" && u.Host != \"\"\n}\n"
  },
  {
    "path": "pdf/pdf.go",
    "content": "// pdf/pdf.go\n\npackage pdf\n\nimport (\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"strings\"\n\n\t\"github.com/ledongthuc/pdf\"\n)\n\n// ConvertPDFToMarkdown converts a PDF file to markdown format\n\nfunc ConvertPDFToMarkdown(path string, isURL bool) (string, error) {\n\tvar reader io.ReadCloser\n\tvar err error\n\n\tif isURL {\n\t\treader, err = downloadPDF(path)\n\t\tif err != nil {\n\t\t\treturn \"\", fmt.Errorf(\"failed to download PDF: %w\", err)\n\t\t}\n\t\tdefer reader.Close()\n\n\t\ttempFile, err := os.CreateTemp(\"\", \"ingest-*.pdf\")\n\t\tif err != nil {\n\t\t\treturn \"\", fmt.Errorf(\"failed to create temp file: %w\", err)\n\t\t}\n\t\tdefer os.Remove(tempFile.Name())\n\t\tdefer tempFile.Close()\n\n\t\tif _, err := io.Copy(tempFile, reader); err != nil {\n\t\t\treturn \"\", fmt.Errorf(\"failed to save PDF: %w\", err)\n\t\t}\n\n\t\tpath = tempFile.Name()\n\t}\n\n\t// Open and read the PDF\n\tf, r, err := pdf.Open(path)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"failed to open PDF: %w\", err)\n\t}\n\tdefer f.Close()\n\n\tvar buf strings.Builder\n\tbuf.WriteString(fmt.Sprintf(\"# PDF Content: %s\\n\\n\", filepath.Base(path)))\n\n\t// Extract text from each page\n\ttotalPages := r.NumPage()\n\tfor pageNum := 1; pageNum <= totalPages; pageNum++ {\n\t\tpage := r.Page(pageNum)\n\t\tif page.V.IsNull() {\n\t\t\tcontinue\n\t\t}\n\n\t\ttext, err := page.GetPlainText(nil)\n\t\tif err != nil {\n\t\t\treturn \"\", fmt.Errorf(\"failed to extract text from page %d: %w\", pageNum, err)\n\t\t}\n\n\t\t// Clean and process the text\n\t\tcleanedText := cleanText(text)\n\t\tif cleanedText != \"\" {\n\t\t\tbuf.WriteString(fmt.Sprintf(\"## Page %d\\n\\n\", pageNum))\n\t\t\tbuf.WriteString(cleanedText)\n\t\t\tbuf.WriteString(\"\\n\\n\")\n\t\t}\n\t}\n\n\tresult := buf.String()\n\tif strings.TrimSpace(result) == strings.TrimSpace(fmt.Sprintf(\"# PDF Content: %s\\n\\n\", filepath.Base(path))) {\n\t\treturn \"\", fmt.Errorf(\"no text content could be extracted from PDF\")\n\t}\n\n\treturn result, nil\n}\n\n// IsPDF checks if a file is a PDF based on its content type or extension\nfunc IsPDF(path string) (bool, error) {\n\t// Check if it's a URL\n\tif strings.HasPrefix(path, \"http://\") || strings.HasPrefix(path, \"https://\") {\n\t\tresp, err := http.Head(path)\n\t\tif err != nil {\n\t\t\treturn false, fmt.Errorf(\"failed to check URL for PDF: %w\", err)\n\t\t}\n\t\tdefer resp.Body.Close()\n\t\treturn resp.Header.Get(\"Content-Type\") == \"application/pdf\", nil\n\t}\n\n\t// Check local file\n\tfile, err := os.Open(path)\n\tif err != nil {\n\t\treturn false, fmt.Errorf(\"failed to open file: %w\", err)\n\t}\n\tdefer file.Close()\n\n\t// Read first 512 bytes to determine file type\n\tbuffer := make([]byte, 512)\n\tn, err := file.Read(buffer)\n\tif err != nil && err != io.EOF {\n\t\treturn false, fmt.Errorf(\"failed to read file header: %w\", err)\n\t}\n\n\t// Check file signature\n\tcontentType := http.DetectContentType(buffer[:n])\n\tif contentType == \"application/pdf\" {\n\t\treturn true, nil\n\t}\n\n\t// Also check file extension\n\treturn strings.ToLower(filepath.Ext(path)) == \".pdf\", nil\n}\n\nfunc downloadPDF(url string) (io.ReadCloser, error) {\n\tresp, err := http.Get(url)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tif resp.StatusCode != http.StatusOK {\n\t\tresp.Body.Close()\n\t\treturn nil, fmt.Errorf(\"failed to download PDF: status code %d\", resp.StatusCode)\n\t}\n\n\tif resp.Header.Get(\"Content-Type\") != \"application/pdf\" {\n\t\tresp.Body.Close()\n\t\treturn nil, fmt.Errorf(\"URL does not point to a PDF file\")\n\t}\n\n\treturn resp.Body, nil\n}\n\nfunc cleanText(text string) string {\n\tif strings.Contains(text, \"%PDF-\") || strings.Contains(text, \"endobj\") {\n\t\t// This appears to be raw PDF data rather than extracted text\n\t\treturn \"\"\n\t}\n\n\t// Remove control characters except newlines and tabs\n\ttext = strings.Map(func(r rune) rune {\n\t\tif r < 32 && r != '\\n' && r != '\\t' {\n\t\t\treturn -1\n\t\t}\n\t\treturn r\n\t}, text)\n\n\t// Split into lines and clean each line\n\tlines := strings.Split(text, \"\\n\")\n\tvar cleanLines []string\n\n\tfor _, line := range lines {\n\t\tline = strings.TrimSpace(line)\n\n\t\t// Skip empty lines and lines that look like PDF syntax\n\t\tif line == \"\" ||\n\t\t\tstrings.HasPrefix(line, \"%\") ||\n\t\t\tstrings.HasPrefix(line, \"/\") ||\n\t\t\tstrings.Contains(line, \"obj\") ||\n\t\t\tstrings.Contains(line, \"endobj\") ||\n\t\t\tstrings.Contains(line, \"stream\") {\n\t\t\tcontinue\n\t\t}\n\n\t\tcleanLines = append(cleanLines, line)\n\t}\n\n\treturn strings.Join(cleanLines, \"\\n\\n\")\n}\n"
  },
  {
    "path": "scripts/install.sh",
    "content": "#!/usr/bin/env bash\n\n# This is a simple installer that gets the latest version of ingest from Github and installs it to /usr/local/bin\n\nINSTALL_DIR=\"/usr/local/bin\"\nINSTALL_PATH=\"${INSTALL_PATH:-$INSTALL_DIR/ingest}\"\nARCH=$(uname -m | tr '[:upper:]' '[:lower:]')\nOS=$(uname -s | tr '[:upper:]' '[:lower:]')\n\n# Ensure the user is not root\nif [ \"$EUID\" -eq 0 ]; then\n  echo \"Please do not run as root\"\n  exit 1\nfi\n\n# Get the latest release from Github\nVER=$(curl --silent -qI https://github.com/sammcj/ingest/releases/latest | awk -F '/' '/^location/ {print  substr($NF, 1, length($NF)-1)}')\n\necho \"Downloading ingest ${VER} for ${OS}-${ARCH}...\"\n\nwget -q --show-progress -O ingest \"https://github.com/sammcj/ingest/releases/download/$VER/ingest-${OS}-${ARCH}\"\n\n# # Move the binary to the install directory\nmv ingest \"${INSTALL_PATH}\"\n\n# # Make the binary executable\nchmod +x \"${INSTALL_PATH}\"\n\necho \"ingest has been installed to ${INSTALL_PATH}\"\n"
  },
  {
    "path": "template/template.go",
    "content": "package template\n\nimport (\n\t\"fmt\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"strings\"\n\t\"text/template\"\n\n\t\"github.com/fatih/color\"\n\t\"github.com/mitchellh/go-homedir\"\n\t\"github.com/sammcj/ingest/utils\"\n)\n\nfunc SetupTemplate(templatePath string) (*template.Template, error) {\n\tvar templateContent string\n\tvar err error\n\n\tif templatePath != \"\" {\n\t\ttemplateContent, err = readTemplateFile(templatePath)\n\t} else {\n\t\ttemplateContent, err = getDefaultTemplate()\n\t}\n\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to read template: %w\", err)\n\t}\n\n\ttmpl, err := template.New(\"default\").Parse(templateContent)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to parse template: %w\", err)\n\t}\n\n\treturn tmpl, nil\n}\n\nfunc readTemplateFile(path string) (string, error) {\n\tcontent, err := os.ReadFile(path)\n\tif err != nil {\n\t\treturn \"\", err\n\t}\n\treturn string(content), nil\n}\n\nfunc getDefaultTemplate() (string, error) {\n\t// Check for user-specific template\n\thome, err := homedir.Dir()\n\tif err == nil {\n\t\tuserTemplateDir := filepath.Join(home, \".config\", \"ingest\", \"patterns\", \"templates\")\n\t\tuserDefaultTemplate := filepath.Join(userTemplateDir, \"default.tmpl\")\n\t\tif _, err := os.Stat(userDefaultTemplate); err == nil {\n\t\t\treturn readTemplateFile(userDefaultTemplate)\n\t\t}\n\t}\n\n\t// If no user-specific template, use built-in template\n\treturn readEmbeddedTemplate()\n}\n\nfunc readEmbeddedTemplate() (string, error) {\n\treturn `\nSource Trees:\n\n{{.source_trees}}\n\n{{if .excluded}}\nExcluded Content:\n{{if le .excluded.TotalFiles 20}}\nFiles:\n{{range .excluded.Files}}\n- {{.}}\n{{end}}\n{{else}}\nDirectories with excluded files:\n{{range $dir, $count := .excluded.Directories}}\n{{if gt $count 0}}- {{$dir}}: {{$count}} files{{end}}\n{{end}}\n\nFile extensions excluded:\n{{range $ext, $count := .excluded.Extensions}}\n- {{$ext}}: {{$count}} files\n{{end}}\n{{end}}\n\n{{end}}\n\n{{range .files}}\n{{if .Code}}\n` + \"`{{.Path}}:`\" + `\n\n{{.Code}}\n\n{{end}}\n{{end}}\n\n{{range .git_data}}\n{{if or .GitDiff .GitDiffBranch .GitLogBranch}}\nGit Information for {{.Path}}:\n{{if .GitDiff}}\nGit Diff:\n{{.GitDiff}}\n{{end}}\n{{if .GitDiffBranch}}\nGit Diff Between Branches:\n{{.GitDiffBranch}}\n{{end}}\n{{if .GitLogBranch}}\nGit Log Between Branches:\n{{.GitLogBranch}}\n{{end}}\n{{end}}\n{{end}}\n`, nil\n}\n\nfunc RenderTemplate(tmpl *template.Template, data map[string]any) (string, error) {\n\tvar output strings.Builder\n\terr := tmpl.Execute(&output, data)\n\tif err != nil {\n\t\treturn \"\", fmt.Errorf(\"failed to render template: %w\", err)\n\t}\n\treturn output.String(), nil\n}\n\nfunc PrintDefaultTemplate() {\n\tdefaultTemplate, err := readEmbeddedTemplate()\n\tif err != nil {\n\t\tutils.PrintColouredMessage(\"!\", fmt.Sprintf(\"Failed to get default template: %v\", err), color.FgRed)\n\t\tos.Exit(1)\n\t}\n\tfmt.Println(defaultTemplate)\n}\n"
  },
  {
    "path": "token/anthropic.go",
    "content": "package token\n\nimport (\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"os\"\n\t\"sync\"\n\t\"time\"\n)\n\ntype AnthropicTokenCountRequest struct {\n\tModel    string    `json:\"model\"`\n\tMessages []Message `json:\"messages\"`\n}\n\ntype Message struct {\n\tRole    string `json:\"role\"`\n\tContent string `json:\"content\"`\n}\n\ntype AnthropicTokenCountResponse struct {\n\tInputTokens int `json:\"input_tokens\"`\n}\n\n// AnthropicModel is the Claude model used for token counting\nconst AnthropicModel = \"claude-sonnet-4-5\"\n\n// httpClient is a reusable HTTP client with connection pooling and timeout\nvar httpClient = &http.Client{\n\tTimeout: 30 * time.Second,\n}\n\nfunc getAnthropicAPIKey() (string, error) {\n\t// Check environment variables in order of preference\n\tif key := os.Getenv(\"ANTHROPIC_API_KEY\"); key != \"\" {\n\t\treturn key, nil\n\t}\n\tif key := os.Getenv(\"ANTHROPIC_TOKEN\"); key != \"\" {\n\t\treturn key, nil\n\t}\n\tif key := os.Getenv(\"ANTHROPIC_TOKEN_COUNT_KEY\"); key != \"\" {\n\t\treturn key, nil\n\t}\n\treturn \"\", fmt.Errorf(\"no Anthropic API key found in environment variables (checked ANTHROPIC_API_KEY, ANTHROPIC_TOKEN, ANTHROPIC_TOKEN_COUNT_KEY)\")\n}\n\nfunc CountTokensAPI(content string) (int, error) {\n\tapiKey, err := getAnthropicAPIKey()\n\tif err != nil {\n\t\treturn 0, err\n\t}\n\n\trequestBody := AnthropicTokenCountRequest{\n\t\tModel: AnthropicModel,\n\t\tMessages: []Message{\n\t\t\t{\n\t\t\t\tRole:    \"user\",\n\t\t\t\tContent: content,\n\t\t\t},\n\t\t},\n\t}\n\n\tjsonData, err := json.Marshal(requestBody)\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"failed to marshal request: %w\", err)\n\t}\n\n\treq, err := http.NewRequest(\"POST\", \"https://api.anthropic.com/v1/messages/count_tokens\", bytes.NewBuffer(jsonData))\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"failed to create request: %w\", err)\n\t}\n\n\treq.Header.Set(\"Content-Type\", \"application/json\")\n\treq.Header.Set(\"x-api-key\", apiKey)\n\treq.Header.Set(\"anthropic-version\", \"2023-06-01\")\n\n\tresp, err := httpClient.Do(req)\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"failed to make request: %w\", err)\n\t}\n\tdefer resp.Body.Close()\n\n\tif resp.StatusCode != http.StatusOK {\n\t\tbody, _ := io.ReadAll(resp.Body)\n\t\treturn 0, fmt.Errorf(\"API request failed with status %d: %s\", resp.StatusCode, string(body))\n\t}\n\n\tvar response AnthropicTokenCountResponse\n\tif err := json.NewDecoder(resp.Body).Decode(&response); err != nil {\n\t\treturn 0, fmt.Errorf(\"failed to decode response: %w\", err)\n\t}\n\n\treturn response.InputTokens, nil\n}\n\n// CountTokensBatchAPI counts tokens for multiple content strings in parallel batches.\n// Processes up to batchSize items concurrently to avoid overwhelming the API.\nfunc CountTokensBatchAPI(contents []string, batchSize int) ([]int, error) {\n\tif len(contents) == 0 {\n\t\treturn []int{}, nil\n\t}\n\n\tresults := make([]int, len(contents))\n\terrors := make([]error, len(contents))\n\tvar wg sync.WaitGroup\n\n\t// Process in batches\n\tfor i := 0; i < len(contents); i += batchSize {\n\t\tend := min(i+batchSize, len(contents))\n\n\t\t// Process this batch concurrently\n\t\tfor j := i; j < end; j++ {\n\t\t\twg.Add(1)\n\t\t\tgo func(index int, content string) {\n\t\t\t\tdefer wg.Done()\n\t\t\t\tcount, err := CountTokensAPI(content)\n\t\t\t\tif err != nil {\n\t\t\t\t\terrors[index] = err\n\t\t\t\t\tresults[index] = 0\n\t\t\t\t} else {\n\t\t\t\t\tresults[index] = count\n\t\t\t\t}\n\t\t\t}(j, contents[j])\n\t\t}\n\n\t\t// Wait for this batch to complete before starting the next\n\t\twg.Wait()\n\t}\n\n\t// Check if any errors occurred\n\tvar firstError error\n\tfor _, err := range errors {\n\t\tif err != nil {\n\t\t\tfirstError = err\n\t\t\tbreak\n\t\t}\n\t}\n\n\treturn results, firstError\n}\n"
  },
  {
    "path": "token/token.go",
    "content": "package token\n\nimport (\n\t\"fmt\"\n\t\"sync\"\n\n\t\"github.com/fatih/color\"\n\t\"github.com/pkoukk/tiktoken-go\"\n\t\"github.com/sammcj/ingest/utils\"\n)\n\nvar (\n\tapiUsedOnce    sync.Once\n\tapiWarningOnce sync.Once\n)\n\n// CorrectionMultiplier is applied to offline token counts for better accuracy.\n// Based on empirical analysis comparing offline tokeniser with Anthropic API,\n// the offline tokeniser underestimates by approximately 15.25%.\n// A 1.18x multiplier reduces average error from ~17% to ~2%.\nconst CorrectionMultiplier = 1.18\n\nfunc GetTokenizer(encoding string) *tiktoken.Tiktoken {\n\tvar err error\n\tvar tk *tiktoken.Tiktoken\n\n\tswitch encoding {\n\tcase \"o200k\", \"gpt-4o\", \"gpt-4.1\", \"gpt-4.5\":\n\t\ttk, err = tiktoken.GetEncoding(\"o200k_base\")\n\tcase \"cl100k\", \"llama3\", \"llama-3\", \"gpt-4\", \"gpt-3.5-turbo\", \"text-embedding-3-large\", \"text-embedding-3-small\", \"text-embedding-ada-002\", \"text-ada-002\":\n\t\ttk, err = tiktoken.GetEncoding(\"cl100k_base\")\n\tcase \"p50k\":\n\t\ttk, err = tiktoken.GetEncoding(\"p50k_base\")\n\tcase \"r50k\", \"gpt2\", \"text-ada-001\", \"text-curie-001\", \"text-babbage-001\":\n\t\ttk, err = tiktoken.GetEncoding(\"r50k_base\")\n\tdefault:\n\t\t// Default to o200k_base for modern Anthropic Claude and OpenAI models\n\t\ttk, err = tiktoken.GetEncoding(\"o200k_base\")\n\t}\n\n\tif err != nil {\n\t\tfmt.Printf(\"Failed to get tokenizer: %v\\n\", err)\n\t\treturn nil\n\t}\n\treturn tk\n}\n\nfunc GetModelInfo(encoding string) string {\n\tswitch encoding {\n\tcase \"o200k\", \"gpt-4o\", \"gpt-4.1\", \"gpt-4.5\":\n\t\treturn \"OpenAI gpt-4+, Anthropic Claude Haiku/Sonnet/Opus 3+ models\"\n\tcase \"cl100k\":\n\t\treturn \"Llama3, OpenAI <4o models, text-embedding-ada-002, gpt-4 etc...\"\n\tcase \"p50k\":\n\t\treturn \"OpenAI code models, text-davinci-002, text-davinci-003 etc...\"\n\tcase \"r50k\", \"gpt2\", \"llama2\", \"llama-2\":\n\t\treturn \"Legacy models like llama2, GPT-3, davinci etc...\"\n\tdefault:\n\t\treturn \"OpenAI gpt-4+, Anthropic Claude Haiku/Sonnet/Opus 3+ models\"\n\t}\n}\n\nfunc CountTokens(rendered string, encoding string, useAnthropicAPI bool, noCorrection bool) int {\n\tif useAnthropicAPI {\n\t\tcount, err := CountTokensAPI(rendered)\n\t\tif err != nil {\n\t\t\tapiWarningOnce.Do(func() {\n\t\t\t\tfmt.Printf(\"Warning: Failed to count tokens using Anthropic API: %v\\nFalling back to offline tokeniser\\n\", err)\n\t\t\t})\n\t\t\t// Fall back to offline tokenizer\n\t\t} else {\n\t\t\tapiUsedOnce.Do(func() {\n\t\t\t\tfmt.Println() // Add blank line for visibility\n\t\t\t\tutils.PrintColouredMessage(\"✓\", fmt.Sprintf(\"Using Anthropic API (%s) for token counting\", AnthropicModel), color.FgYellow)\n\t\t\t})\n\t\t\treturn count\n\t\t}\n\t}\n\n\ttk := GetTokenizer(encoding)\n\tif tk == nil {\n\t\treturn 0\n\t}\n\n\ttokens := tk.Encode(rendered, nil, nil)\n\trawCount := len(tokens)\n\n\t// Apply correction multiplier for better accuracy unless disabled\n\tif noCorrection {\n\t\treturn rawCount\n\t}\n\n\tcorrectedCount := float64(rawCount) * CorrectionMultiplier\n\treturn int(correctedCount)\n}\n\n// CountTokensBatch counts tokens for multiple strings, using parallel API calls if Anthropic API is enabled.\n// Processes up to 4 items concurrently when using the API.\nfunc CountTokensBatch(contents []string, encoding string, useAnthropicAPI bool, noCorrection bool) []int {\n\tif len(contents) == 0 {\n\t\treturn []int{}\n\t}\n\n\tif useAnthropicAPI {\n\t\tcounts, err := CountTokensBatchAPI(contents, 4)\n\t\tif err != nil {\n\t\t\tapiWarningOnce.Do(func() {\n\t\t\t\tfmt.Printf(\"Warning: Failed to count tokens using Anthropic API: %v\\nFalling back to offline tokeniser\\n\", err)\n\t\t\t})\n\t\t\t// Fall back to offline tokeniser for all items\n\t\t} else {\n\t\t\tapiUsedOnce.Do(func() {\n\t\t\t\tfmt.Println() // Add blank line for visibility\n\t\t\t\tutils.PrintColouredMessage(\"✓\", fmt.Sprintf(\"Using Anthropic API (%s) for token counting\", AnthropicModel), color.FgYellow)\n\t\t\t})\n\t\t\treturn counts\n\t\t}\n\t}\n\n\t// Use offline tokeniser for all items\n\tresults := make([]int, len(contents))\n\ttk := GetTokenizer(encoding)\n\tif tk == nil {\n\t\treturn results\n\t}\n\n\tfor i, content := range contents {\n\t\ttokens := tk.Encode(content, nil, nil)\n\t\trawCount := len(tokens)\n\n\t\tif noCorrection {\n\t\t\tresults[i] = rawCount\n\t\t} else {\n\t\t\tcorrectedCount := float64(rawCount) * CorrectionMultiplier\n\t\t\tresults[i] = int(correctedCount)\n\t\t}\n\t}\n\n\treturn results\n}\n"
  },
  {
    "path": "tree-sitter-dev-plan.md",
    "content": "# Tree-sitter Based Code Compressor - Development Plan\n\n## Overview\n\nThis document outlines the development plan for a Tree-sitter based code compressor. The goal is to extract key structural information from source code, such as imports, package definitions, function/method/class signatures, and comments, while omitting detailed implementation bodies.\n\n## Phases\n\n### Phase 1: Basic Language Identification & Parsing\n\n- [x] Identify language from file extension.\n- [x] Parse source code into an AST using Tree-sitter.\n- [x] Implement basic Tree-sitter query for Go (imports, functions, methods, types).\n\n**Status:** Completed.\n\n### Phase 2: Code Chunk Extraction & Compression Logic\n\n- [x] Define `CodeChunk` struct (Note: `OriginalLine` field is a placeholder, actual line mapping not yet implemented - see Phase 6).\n- [x] Implement `GenericCompressor` with `Compress` method.\n- [x] Execute queries and process captures from matches.\n- [x] Implement logic to strip function/method bodies and retain signatures.\n  - [x] Go\n  - [x] Python\n  - [x] JavaScript (for named functions, classes, methods, including exported and default exported named variants)\n- [x] Handle different types of captures (imports, packages, type definitions, comments).\n- [x] Sort and combine extracted code chunks.\n\n**Status:** Largely completed. Core compression logic is functional for Go, Python, and key JavaScript constructs. All `internal/compressor` tests are passing.\n\n### Phase 3: Language-Specific Queries & Refinements\n\n- **Go:**\n  - [x] Query for package, imports, type definitions, function declarations, method declarations, comments.\n- **Python:**\n  - [x] Query for imports, function definitions, class definitions, comments.\n- **JavaScript:**\n  - [x] Query for imports, comments, method definitions.\n  - [x] Query and processing logic for:\n    - [x] Exported named functions/classes/generators (e.g., `export function foo() {}`).\n    - [x] Default exported named functions/classes/generators (e.g., `export default function foo() {}`).\n    - [x] Standalone named functions/classes/generators.\n  - [ ] **Outstanding/Needs Refinement for JavaScript:**\n    - [ ] **Arrow Functions:** Implement body stripping for arrow functions assigned to variables (e.g., `const myArrow = () => { /* body */ };`). The test file `example.js` includes `const myArrowFunc = (a, b) => { /* Arrow function body */ ... }` which is currently not processed for body stripping.\n    - [ ] **Anonymous Default Exports:** Clarify and potentially implement body stripping for anonymous default exported functions and classes (e.g., `export default function() { /* body */ }`). Currently, these are captured by `@export.other` and kept whole. The test file `example.js` includes `export default function() { console.log(\"Anon default func\"); }` and `export default class { constructor() { this.x = 1;} }`, implying these might need stripping.\n    - [ ] Review other common JS constructs (e.g., object methods not part of class syntax, IIFEs) if they need specific handling.\n- **Other Languages:**\n  - [ ] Plan for and implement queries for other languages as needed (e.g., TypeScript, Java, C++).\n\n**Status:** Go and Python are well-covered by current tests. JavaScript has significantly improved and handles many common cases, with tests passing for implemented features. Specific outstanding items for JavaScript are noted above.\n\n### Phase 4: Testing & Edge Cases\n\n- [x] Write initial unit tests for Go, Python, JavaScript compressor logic. (Current tests for `internal/compressor` are passing).\n- [ ] Expand test coverage with more complex real-world examples and edge cases for all supported languages, particularly for JavaScript features listed as outstanding in Phase 3.\n- [ ] Test interactions between different JavaScript export/import syntaxes.\n\n**Status:** Basic unit tests are in place and passing. Further testing is required, especially for JavaScript refinements.\n\n### Phase 5: Integration & CLI\n\n- [ ] Integrate the `GenericCompressor` into `main.go`.\n- [ ] Add CLI flags for language selection, input file/directory, output options.\n- [ ] Handle file system traversal for multiple files.\n\n**Status:** Not started.\n\n### Phase 6: Advanced Features (Future Considerations)\n\n- [ ] **Line Number Mapping:** Fully implement `OriginalLine` in `CodeChunk` to map compressed chunks back to their original line numbers.\n- [ ] Contextual Compression: Explore options for keeping more context if needed (e.g., call sites of a function, specific variable assignments).\n- [ ] Configuration: Allow users to customize what to extract/omit via configuration files or advanced CLI options.\n- [ ] Performance Optimization: Profile and optimize parsing and query execution for large codebases.\n\n**Status:** Not started.\n\nLanguages supported by `smacker/go-tree-sitter`:\n\n```\nbash\nc\ncpp\ncsharp\ncss\ncue\ndockerfile\nelixir\nelm\ngolang\ngroovy\nhcl\nhtml\njava\njavascript\nkotlin\nlua\nmarkdown\nocaml\nphp\nprotobuf\npython\nruby\nrust\nscala\nsql\nsvelte\nswift\ntoml\ntypescript\nmarkdown\nyaml\n```\n\nThe most important to support initially (if we even have to do anything to enable them?) are:\n\n- Go\n- Python\n- JavaScript\n- TypeScript\n- bash\n- rust\n- swift\n- toml\n- yaml\n- css\n- c\n- html\n- sql\n"
  },
  {
    "path": "utils/output_manager.go",
    "content": "package utils\n\nimport (\n\t\"sort\"\n\t\"sync\"\n\n\t\"github.com/fatih/color\"\n)\n\ntype OutputMessage struct {\n\tSymbol   string\n\tMessage  string\n\tColor    color.Attribute\n\tPriority int\n}\n\nvar (\n\tmessages []OutputMessage\n\tmutex    sync.Mutex\n)\n\n// AddMessage adds a message to the output queue\nfunc AddMessage(symbol string, message string, messageColor color.Attribute, priority int) {\n\tmutex.Lock()\n\tdefer mutex.Unlock()\n\tmessages = append(messages, OutputMessage{\n\t\tSymbol:   symbol,\n\t\tMessage:  message,\n\t\tColor:    messageColor,\n\t\tPriority: priority,\n\t})\n}\n\n// PrintMessages prints all collected messages sorted by priority\nfunc PrintMessages() {\n\tmutex.Lock()\n\tdefer mutex.Unlock()\n\n\t// Sort messages by priority (lower priority prints later)\n\tsort.Slice(messages, func(i, j int) bool {\n\t\treturn messages[i].Priority > messages[j].Priority\n\t})\n\n\tfor _, msg := range messages {\n\t\tPrintColouredMessage(msg.Symbol, msg.Message, msg.Color)\n\t}\n\n\t// Clear messages after printing\n\tmessages = nil\n}\n"
  },
  {
    "path": "utils/utils.go",
    "content": "package utils\n\nimport (\n\t\"fmt\"\n\t\"os\"\n\t\"os/exec\"\n\t\"path/filepath\"\n\t\"strconv\"\n\t\"strings\"\n\t\"syscall\"\n\t\"unsafe\"\n\n\t\"github.com/atotto/clipboard\"\n\t\"github.com/fatih/color\"\n\t\"github.com/mitchellh/go-homedir\"\n\t\"github.com/schollz/progressbar/v3\"\n)\n\ntype termSize struct {\n\tRow    uint16\n\tCol    uint16\n\tXpixel uint16\n\tYpixel uint16\n}\n\nfunc CopyToClipboard(rendered string) error {\n\tif isWSL() {\n\t\tif err := writeToWindowsClipboard(rendered); err == nil {\n\t\t\treturn nil\n\t\t}\n\t}\n\n\t// fallback to default\n\treturn writeClipboard(rendered)\n}\n\nfunc writeToWindowsClipboard(text string) error {\n\t// Copy using PowerShell for UTF-8 encoding\n\tpsCmd := `[Console]::InputEncoding = [System.Text.Encoding]::UTF8; $s = [Console]::In.ReadToEnd(); Set-Clipboard -Value $s`\n\tcmd := exec.Command(\"powershell.exe\", \"-NoProfile\", \"-Command\", psCmd)\n\tstdin, err := cmd.StdinPipe()\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to create stdin pipe for PowerShell: %w\", err)\n\t}\n\tdefer stdin.Close()\n\n\tif err := cmd.Start(); err != nil {\n\t\treturn fmt.Errorf(\"failed to start PowerShell: %w\", err)\n\t}\n\n\tif _, err := stdin.Write([]byte(text)); err != nil {\n\t\treturn fmt.Errorf(\"failed to write to PowerShell stdin: %w\", err)\n\t}\n\n\tif err := stdin.Close(); err != nil {\n\t\treturn fmt.Errorf(\"failed to close PowerShell stdin: %w\", err)\n\t}\n\n\tif err := cmd.Wait(); err != nil {\n\t\treturn fmt.Errorf(\"PowerShell failed to set clipboard: %w\", err)\n\t}\n\n\treturn nil\n}\n\nfunc writeClipboard(text string) error {\n\tif err := clipboard.WriteAll(text); err != nil {\n\t\treturn fmt.Errorf(\"failed to copy to clipboard: %v\", err)\n\t}\n\treturn nil\n}\n\nfunc WriteToFile(outputPath string, rendered string) error {\n\terr := os.WriteFile(outputPath, []byte(rendered), 0644)\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to write to file: %v\", err)\n\t}\n\tfmt.Printf(\"%s Prompt written to file: %s\\n\", color.GreenString(\"✓\"), outputPath)\n\treturn nil\n}\n\nfunc SetupSpinner(message string) *progressbar.ProgressBar {\n\treturn progressbar.NewOptions(-1,\n\t\tprogressbar.OptionSetDescription(message),\n\t\tprogressbar.OptionSpinnerType(14),\n\t\tprogressbar.OptionSetTheme(progressbar.Theme{\n\t\t\tSaucer:        \"=\",\n\t\t\tSaucerHead:    \">\",\n\t\t\tSaucerPadding: \" \",\n\t\t\tBarStart:      \"[\",\n\t\t\tBarEnd:        \"]\",\n\t\t}),\n\t)\n}\n\nfunc Label(path string) string {\n\tif path == \"\" {\n\t\twd, err := os.Getwd()\n\t\tif err != nil {\n\t\t\treturn \".\"\n\t\t}\n\t\treturn wd\n\t}\n\treturn path\n}\n\nfunc PrintColouredMessage(symbol string, message string, messageColor color.Attribute) {\n\twhite := color.New(color.FgWhite, color.Bold).SprintFunc()\n\tcolouredMessage := color.New(messageColor).SprintFunc()\n\n\tfmt.Printf(\"%s%s%s %s\\n\", white(\"[\"), white(symbol), white(\"]\"), colouredMessage(message))\n}\n\nfunc EnsureConfigDirectories() error {\n\thome, err := homedir.Dir()\n\tif err != nil {\n\t\treturn fmt.Errorf(\"failed to get home directory: %w\", err)\n\t}\n\n\tconfigDirs := []struct {\n\t\tpath string\n\t\tdesc string\n\t}{\n\t\t{filepath.Join(home, \".config\", \"ingest\", \"patterns\", \"exclude\"), \"Add .glob files here containing glob matches to exclude additional patterns.\"},\n\t\t{filepath.Join(home, \".config\", \"ingest\", \"patterns\", \"templates\"), \"Add go templates with the extension .tmpl here for different output formats.\"},\n\t}\n\n\tfor _, dir := range configDirs {\n\t\tif err := os.MkdirAll(dir.path, 0755); err != nil {\n\t\t\treturn fmt.Errorf(\"failed to create directory %s: %w\", dir.path, err)\n\t\t}\n\n\t\treadmePath := filepath.Join(dir.path, \"README.md\")\n\t\tif _, err := os.Stat(readmePath); os.IsNotExist(err) {\n\t\t\tcontent := fmt.Sprintf(\"# %s\\n\\n%s\", filepath.Base(dir.path), dir.desc)\n\t\t\tif err := os.WriteFile(readmePath, []byte(content), 0644); err != nil {\n\t\t\t\treturn fmt.Errorf(\"failed to create README.md in %s: %w\", dir.path, err)\n\t\t\t}\n\t\t}\n\t}\n\n\treturn nil\n}\n\nfunc FormatNumber(n int) string {\n\tin := strconv.Itoa(n)\n\tout := make([]byte, len(in)+(len(in)-2+int(in[0]/'0'))/3)\n\tif in[0] == '-' {\n\t\tin, out[0] = in[1:], '-'\n\t}\n\n\tfor i, j, k := len(in)-1, len(out)-1, 0; ; i, j = i-1, j-1 {\n\t\tout[j] = in[i]\n\t\tif i == 0 {\n\t\t\treturn string(out)\n\t\t}\n\t\tif k++; k == 3 {\n\t\t\tj, k = j-1, 0\n\t\t\tout[j] = ','\n\t\t}\n\t}\n}\n\nfunc GetTerminalWidth() int {\n\tws := &termSize{}\n\tretCode, _, _ := syscall.Syscall(syscall.SYS_IOCTL,\n\t\tuintptr(syscall.Stdin),\n\t\tuintptr(syscall.TIOCGWINSZ),\n\t\tuintptr(unsafe.Pointer(ws)))\n\n\tif int(retCode) == -1 {\n\t\treturn 100\n\t}\n\treturn int(ws.Col)\n}\n\nfunc isWSL() bool {\n\tif _, ok := os.LookupEnv(\"WSL_DISTRO_NAME\"); ok {\n\t\treturn true\n\t}\n\n\tif out, err := os.ReadFile(\"/proc/sys/kernel/osrelease\"); err == nil {\n\t\ts := strings.ToLower(string(out))\n\t\tif strings.Contains(s, \"microsoft\") || strings.Contains(s, \"wsl\") {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n"
  },
  {
    "path": "web/crawler.go",
    "content": "// web/crawler.go\n\npackage web\n\nimport (\n\t\"io\"\n\t\"net/http\"\n\t\"net/url\"\n\t\"strings\"\n\t\"sync\"\n\t\"time\"\n\n\tmd \"github.com/JohannesKaufmann/html-to-markdown\"\n\t\"github.com/JohannesKaufmann/html-to-markdown/plugin\"\n\t\"github.com/PuerkitoBio/goquery\"\n\t\"github.com/bmatcuk/doublestar/v4\"\n)\n\ntype CrawlOptions struct {\n\tMaxDepth       int\n\tAllowedDomains []string\n\tTimeout        int\n\tConcurrentJobs int\n}\n\ntype WebPage struct {\n\tURL         string\n\tContent     string\n\tTitle       string\n\tLinks       []string\n\tDepth       int\n\tStatusCode  int\n\tContentType string\n}\n\ntype Crawler struct {\n\tvisited         map[string]bool\n\tvisitedLock     sync.Mutex\n\toptions         CrawlOptions\n\tconverter       *md.Converter\n\texcludePatterns []string\n\tinitialPath     string // Store the initial URL path\n\tsinglePageMode  bool   // True if crawling a specific page\n}\n\nfunc NewCrawler(options CrawlOptions, startURL string) *Crawler {\n\tparsedURL, err := url.Parse(startURL)\n\tinitialPath := \"/\"\n\tsinglePageMode := false\n\n\tif err == nil && parsedURL.Path != \"\" && parsedURL.Path != \"/\" {\n\t\tinitialPath = strings.TrimSuffix(parsedURL.Path, \"/\")\n\t\tsinglePageMode = true\n\t}\n\t// Create a new converter with GitHub Flavored Markdown support\n\tconverter := md.NewConverter(\"\", true, &md.Options{\n\t\t// Configure the converter to handle common edge cases\n\t\tStrongDelimiter:  \"**\",\n\t\tEmDelimiter:      \"*\",\n\t\tLinkStyle:        \"inlined\",\n\t\tHeadingStyle:     \"atx\",\n\t\tHorizontalRule:   \"---\",\n\t\tCodeBlockStyle:   \"fenced\",\n\t\tBulletListMarker: \"-\",\n\t})\n\n\t// Use GitHub Flavored Markdown plugins\n\tconverter.Use(plugin.GitHubFlavored())\n\n\t// Configure the converter to handle specific elements\n\tconverter.Keep(\"math\", \"script[type='math/tex']\")         // Keep math formulas\n\tconverter.Remove(\"script\", \"style\", \"iframe\", \"noscript\") // Remove unwanted elements\n\n\treturn &Crawler{\n\t\tvisited:        make(map[string]bool),\n\t\toptions:        options,\n\t\tconverter:      converter,\n\t\tinitialPath:    initialPath,\n\t\tsinglePageMode: singlePageMode,\n\t}\n}\n\nfunc (c *Crawler) SetExcludePatterns(patterns []string) {\n\tc.excludePatterns = patterns\n}\n\nfunc (c *Crawler) shouldExclude(urlStr string) bool {\n\tparsedURL, err := url.Parse(urlStr)\n\tif err != nil {\n\t\treturn false\n\t}\n\n\tpath := parsedURL.Path\n\tif path == \"\" {\n\t\tpath = \"/\"\n\t}\n\n\tfor _, pattern := range c.excludePatterns {\n\t\tcleanPath := strings.TrimPrefix(path, \"/\")\n\t\tif match, _ := doublestar.Match(pattern, cleanPath); match {\n\t\t\treturn true\n\t\t}\n\t\tif match, _ := doublestar.Match(pattern, path); match {\n\t\t\treturn true\n\t\t}\n\t}\n\n\treturn false\n}\n\nfunc (c *Crawler) hasVisited(urlStr string) bool {\n\tc.visitedLock.Lock()\n\tdefer c.visitedLock.Unlock()\n\treturn c.visited[urlStr]\n}\n\nfunc (c *Crawler) markVisited(urlStr string) {\n\tc.visitedLock.Lock()\n\tdefer c.visitedLock.Unlock()\n\tc.visited[urlStr] = true\n}\n\nfunc (c *Crawler) fetchPage(urlStr string, depth int) (*WebPage, error) {\n\tif depth > c.options.MaxDepth {\n\t\treturn nil, nil\n\t}\n\n\tif c.hasVisited(urlStr) {\n\t\treturn nil, nil\n\t}\n\n\tif !c.isAllowed(urlStr) {\n\t\treturn nil, nil\n\t}\n\n\tif c.shouldExclude(urlStr) {\n\t\treturn nil, nil\n\t}\n\n\tclient := &http.Client{\n\t\tTimeout: time.Duration(c.options.Timeout) * time.Second,\n\t}\n\n\tresp, err := client.Get(urlStr)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tdefer resp.Body.Close()\n\n\tc.markVisited(urlStr)\n\n\tbody, err := io.ReadAll(resp.Body)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\t// Parse the HTML document\n\tdoc, err := goquery.NewDocumentFromReader(strings.NewReader(string(body)))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\t// Extract title and links\n\ttitle := doc.Find(\"title\").Text()\n\tlinks := c.extractLinks(doc, urlStr)\n\n\t// Convert HTML to Markdown\n\tmarkdown, err := c.converter.ConvertString(string(body))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\treturn &WebPage{\n\t\tURL:         urlStr,\n\t\tContent:     markdown,\n\t\tTitle:       title,\n\t\tLinks:       links,\n\t\tDepth:       depth,\n\t\tStatusCode:  resp.StatusCode,\n\t\tContentType: resp.Header.Get(\"Content-Type\"),\n\t}, nil\n}\n\nfunc (c *Crawler) extractLinks(doc *goquery.Document, baseURL string) []string {\n\tvar links []string\n\tbaseURLParsed, err := url.Parse(baseURL)\n\tif err != nil {\n\t\treturn links\n\t}\n\n\tdoc.Find(\"a[href]\").Each(func(_ int, s *goquery.Selection) {\n\t\tif href, exists := s.Attr(\"href\"); exists {\n\t\t\tif absURL := c.resolveURL(baseURLParsed, href); absURL != \"\" {\n\t\t\t\tlinks = append(links, absURL)\n\t\t\t}\n\t\t}\n\t})\n\n\treturn links\n}\n\nfunc (c *Crawler) resolveURL(base *url.URL, ref string) string {\n\trefURL, err := url.Parse(ref)\n\tif err != nil {\n\t\treturn \"\"\n\t}\n\n\tresolvedURL := base.ResolveReference(refURL)\n\tif !strings.HasPrefix(resolvedURL.Scheme, \"http\") {\n\t\treturn \"\"\n\t}\n\n\treturn resolvedURL.String()\n}\n\nfunc (c *Crawler) isAllowed(urlStr string) bool {\n\tparsedURL, err := url.Parse(urlStr)\n\tif err != nil {\n\t\treturn false\n\t}\n\n\t// Check domain restrictions if any\n\tif len(c.options.AllowedDomains) > 0 {\n\t\tdomainAllowed := false\n\t\tfor _, domain := range c.options.AllowedDomains {\n\t\t\tif strings.Contains(parsedURL.Host, domain) {\n\t\t\t\tdomainAllowed = true\n\t\t\t\tbreak\n\t\t\t}\n\t\t}\n\t\tif !domainAllowed {\n\t\t\treturn false\n\t\t}\n\t}\n\n\t// If we're in single page mode, only allow the exact same path\n\tif c.singlePageMode {\n\t\tcurrentPath := strings.TrimSuffix(parsedURL.Path, \"/\")\n\t\t// Only allow the exact same path or same path with a fragment\n\t\treturn currentPath == c.initialPath\n\t}\n\n\treturn true\n}\n\nfunc (c *Crawler) Crawl(startURL string) ([]*WebPage, error) {\n\tvar pages []*WebPage\n\tvar pagesLock sync.Mutex\n\tvar wg sync.WaitGroup\n\tsemaphore := make(chan struct{}, c.options.ConcurrentJobs)\n\n\tvar crawlPage func(urlStr string, depth int)\n\tcrawlPage = func(urlStr string, depth int) {\n\t\tdefer wg.Done()\n\t\tsemaphore <- struct{}{}        // Acquire\n\t\tdefer func() { <-semaphore }() // Release\n\n\t\tpage, err := c.fetchPage(urlStr, depth)\n\t\tif err != nil || page == nil {\n\t\t\treturn\n\t\t}\n\n\t\tpagesLock.Lock()\n\t\tpages = append(pages, page)\n\t\tpagesLock.Unlock()\n\n\t\tif depth < c.options.MaxDepth {\n\t\t\tfor _, link := range page.Links {\n\t\t\t\tif !c.hasVisited(link) && c.isAllowed(link) {\n\t\t\t\t\twg.Add(1)\n\t\t\t\t\tgo crawlPage(link, depth+1)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t}\n\n\twg.Add(1)\n\tgo crawlPage(startURL, 0)\n\twg.Wait()\n\n\treturn pages, nil\n}\n"
  },
  {
    "path": "web/integration.go",
    "content": "// web/integration.go\n\npackage web\n\nimport (\n\t\"fmt\"\n\t\"net/url\"\n\t\"path/filepath\"\n\t\"strings\"\n\n\t\"github.com/sammcj/ingest/filesystem\"\n\t\"github.com/sammcj/ingest/pdf\"\n)\n\ntype CrawlResult struct {\n\tTreeString string\n\tFiles      []filesystem.FileInfo\n}\n\nfunc ProcessWebURL(urlStr string, options CrawlOptions, excludePatterns []string) (*CrawlResult, error) {\n\t// Check if URL points to a PDF\n\tisPDF, err := pdf.IsPDF(urlStr)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"error checking PDF: %w\", err)\n\t}\n\n\tif isPDF {\n\t\tcontent, err := pdf.ConvertPDFToMarkdown(urlStr, true)\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"error converting PDF: %w\", err)\n\t\t}\n\n\t\treturn &CrawlResult{\n\t\t\tTreeString: fmt.Sprintf(\"PDF Document: %s\", urlStr),\n\t\t\tFiles: []filesystem.FileInfo{{\n\t\t\t\tPath:      urlStr,\n\t\t\t\tExtension: \".md\",\n\t\t\t\tCode:      content,\n\t\t\t}},\n\t\t}, nil\n\t}\n\n\t// Validate URL\n\tparsedURL, err := url.Parse(urlStr)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"invalid URL: %w\", err)\n\t}\n\n\tif !strings.HasPrefix(parsedURL.Scheme, \"http\") {\n\t\treturn nil, fmt.Errorf(\"URL must start with http:// or https://\")\n\t}\n\n\t// Initialize crawler with the start URL\n\tcrawler := NewCrawler(options, urlStr)\n\tcrawler.SetExcludePatterns(excludePatterns)\n\n\t// Perform crawl\n\tpages, err := crawler.Crawl(urlStr)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"crawl failed: %w\", err)\n\t}\n\n\t// Convert crawled pages to FileInfo format\n\tvar files []filesystem.FileInfo\n\tfor _, page := range pages {\n\t\t// Skip pages with no content or error status codes\n\t\tif page.StatusCode != 200 || page.Content == \"\" {\n\t\t\tcontinue\n\t\t}\n\n\t\tfiles = append(files, filesystem.FileInfo{\n\t\t\tPath:      page.URL,\n\t\t\tExtension: \".md\",\n\t\t\tCode:      page.Content,\n\t\t})\n\t}\n\n\t// Generate tree representation, but only if we have more than one page\n\tvar treeString string\n\tif len(files) > 1 {\n\t\ttreeString = generateWebTree(pages)\n\t} else if len(files) == 1 {\n\t\ttreeString = fmt.Sprintf(\"Web Page: %s\", files[0].Path)\n\t}\n\n\t// If we're crawling a specific page, only return that page's content\n\tif parsedURL.Path != \"/\" && parsedURL.Path != \"\" {\n\t\tfor _, file := range files {\n\t\t\tfileURL, err := url.Parse(file.Path)\n\t\t\tif err != nil {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\t// Find the exact matching path (ignoring trailing slashes)\n\t\t\tif strings.TrimSuffix(fileURL.Path, \"/\") == strings.TrimSuffix(parsedURL.Path, \"/\") {\n\t\t\t\treturn &CrawlResult{\n\t\t\t\t\tTreeString: fmt.Sprintf(\"Web Page: %s\", file.Path),\n\t\t\t\t\tFiles:      []filesystem.FileInfo{file},\n\t\t\t\t}, nil\n\t\t\t}\n\t\t}\n\t}\n\n\treturn &CrawlResult{\n\t\tTreeString: treeString,\n\t\tFiles:      files,\n\t}, nil\n}\n\nfunc generateWebTree(pages []*WebPage) string {\n\tvar builder strings.Builder\n\tbuilder.WriteString(\"Web Crawl Structure:\\n\")\n\n\t// Create a map of depth to pages\n\tdepthMap := make(map[int][]*WebPage)\n\tfor _, page := range pages {\n\t\tif page.StatusCode == 200 && page.Content != \"\" {\n\t\t\tdepthMap[page.Depth] = append(depthMap[page.Depth], page)\n\t\t}\n\t}\n\n\t// Build the tree structure with indentation\n\tfor depth := 0; depth <= len(depthMap); depth++ {\n\t\tif pages, ok := depthMap[depth]; ok {\n\t\t\tfor _, page := range pages {\n\t\t\t\tindent := strings.Repeat(\"  \", depth)\n\t\t\t\turlPath := getURLPath(page.URL)\n\t\t\t\tbuilder.WriteString(fmt.Sprintf(\"%s├── %s\\n\", indent, urlPath))\n\t\t\t}\n\t\t}\n\t}\n\n\treturn builder.String()\n}\n\nfunc getURLPath(urlStr string) string {\n\tparsedURL, err := url.Parse(urlStr)\n\tif err != nil {\n\t\treturn urlStr\n\t}\n\n\tpath := parsedURL.Path\n\tif path == \"\" || path == \"/\" {\n\t\treturn parsedURL.Host\n\t}\n\n\treturn filepath.Join(parsedURL.Host, path)\n}\n"
  }
]