[
  {
    "path": ".gitignore",
    "content": "# Compiled binary (rebuilt locally; release artifacts attached on GitHub)\n/jshunter\n/jshunter.exe\n/jshunter_*\n/dist/\n\n# Go build/test cache\n*.test\n*.out\n\n# Editor\n.vscode/\n.idea/\n*.swp\n.DS_Store\n\n# Operator state\n.jshunterignore\n*.sarif\n*.har\n"
  },
  {
    "path": ".jshunterignore.example",
    "content": "# JSHunter ignore file.\n# One entry per line. Blank lines and lines starting with `#` are skipped.\n#\n# Supported kinds:\n#   hash:<value_hash>           # suppress one specific finding (stable across runs)\n#   rule:<rule_id_or_glob>      # suppress an entire rule or family\n#   source:<source_glob>        # suppress all findings from a source matching glob\n#   rule_value:<rule>:<value_glob>\n#                               # suppress when rule matches AND value matches glob\n#\n# Globs use filepath.Match syntax (`*`, `?`, `[abc]`).\n\n# Example: suppress an analytics SDK that always carries a public-but-rotating key\nrule:slack.webhook\nsource:*/cdn.segment.com/*\n\n# Example: suppress one specific known FP by its sha256 prefix\nhash:a1b2c3d4e5f60718\n\n# Example: suppress only test JWTs from a specific bundle\nrule_value:jwt.token:*test_* \n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# Changelog\n\nAll notable changes to JSHunter are tracked here. Dates are ISO-8601.\n\n## [v0.6 — page-aware crawling, sourcemaps, cache, concurrent verify] — 2026-05-08\n\nThe \"JS-aware crawler, not just a JS-file scanner\" iteration.\n\n### HTML page-awareness (`--inline-html`)\n\n`golang.org/x/net/html` tokenizer walks the response, extracts:\n\n- Every inline `<script>` body (scanned under `URL#inline[N]` source label)\n- Every external `<script src=…>` reference plus its `integrity=` SRI hash\n- Every `<meta http-equiv=\"Content-Security-Policy\" content=…>` directive\n- `<link rel=\"preload|modulepreload|prefetch\" as=\"script\">` referenced JS\n\nHomepage HTML is the most common place JS-only crawlers miss secrets —\nthe `window.__INITIAL_STATE__ = {…}` blob, dev-only `<script type=\"module\">`\ninit code, etc. Now first-class.\n\n### Source map real parsing (`--sourcemap`)\n\n`//# sourceMappingURL=` markers now drive a real fetch+parse pipeline:\n\n- HTTP(S) maps fetched through the same hardened client (host limiter,\n  max-bytes, SSRF guard).\n- `data:application/json;base64,...` inline maps decoded.\n- `data:application/json,...` percent-encoded maps unescaped.\n- Each entry in `sourcesContent[]` is scanned as its own source under\n  `<URL>.map#<original-path>`.\n\nModern bundlers (Vite, esbuild, webpack 5, Turbopack, Rspack) routinely\nship pre-minification sources verbatim — comments, dev-only code paths,\noriginal variable names. This is the highest-leverage signal for secret\nrecon on a production site.\n\n### CSP origin extraction (`--csp-origins`)\n\n`Content-Security-Policy` response headers and `<meta http-equiv>` tags\nare parsed; the host origins (excluding `'self'`, `'unsafe-*'`, `nonce-*`,\nhash sources, `data:`, `blob:`, `ws:`, …) are emitted as\n`[CSP] <source>\\t<origin>` lines suitable for piping back into the URL\nqueue of a follow-up scan.\n\n### robots.txt ingest (`--robots`)\n\nFetches `/robots.txt` for every unique host in the input and prints\n`Disallow`, `Allow`, and `Sitemap` lines. Pure recon helper — JSHunter\ndoes NOT respect robots.txt for its own crawling. Operators wanting\ncompliance pipe these paths back as the input list.\n\n### Disk cache (`--cache-dir`)\n\nPer-URL SHA-256 keyed on-disk cache. Two files per URL:\n\n- `<hash>.body` — the response body\n- `<hash>.meta` — JSON: status, content-type, ETag, Last-Modified, fetched_at\n\nRe-runs attach `If-None-Match` / `If-Modified-Since`; 304 responses serve\nfrom disk. Set-Cookie / Authorization-bearing responses are skipped\n(security hazard). Mode 0600 on disk because cached bodies may carry\nsecrets.\n\n### Concurrent verifier worker pool (`--verify-workers`)\n\n`VerifyAllConcurrent` replaces the serial loop in `emitFinalOutput`. With\n50 findings × 10 s timeout, serial = 8+ minutes; pooled (default 8) =\n~1 min worst case. Per-host limiter still applies inside the workers, so\nno provider gets slammed.\n\n### SARIF partialFingerprints\n\nEach SARIF result now carries `partialFingerprints`:\n\n```json\n\"partialFingerprints\": {\n  \"jshunter/valueHash\": \"<sha256/16>\",\n  \"jshunter/ruleSecretType\": \"<rule_id>:<secret_type>\"\n}\n```\n\nGitHub Code Scanning uses these to persist dismissed/suppressed decisions\nacross runs even when the finding moves source/line.\n\n### Go 1.24 modernization\n\n- `ioutil.ReadAll` → `io.ReadAll` everywhere.\n- `ioutil.ReadFile/WriteFile` → `os.ReadFile/WriteFile`.\n- `rand.Seed(time.Now().UnixNano())` removed (Go 1.20+ auto-seeds the\n  global source).\n\n### Files added\n\n- `html_extract.go` — `golang.org/x/net/html` tokenizer-based extractor.\n- `csp.go` — Content-Security-Policy origin parser.\n- `sourcemap.go` — sourceMappingURL fetch + JSON parse + sourcesContent walk.\n- `cache.go` — `DiskCache` with ETag/Last-Modified revalidation.\n- `robots.go` — RFC 9309 subset parser.\n- `concurrent_verify.go` — bounded worker pool for liveness probes.\n\n## [v0.6 — outputs, suppressions, AWS pair, registry CLI] — 2026-05-08\n\nThe \"make it ship-ready\" iteration. Output formats for CI, persistent\nsuppressions, registry introspection, alternative inputs, and the AWS pair\nverifier that closes the verification gap left in the previous slice.\n\n### AWS pair verifier (SigV4)\n\nWhen the registry detects an Access Key ID and a Secret Access Key in the\n**same source**, JSHunter pairs them and runs `sts:GetCallerIdentity` via\nSigV4 — pure-stdlib HMAC-SHA256 signing, no aws-sdk dependency. A live\nresponse sets `verified=true` on **both** findings and surfaces the IAM\nARN as `verify.account`. Strict pairing: same-source-only with single\nAKID + single secret per source — multi-of-either is left to manual\ntriage to avoid mis-attribution.\n\n### Output formats\n\n| Flag       | Format                                              | Use case                          |\n|------------|-----------------------------------------------------|-----------------------------------|\n| `--sarif`  | SARIF 2.1.0 envelope                                | GitHub code-scanning upload       |\n| `--ndjson` | One Finding per line, `json.Encoder` (no HTML escape) | jq, mlr, SIEM streaming         |\n\nWhen either is set, per-source console output is suppressed so the\nstructured stream stays parseable.\n\n### Suppressions\n\n`--ignore-file PATH` — `.jshunterignore` syntax:\n\n```\nhash:<value_hash_hex>           # single finding by hash\nrule:<rule_id_or_glob>          # entire rule or family\nsource:<glob>                   # all findings from matching source\nrule_value:<rule>:<value-glob>  # rule + value-glob combo\n```\n\nGlobs use `filepath.Match`. Applied at `recordFinding` so suppressions\nwork across both registry and legacy paths.\n\n`--diff PREVIOUS.json` — reads a previous schema-v2 envelope, computes\nthe set of `value_hash` values already reported, and reports only\nfindings NOT in that set. Schema-version mismatch is a hard error.\n\n### Registry introspection / selection\n\n| Flag                  | Effect                                                 |\n|-----------------------|--------------------------------------------------------|\n| `--list-rules`        | Tabular dump of `rule_id severity provider name [flags]` |\n| `--explain RULE_ID`   | Full rule JSON (incl. TP/FP fixtures)                  |\n| `--only-rules a,b,*c` | Run only matching rules (glob suffix supported)        |\n| `--disable-rule x,y`  | Drop matching rules from the registry                  |\n\nSelection is applied **before** `--list-rules` / `--explain` /\n`--self-test`, so an operator can scope CI gates to specific rule families.\n\n### Alternative inputs\n\n`--har FILE` — ingest a Chrome DevTools HAR archive directly, skipping\nthe fetcher. Only entries with JS-typed Content-Type (or `.js` URL\nsuffix) and 2xx/3xx status are scanned. base64-encoded response bodies\nare decoded automatically (std/URL/raw variants tolerated).\n\n### Quality of life\n\n`--no-color` disables ANSI color; if stdout is not a TTY,\n`disableColors()` runs automatically so piping to a file produces clean\ntext.\n\n### Files added\n\n- `aws_pair.go` — SigV4 + pair verifier.\n- `sarif.go` — SARIF 2.1.0 envelope builder.\n- `ndjson.go` — streaming output.\n- `har.go` — HAR ingestion.\n- `ignore.go` — `.jshunterignore` loader and matcher.\n- `diff.go` — previous-envelope baseline.\n- `rules_cli.go` — `--list-rules`, `--explain`, registry selection.\n- `.jshunterignore.example` — operator template.\n\n## [v0.6 — verifier + observability + crawler hardening] — 2026-05-08\n\nThe \"trust the output\" iteration on top of the v0.6 FP pipeline.\n\n### Live verification (`--verify`)\n\nOff-by-default, opt-in liveness probes against documented read-only\nendpoints. A verified secret carries `verified=true` and confidence is\nelevated to `1.0`. Per-host limiter + bounded timeout per probe; secrets\nare never leaked into transport-error strings (sanitized).\n\n| Provider     | Endpoint                                      | Auth                          |\n|--------------|-----------------------------------------------|-------------------------------|\n| Stripe       | `GET /v1/balance`                             | `Authorization: Bearer …`     |\n| GitHub       | `GET /user`                                   | `Authorization: token …`      |\n| OpenAI       | `GET /v1/models`                              | `Authorization: Bearer …`     |\n| Anthropic    | `GET /v1/models`                              | `x-api-key` + `anthropic-version` |\n| Slack        | `GET /api/auth.test`                          | `Authorization: Bearer …`     |\n| SendGrid     | `GET /v3/scopes`                              | `Authorization: Bearer …`     |\n| Mailgun      | `GET /v3/domains`                             | HTTP Basic `api:<key>`        |\n| HuggingFace  | `GET /api/whoami-v2`                          | `Authorization: Bearer …`     |\n\nCitations live next to each verifier in `verify.go`.\n\n### Operator observability (`--stats`)\n\nPer-stage atomic counters with a fresh run-id per process:\n\n- `URLsFetched`, `URLsBlocked`, `BytesParsed`, `BytesTruncated`\n- `RegistryHits`, `LegacyMatchesRaw`\n- `DroppedVendorNoise`, `DroppedFixture`, `DroppedSourcemap`\n- `DroppedLowEntropy`, `DroppedNoContext`, `DroppedBelowConf`, `DroppedRegistryDup`\n- `FindingsAfterFilter`, `FindingsAfterDedupe`\n- `VerifyAttempts`, `VerifyAlive`, `VerifyDead`, `VerifyError`\n\nPrinted to stderr at end of run when `--stats` is set, so stdout pipelines\nstay clean.\n\n### Crawler hardening\n\n- Per-host outbound concurrency cap (default 4, configurable via `--per-host`).\n- Exponential backoff with ±25% jitter between retries.\n- `Retry-After` header (seconds and HTTP-date forms) is honored.\n- 429/5xx circuit breaker: after 5 consecutive bad responses on a host, all\n  requests to that host are dropped for 30 s (or the longest `Retry-After`\n  observed, whichever is greater).\n\n### Output schema\n\n- `Finding` now carries `line`, `column`, and `verify{alive,status,account,note}`.\n- `Location[]` carries `line` and `column` per occurrence — operators can\n  paste the JSON straight into `vim file:line:col`.\n\n### Console redaction (`--show-secrets`)\n\nBy default the console prints redacted values (`AKIA****GHIJ`); the full\nvalue is written to the `-o` output file because that's what the operator\nexplicitly asked for. `--show-secrets` reverts to v0.6.0 behavior.\n\n### Extensibility (`--rules-file`)\n\nOperators ship custom rules at runtime via JSON pack. Format documented in\n`RULES.md`. External rules participate in `--self-test` automatically.\nLoader rejects the whole pack on any validation failure (no partial loads).\n\n### Tests\n\n`detection_test.go` ships with:\n- Property tests for `shannonEntropy` (bounds, monotonicity).\n- Length and middle-mask tests for `redactValue`.\n- Round-trip CRC32 base62 test for `validateGitHubToken`.\n- Structural tests for `validateJWT`, `validateAWSAccessKeyID`,\n  `validateStripeKey`.\n- Vendor-noise denylist coverage (canonical AWS docs example).\n- Schema-version assertion test (golden-file in spirit).\n- Loader contract tests (missing/duplicate id, bad regex, oversized regex).\n- Backoff-bounds and `parseRetryAfter` smoke tests.\n- `runSelfTest` is invoked by `TestRegistry_AllRules_FixturesPass` so any\n  rule whose TP fixture stops matching, or whose FP fixture starts being\n  reported, fails CI.\n\n### Documentation\n\n- New `RULES.md` covering the full schema, confidence model, and rule\n  authoring contract.\n- New `CREDITS.md` honestly naming TruffleHog, Gitleaks, detect-secrets,\n  Nosey Parker, secretlint, Semgrep secrets pack as inspirations.\n\n## [v0.6 — initial false-positive surgery] — 2026-05-08\n\nThe \"kill the false positives\" release. Every secret-class match now flows\nthrough a confidence-scoring pipeline before it is reported, the highest-volume\nproviders get format-and-checksum validators, and the JSON output is\nschema-versioned so downstream tools can detect breaking changes.\n\n### Detector additions\n\n- New curated rule registry (`detection.go`) for highest-precision providers:\n  AWS access keys (prefix family + 16-char base32 body), AWS secret keys,\n  Stripe (`sk/rk/pk_(live|test)_` + clean base62), GitHub PATs (CRC32 base62\n  checksum verified), GitHub fine-grained PATs, OpenAI legacy/project/svcacct,\n  Anthropic, Google API + ya29 OAuth, Slack token family + app + webhook,\n  Discord webhook + bot token, Twilio SK + AC, SendGrid, Mailgun, Mailchimp,\n  GitHub App installation tokens, GitLab PAT + pipeline trigger, Vercel,\n  Doppler, DigitalOcean, Shopify (access + shared secret), npm, PyPI, JWT\n  (with structural decode), private keys (RSA/OpenSSH/EC/PGP), Facebook\n  access token, Linear, HuggingFace, Supabase service-role JWT.\n\n### False-positive fixes\n\n- Vendor-noise denylist: canonical AWS docs example\n  (`AKIAIOSFODNN7EXAMPLE`), `wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY`,\n  Stripe public test fixtures, the canonical 3-part `eyJ...` example JWT.\n- Substring denylist for placeholder fragments (`YOUR_API_KEY`,\n  `REPLACE_ME`, `PLACEHOLDER`, `XXXXXXXX`, etc.).\n- Sourcemap-line skip: any match on a `//# sourceMappingURL=` line is\n  dropped — that line is a build artifact, not a secret.\n- Vendor/chunk filename gate: matches in `vendor*.js`, `chunk-*.js`,\n  `runtime-*.js`, `polyfill*.js`, `framework*.js`, `node_modules/*` paths\n  start with a confidence penalty.\n- Fixture-context penalty: surrounding text containing `example`,\n  `dummy`, `sample`, `placeholder`, `mock_`, `stub_`, `fake_`, `lorem`,\n  `FIXME` lowers the score by 0.30.\n- Generic-rule context gate: `Generic Api Key`, `Generic Secret`,\n  `Quickbooks Api Key`, `Cisco Access Token`, `Sanity Token`,\n  `Atlassian Access Token`, `Heroku Api Key 2/3` now require a\n  `key|token|secret|auth|bearer|api|password|...` keyword within ±96\n  chars and entropy ≥ 3.2 with character-class diversity ≥ 2.\n- Provider validators: AWS access key (prefix + base32 body), Stripe\n  (prefix family + clean base62), GitHub (CRC32 base62 checksum), Slack\n  (hyphen-segment shape), JWT (header decodes to JSON with `alg`),\n  Twilio (32-hex body + entropy gate).\n\n### Broken patterns repaired\n\nThe following v0.5 patterns could **never match** in a body because they\nwere anchored with `^...$` (which only match a complete one-line input)\nor had escape errors. v0.6 corrects them:\n\n- `Dropbox Access Token` — was `^sl\\.…$`, now `\\bsl\\.…\\b`.\n- `Twitter Bearer Token` — was `^AAAA…$`, now bounded `\\bAAAA…\\b`.\n- `Username Password Combo` — was `^[a-z]+://…@`, now scoped to body matches.\n- `Crowdstrike Api Key` — was `^…$`, now requires a `crowdstrike/cs` keyword.\n- `Azure Storage Account Key` — was `^…={0,43}$`, now anchored to\n  `AccountKey=`/`azure_storage_key` context.\n- `Phone Number` — was `^\\+\\d{9,14}$`, now matches inside text bodies.\n- `Ali Cloud Access Key` / `Tencent Cloud Access Key` — anchors removed.\n- `Json Web Token` — was `ey…\\.…\\.…$` (trailing-only), now `\\beyJ…\\.eyJ…\\.…\\b`.\n- `Github Access Token` — `com*` typo (matched `co`, `com`, `comm…`)\n  replaced with proper `\\.com\\b`.\n- `Password in Url` — broken `\\\\s` / `\\\\\\\\` escapes replaced with valid\n  regex character classes.\n- `Amazon Mws Auth Token` — `\\\\.` escape errors replaced with `\\.`.\n- `Heroku Api Key 3` — unbounded `.*` (ReDoS hazard) replaced with\n  bounded `[^\\n]{0,80}`.\n\n### High-FP rules tightened\n\n- `Quickbooks Api Key` (`A[0-9a-f]{32}` matched any commit hash starting\n  with A) — now requires `quickbooks|qbo|intuit` context keyword.\n- `Cisco Api Key` (`cisco[A-Za-z0-9]{30}`) — now requires\n  `cisco_api_key=` style assignment.\n- `Cisco Access Token` (`access_token=\\w+`) — now requires `cisco_`\n  keyword to avoid generic OAuth flows.\n- `Sanity Token` (`sk[a-zA-Z0-9]{32,}`) — now requires `sanity_token=`\n  context.\n- `Atlassian Access Token` (`{20,}\\.{6,}\\.{25,}`) — replaced with the\n  documented Atlassian `ATATT3…` token shape, gated by context keyword.\n- `Heroku Api Key 2` (`heroku[A-Za-z0-9]{32}`) — replaced with the\n  proper `heroku_api_key=UUID` shape.\n\n### CLI additions\n\n- `--min-confidence FLOAT` / `-mc` (default `0.50`) — gate on per-finding\n  confidence.\n- `--show-confidence` / `-sc` — print `[conf=X.XX]` next to each finding.\n- `--no-fp-filter` — disable the FP filter (debug; v0.5-compatible output).\n- `--self-test` — run the rule registry against its built-in TP/FP\n  fixtures and exit non-zero on regression. Suitable for CI.\n- `--max-bytes N` (default 32 MiB) — cap response body reads to defend\n  against gzip bombs and pathological streaming.\n- `--allow-internal` — permit `localhost`, `127.0.0.0/8`, RFC1918, and\n  link-local targets. **Off by default** to prevent SSRF when piping\n  untrusted URL lists.\n\n### Output schema\n\n- Top-level `schema_version: 2` field added to all `--json` output.\n- New `findings[]` array carries: `rule_id`, `name`, `provider`,\n  `secret_type`, `severity`, `value`, `redacted`, `value_hash`,\n  `source`, `confidence`, `entropy`, `verified`, `reasons[]`,\n  `locations[]`. Same secret seen in N sources collapses to one\n  Finding with `locations[]` listing all N.\n- The legacy `matches{name: [value, …]}` map is retained for backward\n  compatibility within schema v2.\n\n### Operational hardening\n\n- Target URL validation refuses non-HTTP(S) schemes (no `file://`),\n  loopback, RFC1918, and link-local hosts unless `--allow-internal` is\n  passed. The intent is making JSHunter safe to run against\n  user-supplied URL lists.\n- Response body reads are now bounded by `--max-bytes` via\n  `io.LimitReader`. \n\n## [v0.5] — 2026-01-22\n\nPre-release baseline. Single-file `jshunter.go`, ~190 regex patterns,\nbasic match/no-match output, no confidence scoring.\n"
  },
  {
    "path": "CREDITS.md",
    "content": "# Credits\n\nJSHunter is a competitive recon tool. Pretending it sprang from nowhere would\nbe dishonest — the secret-detection space has years of public work that\ninspired both the rule shapes and the false-positive techniques baked into\nv0.6. This file names them.\n\n## Prior art that shaped the v0.6 detection layer\n\n- **[TruffleHog](https://github.com/trufflesecurity/trufflehog)** — the\n  reference for \"regex match, then live-verify against the provider.\" The\n  per-provider verifier endpoints (Stripe `/v1/balance`, GitHub `/user`,\n  Slack `auth.test`, SendGrid `/v3/scopes`, Mailgun `/v3/domains`) used in\n  JSHunter's `--verify` flow are the same lightweight, read-only endpoints\n  TruffleHog adopted; we cite them in `verify.go` so a reviewer can audit.\n- **[Gitleaks](https://github.com/gitleaks/gitleaks)** — TOML rule pack\n  shape, the idea of explicit per-rule TP/FP fixtures, and many of the\n  long-tail provider regexes JSHunter inherited. Gitleaks's \"rules.toml\"\n  was the model for JSHunter's external `--rules-file` JSON loader.\n- **[Yelp/detect-secrets](https://github.com/Yelp/detect-secrets)** — entropy\n  thresholds and the \"high-entropy-string\" plugin family. The Shannon-entropy\n  + character-class-diversity gate in `detection.go::scoreFinding` is in the\n  spirit of detect-secrets's filters.\n- **[praetorian-inc/noseyparker](https://github.com/praetorian-inc/noseyparker)** —\n  performance reference for high-volume scanning; the multi-pattern ideas\n  that will land in v0.7 trace back here.\n- **[secretlint](https://github.com/secretlint/secretlint)** — provider\n  rotation tracking; their issue tracker is the canonical place to learn\n  when a provider has changed token format.\n- **[Semgrep secrets pack](https://semgrep.dev/p/secrets)** — context-aware\n  rule construction, especially the \"shape + context window\" pattern that\n  JSHunter's `RequiresContext` + `ContextKeywords` per-rule fields encode.\n- **GitHub Engineering blog: \"Behind GitHub's new authentication token formats\"**\n  ([link](https://github.blog/engineering/platform-security/behind-githubs-new-authentication-token-formats/))\n  — source for the CRC32 base62 checksum used in\n  `detection.go::validateGitHubToken`.\n- **AWS access key bitwise analysis (WithSecure Labs)**\n  ([link](https://labs.withsecure.com/publications/a-bitwise-analysis-of-aws-access-key-identifiers))\n  — base32 alphabet (A–Z, 2–7) + prefix family encoded in\n  `validateAWSAccessKeyID`.\n\n## Vendor & provider documentation cited inline\n\nEvery validator in `verify.go` carries a vendor docs link as a comment.\nWhen a provider rotates token format or moves an endpoint, those comments\nare the single source of truth for what to update.\n\n## License\n \nJSHunter is MIT-licensed. The works above retain their respective licenses;\nJSHunter does not vendor source from any of them. Where a regex is similar\nto a Gitleaks or detect-secrets pattern, that is convergent design on\nprovider-published shapes, not a direct copy.\n\n— Hussain Alsharman, JSHunter author\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2024-2026 Hussain Alsharman\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# JSHunter\n\n<div align=\"center\">\n\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![Go Version](https://img.shields.io/badge/Go-1.22.5+-00ADD8?style=flat&logo=go)](https://golang.org)\n[![Release](https://img.shields.io/github/release/cc1a2b/jshunter.svg)](https://github.com/cc1a2b/jshunter/releases)\n[![GitHub stars](https://img.shields.io/github/stars/cc1a2b/jshunter)](https://github.com/cc1a2b/jshunter/stargazers)\n[![Platform](https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey)](https://github.com/cc1a2b/jshunter/releases)\n\n**🔍 Professional JavaScript Security Analysis Tool**\n\n*Complete endpoint discovery, sensitive data detection, and advanced code analysis for security professionals*\n\n</div>\n\n## 📖 About\n\n**JSHunter** is a comprehensive command-line tool for JavaScript security analysis and endpoint discovery. Built for security professionals, penetration testers, and developers, it delivers enterprise-grade analysis capabilities with high accuracy detection algorithms and professional reporting features.\n\n> **Surgical false-positive reduction.** Every secret-class match flows through a confidence-scoring pipeline (entropy gate, character-class diversity, vendor-noise denylist, fixture-context penalty, sourcemap-line skip) before it is reported. Highest-volume providers (AWS, Stripe, GitHub, OpenAI, Slack, JWT) get format-and-checksum validators. Live verification is opt-in via `--verify`. Page-aware crawling, source-map analysis, HAR ingestion, and on-disk response cache are first-class. Output is JSON (`schema_version: 2`), NDJSON, SARIF 2.1.0, CSV, or Burp-compatible. Run `jshunter --self-test` to exercise the rule registry against its built-in TP/FP fixtures.\n\n<div align=\"center\">\n<img alt=\"JSHunter Demo Screenshot\" src=\"https://github.com/user-attachments/assets/f0197c36-c40b-48e9-bec5-c306acd4a613\" width=\"100%\">\n\n*JSHunter in action - Professional JavaScript security analysis*\n</div>\n\n---\n\n## 📑 Table of Contents\n\n- [About](#-about)\n- [Features](#-features)\n- [Installation](#-installation)  \n- [Quick Start](#-quick-start)\n- [Usage Examples](#-usage-examples)\n- [Command Reference](#-command-reference)\n- [Advanced Usage](#-advanced-usage)\n- [Contributing](#-contributing)\n- [License](#-license)\n- [Support](#-support)\n\n---\n\n## ✨ Features\n\n### 🎯 Core Capabilities\n- **🔍 Comprehensive Endpoint Discovery**: Automatically extracts URLs, API endpoints, and hidden parameters from JavaScript files\n- **🔐 Advanced Security Analysis**: Identifies API keys, JWT tokens, credentials, and potential vulnerabilities with high accuracy  \n- **📥 Flexible Input Methods**: Supports URLs, file lists, local files, stdin piping, and recursive discovery\n- **⚡ High-Performance Architecture**: Multi-threaded concurrent processing with intelligent rate limiting\n- **🎭 Professional Stealth Features**: Proxy support, custom headers, user-agent rotation, and bypass detection\n\n### 🎯 Intelligent Detection Engine\n> **Enterprise-grade accuracy with advanced analysis algorithms**\n\n- **🎯 Smart Base64 Detection**: High-accuracy filtering eliminates false positives from media content and encoded data\n- **🏢 Professional Interface**: Enterprise-ready terminology, documentation, and comprehensive reporting formats\n- **🧠 Context-Aware Analysis**: Advanced algorithms distinguish real security tokens from encoded media data\n- **📊 Entropy Analysis**: Mathematical algorithms identify genuine security tokens and credentials with precision\n\n### 🌐 Professional HTTP & Networking Suite\n<details>\n<summary><strong>Enterprise-Grade Network Configuration</strong></summary>\n\n**Authentication & Headers:**\n- **🔧 Custom Headers** (`-H`): Repeatable authentication headers and custom request headers\n- **🍪 Cookie Management** (`-c`): Session cookies for accessing protected resources\n- **🎭 User-Agent Control** (`-U`): Custom UA strings or file-based rotation for stealth\n\n**Performance & Reliability:**\n- **⏱️ Rate Limiting** (`-R`): Configurable request delays (milliseconds) to avoid detection\n- **⏰ Smart Timeouts** (`-T`): Custom timeout settings for different network conditions\n- **🔄 Intelligent Retry** (`-y`): Automatic retry mechanism with exponential backoff for failed requests\n\n**Professional Integration:**\n- **🔗 Proxy Support** (`-p`): Full Burp Suite and custom proxy integration (HTTP/HTTPS/SOCKS5)\n- **🔒 TLS Flexibility** (`-k`): Optional certificate verification bypass for testing environments\n- **🎯 Thread Control** (`-t`): Configurable concurrent request handling for optimal performance\n\n> **🔒 Security Professional Features**: Designed for penetration testing and security assessments  \n> **Example**: `jshunter -l targets.txt -p 127.0.0.1:8080 -H \"Authorization: Bearer token\" -R 1000`\n\n</details>\n\n### 📝 Advanced JavaScript Analysis\n<details>\n<summary><strong>Complete Code Analysis & Deobfuscation Suite</strong></summary>\n\n**Core Analysis Tools:**\n- **🧩 Deobfuscation Engine** (`-d`): Unpacks minified and obfuscated JavaScript for deep analysis\n- **🗺️ Source Map Parser** (`-m`): Extracts and analyzes original source code from source maps\n- **🔍 Obfuscation Detection** (`-z`): Identifies and classifies obfuscation techniques and patterns\n\n**Dynamic Analysis:**\n- **⚡ Eval Analysis** (`-e`): Analyzes dynamic code execution (`eval()`, `Function()`, runtime generation)\n\n**Code Intelligence:**\n- **🔍 Pattern Recognition**: Identifies common JavaScript frameworks and libraries\n- **📊 Code Structure Analysis**: Maps application architecture and data flows\n- **🎯 Context-Aware Detection**: Understands code context to reduce false positives\n\n> **💡 Professional Usage**: Combine analysis tools with security detection for maximum coverage  \n> **Example**: `jshunter -u target.js -d -m -e -s -g` (full deobfuscation + security analysis)\n\n</details>\n\n### 🔐 Security Analysis Suite\n<details>\n<summary><strong>Complete Security Assessment Toolkit</strong></summary>\n\n**Core Security Detection:**\n- **🔑 Secrets Detection** (`-s`): API keys, access tokens, passwords, and hardcoded credentials\n- **🎫 JWT Token Analysis** (`-x`): Authentication token extraction, validation, and payload inspection\n- **🔥 Firebase Security** (`-F`): Configuration analysis, API keys, and database URL detection\n\n**Advanced Analysis:**\n- **📋 Parameter Discovery** (`-P`): Hidden form parameters, variables, and configuration keys\n- **🔗 URL Parameter Extraction** (`-PU`): Advanced parameter analysis with full URL context\n- **📊 GraphQL Analysis** (`-g`): Schema detection, query extraction, and endpoint discovery\n- **🛡️ WAF Bypass Detection** (`-B`): Security bypass patterns and evasion techniques\n\n**Scope & Context:**\n- **🏠 Internal Endpoint Filtering** (`-i`): Private/internal resource identification and classification\n- **🌐 Link Analysis** (`-L`): Comprehensive URL extraction and relationship mapping\n\n> **🎯 Professional Tip**: Combine flags for comprehensive analysis (e.g., `jshunter -u target.js -s -x -F -g`)\n\n</details>\n\n### 🎯 Scope & Discovery\n<details>\n<summary><strong>Intelligent Crawling & Targeting</strong></summary>\n\n- **🔍 Recursive Discovery**: Multi-depth JavaScript file crawling\n- **🌍 Domain Scoping**: Focus analysis on specific domains\n- **📂 Extension Filtering**: Target specific JavaScript file types\n\n</details>\n\n### 📤 Professional Reporting & Export Suite\n<details>\n<summary><strong>Enterprise-Grade Output & Integration</strong></summary>\n\n**Core Output Formats:**\n- **🖥️ Console Display**: Color-coded terminal output with professional formatting and clear categorization\n- **📄 File Export** (`-o`): Save comprehensive results to custom file locations\n- **📊 JSON Export** (`-j`): Structured data format for automation and programmatic processing\n- **📈 CSV Export** (`-C`): Spreadsheet-compatible format for executive reporting and analysis\n\n**Professional Integration:**\n- **🔴 Burp Suite Export** (`-n`): Direct integration with Burp Suite Professional for immediate testing\n- **🎯 Regex Filtering** (`-r`): Custom pattern matching for targeted result filtering\n- **🔍 Verbose Analysis** (`-v`): Detailed analysis output with debugging information and context\n\n**Result Management:**\n- **✨ Clean Mode** (`--found-only`): Hide empty results for focused security reporting\n- **🤫 Quiet Mode** (`-q`): Suppress banner for automated scripting and CI/CD integration\n\n> **📋 Reporting Workflow**: Use JSON for automation, CSV for management reports, Burp export for immediate testing  \n> **Example**: `jshunter -l targets.txt -s -j -o security-findings.json` (structured security report)\n\n</details>\n\n---\n\n## 📦 Installation\n\n### Go Install (Recommended)\n```bash\n# Install JSHunter\ngo install -v github.com/cc1a2b/jshunter/cmd/jshunter@latest\n\n# Verify installation\njshunter --help\n```\n\n### Build from Source\n```bash\ngit clone https://github.com/cc1a2b/jshunter.git\ncd jshunter\ngo build -o jshunter ./cmd/jshunter\n```\n\n### System Requirements\n- **Go 1.22.5+** (for building from source)\n- **Linux, macOS, or Windows** (64-bit architecture)\n- **Network connectivity** for remote JavaScript analysis\n\n---\n\n## 🚀 Quick Start\n\n### Basic Analysis\n```bash\n# Analyze a single JavaScript file\njshunter -u \"https://example.com/app.js\"\n\n# Scan multiple URLs from file\njshunter -l urls.txt\n\n# Analyze local JavaScript file\njshunter -f app.js\n```\n\n### Complete Security Analysis\n```bash\n# Find API keys, secrets, and credentials\njshunter -u \"https://target.com/app.js\" -s\n\n# Full analysis with deobfuscation, GraphQL, and Firebase detection\njshunter -u \"https://target.com/app.js\" -d -s -g -F -x -L\n\n# Professional security assessment with all tools\njshunter -u \"https://target.com/app.js\" -d -m -e -s -x -P -g -F -B -L\n\n# Export comprehensive results for reporting\njshunter -l targets.txt -s -g -F -j -o security_findings.json\n```\n\n---\n\n## 💡 Usage Examples\n\n```bash\n# Analyze single URL\njshunter -u \"https://example.com/app.js\"\n\n# Analyze multiple URLs from file\njshunter -l urls.txt\n\n# Pipe URLs from stdin\ncat urls.txt | grep \"\\.js\" | jshunter\n\n# Complete security analysis - find secrets, API keys, and credentials\njshunter -u \"https://example.com/app.js\" -s -x -F\n\n# Full analysis suite with deobfuscation and all security tools\njshunter -u \"https://target.com/app.js\" -d -m -e -s -x -P -g -F -B -L\n\n# Professional assessment with source map analysis\njshunter -u \"https://target.com/bundle.js\" -d -m -s -g -F\n\n# Export comprehensive results to structured formats\njshunter -l targets.txt -s -x -F -g -j -o security_findings.json\n\n# Stealth scanning with Burp Suite integration\njshunter -l targets.txt -p 127.0.0.1:8080 -s -g -F -n -o burp_findings.txt\n\n# Scanning through SOCKS5 proxy (Tor, SSH tunnel, etc.)\njshunter -l targets.txt -p socks5://127.0.0.1:9050 -s -x -F\n\n# Rate-limited professional scanning with authentication\njshunter -l urls.txt -R 2000 -H \"Authorization: Bearer token\" -s -x -F -g -q\n\n# Complete endpoint and parameter discovery\njshunter -l urls.txt -ep -P -PU -L -w 2\n\n# Advanced obfuscation analysis with context detection\njshunter -f obfuscated.js -d -z -e -s -v\n```\n\n---\n\n## 📋 Command Reference\n\nGet the complete help anytime with `jshunter --help`\n\n```\nUsage:\n  -u,  --url URL                Input a URL\n  -l,  --list FILE.txt          Input a file with URLs (.txt)\n  -f,  --file FILE.js           Path to JavaScript file\n       --har FILE               Ingest a Chrome DevTools HAR archive\n\nBasic Options:\n  -t,  --threads INT            Number of concurrent threads (default: 5)\n  -c,  --cookies <cookies>      Authentication cookies for protected resources\n  -p,  --proxy host:port        HTTP/SOCKS5 proxy (e.g., 127.0.0.1:8080 for Burp Suite)\n  -q,  --quiet                  Suppress ASCII art output\n       --no-color               Disable ANSI color (auto-off when not a TTY)\n  -o,  --output FILENAME        Output file path\n  -r,  --regex <pattern>        RegEx for filtering results\n       --update, --up           Update the tool to latest version\n  -ep, --end-point              Extract endpoints from JavaScript files\n  -k,  --skip-tls               Skip TLS certificate verification\n  -fo, --found-only             Only show results when sensitive data is found\n\nHTTP Configuration:\n  -H,  --header \"Key: Value\"    Custom HTTP headers (repeatable, including Auth)\n  -U,  --user-agent UA          Custom User-Agent string or file path\n  -R,  --rate-limit MS          Request rate limiting delay (milliseconds)\n  -T,  --timeout SEC            HTTP request timeout (seconds)\n  -y,  --retry INT              Retry attempts for failed requests (default: 2)\n       --per-host INT           Per-host outbound concurrency cap (default: 4)\n       --max-bytes N            Cap response body read in bytes (default: 32MiB)\n       --allow-internal         Permit localhost / RFC1918 / link-local targets\n       --cache-dir DIR          Persist responses on disk; revalidate via ETag\n\nJavaScript Analysis:\n  -d,  --deobfuscate            Deobfuscate minified and obfuscated JavaScript\n  -m,  --sourcemap              Fetch and parse source maps + sourcesContent[]\n  -e,  --eval                   Analyze dynamic code execution (eval, Function)\n  -z,  --obfs-detect            Detect code obfuscation patterns and techniques\n       --inline-html            Scan inline <script> tags + SRI/CSP in HTML responses\n       --csp-origins            Emit CSP-allowed origins as candidate endpoints\n\nSecurity Analysis:\n  -s,  --secrets                Detect API keys, tokens, and credentials\n  -x,  --tokens                 Extract JWT and authentication tokens\n  -P,  --params                 Discover hidden parameters and variables\n  -PU, --param-urls             Advanced parameter extraction with URL context\n  -i,  --internal               Filter for internal/private endpoints\n  -g,  --graphql                Analyze GraphQL endpoints and queries\n  -B,  --bypass                 Detect WAF bypass patterns and techniques\n  -F,  --firebase               Analyze Firebase configurations and keys\n  -L,  --links                  Extract and analyze all embedded links\n\nDetection Tuning:\n  -mc, --min-confidence FLOAT   Minimum confidence (0.0-1.0) for a finding (default: 0.50)\n  -sc, --show-confidence        Print [conf=X.XX] alongside each finding\n       --no-fp-filter           Disable the false-positive filter (debug)\n       --ignore-file FILE       Permanent suppressions (.jshunterignore)\n       --diff PREVIOUS.json     Report only NEW findings vs previous JSON envelope\n       --rules-file FILE.json   Load an external JSON rule pack\n       --only-rules id,glob     Run only matching rules (supports * glob)\n       --disable-rule id,glob   Disable matching rules (supports * glob)\n\nVerification:\n       --verify                 Probe findings against provider read-only endpoints\n       --verify-timeout SEC     Timeout per verification probe (default: 10)\n       --verify-workers INT     Concurrent verifier worker pool (default: 8)\n\nScope & Discovery:\n  -w,  --crawl DEPTH            Recursive JavaScript discovery depth (default: 1)\n  -D,  --domain DOMAIN          Limit analysis to specific domain\n  -E,  --ext                    Filter by JavaScript file extensions\n       --robots                 Fetch /robots.txt for each input host and exit\n\nOutput Formats:\n  -j,  --json                   Structured JSON output (schema_version 2)\n       --ndjson                 Newline-delimited JSON (jq / SIEM streaming)\n       --sarif                  SARIF 2.1.0 (GitHub code-scanning compatible)\n  -C,  --csv                    CSV format for spreadsheet analysis\n  -v,  --verbose                Detailed analysis and debug output\n  -n,  --burp                   Burp Suite compatible export format\n       --stats                  Per-stage counters on stderr at end of run\n\nRegistry:\n       --list-rules             Print the rule registry as a table and exit\n       --explain RULE_ID        Print full rule details and exit\n       --self-test              Run rule registry against built-in TP/FP fixtures\n\n  -h,  --help                   Display this help message\n```\n\n### Confidence model\n\nEvery secret-class match is scored in `[0.0, 1.0]`. The score starts from a per-rule prior and is adjusted by:\n\n| Signal                                         | Effect                       |\n|------------------------------------------------|------------------------------|\n| Source path looks like a vendor/chunk bundle   | −0.15                        |\n| Surrounding context contains fixture wording   | −0.30                        |\n| Provider-specific validator passed             | +0.10                        |\n| Required context keyword present (generic rule)| +0.05                        |\n| Shannon entropy ≥ 4.5                          | +0.05                        |\n| Character-class diversity ≥ 3                  | +0.05                        |\n| Match in the vendor-noise denylist             | dropped before scoring       |\n| Length / entropy below rule floor              | dropped before scoring       |\n| Line is a `//# sourceMappingURL=` marker       | dropped before scoring       |\n\nThe default `--min-confidence 0.50` filters out the long tail of pattern-only matches. Use `--min-confidence 0.80` for high-precision triage, `--no-fp-filter` for raw, unfiltered output.\n\n### Provider validators\n\n| Provider | Validator                                                                |\n|----------|--------------------------------------------------------------------------|\n| AWS      | Prefix family (`AKIA/ASIA/A3T…`) + 16-char base32 body                   |\n| Stripe   | Prefix family (`sk/rk/pk_live/test_`) + clean base62 body                |\n| GitHub   | CRC32 base62 checksum verified against random body                        |\n| OpenAI   | Family prefix + length window (`sk-/sk-proj-/sk-svcacct-`)                |\n| Slack    | Hyphen-segment shape (numeric inner segments, alphanumeric tail)         |\n| JWT      | base64url-decoded JSON header with `alg` field + JSON payload            |\n| Twilio   | 32-hex body + entropy gate                                               |\n\n---\n\n## 🔧 Advanced Usage\n\n### Professional Security Assessment\n```bash\n# Complete security analysis with all tools\njshunter -l targets.txt -d -m -e -z -s -x -P -PU -g -F -B -L -j -v -o complete_assessment.json\n\n# Advanced deobfuscation and analysis pipeline\njshunter -l targets.txt -d -m -z -e -s -g -F --found-only -o deobfuscated_findings.json\n\n# Stealth reconnaissance with rate limiting and custom headers\njshunter -l targets.txt -R 2000 -U \"Mozilla/5.0...\" -H \"X-Forwarded-For: 1.1.1.1\" -s -x -F -q\n\n# Professional penetration testing through proxy\njshunter -l targets.txt -p 127.0.0.1:8080 -s -x -g -F -B -n -o burp_comprehensive.txt\n\n# Deep parameter and endpoint discovery\njshunter -l targets.txt -ep -P -PU -L -w 3 -i -j -o endpoint_discovery.json\n```\n\n### Enterprise & Automation Integration\n```bash\n# CI/CD Security Pipeline Integration\njshunter -f dist/bundle.js -d -s -x -F -j --found-only > security-scan.json\n\n# Comprehensive automated security reporting\njshunter -l production-js.txt -d -s -x -P -g -F -B -C -o enterprise-security-report.csv\n\n# Source map analysis for development security\njshunter -f app.js -m -s -x -F -v -o sourcemap-analysis.json\n\n# Firebase and GraphQL focused assessment\njshunter -l targets.txt -g -F -L -j -o api_security_findings.json\n```\n\n---\n\n## 🤝 Contributing\n\nWe welcome contributions! Here's how you can help:\n\n- **🐛 Report bugs** via [GitHub Issues](https://github.com/cc1a2b/jshunter/issues)\n- **💡 Suggest features** or improvements\n- **📝 Improve documentation** \n- **🔧 Submit pull requests** with enhancements\n\n### Development Setup\n```bash\ngit clone https://github.com/cc1a2b/jshunter.git\ncd jshunter\ngo mod tidy\ngo build -o jshunter ./cmd/jshunter\n```\n\n---\n\n## 📄 License\n\nJSHunter is released under the **MIT License**. See [LICENSE](https://github.com/cc1a2b/jshunter/blob/master/LICENSE) for details.\n\n```\nCopyright (c) 2024-2026 Hussain Alsharman\nLicensed under MIT License - free for commercial and personal use\n```\n\n---\n\n##  Support\n\nIf JSHunter helps with your security research or professional work:\n\n<div align=\"center\">\n\n[![Buy Me A Coffee](https://cdn.buymeacoffee.com/buttons/default-orange.png)](https://www.buymeacoffee.com/cc1a2b)\n\n**⭐ Star this repo** • **🐦 Follow [@cc1a2b](https://twitter.com/cc1a2b)** • **📢 Share with others**\n\n</div>\n\n---\n\n<div align=\"center\">\n\n**🔍 JSHunter - Professional JavaScript Security Analysis**\n\n*Built with ❤️ by [cc1a2b](https://github.com/cc1a2b) for the security community*\n\n</div>\n"
  },
  {
    "path": "RULES.md",
    "content": "# Rule schema\n\nJSHunter v0.6 ships with two rule sources: the **built-in registry** (Go code\nin `detection.go`) and **external rule packs** loaded at runtime via\n`--rules-file` (JSON). This doc covers both, plus the contract every new rule\nmust meet to ship.\n\n## Mental model\n \n```\nfetch  →  parse  →  rule match  →  scoreFinding  →  recordFinding  →  output\n                                       │\n                       (vendor-noise gate, entropy gate, context gate,\n                        provider validator, fixture-context penalty,\n                        sourcemap-line skip, vendor-chunk penalty)\n```\n\nEvery rule is just a regex paired with a per-rule **confidence prior** and\na set of **gates** that adjust or reject the score. The gates are the same\nfor built-in and external rules; the only thing external rules can't do is\nregister a Go-coded `Validate` function (we don't run user-supplied code).\n\n## Rule fields\n\n| Field              | Type        | Required | Notes                                                     |\n|--------------------|-------------|----------|-----------------------------------------------------------|\n| `id`               | string      | yes      | Stable, namespaced (`provider.subtype`). Used for dedupe. |\n| `name`             | string      | yes      | Human label shown in output.                              |\n| `provider`         | string      | no       | Vendor name (`AWS`, `Stripe`, …).                         |\n| `secret_type`      | string      | no       | `api_key`, `pat`, `webhook`, `private_key`, …             |\n| `severity`         | enum        | yes      | `critical|high|medium|low|info`.                          |\n| `pattern`          | regex       | yes      | RE2 syntax. ≤ 4096 bytes.                                 |\n| `group`            | int         | no       | Capture-group index (default 0 = full match).             |\n| `confidence_prior` | float [0,1] | no       | Default 0.55.                                             |\n| `requires_context` | bool        | no       | If true, drops match when no context keyword in ±96 chars.|\n| `context_keywords` | []string    | no       | Keywords required if `requires_context: true`.            |\n| `min_entropy`      | float       | no       | Drop match when Shannon entropy < this.                   |\n| `min_len`          | int         | no       | Drop match shorter than this.                             |\n| `max_len`          | int         | no       | Drop match longer than this.                              |\n| `high_fp_prone`    | bool        | no       | Apply stricter entropy + char-class gates.                |\n| `tp_examples`      | []string    | yes¹     | Example values the rule MUST match.                       |\n| `fp_examples`      | []string    | yes¹     | Example values the rule MUST NOT match.                   |\n\n¹ Required *contractually* (R6). `--self-test` walks every rule's TP and FP\nfixtures; CI gates merges on `--self-test` exit code.\n\n## Confidence model\n\nScore starts at `confidence_prior`, then is adjusted:\n\n| Adjustment                                     | Delta  |\n|------------------------------------------------|--------|\n| Source path matches `vendor/chunk/runtime/…`   | −0.15  |\n| Provider validator passed                      | +0.10  |\n| Context keyword present (`requires_context`)   | +0.05  |\n| Surrounding text contains fixture keywords     | −0.30  |\n| Shannon entropy ≥ 4.5                          | +0.05  |\n| Character-class diversity ≥ 3                  | +0.05  |\n\nHard rejects (no score, drop the match):\n\n- Match in `vendorNoiseExact` or contains a `vendorNoiseSubstr` fragment.\n- Length below `min_len` or above `max_len`.\n- Entropy below `min_entropy`.\n- `high_fp_prone` rule with character-class diversity < 2 or entropy < 3.0.\n- `requires_context` rule with no keyword hit.\n- `Validate` function returns false (built-in rules only).\n- Line is a `//# sourceMappingURL=` marker.\n\n`--min-confidence` (default 0.50) gates the final score.\n\n## External rule pack format\n\nA pack is a JSON file containing an array of rule objects:\n\n```json\n[\n  {\n    \"id\": \"acme.api_key\",\n    \"name\": \"Acme API Key\",\n    \"provider\": \"Acme\",\n    \"secret_type\": \"api_key\",\n    \"severity\": \"high\",\n    \"pattern\": \"\\\\bacme_[A-Za-z0-9]{32}\\\\b\",\n    \"confidence_prior\": 0.85,\n    \"min_len\": 37,\n    \"max_len\": 37,\n    \"tp_examples\": [\"acme_aBcDeFgHiJkLmNoPqRsTuVwXyZ012345\"],\n    \"fp_examples\": [\"acme_placeholder_____xxxxxxxxxxxxxx\"]\n  }\n]\n```\n\nLoad with:\n\n```bash\njshunter --rules-file /path/to/rules.json -u https://target.com/app.js\n```\n\nValidation is strict: any rule that fails to compile or misses a required\nfield rejects the whole pack. That keeps \"why didn't my rule fire?\" out of\nthe support queue.\n\n## Adding a built-in rule\n\n1. Add a `Rule{}` literal to `registerRules()` in `detection.go`.\n2. If the provider has a stable read-only liveness endpoint, register a\n   verifier in `registerVerifiers()` in `verify.go` keyed by the rule ID.\n3. Add `TPExamples` and `FPExamples` lists. **No rule ships without an FP\n   fixture**; one of the most common failure modes is shipping a rule that\n   flags a famous open-source bundle.\n4. If your rule is `high_fp_prone` (matches a generic shape like 32-hex),\n   either set `requires_context: true` with provider-specific\n   `context_keywords`, or set `min_entropy` ≥ 3.5 and `max_len` to the\n   actual provider format. **Both are usually correct.**\n5. Run `go test ./...` and `./jshunter --self-test`. CI must be green.\n\n## Provider validator contract\n\nValidators are pure Go functions of signature\n`func(value string) (ok bool, reasons []string)`. They MUST be deterministic\nand offline; never call the network from a validator (use the verifier in\n`verify.go` for that). They MAY be slow per call (e.g., CRC32) — they run on\nmatched candidates only, not every byte of the body.\n\nExamples:\n\n- `validateAWSAccessKeyID` — prefix family + base32 alphabet check.\n- `validateGitHubToken` — CRC32 base62 trailing checksum verification.\n- `validateJWT` — base64url-decode header, parse JSON, require `alg` field.\n- `validateStripeKey` — prefix family + clean base62 body (no `_`).\n\n## Anti-patterns\n\n- ❌ A rule whose pattern is `(?i).*key.*=.*[A-Za-z0-9]{8,}.*`. This will\n  flood the operator. Be specific. Use the provider's documented prefix.\n- ❌ A rule with no FP fixture. You have a regex that flagged a vendor\n  bundle once; that's the FP fixture. Add it.\n- ❌ Calling `regexp.MustCompile` inside a hot loop. Compile once at\n  registration time.\n- ❌ A rule whose `severity` is `critical` for a publishable key. Public\n  keys aren't credentials; they're configuration. `low` or `info` is right.\n"
  },
  {
    "path": "cmd/jshunter/main.go",
    "content": "package main\n\nimport \"github.com/cc1a2b/jshunter/internal/jshunter\"\n\nfunc main() {\n\tjshunter.Run()\n}\n"
  },
  {
    "path": "go.mod",
    "content": "module github.com/cc1a2b/jshunter\n\ngo 1.24.0\n\ntoolchain go1.24.5\n\nrequire golang.org/x/net v0.49.0\n"
  },
  {
    "path": "go.sum",
    "content": "golang.org/x/net v0.49.0 h1:eeHFmOGUTtaaPSGNmjBKpbng9MulQsJURQUAfUwY++o=\ngolang.org/x/net v0.49.0/go.mod h1:/ysNB2EvaqvesRkuLAyjI1ycPZlQHM3q01F02UY/MV8=\n"
  },
  {
    "path": "internal/jshunter/aws_pair.go",
    "content": "package jshunter\n\nimport (\n\t\"context\"\n\t\"crypto/hmac\"\n\t\"crypto/sha256\"\n\t\"encoding/hex\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"strings\"\n\t\"time\"\n)\n\n// AWS SigV4 verifier for sts:GetCallerIdentity.\n//\n// AWS credentials come in pairs (Access Key ID + Secret Access Key); a single\n// AKID can't be verified alone — the signing process requires both. When\n// JSHunter detects both in the same source, this verifier signs a minimal\n// read-only POST to sts.amazonaws.com and reports back account/ARN.\n//\n// Region is fixed at us-east-1 because the global STS endpoint is\n// regionalized as us-east-1; service is \"sts\". No SDK dependency.\n\nconst (\n\tawsService = \"sts\"\n\tawsRegion  = \"us-east-1\"\n\tawsHost    = \"sts.amazonaws.com\"\n)\n\n// AWSPair is a (AKID, SecretKey) tuple discovered in the same source.\ntype AWSPair struct {\n\tAccessKeyID     string\n\tSecretAccessKey string\n\tSource          string\n\tLine            int\n\tColumn          int\n}\n\n// pairAWSCredentials walks the dedupe map and returns pairs that share a\n// Source. The pairing is conservative: same source, both findings present.\n// Cross-source pairing risks attaching the wrong secret to the wrong AKID.\nfunc pairAWSCredentials() []AWSPair {\n\tfindingsMutex.Lock()\n\tdefer findingsMutex.Unlock()\n\n\tbySource := map[string]struct {\n\t\takids   []*Finding\n\t\tsecrets []*Finding\n\t}{}\n\tfor _, f := range findingsByHash {\n\t\ts := bySource[f.Source]\n\t\tswitch f.RuleID {\n\t\tcase \"aws.access_key_id\":\n\t\t\ts.akids = append(s.akids, f)\n\t\tcase \"aws.secret_access_key\":\n\t\t\ts.secrets = append(s.secrets, f)\n\t\t}\n\t\tbySource[f.Source] = s\n\t}\n\n\tpairs := []AWSPair{}\n\tfor src, s := range bySource {\n\t\t// Single AKID + single secret in the same source is the only case\n\t\t// we can pair with confidence. Multiple of either yield ambiguity;\n\t\t// skip those — operator can run --no-fp-filter and triage manually.\n\t\tif len(s.akids) == 1 && len(s.secrets) == 1 {\n\t\t\ta, sec := s.akids[0], s.secrets[0]\n\t\t\tpairs = append(pairs, AWSPair{\n\t\t\t\tAccessKeyID:     a.Value,\n\t\t\t\tSecretAccessKey: sec.Value,\n\t\t\t\tSource:          src,\n\t\t\t\tLine:            a.Line,\n\t\t\t\tColumn:          a.Column,\n\t\t\t})\n\t\t}\n\t}\n\treturn pairs\n}\n\n// verifyAWSPair calls sts:GetCallerIdentity with SigV4. Returns alive=true\n// and the ARN of the caller on success, alive=false on any 4xx, error string\n// on transport failure. Sanitizes any leaked secret from the error.\nfunc verifyAWSPair(ctx context.Context, client *http.Client, p AWSPair) VerifyResult {\n\tbody := \"Action=GetCallerIdentity&Version=2011-06-15\"\n\tnow := time.Now().UTC()\n\tdateStr := now.Format(\"20060102\")\n\ttimeStr := now.Format(\"20060102T150405Z\")\n\n\tbodyHash := sha256Hex([]byte(body))\n\tcanonicalReq := strings.Join([]string{\n\t\t\"POST\",\n\t\t\"/\",\n\t\t\"\",\n\t\t\"content-type:application/x-www-form-urlencoded; charset=utf-8\",\n\t\t\"host:\" + awsHost,\n\t\t\"x-amz-content-sha256:\" + bodyHash,\n\t\t\"x-amz-date:\" + timeStr,\n\t\t\"\",\n\t\t\"content-type;host;x-amz-content-sha256;x-amz-date\",\n\t\tbodyHash,\n\t}, \"\\n\")\n\n\tcredScope := dateStr + \"/\" + awsRegion + \"/\" + awsService + \"/aws4_request\"\n\tstringToSign := strings.Join([]string{\n\t\t\"AWS4-HMAC-SHA256\",\n\t\ttimeStr,\n\t\tcredScope,\n\t\tsha256Hex([]byte(canonicalReq)),\n\t}, \"\\n\")\n\n\tsigningKey := awsDeriveSigningKey(p.SecretAccessKey, dateStr, awsRegion, awsService)\n\tsignature := hmacHex(signingKey, []byte(stringToSign))\n\n\tauthHeader := fmt.Sprintf(\n\t\t\"AWS4-HMAC-SHA256 Credential=%s/%s, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=%s\",\n\t\tp.AccessKeyID, credScope, signature,\n\t)\n\n\treq, err := http.NewRequestWithContext(ctx, \"POST\", \"https://\"+awsHost+\"/\", strings.NewReader(body))\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Host = awsHost\n\treq.Header.Set(\"Content-Type\", \"application/x-www-form-urlencoded; charset=utf-8\")\n\treq.Header.Set(\"X-Amz-Date\", timeStr)\n\treq.Header.Set(\"X-Amz-Content-Sha256\", bodyHash)\n\treq.Header.Set(\"Authorization\", authHeader)\n\n\thost := req.URL.Host\n\trelease := verifyHostLimiter.acquire(host)\n\tdefer release()\n\n\tresp, err := client.Do(req)\n\tif err != nil {\n\t\treturn VerifyResult{Error: sanitizeNetErr(err.Error())}\n\t}\n\tdefer resp.Body.Close()\n\trespBody, _ := io.ReadAll(&capReader{r: resp.Body, max: 32 * 1024})\n\n\tres := VerifyResult{Status: resp.StatusCode}\n\tif resp.StatusCode == http.StatusOK {\n\t\tres.Alive = true\n\t\ts := string(respBody)\n\t\t// STS returns XML with <Arn>arn:aws:iam::123456789012:user/Foo</Arn>.\n\t\t// We use substring extraction rather than a full XML decoder — the\n\t\t// expected response shape is stable and small.\n\t\tif i := strings.Index(s, \"<Arn>\"); i != -1 {\n\t\t\tj := strings.Index(s[i+5:], \"</Arn>\")\n\t\t\tif j != -1 {\n\t\t\t\tres.Account = s[i+5 : i+5+j]\n\t\t\t}\n\t\t}\n\t\tres.Note = \"sts:GetCallerIdentity returned 200\"\n\t} else {\n\t\t// Don't leak the response body — STS error replies can echo the\n\t\t// AKID. Sanitize aggressively.\n\t\tres.Note = fmt.Sprintf(\"sts returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// awsDeriveSigningKey computes the SigV4 signing key:\n//\n//\tkDate    = HMAC(\"AWS4\" + secret, dateStr)\n//\tkRegion  = HMAC(kDate, region)\n//\tkService = HMAC(kRegion, service)\n//\tkSigning = HMAC(kService, \"aws4_request\")\nfunc awsDeriveSigningKey(secret, dateStr, region, service string) []byte {\n\tk := hmacBytes([]byte(\"AWS4\"+secret), []byte(dateStr))\n\tk = hmacBytes(k, []byte(region))\n\tk = hmacBytes(k, []byte(service))\n\treturn hmacBytes(k, []byte(\"aws4_request\"))\n}\n\nfunc sha256Hex(b []byte) string {\n\th := sha256.Sum256(b)\n\treturn hex.EncodeToString(h[:])\n}\n\nfunc hmacBytes(key, msg []byte) []byte {\n\tm := hmac.New(sha256.New, key)\n\tm.Write(msg)\n\treturn m.Sum(nil)\n}\n\nfunc hmacHex(key, msg []byte) string {\n\treturn hex.EncodeToString(hmacBytes(key, msg))\n}\n"
  },
  {
    "path": "internal/jshunter/cache.go",
    "content": "package jshunter\n\nimport (\n\t\"crypto/sha256\"\n\t\"encoding/hex\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"net/http\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"sync\"\n\t\"time\"\n)\n\n// On-disk cache for HTTP responses, keyed by URL hash. The point is twofold:\n//\n//  1. Re-running JSHunter on the same target during triage shouldn't\n//     re-pull megabytes of bundles we already saw.\n//  2. With ETag / Last-Modified, the second run becomes mostly 304s on\n//     the wire — kinder to targets, faster on the operator's laptop.\n//\n// On disk:\n//   <cache-dir>/<sha256(url)>.body  — response body\n//   <cache-dir>/<sha256(url)>.meta  — JSON metadata (Etag, Last-Modified, status, fetchedAt, contentType)\n//\n// We do NOT cache responses with set-cookie or auth headers — those are\n// session-specific and caching them is a security hazard.\n\ntype cacheMeta struct {\n\tURL          string    `json:\"url\"`\n\tStatus       int       `json:\"status\"`\n\tContentType  string    `json:\"content_type,omitempty\"`\n\tETag         string    `json:\"etag,omitempty\"`\n\tLastModified string    `json:\"last_modified,omitempty\"`\n\tFetchedAt    time.Time `json:\"fetched_at\"`\n\tSize         int       `json:\"size\"`\n}\n\ntype DiskCache struct {\n\tdir string\n\tmu  sync.Mutex\n}\n\nfunc NewDiskCache(dir string) (*DiskCache, error) {\n\tif dir == \"\" {\n\t\treturn nil, nil\n\t}\n\tif err := os.MkdirAll(dir, 0o755); err != nil {\n\t\treturn nil, fmt.Errorf(\"create cache dir: %w\", err)\n\t}\n\treturn &DiskCache{dir: dir}, nil\n}\n\nfunc (c *DiskCache) keyFor(u string) string {\n\th := sha256.Sum256([]byte(u))\n\treturn hex.EncodeToString(h[:])\n}\n\nfunc (c *DiskCache) bodyPath(u string) string {\n\treturn filepath.Join(c.dir, c.keyFor(u)+\".body\")\n}\n\nfunc (c *DiskCache) metaPath(u string) string {\n\treturn filepath.Join(c.dir, c.keyFor(u)+\".meta\")\n}\n\n// Lookup returns the cached entry if present. Caller decides whether to\n// short-circuit (use as-is) or revalidate via If-None-Match.\nfunc (c *DiskCache) Lookup(u string) (body []byte, meta *cacheMeta, ok bool) {\n\tif c == nil {\n\t\treturn nil, nil, false\n\t}\n\tc.mu.Lock()\n\tdefer c.mu.Unlock()\n\n\trawMeta, err := os.ReadFile(c.metaPath(u))\n\tif err != nil {\n\t\treturn nil, nil, false\n\t}\n\tvar m cacheMeta\n\tif err := json.Unmarshal(rawMeta, &m); err != nil {\n\t\treturn nil, nil, false\n\t}\n\tbody, err = os.ReadFile(c.bodyPath(u))\n\tif err != nil {\n\t\treturn nil, nil, false\n\t}\n\treturn body, &m, true\n}\n\n// Store writes body + metadata. Skipped silently when the response carries\n// `Set-Cookie` or `Authorization` (security hazard) or when `body` is empty.\nfunc (c *DiskCache) Store(u string, resp *http.Response, body []byte) error {\n\tif c == nil || len(body) == 0 {\n\t\treturn nil\n\t}\n\tif resp.Header.Get(\"Set-Cookie\") != \"\" {\n\t\treturn nil\n\t}\n\tc.mu.Lock()\n\tdefer c.mu.Unlock()\n\n\tmeta := cacheMeta{\n\t\tURL:          u,\n\t\tStatus:       resp.StatusCode,\n\t\tContentType:  resp.Header.Get(\"Content-Type\"),\n\t\tETag:         resp.Header.Get(\"ETag\"),\n\t\tLastModified: resp.Header.Get(\"Last-Modified\"),\n\t\tFetchedAt:    time.Now().UTC(),\n\t\tSize:         len(body),\n\t}\n\trawMeta, err := json.Marshal(meta)\n\tif err != nil {\n\t\treturn err\n\t}\n\tif err := os.WriteFile(c.metaPath(u), rawMeta, 0o600); err != nil {\n\t\treturn err\n\t}\n\tif err := os.WriteFile(c.bodyPath(u), body, 0o600); err != nil {\n\t\treturn err\n\t}\n\treturn nil\n}\n\n// AttachConditional sets If-None-Match / If-Modified-Since on a request when\n// we have a cached entry. The caller observes a 304 in makeRequestWithRetry\n// and substitutes the cached body.\nfunc (c *DiskCache) AttachConditional(req *http.Request) {\n\tif c == nil {\n\t\treturn\n\t}\n\t_, m, ok := c.Lookup(req.URL.String())\n\tif !ok {\n\t\treturn\n\t}\n\tif m.ETag != \"\" {\n\t\treq.Header.Set(\"If-None-Match\", m.ETag)\n\t}\n\tif m.LastModified != \"\" {\n\t\treq.Header.Set(\"If-Modified-Since\", m.LastModified)\n\t}\n}\n"
  },
  {
    "path": "internal/jshunter/concurrent_verify.go",
    "content": "package jshunter\n\nimport (\n\t\"context\"\n\t\"net/http\"\n\t\"sync\"\n\t\"time\"\n)\n\n// VerifyAllConcurrent runs liveness probes against every Finding that has\n// a registered verifier, using a bounded worker pool. Per-host concurrency\n// is still capped by `verifyHostLimiter`; the worker pool here is a\n// global ceiling on simultaneous outbound HTTP calls so a scan with 200\n// findings doesn't open 200 sockets at once.\n//\n// Each probe is bounded by `timeout`. Findings missing a verifier are\n// silently skipped. Mutates findings in place (sets Verified, Verify,\n// Confidence).\nfunc VerifyAllConcurrent(findings []*Finding, client *http.Client, timeout time.Duration, workers int) {\n\tif workers <= 0 {\n\t\tworkers = 8\n\t}\n\tif timeout <= 0 {\n\t\ttimeout = 10 * time.Second\n\t}\n\tregisterVerifiers()\n\n\tjobs := make(chan *Finding)\n\tvar wg sync.WaitGroup\n\twg.Add(workers)\n\tfor i := 0; i < workers; i++ {\n\t\tgo func() {\n\t\t\tdefer wg.Done()\n\t\t\tfor f := range jobs {\n\t\t\t\tv, ok := verifierRegistry[f.RuleID]\n\t\t\t\tif !ok {\n\t\t\t\t\tcontinue\n\t\t\t\t}\n\t\t\t\tif globalStats != nil {\n\t\t\t\t\tstatInc(&globalStats.VerifyAttempts)\n\t\t\t\t}\n\t\t\t\tctx, cancel := context.WithTimeout(context.Background(), timeout)\n\t\t\t\tres := v(ctx, client, f.Value)\n\t\t\t\tcancel()\n\t\t\t\tfindingsMutex.Lock()\n\t\t\t\tf.Verify = &res\n\t\t\t\tif res.Alive {\n\t\t\t\t\tf.Verified = true\n\t\t\t\t\tf.Confidence = 1.0\n\t\t\t\t}\n\t\t\t\tfindingsMutex.Unlock()\n\t\t\t\tswitch {\n\t\t\t\tcase res.Alive && globalStats != nil:\n\t\t\t\t\tstatInc(&globalStats.VerifyAlive)\n\t\t\t\tcase res.Error != \"\" && globalStats != nil:\n\t\t\t\t\tstatInc(&globalStats.VerifyError)\n\t\t\t\tcase globalStats != nil:\n\t\t\t\t\tstatInc(&globalStats.VerifyDead)\n\t\t\t\t}\n\t\t\t}\n\t\t}()\n\t}\n\tfor _, f := range findings {\n\t\tjobs <- f\n\t}\n\tclose(jobs)\n\twg.Wait()\n}\n"
  },
  {
    "path": "internal/jshunter/crawler.go",
    "content": "package jshunter\n\nimport (\n\t\"fmt\"\n\t\"math/rand\"\n\t\"net/http\"\n\t\"net/url\"\n\t\"strconv\"\n\t\"sync\"\n\t\"time\"\n)\n\n// Per-host concurrency cap. Recon tools that hit one host with N parallel\n// goroutines get banned. The default is intentionally conservative; operators\n// who own the target can raise it via --threads (which becomes a global cap)\n// without changing the per-host floor.\nconst (\n\tdefaultPerHostConcurrency = 4\n\tdefaultBreakerThreshold   = 5\n\tdefaultBreakerCooldown    = 30 * time.Second\n)\n\n// hostController bounds outbound concurrency per host AND tracks consecutive\n// 429/5xx responses for a circuit breaker. When the breaker trips, all\n// subsequent requests to that host are dropped until cooldown elapses.\ntype hostController struct {\n\tperHost int\n\tmu      sync.Mutex\n\tstate   map[string]*hostState\n}\n\ntype hostState struct {\n\tsem        chan struct{}\n\tfailStreak int\n\ttripUntil  time.Time\n}\n\nvar (\n\tglobalHostController *hostController\n\thostControllerOnce   sync.Once\n)\n\nfunc getHostController() *hostController {\n\thostControllerOnce.Do(func() {\n\t\tglobalHostController = &hostController{\n\t\t\tperHost: defaultPerHostConcurrency,\n\t\t\tstate:   map[string]*hostState{},\n\t\t}\n\t})\n\treturn globalHostController\n}\n\n// host returns or creates the per-host bookkeeping struct.\nfunc (c *hostController) host(h string) *hostState {\n\tc.mu.Lock()\n\tdefer c.mu.Unlock()\n\ts, ok := c.state[h]\n\tif !ok {\n\t\ts = &hostState{sem: make(chan struct{}, c.perHost)}\n\t\tc.state[h] = s\n\t}\n\treturn s\n}\n\n// acquire blocks until a token is available for the host. Returns a release\n// closure and a bool — false means the breaker is tripped and the caller\n// should NOT make the request.\nfunc (c *hostController) acquire(host string) (release func(), allowed bool) {\n\tif host == \"\" {\n\t\treturn func() {}, true\n\t}\n\ts := c.host(host)\n\tc.mu.Lock()\n\tif !s.tripUntil.IsZero() && time.Now().Before(s.tripUntil) {\n\t\tc.mu.Unlock()\n\t\treturn func() {}, false\n\t}\n\tc.mu.Unlock()\n\ts.sem <- struct{}{}\n\treturn func() { <-s.sem }, true\n}\n\n// recordOutcome teaches the circuit breaker. 200/2xx clears the streak; 429\n// or 5xx increments it; once we've crossed the threshold the host is benched\n// for the cooldown duration.\nfunc (c *hostController) recordOutcome(host string, status int, retryAfter time.Duration) {\n\tif host == \"\" {\n\t\treturn\n\t}\n\ts := c.host(host)\n\tc.mu.Lock()\n\tdefer c.mu.Unlock()\n\tif status >= 200 && status < 400 {\n\t\ts.failStreak = 0\n\t\treturn\n\t}\n\tif status == http.StatusTooManyRequests || status >= 500 {\n\t\ts.failStreak++\n\t\tif s.failStreak >= defaultBreakerThreshold {\n\t\t\tcd := defaultBreakerCooldown\n\t\t\tif retryAfter > cd {\n\t\t\t\tcd = retryAfter\n\t\t\t}\n\t\t\ts.tripUntil = time.Now().Add(cd)\n\t\t\ts.failStreak = 0\n\t\t}\n\t}\n}\n\n// parseRetryAfter returns the duration the server asked us to wait. Honors\n// both seconds-form (\"Retry-After: 30\") and HTTP-date form. Returns 0 when\n// absent or unparseable.\nfunc parseRetryAfter(h http.Header) time.Duration {\n\tv := h.Get(\"Retry-After\")\n\tif v == \"\" {\n\t\treturn 0\n\t}\n\tif secs, err := strconv.Atoi(v); err == nil && secs >= 0 {\n\t\treturn time.Duration(secs) * time.Second\n\t}\n\tif t, err := http.ParseTime(v); err == nil {\n\t\td := time.Until(t)\n\t\tif d > 0 {\n\t\t\treturn d\n\t\t}\n\t}\n\treturn 0\n}\n\n// backoffWithJitter returns the v0.6 retry sleep — exponential base with\n// ±25% jitter to avoid thundering-herd when many concurrent crawlers hit the\n// same backoff schedule on the same host.\nfunc backoffWithJitter(attempt int) time.Duration {\n\tif attempt < 0 {\n\t\tattempt = 0\n\t}\n\tif attempt > 6 {\n\t\tattempt = 6\n\t}\n\tbase := time.Duration(1<<uint(attempt)) * time.Second\n\tjitter := time.Duration(rand.Int63n(int64(base) / 2))\n\treturn base + jitter - time.Duration(int64(base)/4)\n}\n\n// hostOf is a tiny helper for the controller; tolerates malformed URLs.\nfunc hostOf(rawURL string) string {\n\tu, err := url.Parse(rawURL)\n\tif err != nil || u == nil {\n\t\treturn \"\"\n\t}\n\treturn u.Host\n}\n\n// describeBreaker is used by --verbose to explain why a request was dropped.\nfunc describeBreaker(host string) string {\n\tc := getHostController()\n\tc.mu.Lock()\n\tdefer c.mu.Unlock()\n\ts, ok := c.state[host]\n\tif !ok || s.tripUntil.IsZero() {\n\t\treturn \"\"\n\t}\n\tleft := time.Until(s.tripUntil).Round(time.Second)\n\treturn fmt.Sprintf(\"breaker tripped for %s, %s remaining\", host, left)\n}\n"
  },
  {
    "path": "internal/jshunter/csp.go",
    "content": "package jshunter\n\nimport (\n\t\"strings\"\n)\n\n// ParseCSPOrigins extracts host origins from a Content-Security-Policy\n// header (or http-equiv meta value). Recon use-case: the allow-list of\n// hosts a site loads from is a fast list of subdomains and third-party\n// vendors to investigate. We only return scheme://host[:port] tokens —\n// keywords (`'self'`, `'unsafe-inline'`), data:, blob:, mediastream:,\n// filesystem: are filtered out.\nfunc ParseCSPOrigins(policy string) []string {\n\tif policy == \"\" {\n\t\treturn nil\n\t}\n\tseen := map[string]struct{}{}\n\tout := []string{}\n\tfor _, dir := range strings.Split(policy, \";\") {\n\t\tdir = strings.TrimSpace(dir)\n\t\tif dir == \"\" {\n\t\t\tcontinue\n\t\t}\n\t\tfields := strings.Fields(dir)\n\t\tif len(fields) < 2 {\n\t\t\tcontinue\n\t\t}\n\t\t// Skip directive name (default-src, script-src, …); iterate sources.\n\t\tfor _, src := range fields[1:] {\n\t\t\tsrc = strings.Trim(src, \"\\\"'\")\n\t\t\tif src == \"\" {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tif strings.HasPrefix(src, \"'\") || strings.HasPrefix(src, \"*\") {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tlow := strings.ToLower(src)\n\t\t\tif strings.HasPrefix(low, \"data:\") || strings.HasPrefix(low, \"blob:\") ||\n\t\t\t\tstrings.HasPrefix(low, \"mediastream:\") || strings.HasPrefix(low, \"filesystem:\") ||\n\t\t\t\tstrings.HasPrefix(low, \"ws:\") || strings.HasPrefix(low, \"wss:\") ||\n\t\t\t\tstrings.HasPrefix(low, \"self\") || strings.HasPrefix(low, \"none\") ||\n\t\t\t\tstrings.HasPrefix(low, \"nonce-\") || strings.HasPrefix(low, \"sha256-\") ||\n\t\t\t\tstrings.HasPrefix(low, \"sha384-\") || strings.HasPrefix(low, \"sha512-\") ||\n\t\t\t\tstrings.HasPrefix(low, \"strict-dynamic\") || strings.HasPrefix(low, \"report-sample\") ||\n\t\t\t\tstrings.HasPrefix(low, \"unsafe-\") {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tif _, ok := seen[src]; !ok {\n\t\t\t\tseen[src] = struct{}{}\n\t\t\t\tout = append(out, src)\n\t\t\t}\n\t\t}\n\t}\n\treturn out\n}\n"
  },
  {
    "path": "internal/jshunter/detection.go",
    "content": "package jshunter\n\nimport (\n\t\"crypto/sha256\"\n\t\"encoding/base64\"\n\t\"encoding/hex\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"hash/crc32\"\n\t\"math\"\n\t\"regexp\"\n\t\"sort\"\n\t\"strings\"\n\t\"sync\"\n)\n\n// SchemaVersion tags every JSON finding so downstream tools can detect\n// breaking changes in the JSHunter output contract.\nconst (\n\tSchemaVersion        = 2\n\tDefaultMinConfidence = 0.50\n\tDefaultMaxBytes      = 32 * 1024 * 1024\n\tcontextWindow        = 96\n)\n\ntype Severity string\n\nconst (\n\tSevCritical Severity = \"critical\"\n\tSevHigh     Severity = \"high\"\n\tSevMedium   Severity = \"medium\"\n\tSevLow      Severity = \"low\"\n\tSevInfo     Severity = \"info\"\n)\n\n// Rule is a single secret-class detector with all signals the FP pipeline needs.\ntype Rule struct {\n\tID              string\n\tName            string\n\tProvider        string\n\tSecretType      string\n\tSeverity        Severity\n\tPattern         *regexp.Regexp\n\tGroup           int\n\tConfidencePrior float64\n\tRequiresContext bool\n\tContextKeywords []string\n\tMinEntropy      float64\n\tMinLen          int\n\tMaxLen          int\n\tHighFPProne     bool\n\tValidate        func(string) (bool, []string)\n\tTPExamples      []string\n\tFPExamples      []string\n}\n\n// Location records every distinct site at which the same secret value was seen.\ntype Location struct {\n\tSource string `json:\"source\"`\n\tLine   int    `json:\"line,omitempty\"`\n\tColumn int    `json:\"column,omitempty\"`\n}\n\n// Finding is the v0.6 unit of output: scored, deduped, and self-describing.\n// Line/Column carry the position of the *first* occurrence; Locations[]\n// mirrors all subsequent occurrences after dedupe.\ntype Finding struct {\n\tSchemaVersion int           `json:\"schema_version\"`\n\tRuleID        string        `json:\"rule_id,omitempty\"`\n\tName          string        `json:\"name\"`\n\tProvider      string        `json:\"provider,omitempty\"`\n\tSecretType    string        `json:\"secret_type,omitempty\"`\n\tSeverity      Severity      `json:\"severity,omitempty\"`\n\tValue         string        `json:\"value,omitempty\"`\n\tRedacted      string        `json:\"redacted\"`\n\tValueHash     string        `json:\"value_hash\"`\n\tSource        string        `json:\"source\"`\n\tLine          int           `json:\"line,omitempty\"`\n\tColumn        int           `json:\"column,omitempty\"`\n\tConfidence    float64       `json:\"confidence\"`\n\tEntropy       float64       `json:\"entropy\"`\n\tVerified      bool          `json:\"verified\"`\n\tVerify        *VerifyResult `json:\"verify,omitempty\"`\n\tReasons       []string      `json:\"reasons,omitempty\"`\n\tLocations     []Location    `json:\"locations,omitempty\"`\n}\n\nvar (\n\trulesRegistry  []Rule\n\trulesIndex     = make(map[string]*Rule)\n\trulesOnce      sync.Once\n\tfindingsByHash = make(map[string]*Finding)\n\tfindingsMutex  sync.Mutex\n\n\t// Operator-managed suppression hooks. Set from main() once at startup,\n\t// consulted on every recordFinding. Both are nil-safe.\n\tactiveIgnoreList *IgnoreList\n\tactiveDiffSeen   map[string]bool\n)\n\n// Famous false positives the recon community has burned cycles on.\n// Exact-match denylist; never a true positive.\n//\n// Provider sample values are split into prefix + body fragments so this\n// source file does not itself trigger upstream secret-scanning systems\n// (GitHub Push Protection, etc.). The runtime value is identical — Go\n// folds the constant concatenation at compile time.\nvar vendorNoiseExact = map[string]struct{}{\n\t\"AKIA\" + \"IOSFODNN7EXAMPLE\":                  {},\n\t\"wJalrXUtnFEMI/K7MDENG/bPxRfi\" + \"CYEXAMPLEKEY\": {},\n\t\"sk_\" + \"test_\" + \"BQokikJOvBiI2HlWgH4olfQ2\": {},\n\t\"pk_\" + \"test_\" + \"TYooMQauvdEDq54NiTphI7jx\": {},\n\t\"sk_\" + \"test_\" + \"4eC39HqLyjWDarjtT1zdp7dc\": {},\n\t\"pk_\" + \"test_\" + \"6pRNAsCfBOKtIshFeQd4XMUh\": {},\n\t\"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9\" +\n\t\t\".eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ\" +\n\t\t\".SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c\": {},\n}\n\n// Substring denylist: any match containing one of these is sample/placeholder.\nvar vendorNoiseSubstr = []string{\n\t\"EXAMPLEKEY\", \"EXAMPLEEXAMPLE\", \"YOUR_API_KEY\", \"YOURAPIKEY\",\n\t\"REPLACEME\", \"REPLACE_ME\", \"PLACEHOLDER\", \"XXXXXXXX\",\n\t\"INSERT_KEY_HERE\", \"PUT_KEY_HERE\", \"ENTER_YOUR_KEY\",\n\t\"my_secret_key\", \"test-secret-key\",\n\t\"'password'\", `\"password\"`, \"'PASSWORD'\", `\"PASSWORD\"`, \"'Password'\", `\"Password\"`,\n\t\"'passwd'\", `\"passwd\"`, \"'pwd'\", `\"pwd\"`,\n\t\"'changeme'\", `\"changeme\"`, \"'CHANGEME'\", `\"CHANGEME\"`,\n\t\"'admin'\", `\"admin\"`, \"'admin123'\", `\"admin123\"`,\n\t\"'12345678'\", `\"12345678\"`, \"'123456789'\", `\"123456789\"`,\n\t\"'qwerty'\", `\"qwerty\"`, \"'qwerty123'\", `\"qwerty123\"`,\n\t\"'letmein'\", `\"letmein\"`, \"'test123'\", `\"test123\"`,\n\t\"'secret'\", `\"secret\"`, \"'default'\", `\"default\"`,\n}\n\n// Surrounding-context tokens that lower the score (looks like a fixture/sample).\nvar fixtureKeywords = []string{\n\t\"example\", \"fixture\", \"dummy\", \"sample\",\n\t\"placeholder\", \"fake_\", \"mock_\", \"stub_\", \"lorem\",\n\t\"FIXME\", \"TODO\", \"// e.g.\", \"for example\",\n}\n\n// Generic-rule context: at least one of these must appear within ±contextWindow\n// chars when a rule is flagged RequiresContext=true.\nvar contextKeywordsGeneric = []string{\n\t\"key\", \"token\", \"secret\", \"auth\", \"bearer\", \"api\",\n\t\"private\", \"credential\", \"password\", \"pwd\", \"session\", \"access\",\n}\n\n// Sourcemap signature on the line is an instant skip - it's a build artifact.\nvar sourcemapMarkerRe = regexp.MustCompile(`(?i)//[#@]\\s*source(?:mapping)?URL=`)\n\n// Vendor chunk filename hint; raises the FP threshold.\nvar vendorChunkRe = regexp.MustCompile(`(?i)(?:vendor|chunk|runtime|polyfill|framework|webpack|node_modules)[-_./~]`)\n\n// registerRules wires the curated provider registry. Called lazily so users\n// who only consume legacy regexPatterns pay nothing.\nfunc registerRules() {\n\trulesOnce.Do(func() {\n\t\trulesRegistry = append(rulesRegistry, []Rule{\n\t\t\t{\n\t\t\t\tID:              \"aws.access_key_id\",\n\t\t\t\tName:            \"AWS Access Key ID\",\n\t\t\t\tProvider:        \"AWS\",\n\t\t\t\tSecretType:      \"access_key_id\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\b(?:AKIA|ASIA|A3T[A-Z0-9]|AGPA|AIDA|AROA|AIPA|ANPA|ANVA)[A-Z2-7]{16}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          20, MaxLen: 20,\n\t\t\t\tValidate:   validateAWSAccessKeyID,\n\t\t\t\tTPExamples: []string{\"AKIA2OGYBAH6STMMNXWG\"},\n\t\t\t\tFPExamples: []string{\"AKIAIOSFODNN7EXAMPLE\"},\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"aws.secret_access_key\",\n\t\t\t\tName:            \"AWS Secret Access Key\",\n\t\t\t\tProvider:        \"AWS\",\n\t\t\t\tSecretType:      \"secret_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`(?i)\\b(?:aws[_-]?(?:secret|sk))[_-]?(?:access[_-]?)?key\\s*[:=]\\s*[\"']([A-Za-z0-9/+=]{40})[\"']`),\n\t\t\t\tGroup:           1,\n\t\t\t\tConfidencePrior: 0.80,\n\t\t\t\tMinEntropy:      4.2,\n\t\t\t\tMinLen:          40, MaxLen: 40,\n\t\t\t\tHighFPProne: true,\n\t\t\t\tValidate:    validateAWSSecretKey,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"stripe.secret_key\",\n\t\t\t\tName:            \"Stripe Secret Key\",\n\t\t\t\tProvider:        \"Stripe\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\b(?:sk|rk)_live_[0-9A-Za-z]{20,247}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          28,\n\t\t\t\tValidate:        validateStripeKey,\n\t\t\t\tTPExamples:      []string{\"sk_\" + \"live_\" + \"51HVFjkJK29bs8Hjk39MeOpqRsTuVwXyZ\"},\n\t\t\t\tFPExamples:      []string{\"sk_\" + \"test_\" + \"BQokikJOvBiI2HlWgH4olfQ2\"},\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"stripe.restricted_key\",\n\t\t\t\tName:            \"Stripe Restricted Key\",\n\t\t\t\tProvider:        \"Stripe\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\brk_live_[0-9A-Za-z]{20,247}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          28,\n\t\t\t\tValidate:        validateStripeKey,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"stripe.publishable_key\",\n\t\t\t\tName:            \"Stripe Publishable Key\",\n\t\t\t\tProvider:        \"Stripe\",\n\t\t\t\tSecretType:      \"publishable_key\",\n\t\t\t\tSeverity:        SevLow,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bpk_live_[0-9A-Za-z]{20,247}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          28,\n\t\t\t\tValidate:        validateStripeKey,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"github.pat_classic\",\n\t\t\t\tName:            \"GitHub Personal Access Token\",\n\t\t\t\tProvider:        \"GitHub\",\n\t\t\t\tSecretType:      \"pat\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bgh[oprsu]_[A-Za-z0-9]{36,251}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          40,\n\t\t\t\tValidate:        validateGitHubToken,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"github.fine_grained_pat\",\n\t\t\t\tName:            \"GitHub Fine-Grained PAT\",\n\t\t\t\tProvider:        \"GitHub\",\n\t\t\t\tSecretType:      \"pat\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bgithub_pat_[0-9A-Za-z_]{82}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          93, MaxLen: 93,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"openai.legacy_key\",\n\t\t\t\tName:            \"OpenAI API Key (legacy)\",\n\t\t\t\tProvider:        \"OpenAI\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bsk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          51, MaxLen: 51,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"openai.project_key\",\n\t\t\t\tName:            \"OpenAI Project Key\",\n\t\t\t\tProvider:        \"OpenAI\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bsk-proj-[A-Za-z0-9_\\-]{40,200}\\b`),\n\t\t\t\tConfidencePrior: 0.92,\n\t\t\t\tMinLen:          48,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"openai.svcacct_key\",\n\t\t\t\tName:            \"OpenAI Service Account Key\",\n\t\t\t\tProvider:        \"OpenAI\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bsk-svcacct-[A-Za-z0-9_\\-]{40,200}\\b`),\n\t\t\t\tConfidencePrior: 0.92,\n\t\t\t\tMinLen:          48,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"anthropic.api_key\",\n\t\t\t\tName:            \"Anthropic API Key\",\n\t\t\t\tProvider:        \"Anthropic\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bsk-ant-(?:api|admin)\\d{2}-[A-Za-z0-9_\\-]{86,200}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          93,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"google.api_key\",\n\t\t\t\tName:            \"Google API Key\",\n\t\t\t\tProvider:        \"Google\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bAIza[0-9A-Za-z_\\-]{35}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          39, MaxLen: 39,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"google.oauth_token\",\n\t\t\t\tName:            \"Google OAuth Access Token\",\n\t\t\t\tProvider:        \"Google\",\n\t\t\t\tSecretType:      \"oauth\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bya29\\.[0-9A-Za-z_\\-]{40,200}\\b`),\n\t\t\t\tConfidencePrior: 0.90,\n\t\t\t\tMinLen:          45,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"slack.user_or_bot_token\",\n\t\t\t\tName:            \"Slack Token\",\n\t\t\t\tProvider:        \"Slack\",\n\t\t\t\tSecretType:      \"api_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bxox[bopsare]-(?:\\d+-){1,4}[A-Za-z0-9]{16,40}\\b`),\n\t\t\t\tConfidencePrior: 0.92,\n\t\t\t\tMinLen:          24,\n\t\t\t\tValidate:        validateSlackToken,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"slack.app_token\",\n\t\t\t\tName:            \"Slack App Token\",\n\t\t\t\tProvider:        \"Slack\",\n\t\t\t\tSecretType:      \"api_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bxapp-\\d-[A-Z0-9]+-\\d+-[A-Za-z0-9]{40,80}\\b`),\n\t\t\t\tConfidencePrior: 0.92,\n\t\t\t\tMinLen:          50,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"slack.webhook\",\n\t\t\t\tName:            \"Slack Incoming Webhook\",\n\t\t\t\tProvider:        \"Slack\",\n\t\t\t\tSecretType:      \"webhook\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bhttps://hooks\\.slack\\.com/services/T[A-Z0-9]{8,12}/B[A-Z0-9]{8,12}/[A-Za-z0-9]{20,40}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"discord.webhook\",\n\t\t\t\tName:            \"Discord Webhook\",\n\t\t\t\tProvider:        \"Discord\",\n\t\t\t\tSecretType:      \"webhook\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bhttps://(?:discord|discordapp)\\.com/api/webhooks/\\d{17,20}/[A-Za-z0-9_\\-]{60,80}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"discord.bot_token\",\n\t\t\t\tName:            \"Discord Bot Token\",\n\t\t\t\tProvider:        \"Discord\",\n\t\t\t\tSecretType:      \"bot_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\b[MN][A-Za-z\\d]{23,25}\\.[\\w-]{6}\\.[\\w-]{27,38}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          59,\n\t\t\t\tHighFPProne:     true,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"twilio.api_key\",\n\t\t\t\tName:            \"Twilio API Key (SK)\",\n\t\t\t\tProvider:        \"Twilio\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bSK[0-9a-fA-F]{32}\\b`),\n\t\t\t\tConfidencePrior: 0.75,\n\t\t\t\tMinLen:          34, MaxLen: 34,\n\t\t\t\tHighFPProne:     true,\n\t\t\t\tValidate:        validateTwilioSK,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"twilio.account_sid\",\n\t\t\t\tName:            \"Twilio Account SID (AC)\",\n\t\t\t\tProvider:        \"Twilio\",\n\t\t\t\tSecretType:      \"account_id\",\n\t\t\t\tSeverity:        SevMedium,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bAC[a-f0-9]{32}\\b`),\n\t\t\t\tConfidencePrior: 0.70,\n\t\t\t\tMinLen:          34, MaxLen: 34,\n\t\t\t\tHighFPProne:     true,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"sendgrid.api_key\",\n\t\t\t\tName:            \"SendGrid API Key\",\n\t\t\t\tProvider:        \"SendGrid\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bSG\\.[A-Za-z0-9_\\-]{22}\\.[A-Za-z0-9_\\-]{43}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          69, MaxLen: 69,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"mailgun.api_key\",\n\t\t\t\tName:            \"Mailgun API Key\",\n\t\t\t\tProvider:        \"Mailgun\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bkey-[0-9a-zA-Z]{32}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          36, MaxLen: 36,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"mailchimp.api_key\",\n\t\t\t\tName:            \"Mailchimp API Key\",\n\t\t\t\tProvider:        \"Mailchimp\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\b[0-9a-f]{32}-us[0-9]{1,3}\\b`),\n\t\t\t\tConfidencePrior: 0.85,\n\t\t\t\tMinLen:          35,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"github.app_install_token\",\n\t\t\t\tName:            \"GitHub App Installation Token\",\n\t\t\t\tProvider:        \"GitHub\",\n\t\t\t\tSecretType:      \"installation_token\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bv1\\.[a-f0-9]{40,}\\b`),\n\t\t\t\tConfidencePrior: 0.55,\n\t\t\t\tMinLen:          43,\n\t\t\t\tHighFPProne:     true,\n\t\t\t\tRequiresContext: true,\n\t\t\t\tContextKeywords: []string{\"github\", \"token\", \"install\", \"app\"},\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"gitlab.pat\",\n\t\t\t\tName:            \"GitLab Personal Access Token\",\n\t\t\t\tProvider:        \"GitLab\",\n\t\t\t\tSecretType:      \"pat\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bglpat-[0-9A-Za-z_\\-]{20,40}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          26,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"gitlab.pipeline_token\",\n\t\t\t\tName:            \"GitLab Pipeline Trigger Token\",\n\t\t\t\tProvider:        \"GitLab\",\n\t\t\t\tSecretType:      \"pipeline_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bglptt-[a-f0-9]{40}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          46, MaxLen: 46,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"vercel.token\",\n\t\t\t\tName:            \"Vercel Token\",\n\t\t\t\tProvider:        \"Vercel\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\b(?:vercel_)?[A-Za-z0-9]{24}\\b`),\n\t\t\t\tConfidencePrior: 0.45,\n\t\t\t\tHighFPProne:     true,\n\t\t\t\tRequiresContext: true,\n\t\t\t\tContextKeywords: []string{\"vercel\", \"VERCEL_TOKEN\"},\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"doppler.token\",\n\t\t\t\tName:            \"Doppler Token\",\n\t\t\t\tProvider:        \"Doppler\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bdp\\.(?:pt|st|sa|ct|scim|audit)\\.[A-Za-z0-9]{40,44}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          47,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"digitalocean.token\",\n\t\t\t\tName:            \"DigitalOcean Token\",\n\t\t\t\tProvider:        \"DigitalOcean\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bdop_v1_[a-f0-9]{64}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          71, MaxLen: 71,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"shopify.access_token\",\n\t\t\t\tName:            \"Shopify Access Token\",\n\t\t\t\tProvider:        \"Shopify\",\n\t\t\t\tSecretType:      \"access_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bshpat_[a-fA-F0-9]{32}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          38, MaxLen: 38,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"shopify.shared_secret\",\n\t\t\t\tName:            \"Shopify Shared Secret\",\n\t\t\t\tProvider:        \"Shopify\",\n\t\t\t\tSecretType:      \"shared_secret\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bshpss_[a-fA-F0-9]{32}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          38, MaxLen: 38,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"npm.token\",\n\t\t\t\tName:            \"npm Access Token\",\n\t\t\t\tProvider:        \"npm\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bnpm_[A-Za-z0-9]{36}\\b`),\n\t\t\t\tConfidencePrior: 0.95,\n\t\t\t\tMinLen:          40, MaxLen: 40,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"pypi.token\",\n\t\t\t\tName:            \"PyPI Token\",\n\t\t\t\tProvider:        \"PyPI\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bpypi-AgEIcHlwaS5vcmc[A-Za-z0-9_\\-]{50,200}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          70,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"jwt.token\",\n\t\t\t\tName:            \"JSON Web Token\",\n\t\t\t\tProvider:        \"JWT\",\n\t\t\t\tSecretType:      \"jwt\",\n\t\t\t\tSeverity:        SevMedium,\n\t\t\t\tPattern:         regexp.MustCompile(`\\beyJ[A-Za-z0-9_\\-]{8,}\\.eyJ[A-Za-z0-9_\\-]{8,}\\.[A-Za-z0-9_\\-]{8,}\\b`),\n\t\t\t\tConfidencePrior: 0.70,\n\t\t\t\tMinLen:          40,\n\t\t\t\tValidate:        validateJWT,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"rsa.private_key\",\n\t\t\t\tName:            \"RSA Private Key\",\n\t\t\t\tProvider:        \"PKI\",\n\t\t\t\tSecretType:      \"private_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`-----BEGIN RSA PRIVATE KEY-----`),\n\t\t\t\tConfidencePrior: 0.99,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"openssh.private_key\",\n\t\t\t\tName:            \"OpenSSH Private Key\",\n\t\t\t\tProvider:        \"PKI\",\n\t\t\t\tSecretType:      \"private_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`-----BEGIN OPENSSH PRIVATE KEY-----`),\n\t\t\t\tConfidencePrior: 0.99,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"ec.private_key\",\n\t\t\t\tName:            \"EC Private Key\",\n\t\t\t\tProvider:        \"PKI\",\n\t\t\t\tSecretType:      \"private_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`-----BEGIN EC PRIVATE KEY-----`),\n\t\t\t\tConfidencePrior: 0.99,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"pgp.private_key\",\n\t\t\t\tName:            \"PGP Private Key\",\n\t\t\t\tProvider:        \"PKI\",\n\t\t\t\tSecretType:      \"private_key\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`-----BEGIN PGP PRIVATE KEY BLOCK-----`),\n\t\t\t\tConfidencePrior: 0.99,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"facebook.access_token\",\n\t\t\t\tName:            \"Facebook Access Token\",\n\t\t\t\tProvider:        \"Meta\",\n\t\t\t\tSecretType:      \"access_token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bEAA[A-Za-z0-9]{20,200}\\b`),\n\t\t\t\tConfidencePrior: 0.65,\n\t\t\t\tMinLen:          25,\n\t\t\t\tHighFPProne:     true,\n\t\t\t\tRequiresContext: true,\n\t\t\t\tContextKeywords: []string{\"facebook\", \"fb\", \"meta\", \"graph.facebook\"},\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"linear.api_key\",\n\t\t\t\tName:            \"Linear API Key\",\n\t\t\t\tProvider:        \"Linear\",\n\t\t\t\tSecretType:      \"api_key\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\blin_(?:api|oauth)_[A-Za-z0-9]{40,80}\\b`),\n\t\t\t\tConfidencePrior: 0.97,\n\t\t\t\tMinLen:          47,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"huggingface.token\",\n\t\t\t\tName:            \"HuggingFace Token\",\n\t\t\t\tProvider:        \"HuggingFace\",\n\t\t\t\tSecretType:      \"token\",\n\t\t\t\tSeverity:        SevHigh,\n\t\t\t\tPattern:         regexp.MustCompile(`\\bhf_[A-Za-z0-9]{34,80}\\b`),\n\t\t\t\tConfidencePrior: 0.92,\n\t\t\t\tMinLen:          37,\n\t\t\t},\n\t\t\t{\n\t\t\t\tID:              \"supabase.service_role\",\n\t\t\t\tName:            \"Supabase Service Role JWT\",\n\t\t\t\tProvider:        \"Supabase\",\n\t\t\t\tSecretType:      \"service_role\",\n\t\t\t\tSeverity:        SevCritical,\n\t\t\t\tPattern:         regexp.MustCompile(`\\beyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9\\.[A-Za-z0-9_\\-]+\\.[A-Za-z0-9_\\-]+\\b`),\n\t\t\t\tConfidencePrior: 0.75,\n\t\t\t\tMinLen:          60,\n\t\t\t\tValidate:        validateJWT,\n\t\t\t},\n\t\t}...)\n\n\t\tfor i := range rulesRegistry {\n\t\t\tr := &rulesRegistry[i]\n\t\t\trulesIndex[r.ID] = r\n\t\t}\n\t})\n}\n\n// shannonEntropy is the standard bits-per-symbol entropy. Vendored here to\n// avoid pulling another dep; the math/log2 path is hot-cache safe.\nfunc shannonEntropy(s string) float64 {\n\tif s == \"\" {\n\t\treturn 0\n\t}\n\tfreq := make(map[rune]int, len(s))\n\tfor _, r := range s {\n\t\tfreq[r]++\n\t}\n\tlength := float64(len(s))\n\tentropy := 0.0\n\tfor _, c := range freq {\n\t\tp := float64(c) / length\n\t\tentropy -= p * math.Log2(p)\n\t}\n\treturn entropy\n}\n\n// charClassDiversity counts how many of the four shape classes the string uses.\n// Real secrets tend to use 3+; minified identifiers use 1-2.\nfunc charClassDiversity(s string) int {\n\tvar hasLower, hasUpper, hasDigit, hasSym bool\n\tfor _, r := range s {\n\t\tswitch {\n\t\tcase r >= 'a' && r <= 'z':\n\t\t\thasLower = true\n\t\tcase r >= 'A' && r <= 'Z':\n\t\t\thasUpper = true\n\t\tcase r >= '0' && r <= '9':\n\t\t\thasDigit = true\n\t\tcase r == '_' || r == '-' || r == '.' || r == '+' || r == '/' || r == '=':\n\t\t\thasSym = true\n\t\t}\n\t}\n\td := 0\n\tif hasLower {\n\t\td++\n\t}\n\tif hasUpper {\n\t\td++\n\t}\n\tif hasDigit {\n\t\td++\n\t}\n\tif hasSym {\n\t\td++\n\t}\n\treturn d\n}\n\n// redactValue masks the middle of a secret while keeping enough head/tail to\n// disambiguate findings without leaking the secret to logs.\nfunc redactValue(v string) string {\n\tn := len(v)\n\tswitch {\n\tcase n <= 8:\n\t\treturn strings.Repeat(\"*\", n)\n\tcase n <= 16:\n\t\treturn v[:2] + strings.Repeat(\"*\", n-4) + v[n-2:]\n\tdefault:\n\t\treturn v[:4] + strings.Repeat(\"*\", n-8) + v[n-4:]\n\t}\n}\n\n// hashValue gives a stable, short identifier for dedup that can't be reversed\n// to the secret itself when emitted in summary output.\nfunc hashValue(v string) string {\n\th := sha256.Sum256([]byte(v))\n\treturn hex.EncodeToString(h[:8])\n}\n\n// looksLikeFixture returns true if the surrounding context smells like docs/example code.\nfunc looksLikeFixture(context string) bool {\n\tlow := strings.ToLower(context)\n\tfor _, kw := range fixtureKeywords {\n\t\tif strings.Contains(low, strings.ToLower(kw)) {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n\n// hasContextKeyword checks whether at least one of `kws` appears (case-insensitive)\n// in the given context window.\nfunc hasContextKeyword(context string, kws []string) bool {\n\tif len(kws) == 0 {\n\t\treturn true\n\t}\n\tlow := strings.ToLower(context)\n\tfor _, kw := range kws {\n\t\tif strings.Contains(low, strings.ToLower(kw)) {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n\n// isInVendorNoise screens canonical sample/placeholder values.\nfunc isInVendorNoise(v string) (bool, string) {\n\tif _, ok := vendorNoiseExact[v]; ok {\n\t\treturn true, \"exact-match in vendor-noise corpus\"\n\t}\n\tfor _, sub := range vendorNoiseSubstr {\n\t\tif strings.Contains(v, sub) {\n\t\t\treturn true, \"contains placeholder fragment '\" + sub + \"'\"\n\t\t}\n\t}\n\treturn false, \"\"\n}\n\n// extractContextWindow returns the slice of body around `start..end` for context analysis.\nfunc extractContextWindow(body string, start, end int) string {\n\ta := start - contextWindow\n\tif a < 0 {\n\t\ta = 0\n\t}\n\tb := end + contextWindow\n\tif b > len(body) {\n\t\tb = len(body)\n\t}\n\treturn body[a:b]\n}\n\n// scoreFinding runs the FP pipeline against (rule, value, context) and returns\n// (keep, score, reasons). Score is in [0,1]. Caller uses the configured\n// minimum-confidence threshold to gate output. Counters are incremented when\n// a match is dropped at a known stage so --stats can audit the pipeline.\nfunc scoreFinding(rule *Rule, value, context, source string) (bool, float64, []string) {\n\treasons := []string{}\n\tscore := rule.ConfidencePrior\n\tif score == 0 {\n\t\tscore = 0.5\n\t}\n\n\tif vendorChunkRe.MatchString(source) {\n\t\tscore -= 0.15\n\t\treasons = append(reasons, \"source looks like a vendor/chunk bundle\")\n\t}\n\n\tif drop, why := isInVendorNoise(value); drop {\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedVendorNoise)\n\t\t}\n\t\treturn false, 0, []string{why}\n\t}\n\n\tif rule.MinLen > 0 && len(value) < rule.MinLen {\n\t\treturn false, 0, []string{fmt.Sprintf(\"length %d < MinLen %d\", len(value), rule.MinLen)}\n\t}\n\tif rule.MaxLen > 0 && len(value) > rule.MaxLen {\n\t\treturn false, 0, []string{fmt.Sprintf(\"length %d > MaxLen %d\", len(value), rule.MaxLen)}\n\t}\n\n\tentropy := shannonEntropy(value)\n\tif rule.MinEntropy > 0 && entropy < rule.MinEntropy {\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedLowEntropy)\n\t\t}\n\t\treturn false, 0, []string{fmt.Sprintf(\"entropy %.2f < required %.2f\", entropy, rule.MinEntropy)}\n\t}\n\n\tif rule.HighFPProne {\n\t\tdiversity := charClassDiversity(value)\n\t\tif diversity < 2 {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedLowEntropy)\n\t\t\t}\n\t\t\treturn false, 0, []string{\"insufficient character-class diversity for high-FP rule\"}\n\t\t}\n\t\tif entropy < 3.0 {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedLowEntropy)\n\t\t\t}\n\t\t\treturn false, 0, []string{fmt.Sprintf(\"entropy %.2f too low for high-FP rule\", entropy)}\n\t\t}\n\t}\n\n\tif rule.RequiresContext {\n\t\tkws := rule.ContextKeywords\n\t\tif len(kws) == 0 {\n\t\t\tkws = contextKeywordsGeneric\n\t\t}\n\t\tif !hasContextKeyword(context, kws) {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedNoContext)\n\t\t\t}\n\t\t\treturn false, 0, []string{\"missing required context keyword(s)\"}\n\t\t}\n\t\tscore += 0.05\n\t\treasons = append(reasons, \"context keyword present\")\n\t}\n\n\tif rule.Validate != nil {\n\t\tok, vReasons := rule.Validate(value)\n\t\tif !ok {\n\t\t\treturn false, 0, append([]string{\"provider validator rejected\"}, vReasons...)\n\t\t}\n\t\tscore += 0.10\n\t\treasons = append(reasons, vReasons...)\n\t}\n\n\tif looksLikeFixture(context) {\n\t\tscore -= 0.30\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedFixture)\n\t\t}\n\t\treasons = append(reasons, \"surrounded by fixture/example wording\")\n\t}\n\n\tif entropy >= 4.5 {\n\t\tscore += 0.05\n\t\treasons = append(reasons, fmt.Sprintf(\"high entropy %.2f\", entropy))\n\t}\n\n\tif charClassDiversity(value) >= 3 {\n\t\tscore += 0.05\n\t\treasons = append(reasons, \"diverse character classes\")\n\t}\n\n\tif score < 0 {\n\t\tscore = 0\n\t}\n\tif score > 1 {\n\t\tscore = 1\n\t}\n\n\treturn true, score, reasons\n}\n\n// recordFinding inserts or merges a finding into the dedupe map keyed by\n// (value_hash, secret_type). Same secret seen in many sources collapses to a\n// single Finding with a Locations[] list, each location carrying its own\n// line:column pair so an operator can `vim file:line:col` directly.\n//\n// Returns nil when the finding is suppressed by the active ignore list or\n// the active diff baseline. Callers must check for nil.\nfunc recordFinding(f *Finding) *Finding {\n\tif activeIgnoreList != nil && activeIgnoreList.ShouldIgnore(f) {\n\t\treturn nil\n\t}\n\tif activeDiffSeen != nil && activeDiffSeen[f.ValueHash] {\n\t\treturn nil\n\t}\n\tfindingsMutex.Lock()\n\tdefer findingsMutex.Unlock()\n\tkey := f.ValueHash + \"|\" + f.SecretType\n\tloc := Location{Source: f.Source, Line: f.Line, Column: f.Column}\n\tif existing, ok := findingsByHash[key]; ok {\n\t\texisting.Locations = append(existing.Locations, loc)\n\t\tif f.Confidence > existing.Confidence {\n\t\t\texisting.Confidence = f.Confidence\n\t\t\texisting.Reasons = f.Reasons\n\t\t}\n\t\tif f.Verified && !existing.Verified {\n\t\t\texisting.Verified = true\n\t\t\texisting.Verify = f.Verify\n\t\t}\n\t\treturn existing\n\t}\n\tf.Locations = []Location{loc}\n\tfindingsByHash[key] = f\n\tif globalStats != nil {\n\t\tstatInc(&globalStats.FindingsAfterDedupe)\n\t}\n\treturn f\n}\n\n// flushFindings returns a snapshot of all unique findings, sorted by severity then confidence.\nfunc flushFindings() []*Finding {\n\tfindingsMutex.Lock()\n\tdefer findingsMutex.Unlock()\n\tout := make([]*Finding, 0, len(findingsByHash))\n\tfor _, f := range findingsByHash {\n\t\tout = append(out, f)\n\t}\n\tsevRank := map[Severity]int{\n\t\tSevCritical: 5, SevHigh: 4, SevMedium: 3, SevLow: 2, SevInfo: 1,\n\t}\n\tsort.Slice(out, func(i, j int) bool {\n\t\tri, rj := sevRank[out[i].Severity], sevRank[out[j].Severity]\n\t\tif ri != rj {\n\t\t\treturn ri > rj\n\t\t}\n\t\tif out[i].Confidence != out[j].Confidence {\n\t\t\treturn out[i].Confidence > out[j].Confidence\n\t\t}\n\t\treturn out[i].Name < out[j].Name\n\t})\n\treturn out\n}\n\n// resetFindings clears the dedupe state between independent runs.\nfunc resetFindings() {\n\tfindingsMutex.Lock()\n\tfindingsByHash = make(map[string]*Finding)\n\tfindingsMutex.Unlock()\n}\n\n// analyzeBody runs the curated registry against `body` and returns scored findings.\n// Source is the URL or filepath. minConfidence gates which findings pass.\n// Each Finding records the byte offset, line, and column of the match so\n// downstream tools can anchor results back to the exact source location.\nfunc analyzeBody(source string, body []byte, minConfidence float64) []*Finding {\n\tregisterRules()\n\tbodyStr := string(body)\n\tout := []*Finding{}\n\n\tfor i := range rulesRegistry {\n\t\trule := &rulesRegistry[i]\n\t\tmatches := rule.Pattern.FindAllStringSubmatchIndex(bodyStr, -1)\n\t\tfor _, m := range matches {\n\t\t\tstart, end := m[0], m[1]\n\t\t\tvalue := bodyStr[start:end]\n\t\t\tif rule.Group > 0 && len(m) > 2*rule.Group+1 {\n\t\t\t\tgs, ge := m[2*rule.Group], m[2*rule.Group+1]\n\t\t\t\tif gs >= 0 && ge >= 0 {\n\t\t\t\t\tstart, end = gs, ge\n\t\t\t\t\tvalue = bodyStr[start:end]\n\t\t\t\t}\n\t\t\t}\n\n\t\t\tlineCtx := bodyStr[lineStartIndex(bodyStr, start):lineEndIndex(bodyStr, end)]\n\t\t\tif sourcemapMarkerRe.MatchString(lineCtx) {\n\t\t\t\tif globalStats != nil {\n\t\t\t\t\tstatInc(&globalStats.DroppedSourcemap)\n\t\t\t\t}\n\t\t\t\tcontinue\n\t\t\t}\n\n\t\t\tctx := extractContextWindow(bodyStr, start, end)\n\t\t\tkeep, score, reasons := scoreFinding(rule, value, ctx, source)\n\t\t\tif !keep {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tif score < minConfidence {\n\t\t\t\tif globalStats != nil {\n\t\t\t\t\tstatInc(&globalStats.DroppedBelowConf)\n\t\t\t\t}\n\t\t\t\tcontinue\n\t\t\t}\n\n\t\t\tline, col := positionAt(bodyStr, start)\n\t\t\tf := &Finding{\n\t\t\t\tSchemaVersion: SchemaVersion,\n\t\t\t\tRuleID:        rule.ID,\n\t\t\t\tName:          rule.Name,\n\t\t\t\tProvider:      rule.Provider,\n\t\t\t\tSecretType:    rule.SecretType,\n\t\t\t\tSeverity:      rule.Severity,\n\t\t\t\tValue:         value,\n\t\t\t\tRedacted:      redactValue(value),\n\t\t\t\tValueHash:     hashValue(value),\n\t\t\t\tSource:        source,\n\t\t\t\tConfidence:    score,\n\t\t\t\tEntropy:       shannonEntropy(value),\n\t\t\t\tReasons:       reasons,\n\t\t\t\tLine:          line,\n\t\t\t\tColumn:        col,\n\t\t\t}\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.RegistryHits)\n\t\t\t\tstatInc(&globalStats.FindingsAfterFilter)\n\t\t\t}\n\t\t\tif rec := recordFinding(f); rec != nil {\n\t\t\t\tout = append(out, rec)\n\t\t\t}\n\t\t}\n\t}\n\treturn out\n}\n\n// positionAt returns the 1-indexed (line, column) of byte offset `idx` in s.\n// Cheap O(idx) scan; called once per finding so the cost is negligible\n// relative to the regex evaluation that produced the offset.\nfunc positionAt(s string, idx int) (line, col int) {\n\tif idx < 0 {\n\t\tidx = 0\n\t}\n\tif idx > len(s) {\n\t\tidx = len(s)\n\t}\n\tline, col = 1, 1\n\tfor i := 0; i < idx; i++ {\n\t\tif s[i] == '\\n' {\n\t\t\tline++\n\t\t\tcol = 1\n\t\t} else {\n\t\t\tcol++\n\t\t}\n\t}\n\treturn line, col\n}\n\nfunc lineStartIndex(s string, idx int) int {\n\tif idx <= 0 {\n\t\treturn 0\n\t}\n\tif idx >= len(s) {\n\t\tidx = len(s) - 1\n\t}\n\tfor i := idx; i > 0; i-- {\n\t\tif s[i-1] == '\\n' {\n\t\t\treturn i\n\t\t}\n\t}\n\treturn 0\n}\n\nfunc lineEndIndex(s string, idx int) int {\n\tif idx >= len(s) {\n\t\treturn len(s)\n\t}\n\tfor i := idx; i < len(s); i++ {\n\t\tif s[i] == '\\n' {\n\t\t\treturn i\n\t\t}\n\t}\n\treturn len(s)\n}\n\n// applyLegacyFPFilter wraps the existing regexPatterns dictionary so that\n// every legacy hit is also scored. Returns (keep, confidence, reasons) so the\n// caller can decide to print or drop, and to optionally show the score.\n//\n// This is the bridge that brings v0.6 quality to rules we have not yet\n// migrated into the curated registry.\nfunc applyLegacyFPFilter(name, value, body, source string, start, end int) (bool, float64, []string) {\n\tif globalStats != nil {\n\t\tstatInc(&globalStats.LegacyMatchesRaw)\n\t}\n\tif value == \"\" {\n\t\treturn false, 0, []string{\"empty match\"}\n\t}\n\tif drop, why := isInVendorNoise(value); drop {\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedVendorNoise)\n\t\t}\n\t\treturn false, 0, []string{why}\n\t}\n\n\tctx := extractContextWindow(body, start, end)\n\tif sourcemapMarkerRe.MatchString(body[lineStartIndex(body, start):lineEndIndex(body, end)]) {\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedSourcemap)\n\t\t}\n\t\treturn false, 0, []string{\"line is a //# sourceMappingURL marker\"}\n\t}\n\n\tscore := 0.55\n\treasons := []string{}\n\n\tlow := strings.ToLower(name)\n\thighFP := strings.HasPrefix(low, \"generic \") || strings.Contains(low, \"quickbooks\") ||\n\t\tstrings.Contains(low, \"cisco access\") || strings.Contains(low, \"sanity\") ||\n\t\tstrings.Contains(low, \"atlassian access\") || strings.Contains(low, \"heroku\")\n\n\tentropy := shannonEntropy(value)\n\tdiversity := charClassDiversity(value)\n\n\tif highFP {\n\t\tif diversity < 2 {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedLowEntropy)\n\t\t\t}\n\t\t\treturn false, 0, []string{\"high-FP-prone rule: low character-class diversity\"}\n\t\t}\n\t\tif entropy < 3.2 {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedLowEntropy)\n\t\t\t}\n\t\t\treturn false, 0, []string{fmt.Sprintf(\"high-FP-prone rule: entropy %.2f too low\", entropy)}\n\t\t}\n\t\tif !hasContextKeyword(ctx, contextKeywordsGeneric) {\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.DroppedNoContext)\n\t\t\t}\n\t\t\treturn false, 0, []string{\"high-FP-prone rule: no key/token/secret context keyword\"}\n\t\t}\n\t\treasons = append(reasons, \"context keyword present (generic-rule gate)\")\n\t}\n\n\tif vendorChunkRe.MatchString(source) {\n\t\tscore -= 0.15\n\t\treasons = append(reasons, \"vendor/chunk bundle\")\n\t}\n\n\tif looksLikeFixture(ctx) {\n\t\tscore -= 0.30\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.DroppedFixture)\n\t\t}\n\t\treasons = append(reasons, \"context looks like a fixture/example\")\n\t}\n\n\tif entropy >= 4.5 {\n\t\tscore += 0.05\n\t\treasons = append(reasons, fmt.Sprintf(\"high entropy %.2f\", entropy))\n\t}\n\n\tif score < 0 {\n\t\tscore = 0\n\t}\n\tif score > 1 {\n\t\tscore = 1\n\t}\n\treturn true, score, reasons\n}\n\n// validateAWSAccessKeyID enforces the documented prefix family and pure-base32\n// body alphabet (A-Z, 2-7). Excludes 0,1,8,9 which AWS deliberately omits.\nfunc validateAWSAccessKeyID(v string) (bool, []string) {\n\tif len(v) != 20 {\n\t\treturn false, []string{\"length != 20\"}\n\t}\n\tprefix := v[:4]\n\tswitch prefix {\n\tcase \"AKIA\", \"ASIA\", \"AGPA\", \"AIDA\", \"AROA\", \"AIPA\", \"ANPA\", \"ANVA\":\n\tdefault:\n\t\tif !(strings.HasPrefix(v, \"A3T\") && v[3] >= 'A' && v[3] <= 'Z') {\n\t\t\treturn false, []string{\"unknown AWS access key prefix\"}\n\t\t}\n\t}\n\tfor i := 4; i < 20; i++ {\n\t\tc := v[i]\n\t\tok := (c >= 'A' && c <= 'Z') || (c >= '2' && c <= '7')\n\t\tif !ok {\n\t\t\treturn false, []string{\"non-base32 character in body\"}\n\t\t}\n\t}\n\treturn true, []string{\"AWS prefix + base32 body OK\"}\n}\n\n// validateAWSSecretKey enforces 40-char body and high entropy on the captured group.\nfunc validateAWSSecretKey(v string) (bool, []string) {\n\tif len(v) != 40 {\n\t\treturn false, []string{\"length != 40\"}\n\t}\n\tif shannonEntropy(v) < 4.0 {\n\t\treturn false, []string{\"entropy below 4.0\"}\n\t}\n\tif charClassDiversity(v) < 3 {\n\t\treturn false, []string{\"low character-class diversity\"}\n\t}\n\treturn true, []string{\"40-char base64 body, high entropy\"}\n}\n\n// validateStripeKey verifies the prefix family and that the body is clean base62\n// (no underscores), which Stripe uses to avoid colliding with random hashes.\nfunc validateStripeKey(v string) (bool, []string) {\n\tprefixes := []string{\"sk_live_\", \"sk_test_\", \"rk_live_\", \"rk_test_\", \"pk_live_\", \"pk_test_\"}\n\tmatched := \"\"\n\tfor _, p := range prefixes {\n\t\tif strings.HasPrefix(v, p) {\n\t\t\tmatched = p\n\t\t\tbreak\n\t\t}\n\t}\n\tif matched == \"\" {\n\t\treturn false, []string{\"unknown Stripe key prefix\"}\n\t}\n\tbody := v[len(matched):]\n\tif len(body) < 20 {\n\t\treturn false, []string{\"Stripe body too short\"}\n\t}\n\tfor _, c := range body {\n\t\tif !((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9')) {\n\t\t\treturn false, []string{\"non-base62 char in Stripe key body\"}\n\t\t}\n\t}\n\treturn true, []string{\"Stripe \" + matched + \" key, base62 body\"}\n}\n\n// validateGitHubToken implements the documented CRC32-base62 tail checksum.\n// GitHub tokens of the form ghp_/gho_/ghu_/ghs_/ghr_ embed a 6-char checksum\n// computed over the random body. Verifying it is one of the highest-precision\n// signals available without a network call.\nfunc validateGitHubToken(v string) (bool, []string) {\n\tif len(v) < 40 {\n\t\treturn false, []string{\"too short for GitHub token\"}\n\t}\n\tif !strings.HasPrefix(v, \"ghp_\") && !strings.HasPrefix(v, \"gho_\") &&\n\t\t!strings.HasPrefix(v, \"ghu_\") && !strings.HasPrefix(v, \"ghs_\") &&\n\t\t!strings.HasPrefix(v, \"ghr_\") {\n\t\treturn false, []string{\"unknown GitHub token prefix\"}\n\t}\n\tbody := v[4:]\n\tif len(body) < 6 {\n\t\treturn false, []string{\"body too short for checksum\"}\n\t}\n\trandom := body[:len(body)-6]\n\tchecksum := body[len(body)-6:]\n\twant := base62EncodeCRC32(crc32.ChecksumIEEE([]byte(random)))\n\tif !strings.EqualFold(want, checksum) {\n\t\treturn false, []string{\"CRC32 checksum mismatch (cannot verify; treat as suspect)\"}\n\t}\n\treturn true, []string{\"GitHub CRC32 checksum verified\"}\n}\n\n// base62EncodeCRC32 encodes a uint32 as 6-character base62, left-padded with '0'.\nfunc base62EncodeCRC32(n uint32) string {\n\tconst alphabet = \"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\"\n\tif n == 0 {\n\t\treturn \"000000\"\n\t}\n\tbuf := make([]byte, 0, 6)\n\tfor n > 0 {\n\t\tbuf = append([]byte{alphabet[n%62]}, buf...)\n\t\tn /= 62\n\t}\n\tfor len(buf) < 6 {\n\t\tbuf = append([]byte{'0'}, buf...)\n\t}\n\treturn string(buf)\n}\n\n// validateSlackToken enforces the hyphenated segment shape used across the\n// Slack token family (xoxb/xoxp/xoxa/xoxr/xoxs/xoxe/xapp).\nfunc validateSlackToken(v string) (bool, []string) {\n\tparts := strings.Split(v, \"-\")\n\tif len(parts) < 3 {\n\t\treturn false, []string{\"too few hyphen-segments for Slack token\"}\n\t}\n\tfor _, p := range parts[1 : len(parts)-1] {\n\t\tfor _, c := range p {\n\t\t\tif c < '0' || c > '9' {\n\t\t\t\treturn false, []string{\"non-numeric inner segment\"}\n\t\t\t}\n\t\t}\n\t}\n\ttail := parts[len(parts)-1]\n\tif len(tail) < 16 {\n\t\treturn false, []string{\"tail segment too short\"}\n\t}\n\treturn true, []string{\"Slack hyphen-segment structure OK\"}\n}\n\n// validateTwilioSK enforces the 32-hex body and rejects pure-zero or repeating runs.\nfunc validateTwilioSK(v string) (bool, []string) {\n\tif len(v) != 34 || !strings.HasPrefix(v, \"SK\") {\n\t\treturn false, []string{\"bad Twilio SK shape\"}\n\t}\n\tbody := v[2:]\n\tfor _, c := range body {\n\t\tif !((c >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F')) {\n\t\t\treturn false, []string{\"non-hex char in Twilio SK body\"}\n\t\t}\n\t}\n\tif shannonEntropy(body) < 3.5 {\n\t\treturn false, []string{\"Twilio SK entropy too low\"}\n\t}\n\treturn true, []string{\"32-hex Twilio SK body\"}\n}\n\n// validateJWT decodes the header and payload as base64url-encoded JSON and\n// requires that the header carries an `alg` field. Catches the very common\n// \"JWT-shaped strings that aren't JWTs\" FP class.\nfunc validateJWT(v string) (bool, []string) {\n\tparts := strings.Split(v, \".\")\n\tif len(parts) != 3 {\n\t\treturn false, []string{\"JWT must have 3 dot-separated segments\"}\n\t}\n\theaderBytes, err := base64.RawURLEncoding.DecodeString(parts[0])\n\tif err != nil {\n\t\t// some JWT libs emit padded base64; tolerate it\n\t\theaderBytes, err = base64.URLEncoding.DecodeString(parts[0])\n\t\tif err != nil {\n\t\t\treturn false, []string{\"JWT header is not base64url\"}\n\t\t}\n\t}\n\tvar header map[string]any\n\tif err := json.Unmarshal(headerBytes, &header); err != nil {\n\t\treturn false, []string{\"JWT header is not JSON\"}\n\t}\n\tif _, ok := header[\"alg\"]; !ok {\n\t\treturn false, []string{\"JWT header missing alg\"}\n\t}\n\tpayloadBytes, err := base64.RawURLEncoding.DecodeString(parts[1])\n\tif err != nil {\n\t\tpayloadBytes, err = base64.URLEncoding.DecodeString(parts[1])\n\t\tif err != nil {\n\t\t\treturn false, []string{\"JWT payload is not base64url\"}\n\t\t}\n\t}\n\tvar payload map[string]any\n\tif err := json.Unmarshal(payloadBytes, &payload); err != nil {\n\t\treturn false, []string{\"JWT payload is not JSON\"}\n\t}\n\treturn true, []string{\"JWT structurally valid (alg present, JSON header+payload)\"}\n}\n\n// SelfTestResult is the per-rule outcome of `--self-test`.\ntype SelfTestResult struct {\n\tRuleID   string `json:\"rule_id\"`\n\tName     string `json:\"name\"`\n\tTPPassed int    `json:\"tp_passed\"`\n\tTPTotal  int    `json:\"tp_total\"`\n\tFPCaught int    `json:\"fp_caught\"`\n\tFPTotal  int    `json:\"fp_total\"`\n\tOK       bool   `json:\"ok\"`\n\tNotes    []string `json:\"notes,omitempty\"`\n}\n\n// runSelfTest exercises every registered rule against its embedded TP/FP\n// fixtures and reports a precision/recall summary. A rule is OK when it\n// catches all TPs and rejects all FPs.\nfunc runSelfTest() []SelfTestResult {\n\tregisterRules()\n\tout := make([]SelfTestResult, 0, len(rulesRegistry))\n\tfor i := range rulesRegistry {\n\t\tr := &rulesRegistry[i]\n\t\tres := SelfTestResult{RuleID: r.ID, Name: r.Name, OK: true}\n\t\tfor _, tp := range r.TPExamples {\n\t\t\tres.TPTotal++\n\t\t\tfakeBody := \"const apiKey = \\\"\" + tp + \"\\\";\"\n\t\t\tfs := analyzeBody(\"self-test://\"+r.ID, []byte(fakeBody), 0.0)\n\t\t\tok := false\n\t\t\tfor _, f := range fs {\n\t\t\t\tif f.RuleID == r.ID && f.Value == tp {\n\t\t\t\t\tok = true\n\t\t\t\t\tbreak\n\t\t\t\t}\n\t\t\t}\n\t\t\tif ok {\n\t\t\t\tres.TPPassed++\n\t\t\t} else {\n\t\t\t\tres.OK = false\n\t\t\t\tres.Notes = append(res.Notes, \"missed TP: \"+redactValue(tp))\n\t\t\t}\n\t\t\tresetFindings()\n\t\t}\n\t\tfor _, fp := range r.FPExamples {\n\t\t\tres.FPTotal++\n\t\t\tfakeBody := \"const apiKey = \\\"\" + fp + \"\\\";\"\n\t\t\tfs := analyzeBody(\"self-test://\"+r.ID, []byte(fakeBody), 0.0)\n\t\t\tcaught := true\n\t\t\tfor _, f := range fs {\n\t\t\t\tif f.RuleID == r.ID && f.Value == fp {\n\t\t\t\t\tcaught = false\n\t\t\t\t\tbreak\n\t\t\t\t}\n\t\t\t}\n\t\t\tif caught {\n\t\t\t\tres.FPCaught++\n\t\t\t} else {\n\t\t\t\tres.OK = false\n\t\t\t\tres.Notes = append(res.Notes, \"leaked FP: \"+redactValue(fp))\n\t\t\t}\n\t\t\tresetFindings()\n\t\t}\n\t\tout = append(out, res)\n\t}\n\treturn out\n}\n"
  },
  {
    "path": "internal/jshunter/diff.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n)\n\n// DiffPrevious reads a previous schema-v2 envelope and returns the set of\n// value_hashes already reported. Operators run:\n//\n//\tjshunter ... -j -o yesterday.json\n//\tjshunter ... --diff yesterday.json -j -o today-new.json\n//\n// and only see findings that weren't there yesterday. Anchored on\n// value_hash because secrets that move between sources or that match\n// different rules across releases must still dedupe consistently.\nfunc DiffPrevious(path string) (map[string]bool, error) {\n\tif path == \"\" {\n\t\treturn nil, nil\n\t}\n\traw, err := os.ReadFile(path)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"read previous: %w\", err)\n\t}\n\tvar env struct {\n\t\tSchemaVersion int       `json:\"schema_version\"`\n\t\tFindings      []Finding `json:\"findings\"`\n\t}\n\tif err := json.Unmarshal(raw, &env); err != nil {\n\t\treturn nil, fmt.Errorf(\"parse previous: %w\", err)\n\t}\n\tif env.SchemaVersion != SchemaVersion {\n\t\treturn nil, fmt.Errorf(\"--diff requires schema_version=%d; previous file has %d\", SchemaVersion, env.SchemaVersion)\n\t}\n\tseen := make(map[string]bool, len(env.Findings))\n\tfor _, f := range env.Findings {\n\t\tif f.ValueHash != \"\" {\n\t\t\tseen[f.ValueHash] = true\n\t\t}\n\t}\n\treturn seen, nil\n}\n"
  },
  {
    "path": "internal/jshunter/har.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/base64\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n\t\"strings\"\n)\n\n// HAR (HTTP Archive) ingestion. Operators who already have a Burp/Chrome\n// devtools archive don't need JSHunter to re-fetch — feeding the HAR\n// directly is faster and reproducible.\n\ntype harFile struct {\n\tLog struct {\n\t\tEntries []harEntry `json:\"entries\"`\n\t} `json:\"log\"`\n}\n\ntype harEntry struct {\n\tRequest struct {\n\t\tURL string `json:\"url\"`\n\t} `json:\"request\"`\n\tResponse struct {\n\t\tStatus  int `json:\"status\"`\n\t\tContent struct {\n\t\t\tMimeType string `json:\"mimeType\"`\n\t\t\tText     string `json:\"text\"`\n\t\t\tEncoding string `json:\"encoding\"`\n\t\t} `json:\"content\"`\n\t} `json:\"response\"`\n}\n\n// IngestHAR reads a HAR file and runs the v0.6 detection pipeline against\n// every JS-typed response within it. Non-JS entries are silently skipped.\n// Returns the number of entries scanned (useful for --stats and CI gating).\nfunc IngestHAR(path string, config *Config) (int, error) {\n\traw, err := os.ReadFile(path)\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"read har: %w\", err)\n\t}\n\tvar h harFile\n\tif err := json.Unmarshal(raw, &h); err != nil {\n\t\treturn 0, fmt.Errorf(\"parse har: %w\", err)\n\t}\n\n\tscanned := 0\n\tfor _, e := range h.Log.Entries {\n\t\tif e.Response.Status < 200 || e.Response.Status >= 400 {\n\t\t\tcontinue\n\t\t}\n\t\tmt := strings.ToLower(e.Response.Content.MimeType)\n\t\turlLower := strings.ToLower(e.Request.URL)\n\t\tisJS := strings.Contains(mt, \"javascript\") ||\n\t\t\tstrings.Contains(mt, \"ecmascript\") ||\n\t\t\tstrings.HasSuffix(urlLower, \".js\") ||\n\t\t\tstrings.Contains(urlLower, \".js?\")\n\t\tif !isJS {\n\t\t\tcontinue\n\t\t}\n\n\t\tbody := []byte(e.Response.Content.Text)\n\t\tif e.Response.Content.Encoding == \"base64\" {\n\t\t\tif dec, derr := harBase64Decode(body); derr == nil {\n\t\t\t\tbody = dec\n\t\t\t}\n\t\t}\n\t\tif config.MaxBytes > 0 && int64(len(body)) > config.MaxBytes {\n\t\t\tbody = body[:config.MaxBytes]\n\t\t\tif globalStats != nil {\n\t\t\t\tstatInc(&globalStats.BytesTruncated)\n\t\t\t}\n\t\t}\n\t\tif globalStats != nil {\n\t\t\tstatInc(&globalStats.URLsFetched)\n\t\t\tstatAdd(&globalStats.BytesParsed, int64(len(body)))\n\t\t}\n\t\tprocessed := processJSAnalysis(body, config)\n\t\treportMatchesWithConfig(e.Request.URL, processed, config)\n\t\tscanned++\n\t}\n\treturn scanned, nil\n}\n\n// harBase64Decode is tolerant of std/URL/raw base64 variants (HAR exporters\n// disagree). We try each in order and return the first decode that succeeds.\nfunc harBase64Decode(b []byte) ([]byte, error) {\n\ts := strings.TrimSpace(string(b))\n\tfor _, dec := range []func(string) ([]byte, error){\n\t\tbase64.StdEncoding.DecodeString,\n\t\tbase64.URLEncoding.DecodeString,\n\t\tbase64.RawStdEncoding.DecodeString,\n\t\tbase64.RawURLEncoding.DecodeString,\n\t} {\n\t\tif out, err := dec(s); err == nil {\n\t\t\treturn out, nil\n\t\t}\n\t}\n\treturn nil, fmt.Errorf(\"har: not decodable as any base64 variant\")\n}\n"
  },
  {
    "path": "internal/jshunter/html_extract.go",
    "content": "package jshunter\n\nimport (\n\t\"bytes\"\n\t\"fmt\"\n\t\"io\"\n\t\"strings\"\n\n\t\"golang.org/x/net/html\"\n)\n\n// HTMLArtifacts is the structured slice of recon-relevant payloads\n// extracted from one HTML response. Inline scripts and SRI hashes are not\n// available in JS-only crawls — operators routinely miss secrets that live\n// in the homepage's `<script>` tag rather than an external bundle.\ntype HTMLArtifacts struct {\n\tInlineScripts []InlineScript\n\tExternalJS    []ExternalJS\n\tCSPOrigins    []string\n\tSourcemaps    []string\n}\n\ntype InlineScript struct {\n\t// Index is the zero-based position of the script tag in the document\n\t// so we can synthesize a stable per-script source id (`page#script[3]`).\n\tIndex    int\n\tBody     string\n\tType     string // \"module\" | \"\" | \"application/json\" | …\n\tNonce    string\n\tIsLDJSON bool\n}\n\ntype ExternalJS struct {\n\tURL       string\n\tIntegrity string // SRI: \"sha384-...\"\n\tAsync     bool\n\tDefer     bool\n\tType      string\n}\n\n// ExtractFromHTML parses an HTML body and returns the extractable\n// artifacts. Robust to malformed input — `golang.org/x/net/html` recovers\n// from broken markup the way browsers do.\nfunc ExtractFromHTML(body []byte) (*HTMLArtifacts, error) {\n\tout := &HTMLArtifacts{}\n\tz := html.NewTokenizer(bytes.NewReader(body))\n\tscriptIdx := 0\n\n\tfor {\n\t\ttt := z.Next()\n\t\tswitch tt {\n\t\tcase html.ErrorToken:\n\t\t\tif err := z.Err(); err != nil && err != io.EOF {\n\t\t\t\treturn out, fmt.Errorf(\"html tokenizer: %w\", err)\n\t\t\t}\n\t\t\treturn out, nil\n\n\t\tcase html.StartTagToken, html.SelfClosingTagToken:\n\t\t\tt := z.Token()\n\t\t\tswitch strings.ToLower(t.Data) {\n\t\t\tcase \"script\":\n\t\t\t\tattrs := tagAttrs(t)\n\t\t\t\tsrc := attrs[\"src\"]\n\t\t\t\tif src != \"\" {\n\t\t\t\t\tout.ExternalJS = append(out.ExternalJS, ExternalJS{\n\t\t\t\t\t\tURL:       src,\n\t\t\t\t\t\tIntegrity: attrs[\"integrity\"],\n\t\t\t\t\t\tAsync:     hasAttr(t, \"async\"),\n\t\t\t\t\t\tDefer:     hasAttr(t, \"defer\"),\n\t\t\t\t\t\tType:      attrs[\"type\"],\n\t\t\t\t\t})\n\t\t\t\t} else if tt == html.StartTagToken {\n\t\t\t\t\t// Capture inline script body up to </script>.\n\t\t\t\t\tbody, err := readUntilEndTag(z, \"script\")\n\t\t\t\t\tif err == nil && strings.TrimSpace(body) != \"\" {\n\t\t\t\t\t\tscript := InlineScript{\n\t\t\t\t\t\t\tIndex: scriptIdx,\n\t\t\t\t\t\t\tBody:  body,\n\t\t\t\t\t\t\tType:  attrs[\"type\"],\n\t\t\t\t\t\t\tNonce: attrs[\"nonce\"],\n\t\t\t\t\t\t}\n\t\t\t\t\t\tscript.IsLDJSON = strings.EqualFold(script.Type, \"application/ld+json\")\n\t\t\t\t\t\tout.InlineScripts = append(out.InlineScripts, script)\n\t\t\t\t\t}\n\t\t\t\t\tscriptIdx++\n\t\t\t\t}\n\n\t\t\tcase \"meta\":\n\t\t\t\t// CSP via http-equiv (some sites prefer this over header).\n\t\t\t\tif strings.EqualFold(tagAttrs(t)[\"http-equiv\"], \"Content-Security-Policy\") {\n\t\t\t\t\tcontent := tagAttrs(t)[\"content\"]\n\t\t\t\t\tout.CSPOrigins = append(out.CSPOrigins, ParseCSPOrigins(content)...)\n\t\t\t\t}\n\n\t\t\tcase \"link\":\n\t\t\t\tattrs := tagAttrs(t)\n\t\t\t\trel := strings.ToLower(attrs[\"rel\"])\n\t\t\t\thref := attrs[\"href\"]\n\t\t\t\tif href != \"\" {\n\t\t\t\t\tswitch rel {\n\t\t\t\t\tcase \"preload\", \"modulepreload\", \"prefetch\":\n\t\t\t\t\t\tif strings.EqualFold(attrs[\"as\"], \"script\") || rel == \"modulepreload\" {\n\t\t\t\t\t\t\tout.ExternalJS = append(out.ExternalJS, ExternalJS{\n\t\t\t\t\t\t\t\tURL:       href,\n\t\t\t\t\t\t\t\tIntegrity: attrs[\"integrity\"],\n\t\t\t\t\t\t\t\tType:      \"module\",\n\t\t\t\t\t\t\t})\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t}\n}\n\n// readUntilEndTag consumes tokens up to and including the closing tag,\n// returning the concatenated text content. Used to capture inline script\n// bodies which the tokenizer reports as a separate Text token.\nfunc readUntilEndTag(z *html.Tokenizer, tag string) (string, error) {\n\tvar buf bytes.Buffer\n\tfor {\n\t\ttt := z.Next()\n\t\tswitch tt {\n\t\tcase html.ErrorToken:\n\t\t\treturn buf.String(), z.Err()\n\t\tcase html.TextToken:\n\t\t\tbuf.Write(z.Text())\n\t\tcase html.EndTagToken:\n\t\t\tt := z.Token()\n\t\t\tif strings.EqualFold(t.Data, tag) {\n\t\t\t\treturn buf.String(), nil\n\t\t\t}\n\t\t}\n\t}\n}\n\nfunc tagAttrs(t html.Token) map[string]string {\n\tm := make(map[string]string, len(t.Attr))\n\tfor _, a := range t.Attr {\n\t\tm[strings.ToLower(a.Key)] = a.Val\n\t}\n\treturn m\n}\n\nfunc hasAttr(t html.Token, name string) bool {\n\tfor _, a := range t.Attr {\n\t\tif strings.EqualFold(a.Key, name) {\n\t\t\treturn true\n\t\t}\n\t}\n\treturn false\n}\n\n// looksLikeHTML returns true when the response body is HTML rather than JS.\n// We use a tiny prefix sniff rather than the full encoding/sniff implementation\n// because the body is already bounded by --max-bytes.\nfunc looksLikeHTML(body []byte, contentType string) bool {\n\tif strings.Contains(strings.ToLower(contentType), \"html\") {\n\t\treturn true\n\t}\n\thead := body\n\tif len(head) > 512 {\n\t\thead = head[:512]\n\t}\n\tlow := strings.ToLower(string(head))\n\tlow = strings.TrimSpace(low)\n\treturn strings.HasPrefix(low, \"<!doctype html\") ||\n\t\tstrings.HasPrefix(low, \"<html\") ||\n\t\tstrings.HasPrefix(low, \"<head\") ||\n\t\tstrings.HasPrefix(low, \"<body\")\n}\n"
  },
  {
    "path": "internal/jshunter/ignore.go",
    "content": "package jshunter\n\nimport (\n\t\"bufio\"\n\t\"fmt\"\n\t\"io\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"strings\"\n)\n\n// .jshunterignore is the operator's permanent suppression list. Format is\n// one entry per line, blank/`#` lines ignored. Supported kinds:\n//\n//\thash:<value_hash_hex>           # suppress one specific finding\n//\trule:<rule_id|rule_id_glob>     # suppress an entire rule (or family)\n//\tsource:<glob>                   # suppress all findings whose source matches\n//\trule_value:<rule>:<value_glob>  # suppress findings where rule matches and\n//\t                                # value matches the glob (after rule)\n//\n// Globs use the standard filepath.Match syntax (`*`, `?`, `[abc]`).\n\ntype IgnoreEntry struct {\n\tKind string\n\tA    string\n\tB    string\n}\n\ntype IgnoreList struct {\n\tEntries []IgnoreEntry\n}\n\n// LoadIgnoreFile reads and parses an ignore file. A missing file is NOT an\n// error — operators expect to run with or without one — but a malformed\n// file is, because silently ignoring bad rules invites \"why didn't my\n// suppression work?\" tickets.\nfunc LoadIgnoreFile(path string) (*IgnoreList, error) {\n\tif path == \"\" {\n\t\treturn nil, nil\n\t}\n\tf, err := os.Open(path)\n\tif err != nil {\n\t\tif os.IsNotExist(err) {\n\t\t\treturn nil, nil\n\t\t}\n\t\treturn nil, fmt.Errorf(\"open ignore file: %w\", err)\n\t}\n\tdefer f.Close()\n\treturn parseIgnoreReader(f)\n}\n\nfunc parseIgnoreReader(r io.Reader) (*IgnoreList, error) {\n\til := &IgnoreList{}\n\tsc := bufio.NewScanner(r)\n\tlineNo := 0\n\tfor sc.Scan() {\n\t\tlineNo++\n\t\tline := strings.TrimSpace(sc.Text())\n\t\tif line == \"\" || strings.HasPrefix(line, \"#\") {\n\t\t\tcontinue\n\t\t}\n\t\tidx := strings.Index(line, \":\")\n\t\tif idx == -1 {\n\t\t\treturn nil, fmt.Errorf(\"ignore line %d: missing ':' separator\", lineNo)\n\t\t}\n\t\tkind := strings.TrimSpace(line[:idx])\n\t\trest := strings.TrimSpace(line[idx+1:])\n\t\tif rest == \"\" {\n\t\t\treturn nil, fmt.Errorf(\"ignore line %d: empty value\", lineNo)\n\t\t}\n\t\tswitch kind {\n\t\tcase \"hash\", \"rule\", \"source\":\n\t\t\til.Entries = append(il.Entries, IgnoreEntry{Kind: kind, A: rest})\n\t\tcase \"rule_value\":\n\t\t\tparts := strings.SplitN(rest, \":\", 2)\n\t\t\tif len(parts) != 2 || parts[0] == \"\" || parts[1] == \"\" {\n\t\t\t\treturn nil, fmt.Errorf(\"ignore line %d: rule_value needs <rule>:<value-glob>\", lineNo)\n\t\t\t}\n\t\t\til.Entries = append(il.Entries, IgnoreEntry{Kind: \"rule_value\", A: parts[0], B: parts[1]})\n\t\tdefault:\n\t\t\treturn nil, fmt.Errorf(\"ignore line %d: unknown kind %q (want hash|rule|source|rule_value)\", lineNo, kind)\n\t\t}\n\t}\n\tif err := sc.Err(); err != nil {\n\t\treturn nil, err\n\t}\n\treturn il, nil\n}\n\n// ShouldIgnore returns true if any entry matches this finding.\nfunc (il *IgnoreList) ShouldIgnore(f *Finding) bool {\n\tif il == nil {\n\t\treturn false\n\t}\n\tfor _, e := range il.Entries {\n\t\tswitch e.Kind {\n\t\tcase \"hash\":\n\t\t\tif f.ValueHash == e.A {\n\t\t\t\treturn true\n\t\t\t}\n\t\tcase \"rule\":\n\t\t\tif f.RuleID == e.A || globMatch(e.A, f.RuleID) {\n\t\t\t\treturn true\n\t\t\t}\n\t\tcase \"source\":\n\t\t\tif globMatch(e.A, f.Source) {\n\t\t\t\treturn true\n\t\t\t}\n\t\tcase \"rule_value\":\n\t\t\tif (f.RuleID == e.A || globMatch(e.A, f.RuleID)) && globMatch(e.B, f.Value) {\n\t\t\t\treturn true\n\t\t\t}\n\t\t}\n\t}\n\treturn false\n}\n\nfunc globMatch(pattern, s string) bool {\n\tif pattern == \"*\" {\n\t\treturn true\n\t}\n\tok, _ := filepath.Match(pattern, s)\n\treturn ok\n}\n"
  },
  {
    "path": "internal/jshunter/jshunter.go",
    "content": "package jshunter\n\nimport (\n    \"bufio\"\n    \"context\"\n    \"crypto/tls\"\n    \"encoding/csv\"\n    \"encoding/json\"\n    \"flag\"\n    \"fmt\"\n    \"io\"\n    \"math\"\n    \"net\"\n    \"net/http\"\n    \"net/url\"\n    \"os\"\n    \"regexp\"\n    \"runtime\"\n    \"sort\"\n    \"strings\"\n    \"sync\"\n    \"time\"\n    \"math/rand\"\n\n    \"golang.org/x/net/proxy\"\n)\n\n\nvar (\n    version = \"v0.7.5\"\n    colors = map[string]string{\n        \"RED\":    \"\\033[0;31m\",\n        \"GREEN\":  \"\\033[0;32m\",\n        \"BLUE\":   \"\\033[0;34m\",\n        \"YELLOW\": \"\\033[0;33m\",\n        \"CYAN\":   \"\\033[0;36m\",\n        \"PURPLE\": \"\\033[0;35m\",\n        \"NC\":     \"\\033[0m\",\n    }\n    // Global deduplication for all outputs\n    globalSeenParams = make(map[string]bool)\n    globalSeenAll    = make(map[string]bool)\n    globalSeenMutex  sync.Mutex\n    globalFoundAny   = false // Track if any findings were made across all files\n    missingMessages  = make([]string, 0) // Buffer for MISSING messages\n    missingMutex     sync.Mutex\n)\n\n\n\nvar (\n    //regex-cc1a2b\n    regexPatterns = map[string]*regexp.Regexp{\n\t\"Google API\":                    regexp.MustCompile(`AIza[0-9A-Za-z-_]{35}`),\n\t\"Firebase\":                      regexp.MustCompile(`AAAA[A-Za-z0-9_-]{7}:[A-Za-z0-9_-]{140}(?:\\s|$|[^A-Za-z0-9_-])`),\n\t\"Amazon Aws Access Key ID\":      regexp.MustCompile(`A[SK]IA[0-9A-Z]{16}`),\n\t\"Amazon Mws Auth Token\":         regexp.MustCompile(`\\bamzn\\.mws\\.[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\\b`),\n\t\"Amazon Aws Url\":                regexp.MustCompile(`s3\\.amazonaws.com[/]+|[a-zA-Z0-9_-]*\\.s3\\.amazonaws.com`),\n\t\"Amazon Aws Url2\":               regexp.MustCompile(`([a-zA-Z0-9-._]+\\.s3\\.amazonaws\\.com|s3://[a-zA-Z0-9-._]+|s3-[a-z]{2}-[a-z]+-[0-9]+\\.amazonaws\\.com|s3.amazonaws.com/[a-zA-Z0-9-._]+|s3.console.aws.amazon.com/s3/buckets/[a-zA-Z0-9-._]+)`),\n\t\"Facebook Access Token\":         regexp.MustCompile(`EAACEdEose0cBA[0-9A-Za-z]+`),\n\t\"Authorization Basic\":           regexp.MustCompile(`(?i)\\bauthorization\\s*:\\s*basic\\s+[a-zA-Z0-9=:_\\+\\/-]{20,100}`),\n\t\"Authorization Bearer\":          regexp.MustCompile(`(?i)\\bauthorization\\s*:\\s*bearer\\s+[a-zA-Z0-9_\\-\\.=:_\\+\\/]{20,100}`),\n    \"Authorization Api\":             regexp.MustCompile(`(?i)\\bapi[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_\\-]{20,100}[\"']?`),\n\t\"Twilio Api Key\":                regexp.MustCompile(`SK[0-9a-fA-F]{32}`),\n\t\"Twilio Account Sid\":            regexp.MustCompile(`(?i)\\b(?:twilio|tw)\\s*[_-]?account[_-]?sid\\s*[:=]\\s*[\"']?AC[a-zA-Z0-9_\\-]{32}[\"']?`),\n\t\"Twilio App Sid\":                regexp.MustCompile(`\\bAP[a-fA-F0-9]{32}\\b`),\n\t\"Paypal Braintre Access Token\":  regexp.MustCompile(`access_token\\$production\\$[0-9a-z]{16}\\$[0-9a-f]{32}`),\n\t\"Square Oauth Secret\":           regexp.MustCompile(`sq0csp-[0-9A-Za-z\\-_]{43}|sq0[a-z]{3}-[0-9A-Za-z\\-_]{22,43}`),\n\t\"Square Access Token\":           regexp.MustCompile(`sqOatp-[0-9A-Za-z\\-_]{22}`),\n\t\"Stripe Standard Api\":           regexp.MustCompile(`sk_live_[0-9a-zA-Z]{24}`),\n\t\"Stripe Restricted Api\":         regexp.MustCompile(`rk_live_[0-9a-zA-Z]{24}`),\n\t\"Authorization Github Token\":    regexp.MustCompile(`\\bghp_[a-zA-Z0-9]{36}\\b`),\n\t\"Github Access Token\":           regexp.MustCompile(`[a-zA-Z0-9_-]+:[a-zA-Z0-9_\\-]{20,}@github\\.com\\b`),\n\t\"Rsa Private Key\":               regexp.MustCompile(`-----BEGIN RSA PRIVATE KEY-----`),\n\t\"Ssh Dsa Private Key\":           regexp.MustCompile(`-----BEGIN DSA PRIVATE KEY-----`),\n\t\"Ssh Dc Private Key\":            regexp.MustCompile(`-----BEGIN EC PRIVATE KEY-----`),\n\t\"Pgp Private Block\":             regexp.MustCompile(`-----BEGIN PGP PRIVATE KEY BLOCK-----`),\n\t\"Ssh Private Key\":               regexp.MustCompile(`(?s)-----BEGIN OPENSSH PRIVATE KEY-----[a-zA-Z0-9+\\/=\\n]+-----END OPENSSH PRIVATE KEY-----`),\n\t\"Json Web Token\":                regexp.MustCompile(`\\beyJ[A-Za-z0-9_\\-]{8,}\\.eyJ[A-Za-z0-9_\\-]{8,}\\.[A-Za-z0-9_\\-]{8,}\\b`),\n    \"Putty Private Key\":             regexp.MustCompile(`(?s)PuTTY-User-Key-File-2.*?-----END`),\n    \"Ssh2 Encrypted Private Key\":    regexp.MustCompile(`(?s)-----BEGIN SSH2 ENCRYPTED PRIVATE KEY-----[a-zA-Z0-9+\\/=\\n]+-----END SSH2 ENCRYPTED PRIVATE KEY-----`),\n    \"Generic Private Key\":           regexp.MustCompile(`(?s)-----BEGIN.*PRIVATE KEY-----[a-zA-Z0-9+\\/=\\n]+-----END.*PRIVATE KEY-----`),\n    \"Username Password Combo\":       regexp.MustCompile(`(?i)\\b[a-z]+://[^/\\s:@\"']{1,64}:[^/\\s:@\"']{1,128}@[a-zA-Z0-9.\\-]{3,255}`),\n    \"Facebook Oauth\":                regexp.MustCompile(`(?i)(?:facebook|fb)[_\\-]?(?:app[_\\-]?)?(?:secret|client[_\\-]?secret|oauth)\\s*[:=]\\s*['\\\"]?[0-9a-f]{32}['\\\"]?`),\n    \"Twitter Oauth\":                 regexp.MustCompile(`(?i)\\b(?:twitter|tw)\\s*[_-]?oauth[_-]?token\\s*[:=]\\s*[\"']?[0-9a-zA-Z]{35,44}[\"']?`),\n    \"Github Token\":                  regexp.MustCompile(`(?i)\\b(gh[pousr]_[0-9a-zA-Z]{36})\\b`),\n    \"Google Oauth Client Secret\":    regexp.MustCompile(`\\\"client_secret\\\":\\\"[a-zA-Z0-9-_]{24}\\\"`),\n    \"Aws Api Key\":                   regexp.MustCompile(`\\bAKIA[0-9A-Z]{16}\\b`),\n\t\"Slack Token\":                   regexp.MustCompile(`\\\"api_token\\\":\\\"(xox[a-zA-Z]-[a-zA-Z0-9-]+)\\\"`),\n\t\"Ssh Priv Key\":                  regexp.MustCompile(`([-]+BEGIN [^\\s]+ PRIVATE KEY[-]+[\\s]*[^-]*[-]+END [^\\s]+ PRIVATE KEY[-]+)`),\n\t\"Slack Webhook Url\":             regexp.MustCompile(`https://hooks.slack.com/services/[A-Za-z0-9]+/[A-Za-z0-9]+/[A-Za-z0-9]+`),\n\t\"Heroku Api Key 2\":              regexp.MustCompile(`(?i)\\bheroku[_-]?(?:api[_-]?)?key\\s*[:=]\\s*[\"']?[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}[\"']?`),\n\t\"Dropbox Access Token\":          regexp.MustCompile(`\\bsl\\.[A-Za-z0-9_-]{64,200}\\b`),\n\t\"Salesforce Access Token\":       regexp.MustCompile(`00D[0-9A-Za-z]{15,18}![A-Za-z0-9]{40}`),\n\t\"Twitter Bearer Token\":          regexp.MustCompile(`\\bAAAAAAAAAAAAAAAAAAAAA[A-Za-z0-9%]{30,80}\\b`),\n\t\"Firebase Url\":                  regexp.MustCompile(`https://[a-z0-9-]+\\.firebaseio\\.com`),\n\t\"Pem Private Key\":               regexp.MustCompile(`-----BEGIN (?:[A-Z ]+ )?PRIVATE KEY-----`),\n\t\"Google Cloud Sa Key\":           regexp.MustCompile(`\"type\": \"service_account\"`),\n\t\"Stripe Publishable Key\":        regexp.MustCompile(`pk_live_[0-9a-zA-Z]{24}`),\n\t\"Azure Storage Account Key\":     regexp.MustCompile(`(?i)\\b(?:AccountKey|azure[_-]?storage[_-]?key)\\s*[:=]\\s*[\"']?[A-Za-z0-9+/]{86}==[\"']?`),\n\t\"Instagram Access Token\":        regexp.MustCompile(`IGQV[A-Za-z0-9._-]{10,}`),\n\t\"Stripe Test Publishable Key\":   regexp.MustCompile(`pk_test_[0-9a-zA-Z]{24}`),\n\t\"Stripe Test Secret Key\":        regexp.MustCompile(`sk_test_[0-9a-zA-Z]{24}`),\n\t\"Slack Bot Token\":               regexp.MustCompile(`xoxb-[A-Za-z0-9-]{24,34}`),\n\t\"Slack User Token\":              regexp.MustCompile(`xoxp-[A-Za-z0-9-]{24,34}`),\n    \"Google Gmail Api Key\":          regexp.MustCompile(`\\bAIza[0-9A-Za-z_\\-]{35}\\b`),\n    \"Google Gmail Oauth\":            regexp.MustCompile(`\\b[0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com\\b`),\n    \"Google Oauth Access Token\":     regexp.MustCompile(`\\bya29\\.[0-9A-Za-z_\\-]{40,}\\b`),\n    \"Mailchimp Api Key\":             regexp.MustCompile(`[0-9a-f]{32}-us[0-9]{1,2}`),\n    \"Mailgun Api Key\":               regexp.MustCompile(`key-[0-9a-zA-Z]{32}`),\n    \"Google Drive Oauth\":            regexp.MustCompile(`\\b[0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com\\b`),\n    \"Paypal Braintree Access Token\": regexp.MustCompile(`access_token\\$production\\$[0-9a-z]{16}\\$[0-9a-f]{32}`),\n    \"Picatic Api Key\":               regexp.MustCompile(`sk_live_[0-9a-z]{32}`),\n    \"Stripe Api Key\":                regexp.MustCompile(`sk_live_[0-9a-zA-Z]{24}`),\n    \"Stripe Restricted Api Key\":     regexp.MustCompile(`rk_live_[0-9a-zA-Z]{24}`),\n    \"Square Access Token 2\":         regexp.MustCompile(`\\bsq0atp-[0-9A-Za-z_\\-]{22}\\b`),\n    \"Square Oauth Secret 2\":         regexp.MustCompile(`\\bsq0csp-[0-9A-Za-z_\\-]{43}\\b`),\n    \"Twitter Access Token\":          regexp.MustCompile(`(?i)\\b(?:twitter|tw)\\s*[_-]?access[_-]?token\\s*[:=]\\s*[\"']?[0-9]+-[0-9a-zA-Z]{40}[\"']?`),\n\t\"Heroku Api Key 3\":              regexp.MustCompile(`(?i)\\bheroku\\b[^\\n]{0,80}\\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\\b`),\n    \"Generic Api Key\":               regexp.MustCompile(`(?i)\\bapi[_-]?key\\s*[:=]\\s*['\\\"]?[0-9a-zA-Z]{32,45}['\\\"]?`),\n    \"Generic Secret\":                regexp.MustCompile(`(?i)\\bsecret\\s*[:=]\\s*['\\\"]?[0-9a-zA-Z]{32,45}['\\\"]?`),\n    \"Slack Webhook\":                 regexp.MustCompile(`https://hooks[.]slack[.]com/services/T[a-zA-Z0-9_]{8}/B[a-zA-Z0-9_]{8}/[a-zA-Z0-9_]{24}`),\n    \"Gcp Service Account\":           regexp.MustCompile(`\\\"type\\\": \\\"service_account\\\"`),\n    \"Password in Url\":               regexp.MustCompile(`[a-zA-Z]{3,10}://[^/\\s:@\"']{3,32}:[^/\\s:@\"']{3,128}@[a-zA-Z0-9.\\-]{3,200}`),\n\t\"Discord Webhook url\":           regexp.MustCompile(`https://discord(?:app)?\\.com/api/webhooks/[0-9]{18,20}/[A-Za-z0-9_-]{64,}`),\n\t\"Discord bot Token\":             regexp.MustCompile(`[MN][A-Za-z\\d]{23}\\.[\\w-]{6}\\.[\\w-]{27}`),\n\t\"Okta Api Token\":                regexp.MustCompile(`00[a-zA-Z0-9]{30}\\.[a-zA-Z0-9\\-_]{30,}\\.[a-zA-Z0-9\\-_]{30,}`),\n\t\"Sendgrid Api Key\":              regexp.MustCompile(`SG\\.[A-Za-z0-9_-]{22}\\.[A-Za-z0-9_-]{43}`),\n\t\"Mapbox Access Token\":           regexp.MustCompile(`pk\\.[a-zA-Z0-9]{60}\\.[a-zA-Z0-9]{22}`),\n\t\"Gitlab Personal Access token\":  regexp.MustCompile(`glpat-[A-Za-z0-9\\-]{20}`),\n\t\"Datadog Api Key\":               regexp.MustCompile(`ddapi_[a-zA-Z0-9]{32}`),\n\t\"shopify Access Token\":          regexp.MustCompile(`shpat_[A-Za-z0-9]{32}`),\n    \"Atlassian Access Token\":        regexp.MustCompile(`(?i)\\b(?:atlassian|jira|confluence)[_-]?(?:api[_-]?)?token\\s*[:=]\\s*[\"']?ATATT3[A-Za-z0-9_\\-]{180,250}[\"']?`),\n\t\"Crowdstrike Api Key\":           regexp.MustCompile(`(?i)\\b(?:crowdstrike|cs)[_-]?(?:api[_-]?)?(?:key|token)\\s*[:=]\\s*[\"']?[A-Za-z0-9]{32}\\.[A-Za-z0-9]{16}[\"']?`),\n\t\"Quickbooks Api Key\":            regexp.MustCompile(`(?i)\\b(?:quickbooks|qbo|intuit)[_-]?(?:api[_-]?)?(?:key|token)\\s*[:=]\\s*[\"']?A[0-9a-f]{32}[\"']?`),\n\t\"Cisco Api Key\":                 regexp.MustCompile(`(?i)\\bcisco[_-]?(?:api[_-]?)?key\\s*[:=]\\s*[\"']?[A-Za-z0-9]{30,}[\"']?`),\n\t\"Cisco Access Token\":            regexp.MustCompile(`(?i)\\bcisco[_-]?access[_-]?token\\s*[:=]\\s*[\"']?[A-Za-z0-9_\\-]{20,}[\"']?`),\n\t\"Segment Write Key\":             regexp.MustCompile(`(?i)\\b(?:segment[_-]?)?writeKey\\s*[:=]\\s*[\"']?[A-Za-z0-9]{32}[\"']?`),\n\t\"Tiktok Access Token\":           regexp.MustCompile(`\\btiktok_access_token=[a-zA-Z0-9_]{20,}\\b`),\n\t\"Slack Client Secret\":           regexp.MustCompile(`xoxs-[0-9]{1,9}.[0-9A-Za-z]{1,12}.[0-9A-Za-z]{24,64}`),\n    \"Phone Number\":                  regexp.MustCompile(`(?:^|[\\s\"'<>:,;(\\[])\\+\\d{9,14}(?:[\\s\"'<>,;.!?)\\]]|$)`),\n    \"Email\":                         regexp.MustCompile(`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}`),\n\t\"Ali Cloud Access Key\":\t\t     regexp.MustCompile(`\\bLTAI[A-Za-z0-9]{12,20}\\b`),\n\t\"Tencent Cloud Access Key\":\t     regexp.MustCompile(`\\bAKID[A-Za-z0-9]{13,20}\\b`),\n        \"OpenAI API Key\":                regexp.MustCompile(`sk-[a-zA-Z0-9]{20}T3BlbkFJ[a-zA-Z0-9]{20}`),\n        \"OpenAI API Key Project\":        regexp.MustCompile(`sk-proj-[a-zA-Z0-9]{48,}`),\n        \"OpenAI API Key Svc\":            regexp.MustCompile(`sk-svcacct-[a-zA-Z0-9_-]{80,}`),\n        \"Anthropic API Key\":             regexp.MustCompile(`sk-ant-api[a-zA-Z0-9-]{37,}`),\n        \"HuggingFace Token\":             regexp.MustCompile(`hf_[a-zA-Z0-9]{34,}`),\n        \"Cohere API Key\":                regexp.MustCompile(`(?i)cohere[_-]?api[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{40}[\"']?`),\n        \"Replicate API Token\":           regexp.MustCompile(`r8_[a-zA-Z0-9]{40}`),\n        \"Google AI API Key\":             regexp.MustCompile(`(?i)(?:gemini|palm|bard)[_-]?api[_-]?key\\s*[:=]\\s*[\"']?AIza[a-zA-Z0-9_-]{35}[\"']?`),\n        \"AWS Secret Access Key\":         regexp.MustCompile(`(?i)(?:aws)?[_-]?secret[_-]?(?:access)?[_-]?key\\s*[:=]\\s*[\"']?[A-Za-z0-9/+=]{40}[\"']?`),\n        \"AWS Session Token\":             regexp.MustCompile(`(?i)aws[_-]?session[_-]?token\\s*[:=]\\s*[\"']?[A-Za-z0-9/+=]{100,}[\"']?`),\n        \"MongoDB Connection String\":     regexp.MustCompile(`mongodb(?:\\+srv)?://[a-zA-Z0-9._-]+:[^@\\s\"']+@[a-zA-Z0-9._-]+`),\n        \"PostgreSQL Connection String\":  regexp.MustCompile(`postgres(?:ql)?://[a-zA-Z0-9._-]+:[^@\\s\"']+@[a-zA-Z0-9._-]+`),\n        \"MySQL Connection String\":       regexp.MustCompile(`mysql://[a-zA-Z0-9._-]+:[^@\\s\"']+@[a-zA-Z0-9._-]+`),\n        \"Redis Connection String\":       regexp.MustCompile(`redis://[a-zA-Z0-9._-]+:[^@\\s\"']+@[a-zA-Z0-9._-]+`),\n        \"MSSQL Connection String\":       regexp.MustCompile(`(?i)(?:server|data source)=[^;]+;.*(?:password|pwd)=[^;]+`),\n        \"Database URL Generic\":          regexp.MustCompile(`(?i)(?:database|db)[_-]?url\\s*[:=]\\s*[\"']?[a-z]+://[^:]+:[^@]+@[^\\s\"']+[\"']?`),\n        \"Azure Client Secret\":           regexp.MustCompile(`(?i)(?:azure|ad)[_-]?(?:client)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9~._-]{34,}[\"']?`),\n        \"Azure Storage Connection\":      regexp.MustCompile(`DefaultEndpointsProtocol=https?;AccountName=[^;]+;AccountKey=[a-zA-Z0-9+/=]{86,}`),\n        \"Azure SAS Token\":               regexp.MustCompile(`(?i)[?&]sig=[a-zA-Z0-9%]{43,}`),\n        \"Azure SQL Connection\":          regexp.MustCompile(`(?i)Server=tcp:[^;]+;.*Password=[^;]+`),\n        \"DigitalOcean Token\":            regexp.MustCompile(`dop_v1_[a-f0-9]{64}`),\n        \"DigitalOcean OAuth\":            regexp.MustCompile(`doo_v1_[a-f0-9]{64}`),\n        \"DigitalOcean Refresh\":          regexp.MustCompile(`dor_v1_[a-f0-9]{64}`),\n        \"Linode API Token\":              regexp.MustCompile(`(?i)linode[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9]{64}[\"']?`),\n        \"Vultr API Key\":                 regexp.MustCompile(`(?i)vultr[_-]?api[_-]?key\\s*[:=]\\s*[\"']?[A-Z0-9]{36}[\"']?`),\n        \"Hetzner API Token\":             regexp.MustCompile(`(?i)hetzner[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{64}[\"']?`),\n        \"Oracle Cloud API Key\":          regexp.MustCompile(`(?i)oci[_-]?api[_-]?key\\s*[:=]\\s*[\"']?-----BEGIN (?:RSA )?PRIVATE KEY-----`),\n        \"IBM Cloud API Key\":             regexp.MustCompile(`(?i)ibm[_-]?(?:cloud)?[_-]?api[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{44}[\"']?`),\n        \"NPM Access Token\":              regexp.MustCompile(`npm_[a-zA-Z0-9]{36}`),\n        \"PyPI API Token\":                regexp.MustCompile(`pypi-[a-zA-Z0-9_-]{100,}`),\n        \"NuGet API Key\":                 regexp.MustCompile(`oy2[a-z0-9]{43}`),\n        \"RubyGems API Key\":              regexp.MustCompile(`rubygems_[a-f0-9]{48}`),\n        \"CircleCI Token\":                regexp.MustCompile(`(?i)circle[_-]?(?:ci)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9]{40}[\"']?`),\n        \"Travis CI Token\":               regexp.MustCompile(`(?i)travis[_-]?(?:ci)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{22}[\"']?`),\n        \"Jenkins API Token\":             regexp.MustCompile(`(?i)jenkins[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9]{32,}[\"']?`),\n        \"Bitbucket App Password\":        regexp.MustCompile(`(?i)bitbucket[_-]?(?:app)?[_-]?(?:password|secret)\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{18,}[\"']?`),\n        \"Codecov Token\":                 regexp.MustCompile(`(?i)codecov[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9-]{36}[\"']?`),\n        \"Vercel Token\":                  regexp.MustCompile(`(?i)vercel[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{24}[\"']?`),\n        \"Netlify Token\":                 regexp.MustCompile(`(?i)netlify[_-]?(?:auth)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{40,}[\"']?`),\n        \"Vault Token\":                   regexp.MustCompile(`(?i)(?:vault[_-]?token|hvs)\\s*[:=]?\\s*[\"']?(?:hvs\\.)?[a-zA-Z0-9_-]{24,}[\"']?`),\n        \"Kubernetes Token\":              regexp.MustCompile(`eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9\\.[a-zA-Z0-9_-]+\\.[a-zA-Z0-9_-]+`),\n        \"Docker Registry Password\":      regexp.MustCompile(`(?i)docker[_-]?(?:registry)?[_-]?(?:password|pass|pwd)\\s*[:=]\\s*[\"']?[^\\s\"']{8,}[\"']?`),\n        \"Terraform Cloud Token\":         regexp.MustCompile(`(?i)(?:tfe|terraform)[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{14}\\.[a-zA-Z0-9_-]{67}[\"']?`),\n        \"Pulumi Access Token\":           regexp.MustCompile(`pul-[a-f0-9]{40}`),\n        \"Adyen API Key\":                 regexp.MustCompile(`(?i)adyen[_-]?api[_-]?key\\s*[:=]\\s*[\"']?AQE[a-zA-Z0-9_-]{50,}[\"']?`),\n        \"Klarna API Key\":                regexp.MustCompile(`(?i)klarna[_-]?api[_-]?(?:key|secret)\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{30,}[\"']?`),\n        \"Razorpay Key\":                  regexp.MustCompile(`rzp_(?:live|test)_[a-zA-Z0-9]{14}`),\n        \"Coinbase API Secret\":           regexp.MustCompile(`(?i)coinbase[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{64}[\"']?`),\n        \"Binance API Secret\":            regexp.MustCompile(`(?i)binance[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{64}[\"']?`),\n        \"Twilio Auth Token\":             regexp.MustCompile(`(?i)twilio[_-]?auth[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Pusher Secret\":                 regexp.MustCompile(`(?i)pusher[_-]?(?:app)?[_-]?secret\\s*[:=]\\s*[\"']?[a-f0-9]{20}[\"']?`),\n        \"Vonage API Secret\":             regexp.MustCompile(`(?i)(?:vonage|nexmo)[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{16}[\"']?`),\n        \"Plivo Auth Token\":              regexp.MustCompile(`(?i)plivo[_-]?auth[_-]?(?:token|id)\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{40,}[\"']?`),\n        \"MessageBird API Key\":           regexp.MustCompile(`(?i)messagebird[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{25}[\"']?`),\n        \"Intercom Access Token\":         regexp.MustCompile(`(?i)intercom[_-]?(?:access)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9=_-]{60,}[\"']?`),\n        \"Zendesk API Token\":             regexp.MustCompile(`(?i)zendesk[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{40}[\"']?`),\n        \"Algolia Admin API Key\":         regexp.MustCompile(`(?i)algolia[_-]?(?:admin)?[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Elasticsearch API Key\":         regexp.MustCompile(`(?i)(?:elastic|es)[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{50,}[\"']?`),\n        \"Mixpanel API Secret\":           regexp.MustCompile(`(?i)mixpanel[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Amplitude API Key\":             regexp.MustCompile(`(?i)amplitude[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Segment Write Key Alt\":         regexp.MustCompile(`(?i)segment[_-]?(?:write)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{32}[\"']?`),\n        \"New Relic License Key\":         regexp.MustCompile(`(?i)new[_-]?relic[_-]?license[_-]?key\\s*[:=]\\s*[\"']?[a-f0-9]{40}[\"']?`),\n        \"New Relic API Key\":             regexp.MustCompile(`NRAK-[A-Z0-9]{27}`),\n        \"New Relic Insights Key\":        regexp.MustCompile(`NRI[IQ]-[a-zA-Z0-9_-]{32}`),\n        \"Loggly Token\":                  regexp.MustCompile(`(?i)loggly[_-]?(?:customer)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9-]{36}[\"']?`),\n        \"Splunk HEC Token\":              regexp.MustCompile(`(?i)splunk[_-]?(?:hec)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9-]{36}[\"']?`),\n        \"Sumo Logic Access Key\":         regexp.MustCompile(`(?i)sumo[_-]?logic[_-]?(?:access)?[_-]?(?:key|id)\\s*[:=]\\s*[\"']?su[a-zA-Z0-9]{12}[\"']?`),\n        \"Grafana API Key\":               regexp.MustCompile(`eyJr[a-zA-Z0-9_-]{50,}={0,2}`),\n        \"PagerDuty API Key\":             regexp.MustCompile(`(?i)pagerduty[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9+/=_-]{20}[\"']?`),\n        \"Supabase Service Role Key\":     regexp.MustCompile(`eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9\\.[a-zA-Z0-9_-]+\\.[a-zA-Z0-9_-]+`),\n        \"Firebase Admin SDK Key\":        regexp.MustCompile(`(?i)firebase[_-]?(?:admin)?[_-]?sdk[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{100,}[\"']?`),\n        \"Auth0 Client Secret\":           regexp.MustCompile(`(?i)auth0[_-]?(?:client)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{64,}[\"']?`),\n        \"Okta API Token Alt\":            regexp.MustCompile(`(?i)okta[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?00[a-zA-Z0-9_-]{40}[\"']?`),\n        \"Cloudinary Secret\":             regexp.MustCompile(`(?i)cloudinary[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{27}[\"']?`),\n        \"Cloudinary URL\":                regexp.MustCompile(`cloudinary://[0-9]+:[a-zA-Z0-9_-]+@[a-z]+`),\n        \"Backblaze Application Key\":     regexp.MustCompile(`(?i)b2[_-]?(?:application)?[_-]?key\\s*[:=]\\s*[\"']?K[a-zA-Z0-9]{30,}[\"']?`),\n        \"Wasabi Access Key\":             regexp.MustCompile(`(?i)wasabi[_-]?(?:access)?[_-]?key\\s*[:=]\\s*[\"']?[A-Z0-9]{20}[\"']?`),\n        \"LaunchDarkly SDK Key\":          regexp.MustCompile(`(?i)(?:ld)?[_-]?sdk[_-]?key\\s*[:=]\\s*[\"']?sdk-[a-f0-9-]{36}[\"']?`),\n        \"LaunchDarkly API Key\":          regexp.MustCompile(`(?i)launchdarkly[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?api-[a-f0-9-]{36}[\"']?`),\n        \"Split.io API Key\":              regexp.MustCompile(`(?i)split[_-]?(?:io)?[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{50,}[\"']?`),\n        \"Statsig Secret\":                regexp.MustCompile(`(?i)statsig[_-]?(?:secret)?[_-]?key\\s*[:=]\\s*[\"']?secret-[a-zA-Z0-9]{50,}[\"']?`),\n        \"GitLab Pipeline Token\":         regexp.MustCompile(`glptt-[a-f0-9]{40}`),\n        \"GitLab Runner Token\":           regexp.MustCompile(`GR1348941[a-zA-Z0-9_-]{20}`),\n        \"GitHub App Private Key\":        regexp.MustCompile(`-----BEGIN RSA PRIVATE KEY-----[\\s\\S]+?-----END RSA PRIVATE KEY-----`),\n        \"Bitbucket OAuth Secret\":        regexp.MustCompile(`(?i)bitbucket[_-]?(?:oauth)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{32,}[\"']?`),\n        \"Contentful Management Token\":   regexp.MustCompile(`CFPAT-[a-zA-Z0-9_-]{43}`),\n        \"Contentful Delivery Token\":     regexp.MustCompile(`(?i)contentful[_-]?(?:delivery)?[_-]?token\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{43}[\"']?`),\n        \"Sanity Token\":                  regexp.MustCompile(`(?i)\\bsanity[_-]?(?:api[_-]?)?token\\s*[:=]\\s*[\"']?sk[a-zA-Z0-9]{32,}[\"']?`),\n        \"Strapi API Token\":              regexp.MustCompile(`(?i)strapi[_-]?(?:api)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9]{256}[\"']?`),\n        \"Postmark Server Token\":         regexp.MustCompile(`(?i)postmark[_-]?(?:server)?[_-]?token\\s*[:=]\\s*[\"']?[a-f0-9-]{36}[\"']?`),\n        \"SparkPost API Key\":             regexp.MustCompile(`(?i)sparkpost[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-f0-9]{40}[\"']?`),\n        \"Mailjet API Secret\":            regexp.MustCompile(`(?i)mailjet[_-]?(?:api)?[_-]?secret\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Mandrill API Key\":              regexp.MustCompile(`(?i)mandrill[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{22}[\"']?`),\n        \"Customer.io API Key\":           regexp.MustCompile(`(?i)customer[_-]?io[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Mapbox Secret Token\":           regexp.MustCompile(`sk\\.[a-zA-Z0-9]{60,}\\.[a-zA-Z0-9_-]{22,}`),\n        \"Here API Key\":                  regexp.MustCompile(`(?i)here[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9_-]{43}[\"']?`),\n        \"TomTom API Key\":                regexp.MustCompile(`(?i)tomtom[_-]?(?:api)?[_-]?key\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{32}[\"']?`),\n        \"LinkedIn Client Secret\":        regexp.MustCompile(`(?i)linkedin[_-]?(?:client)?[_-]?secret\\s*[:=]\\s*[\"']?[a-zA-Z0-9]{16}[\"']?`),\n        \"Spotify Client Secret\":         regexp.MustCompile(`(?i)spotify[_-]?(?:client)?[_-]?secret\\s*[:=]\\s*[\"']?[a-f0-9]{32}[\"']?`),\n        \"Dropbox App Secret\":            regexp.MustCompile(`(?i)dropbox[_-]?(?:app)?[_-]?secret\\s*[:=]\\s*[\"']?[a-z0-9]{15}[\"']?`),\n        \"Private Key Inline\":            regexp.MustCompile(`(?i)(?:private[_-]?key|priv[_-]?key)\\s*[:=]\\s*[\"'][a-zA-Z0-9+/=\\n]{100,}[\"']`),\n        \"Password Hardcoded\":            regexp.MustCompile(`(?i)(?:password|passwd|pwd)\\s*[:=]\\s*[\"'][^\"']{8,50}[\"']`),\n        \"Secret Key Hardcoded\":          regexp.MustCompile(`(?i)(?:secret[_-]?key|signing[_-]?key|encryption[_-]?key)\\s*[:=]\\s*[\"'][a-zA-Z0-9+/=_-]{20,}[\"']`),\n    }\n\n    asciiArt = `\n         ________             __         \n     __ / / __/ /  __ _____  / /____ ____\n    / // /\\ \\/ _ \\/ // / _ \\/ __/ -_) __/\n    \\___/___/_//_/\\_,_/_//_/\\__/\\__/_/  \n\n     ` + version + `                         Created by cc1a2b\n    `\n)\n\n// progressReader wraps an io.Reader to track download progress\ntype progressReader struct {\n    reader     io.Reader\n    total      int64\n    current    int64\n    lastUpdate time.Time\n    onProgress func(int64)\n}\n\nfunc (pr *progressReader) Read(p []byte) (int, error) {\n    n, err := pr.reader.Read(p)\n    pr.current += int64(n)\n    \n    // Only update progress every 100ms to avoid too many updates\n    if pr.onProgress != nil && time.Since(pr.lastUpdate) > 100*time.Millisecond {\n        pr.onProgress(pr.current)\n        pr.lastUpdate = time.Now()\n    }\n    \n    return n, err\n}\n\n// flagList is a custom type for handling multiple header flags\ntype flagList []string\n\nfunc (f *flagList) String() string {\n    return strings.Join(*f, \", \")\n}\n\nfunc (f *flagList) Set(value string) error {\n    *f = append(*f, value)\n    return nil\n}\n\n// Config holds all configuration options\ntype Config struct {\n    // Basic options\n    URL, List, JSFile, Output, Regex, Cookies, Proxy string\n    Threads                                           int\n    Quiet, Help, Update, ExtractEndpoints, SkipTLS, FoundOnly bool\n    \n    // Advanced HTTP\n    Headers    []string // Custom HTTP headers\n    UserAgent  string   // Custom User-Agent (single string or randomly selected from file)\n    UserAgents []string // List of User-Agents (when loaded from file)\n    RateLimit int      // Delay between requests (ms)\n    Timeout    int      // Request timeout (seconds)\n    Retry      int      // Retry failed requests\n    \n    // JS Analysis\n    Deobfuscate, SourceMap, Eval, ObfsDetect bool\n    \n    // Security Analysis\n    Secrets, Tokens, Params, ParamURLs, Internal, GraphQL, Bypass, Firebase, Links bool\n    \n    // Crawling & Scope\n    CrawlDepth int    // Recursive JS crawling depth\n    Domain     string // Scope to specific domain\n    Ext        string // Match specific JS file extensions\n    \n    // Output\n    JSON, CSV, Verbose, Burp bool\n\n    // v0.6: false-positive pipeline controls.\n    // MinConfidence gates findings; ShowConfidence prints the score inline.\n    // NoFPFilter disables the legacy FP filter (debug only). SelfTest runs\n    // the rule registry against its TP/FP fixtures and exits. MaxBytes caps\n    // response body reads (gzip-bomb defense). AllowInternal opts in to\n    // file://, localhost, and RFC1918 targets — off by default to avoid SSRF.\n    MinConfidence  float64\n    ShowConfidence bool\n    NoFPFilter     bool\n    SelfTest       bool\n    MaxBytes       int64\n    AllowInternal  bool\n\n    // v0.6+: live verification + observability + extensibility.\n    // Verify enables read-only liveness probes against provider endpoints\n    // (Stripe /v1/balance, GitHub /user, OpenAI /v1/models, Slack auth.test, etc.).\n    // VerifyTimeout bounds each probe; PerHost caps outbound concurrency per host.\n    // Stats prints per-stage counters at end of run.\n    // RulesFile loads an external JSON rule pack at startup.\n    Verify         bool\n    VerifyTimeout  int\n    PerHost        int\n    Stats          bool\n    RulesFile      string\n\n    // v0.6++: I/O formats, suppressions, registry introspection, deltas.\n    // SARIF and NDJSON are alternative output modes; IgnoreFile is a\n    // permanent suppression list; DiffFile takes a previous JSON envelope\n    // and reports only new findings; OnlyRules/DisableRule apply a registry\n    // filter; HARFile bypasses the fetcher and reads from a Chrome HAR\n    // archive directly; NoColor disables ANSI color (also auto when stdout\n    // is not a TTY).\n    SARIF          bool\n    NDJSON         bool\n    IgnoreFile     string\n    DiffFile       string\n    OnlyRules      string\n    DisableRule    string\n    HARFile        string\n    NoColor        bool\n    IgnoreSet      *IgnoreList\n    DiffSeen       map[string]bool\n\n    // v0.6+++: page-aware crawling, source maps, cache, robots, concurrent verify.\n    CacheDir       string\n    Robots         bool\n    InlineHTML     bool\n    CSPOrigins     bool\n    VerifyWorkers  int\n    Cache          *DiskCache\n}\n\nfunc Run() {\n    var (\n        url, list, jsFile, output, regex, cookies, proxy string\n        threads                                           int\n        quiet, help, update, extractEndpoints, skipTLS, foundOnly bool\n    )\n    \n    // Advanced HTTP\n    var headers flagList\n    var userAgent string\n    var rateLimit, timeout, retry int\n    \n    // JS Analysis\n    var deobfuscate, sourceMap, eval, obfsDetect bool\n    \n    // Security Analysis\n    var secrets, tokens, params, paramURLs, internal, graphql, bypass, firebase, links bool\n    \n    // Crawling & Scope\n    var crawlDepth int\n    var domain, ext string\n    \n    // Output\n    var jsonOut, csvOut, verbose, burp bool\n\n    flag.StringVar(&url, \"u\", \"\", \"Input a URL\")\n    flag.StringVar(&url, \"url\", \"\", \"Input a URL\")\n    flag.StringVar(&list, \"l\", \"\", \"Input a file with URLs (.txt)\")\n    flag.StringVar(&list, \"list\", \"\", \"Input a file with URLs (.txt)\")\n    flag.StringVar(&jsFile, \"f\", \"\", \"Path to JavaScript file\")\n    flag.StringVar(&jsFile, \"file\", \"\", \"Path to JavaScript file\")\n    flag.StringVar(&output, \"o\", \"\", \"Output file path\")\n    flag.StringVar(&output, \"output\", \"\", \"Output file path\")\n    flag.StringVar(&regex, \"r\", \"\", \"RegEx for filtering results (endpoints and sensitive data)\")\n    flag.StringVar(&regex, \"regex\", \"\", \"RegEx for filtering results (endpoints and sensitive data)\")\n    flag.StringVar(&cookies, \"c\", \"\", \"Cookies for authenticated JS files\")\n    flag.StringVar(&cookies, \"cookies\", \"\", \"Cookies for authenticated JS files\")\n    flag.StringVar(&proxy, \"p\", \"\", \"Set proxy (host:port)\")\n    flag.StringVar(&proxy, \"proxy\", \"\", \"Set proxy (host:port)\")\n    flag.IntVar(&threads, \"t\", 5, \"Number of concurrent threads\")\n    flag.IntVar(&threads, \"threads\", 5, \"Number of concurrent threads\")\n    flag.BoolVar(&quiet, \"q\", false, \"Quiet mode: suppress ASCII art output\")\n    flag.BoolVar(&quiet, \"quiet\", false, \"Quiet mode: suppress ASCII art output\")\n    flag.BoolVar(&help, \"h\", false, \"Display help message\")\n    flag.BoolVar(&help, \"help\", false, \"Display help message\")\n    flag.BoolVar(&update, \"update\", false, \"Update the tool with latest patterns\")\n    flag.BoolVar(&update, \"up\", false, \"Update the tool to latest version\")\n    flag.BoolVar(&extractEndpoints, \"ep\", false, \"Extract endpoints from JavaScript files\")\n    flag.BoolVar(&extractEndpoints, \"end-point\", false, \"Extract endpoints from JavaScript files\")\n    flag.BoolVar(&skipTLS, \"k\", false, \"Skip TLS certificate verification\")\n    flag.BoolVar(&skipTLS, \"skip-tls\", false, \"Skip TLS certificate verification\")\n    flag.BoolVar(&foundOnly, \"fo\", false, \"Only show results when sensitive data is found (hide MISSING messages)\")\n    flag.BoolVar(&foundOnly, \"found-only\", false, \"Only show results when sensitive data is found (hide MISSING messages)\")\n    \n    // Advanced HTTP flags\n    flag.Var(&headers, \"H\", \"Custom HTTP headers (repeatable, format: 'Key: Value')\")\n    flag.Var(&headers, \"header\", \"Custom HTTP headers (repeatable, format: 'Key: Value')\")\n    flag.StringVar(&userAgent, \"U\", \"\", \"Custom User-Agent string or path to file containing user agents (one per line)\")\n    flag.StringVar(&userAgent, \"user-agent\", \"\", \"Custom User-Agent string or path to file containing user agents (one per line)\")\n    flag.IntVar(&rateLimit, \"R\", 0, \"Delay between requests (ms)\")\n    flag.IntVar(&rateLimit, \"rate-limit\", 0, \"Delay between requests (ms)\")\n    flag.IntVar(&timeout, \"T\", 30, \"Request timeout (seconds)\")\n    flag.IntVar(&timeout, \"timeout\", 30, \"Request timeout (seconds)\")\n    flag.IntVar(&retry, \"y\", 2, \"Retry failed requests\")\n    flag.IntVar(&retry, \"retry\", 2, \"Retry failed requests\")\n    \n    // JS Analysis flags\n    flag.BoolVar(&deobfuscate, \"d\", false, \"Deobfuscate minified/obfuscated code\")\n    flag.BoolVar(&deobfuscate, \"deobfuscate\", false, \"Deobfuscate minified/obfuscated code\")\n    flag.BoolVar(&sourceMap, \"m\", false, \"Parse source maps for original JS\")\n    flag.BoolVar(&sourceMap, \"sourcemap\", false, \"Parse source maps for original JS\")\n    flag.BoolVar(&eval, \"e\", false, \"Analyze eval() & dynamic code\")\n    flag.BoolVar(&eval, \"eval\", false, \"Analyze eval() & dynamic code\")\n    flag.BoolVar(&obfsDetect, \"z\", false, \"Detect obfuscation techniques\")\n    flag.BoolVar(&obfsDetect, \"obfs-detect\", false, \"Detect obfuscation techniques\")\n    \n    // Security Analysis flags\n    flag.BoolVar(&secrets, \"s\", false, \"API keys, tokens, credentials detection\")\n    flag.BoolVar(&secrets, \"secrets\", false, \"API keys, tokens, credentials detection\")\n    flag.BoolVar(&tokens, \"x\", false, \"JWT/auth tokens extraction\")\n    flag.BoolVar(&tokens, \"tokens\", false, \"JWT/auth tokens extraction\")\n    flag.BoolVar(&params, \"P\", false, \"Hidden parameters discovery\")\n    flag.BoolVar(&params, \"params\", false, \"Hidden parameters discovery\")\n    flag.BoolVar(&paramURLs, \"PU\", false, \"Advanced URL parameter extraction with base URLs\")\n    flag.BoolVar(&paramURLs, \"param-urls\", false, \"Advanced URL parameter extraction with base URLs\")\n    flag.BoolVar(&internal, \"i\", false, \"Internal/private endpoints only\")\n    flag.BoolVar(&internal, \"internal\", false, \"Internal/private endpoints only\")\n    flag.BoolVar(&graphql, \"g\", false, \"GraphQL endpoints & queries\")\n    flag.BoolVar(&graphql, \"graphql\", false, \"GraphQL endpoints & queries\")\n    flag.BoolVar(&bypass, \"B\", false, \"WAF bypass patterns detection\")\n    flag.BoolVar(&bypass, \"bypass\", false, \"WAF bypass patterns detection\")\n    flag.BoolVar(&firebase, \"F\", false, \"Firebase config/secrets detection\")\n    flag.BoolVar(&firebase, \"firebase\", false, \"Firebase config/secrets detection\")\n    flag.BoolVar(&links, \"L\", false, \"Extract all links/URLs from JS\")\n    flag.BoolVar(&links, \"links\", false, \"Extract all links/URLs from JS\")\n    \n    // Crawling & Scope flags\n    flag.IntVar(&crawlDepth, \"w\", 1, \"Recursive JS crawling depth\")\n    flag.IntVar(&crawlDepth, \"crawl\", 1, \"Recursive JS crawling depth\")\n    flag.StringVar(&domain, \"D\", \"\", \"Scope to specific domain\")\n    flag.StringVar(&domain, \"domain\", \"\", \"Scope to specific domain\")\n    flag.StringVar(&ext, \"E\", \"\", \"Match specific JS file extensions (comma-separated)\")\n    flag.StringVar(&ext, \"ext\", \"\", \"Match specific JS file extensions (comma-separated)\")\n    \n    // Output flags\n    flag.BoolVar(&jsonOut, \"j\", false, \"Structured JSON output\")\n    flag.BoolVar(&jsonOut, \"json\", false, \"Structured JSON output\")\n    flag.BoolVar(&csvOut, \"C\", false, \"CSV for Excel/Sheets import\")\n    flag.BoolVar(&csvOut, \"csv\", false, \"CSV for Excel/Sheets import\")\n    flag.BoolVar(&verbose, \"v\", false, \"Detailed analysis output\")\n    flag.BoolVar(&verbose, \"verbose\", false, \"Detailed analysis output\")\n    flag.BoolVar(&burp, \"n\", false, \"Burp Suite export format\")\n    flag.BoolVar(&burp, \"burp\", false, \"Burp Suite export format\")\n\n    // v0.6 — FP pipeline controls\n    var minConfidence float64\n    var showConfidence, noFPFilter, selfTest, allowInternal bool\n    var maxBytes int64\n    flag.Float64Var(&minConfidence, \"mc\", DefaultMinConfidence, \"Minimum confidence (0.0-1.0) for a finding to be reported\")\n    flag.Float64Var(&minConfidence, \"min-confidence\", DefaultMinConfidence, \"Minimum confidence (0.0-1.0) for a finding to be reported\")\n    flag.BoolVar(&showConfidence, \"sc\", false, \"Show confidence score on each printed finding\")\n    flag.BoolVar(&showConfidence, \"show-confidence\", false, \"Show confidence score on each printed finding\")\n    flag.BoolVar(&noFPFilter, \"no-fp-filter\", false, \"Disable false-positive filter (debug; keep all matches)\")\n    flag.BoolVar(&selfTest, \"self-test\", false, \"Run the rule registry against its built-in TP/FP fixtures and exit\")\n    flag.Int64Var(&maxBytes, \"max-bytes\", DefaultMaxBytes, \"Cap response body read size in bytes (gzip-bomb defense)\")\n    flag.BoolVar(&allowInternal, \"allow-internal\", false, \"Allow file://, localhost, and RFC1918 targets (off by default to prevent SSRF)\")\n\n    // v0.6+ — verifier, stats, extensibility\n    var verify, stats bool\n    var verifyTimeout, perHost int\n    var rulesFile string\n    flag.BoolVar(&verify, \"verify\", false, \"Probe each finding against the provider's read-only endpoint (off by default; opt-in)\")\n    flag.IntVar(&verifyTimeout, \"verify-timeout\", 10, \"Timeout in seconds for each verification probe\")\n    flag.IntVar(&perHost, \"per-host\", defaultPerHostConcurrency, \"Per-host outbound concurrency cap (avoids getting banned)\")\n    flag.BoolVar(&stats, \"stats\", false, \"Print per-stage counters (URLs fetched, FP-drops by reason, findings) on stderr at end of run\")\n    flag.StringVar(&rulesFile, \"rules-file\", \"\", \"Load an external JSON rule pack at startup (additive to built-in registry)\")\n\n    // v0.6++ — I/O formats, suppressions, deltas, registry introspection\n    var sarifOut, ndjsonOut, listRules, noColor bool\n    var explainID, ignoreFile, diffFile, onlyRules, disableRule, harFile string\n    flag.BoolVar(&sarifOut, \"sarif\", false, \"Emit SARIF 2.1.0 (suitable for GitHub code-scanning)\")\n    flag.BoolVar(&ndjsonOut, \"ndjson\", false, \"Stream findings as newline-delimited JSON\")\n    flag.StringVar(&ignoreFile, \"ignore-file\", \"\", \"Path to .jshunterignore for permanent suppression\")\n    flag.StringVar(&diffFile, \"diff\", \"\", \"Diff against a previous JSON envelope; only NEW findings reported\")\n    flag.BoolVar(&listRules, \"list-rules\", false, \"Print the rule registry as a table and exit\")\n    flag.StringVar(&explainID, \"explain\", \"\", \"Print the full rule definition (incl. TP/FP fixtures) and exit\")\n    flag.StringVar(&onlyRules, \"only-rules\", \"\", \"Comma-separated rule_id patterns; only matching rules run (supports * glob)\")\n    flag.StringVar(&disableRule, \"disable-rule\", \"\", \"Comma-separated rule_id patterns to disable (supports * glob)\")\n    flag.StringVar(&harFile, \"har\", \"\", \"Ingest a Chrome DevTools HAR file instead of fetching URLs\")\n    flag.BoolVar(&noColor, \"no-color\", false, \"Disable ANSI color (auto-disabled when stdout is not a TTY)\")\n\n    // v0.6+++ — page-aware crawling, sourcemaps, cache, robots, concurrent verify\n    var cacheDir string\n    var robotsMode, inlineHTML, cspOrigins bool\n    var verifyWorkers int\n    flag.StringVar(&cacheDir, \"cache-dir\", \"\", \"Persist HTTP responses on disk for ETag-based revalidation\")\n    flag.BoolVar(&robotsMode, \"robots\", false, \"Fetch /robots.txt for the target host(s) and print Disallow paths\")\n    flag.BoolVar(&inlineHTML, \"inline-html\", false, \"Scan inline <script> tags and SRI/CSP from HTML responses\")\n    flag.BoolVar(&cspOrigins, \"csp-origins\", false, \"Extract Content-Security-Policy origins as candidate endpoints\")\n    flag.IntVar(&verifyWorkers, \"verify-workers\", 8, \"Worker pool size for concurrent --verify probes\")\n\n    flag.Parse()\n\n    // Apply rule-registry selection BEFORE any subcommand that depends on\n    // the rule set (--list-rules, --explain, --self-test).\n    if onlyRules != \"\" || disableRule != \"\" {\n        kept := applyRuleSelection(onlyRules, disableRule)\n        if !quiet {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] rule selection applied: %d rules active\\n\",\n                colors[\"CYAN\"], colors[\"NC\"], kept)\n        }\n    }\n\n    if listRules {\n        runListRules()\n        return\n    }\n    if explainID != \"\" {\n        runExplainRule(explainID)\n        return\n    }\n\n    // TTY autodetect: disable colors when stdout is not a terminal so piped\n    // output stays clean. --no-color forces disable in any case.\n    if noColor || !isStdoutTTY() {\n        disableColors()\n    }\n\n    if rulesFile != \"\" {\n        n, err := LoadRulesFile(rulesFile)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sERROR%s] rules file: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            os.Exit(2)\n        }\n        if !quiet {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] loaded %d external rules from %s\\n\", colors[\"CYAN\"], colors[\"NC\"], n, rulesFile)\n        }\n    }\n\n    if perHost > 0 {\n        getHostController().perHost = perHost\n    }\n\n    if selfTest {\n        runSelfTestCLI()\n        return\n    }\n\n    // Process User-Agent: check if it's a file path or a string\n    var userAgentsList []string\n    finalUserAgent := userAgent\n    if userAgent != \"\" {\n        // Check if it looks like a file path (contains path separators or common file extensions)\n        if strings.Contains(userAgent, \"/\") || strings.Contains(userAgent, \"\\\\\") || \n           strings.HasSuffix(userAgent, \".txt\") || strings.HasSuffix(userAgent, \".list\") {\n            // Try to read as file\n            if fileInfo, err := os.Stat(userAgent); err == nil && !fileInfo.IsDir() {\n                // It's a file, read user agents from it\n                file, err := os.Open(userAgent)\n                if err == nil {\n                    defer file.Close()\n                    scanner := bufio.NewScanner(file)\n                    for scanner.Scan() {\n                        line := strings.TrimSpace(scanner.Text())\n                        if line != \"\" && !strings.HasPrefix(line, \"#\") {\n                            userAgentsList = append(userAgentsList, line)\n                        }\n                    }\n                    if len(userAgentsList) > 0 {\n                        // Select a random user agent from the list\n                        finalUserAgent = userAgentsList[rand.Intn(len(userAgentsList))]\n                        if !quiet {\n                            fmt.Printf(\"[%sINFO%s] Loaded %d user agents from file, using: %s\\n\", \n                                colors[\"CYAN\"], colors[\"NC\"], len(userAgentsList), finalUserAgent)\n                        }\n                    } else {\n                        if !quiet {\n                            fmt.Printf(\"[%sWARN%s] User-Agent file is empty or contains no valid entries, using as string\\n\", \n                                colors[\"YELLOW\"], colors[\"NC\"])\n                        }\n                    }\n                } else {\n                    if !quiet {\n                        fmt.Printf(\"[%sWARN%s] Could not read User-Agent file, using as string: %v\\n\", \n                            colors[\"YELLOW\"], colors[\"NC\"], err)\n                    }\n                }\n            }\n        }\n    }\n\n    // Create config object\n    config := Config{\n        URL: url, List: list, JSFile: jsFile, Output: output, Regex: regex,\n        Cookies: cookies, Proxy: proxy, Threads: threads,\n        Quiet: quiet, Help: help, Update: update, ExtractEndpoints: extractEndpoints,\n        SkipTLS: skipTLS, FoundOnly: foundOnly,\n        Headers: headers, UserAgent: finalUserAgent, UserAgents: userAgentsList, RateLimit: rateLimit,\n        Timeout: timeout, Retry: retry,\n        Deobfuscate: deobfuscate, SourceMap: sourceMap, Eval: eval, ObfsDetect: obfsDetect,\n        Secrets: secrets, Tokens: tokens, Params: params, ParamURLs: paramURLs, Internal: internal,\n        GraphQL: graphql, Bypass: bypass, Firebase: firebase, Links: links,\n        CrawlDepth: crawlDepth, Domain: domain, Ext: ext,\n        JSON: jsonOut, CSV: csvOut, Verbose: verbose, Burp: burp,\n        MinConfidence: minConfidence, ShowConfidence: showConfidence,\n        NoFPFilter: noFPFilter, SelfTest: selfTest,\n        MaxBytes: maxBytes, AllowInternal: allowInternal,\n        Verify: verify, VerifyTimeout: verifyTimeout, PerHost: perHost,\n        Stats: stats, RulesFile: rulesFile,\n        SARIF: sarifOut, NDJSON: ndjsonOut,\n        IgnoreFile: ignoreFile, DiffFile: diffFile,\n        OnlyRules: onlyRules, DisableRule: disableRule,\n        HARFile: harFile, NoColor: noColor,\n        CacheDir: cacheDir, Robots: robotsMode,\n        InlineHTML: inlineHTML, CSPOrigins: cspOrigins,\n        VerifyWorkers: verifyWorkers,\n    }\n\n    // Initialize the run-wide stats struct lazily; counters are no-op when\n    // --stats isn't requested but Inc() calls remain cheap and uniform.\n    initStats()\n\n    // Disk cache: enabled when --cache-dir is set. Failure to mkdir is\n    // a hard error — silent fallback would surprise operators expecting\n    // 304s and finding full re-downloads instead.\n    if config.CacheDir != \"\" {\n        dc, err := NewDiskCache(config.CacheDir)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sERROR%s] cache-dir: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            os.Exit(2)\n        }\n        config.Cache = dc\n        if !quiet {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] disk cache active at %s\\n\",\n                colors[\"CYAN\"], colors[\"NC\"], config.CacheDir)\n        }\n    }\n\n    // --robots: opt-in fetch of /robots.txt for each unique host in the\n    // input. Prints disallowed paths and sitemap references; does NOT\n    // make JSHunter respect them on subsequent fetches.\n    if config.Robots {\n        runRobotsCLI(&config)\n        return\n    }\n\n    // Load .jshunterignore if specified — operator-managed permanent\n    // suppression of known-noise findings.\n    if config.IgnoreFile != \"\" {\n        il, err := LoadIgnoreFile(config.IgnoreFile)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sERROR%s] ignore file: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            os.Exit(2)\n        }\n        config.IgnoreSet = il\n        activeIgnoreList = il\n        if !quiet && il != nil {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] loaded %d ignore entries\\n\",\n                colors[\"CYAN\"], colors[\"NC\"], len(il.Entries))\n        }\n    }\n\n    // Load --diff baseline. New findings only — anything in the previous\n    // envelope (by value_hash) is suppressed so CI gates can fail on\n    // genuine regressions.\n    if config.DiffFile != \"\" {\n        seen, err := DiffPrevious(config.DiffFile)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sERROR%s] diff: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            os.Exit(2)\n        }\n        config.DiffSeen = seen\n        activeDiffSeen = seen\n        if !quiet {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] diff baseline carries %d known finding hashes\\n\",\n                colors[\"CYAN\"], colors[\"NC\"], len(seen))\n        }\n    }\n\n    // HAR ingestion is mutually exclusive with URL/list/file fetch paths;\n    // when --har is set we shortcut the dispatch entirely.\n    if config.HARFile != \"\" {\n        n, err := IngestHAR(config.HARFile, &config)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sERROR%s] har: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            os.Exit(2)\n        }\n        if !quiet {\n            fmt.Fprintf(os.Stderr, \"[%sINFO%s] HAR scan complete: %d JS entries\\n\",\n                colors[\"CYAN\"], colors[\"NC\"], n)\n        }\n        emitFinalOutput(&config)\n        return\n    }\n\n    if help {\n        customHelp()\n        return\n    }\n\n    if update {\n        updateTool()\n        return\n    }\n\n    if config.URL == \"\" && config.List == \"\" && config.JSFile == \"\" {\n        if isInputFromStdin() {\n            // Show ASCII art before processing stdin if not quiet\n            if !config.Quiet {\n                time.Sleep(100 * time.Millisecond)\n                displayAsciiArt()\n            }\n            \n            // Read all stdin content\n            stdinContent, err := io.ReadAll(os.Stdin)\n            if err != nil {\n                if !config.Quiet {\n                    fmt.Fprintf(os.Stderr, \"Error reading from stdin: %v\\n\", err)\n                }\n                return\n            }\n            \n            content := string(stdinContent)\n            \n            // Check if it looks like a list of URLs (each line is a URL)\n            lines := strings.Split(content, \"\\n\")\n            urlCount := 0\n            jsLineCount := 0\n            totalLines := 0\n            \n            for _, line := range lines {\n                line = strings.TrimSpace(line)\n                if line == \"\" {\n                    continue\n                }\n                totalLines++\n                \n                // Check if line looks like a URL\n                if strings.HasPrefix(line, \"http://\") || strings.HasPrefix(line, \"https://\") {\n                    urlCount++\n                }\n                // Check if line looks like JavaScript\n                if strings.Contains(line, \"function\") || \n                   strings.Contains(line, \"const \") || \n                   strings.Contains(line, \"let \") || \n                   strings.Contains(line, \"var \") ||\n                   strings.Contains(line, \"URLSearchParams\") ||\n                   strings.Contains(line, \".get(\") ||\n                   strings.Contains(line, \"fetch(\") ||\n                   strings.Contains(line, \"axios.\") ||\n                   strings.Contains(line, \"//\") ||\n                   strings.Contains(line, \"/*\") {\n                    jsLineCount++\n                }\n            }\n            \n            // Determine if it's JavaScript or URL list\n            // Priority: If most lines are URLs, always treat as URL list (process each URL)\n            isJavaScript := false\n            \n            if totalLines > 0 {\n                urlRatio := float64(urlCount) / float64(totalLines)\n                \n                // If more than 50% are URLs, treat as URL list (process each URL individually)\n                if urlRatio > 0.5 {\n                    isJavaScript = false\n                } else if config.ParamURLs || config.Params {\n                    // Using -PU/-P flags, check if it's actually JS code\n                    if jsLineCount > 5 || \n                       strings.Contains(content, \"function \") || \n                       strings.Contains(content, \"const urlParams\") ||\n                       strings.Contains(content, \"new URLSearchParams\") ||\n                       strings.Contains(content, \"URLSearchParams.get\") {\n                        // Clear JavaScript patterns\n                        isJavaScript = true\n                    } else {\n                        // Default: treat as JavaScript when using -PU/-P\n                        isJavaScript = true\n                    }\n                } else {\n                    // Without -PU/-P, check if it's JavaScript\n                    if jsLineCount > 5 || strings.Contains(content, \"function \") {\n                        isJavaScript = true\n                    }\n                }\n            }\n            \n            if isJavaScript {\n                // Process as JavaScript content directly\n                source := \"stdin\"\n                bodyBytes := []byte(content)\n                \n                if config.ParamURLs {\n                    paramURLs := extractURLParamsWithBaseURLs(content, source)\n                    if len(paramURLs) > 0 {\n                        globalSeenMutex.Lock()\n                        globalFoundAny = true // Mark that we found something\n                        for _, paramURL := range paramURLs {\n                            if !globalSeenAll[paramURL] {\n                                globalSeenAll[paramURL] = true\n                                fmt.Println(paramURL)\n                            }\n                        }\n                        globalSeenMutex.Unlock()\n                    }\n                } else if config.ExtractEndpoints {\n                    endpoints := extractEndpointsFromContent(content, config.Regex, \"\")\n                    displayEndpoints(endpoints, source)\n                } else {\n                    // Process as sensitive data search - use reportMatchesWithConfig directly\n                    reportMatchesWithConfig(source, bodyBytes, &config)\n                }\n            } else {\n                // Treat each line as URL/file path (old behavior)\n                scanner := bufio.NewScanner(strings.NewReader(content))\n                for scanner.Scan() {\n                    inputURL := strings.TrimSpace(scanner.Text())\n                    if inputURL == \"\" {\n                        continue\n                    }\n                    \n                    if config.ExtractEndpoints {\n                        processInputsForEndpointsWithConfig(inputURL, &config)\n                    } else {\n                        processInputsWithConfig(inputURL, &config)\n                    }\n                }\n                if err := scanner.Err(); err != nil {\n                    if !config.Quiet {\n                        fmt.Fprintln(os.Stderr, \"Error reading from stdin:\", err)\n                    }\n                }\n            }\n            return\n        }\n        customHelp()\n        os.Exit(1)\n    }\n\n    if !config.Quiet {\n        time.Sleep(100 * time.Millisecond)\n        displayAsciiArt()\n    }\n\n    if config.Quiet {\n        disableColors()\n    }\n\n    if config.JSFile != \"\" {\n        if config.ExtractEndpoints {\n            processJSFileForEndpointsWithConfig(config.JSFile, &config)\n        } else {\n            processJSFileWithConfig(config.JSFile, &config)\n        }\n        return \n    }\n\n    if config.ExtractEndpoints && (config.URL != \"\" || config.List != \"\") {\n        processInputsForEndpointsWithConfig(config.URL, &config)\n    } else {\n        processInputsWithConfig(config.URL, &config)\n    }\n}\n\n\n// runRobotsCLI fetches /robots.txt for each unique host in --url/--list\n// input and prints the parsed Disallow / Allow / Sitemap lines. This is a\n// pure recon helper: JSHunter does not honor robots.txt for its own fetch\n// path — operators who want compliance can pipe these paths back in.\nfunc runRobotsCLI(config *Config) {\n    client := createHTTPClientWithConfig(config)\n    hosts := collectHostsFromInput(config)\n    if len(hosts) == 0 {\n        fmt.Fprintf(os.Stderr, \"[%sWARN%s] --robots: no input URLs to inspect (use -u or -l)\\n\",\n            colors[\"YELLOW\"], colors[\"NC\"])\n        return\n    }\n    for _, h := range hosts {\n        r, err := FetchRobots(client, h, config.UserAgent)\n        if err != nil {\n            fmt.Fprintf(os.Stderr, \"[%sROBOTS%s] %s: %v\\n\", colors[\"YELLOW\"], colors[\"NC\"], h, err)\n            continue\n        }\n        if r == nil {\n            fmt.Printf(\"# %s — no robots.txt\\n\", h)\n            continue\n        }\n        fmt.Printf(\"# %s\\n\", r.URL)\n        for _, p := range r.Disallow {\n            fmt.Printf(\"Disallow %s%s\\n\", strings.TrimRight(h, \"/\"), p)\n        }\n        for _, p := range r.Allow {\n            fmt.Printf(\"Allow    %s%s\\n\", strings.TrimRight(h, \"/\"), p)\n        }\n        for _, s := range r.Sitemaps {\n            fmt.Printf(\"Sitemap  %s\\n\", s)\n        }\n    }\n}\n\n// collectHostsFromInput dedupes the scheme://host roots from --url and --list\n// so each host is asked for /robots.txt exactly once.\nfunc collectHostsFromInput(config *Config) []string {\n    seen := map[string]struct{}{}\n    out := []string{}\n    add := func(u string) {\n        u = strings.TrimSpace(u)\n        if !strings.HasPrefix(u, \"http://\") && !strings.HasPrefix(u, \"https://\") {\n            return\n        }\n        parsed, err := url.Parse(u)\n        if err != nil {\n            return\n        }\n        root := parsed.Scheme + \"://\" + parsed.Host\n        if _, ok := seen[root]; !ok {\n            seen[root] = struct{}{}\n            out = append(out, root)\n        }\n    }\n    if config.URL != \"\" {\n        add(config.URL)\n    }\n    if config.List != \"\" {\n        f, err := os.Open(config.List)\n        if err == nil {\n            defer f.Close()\n            sc := bufio.NewScanner(f)\n            for sc.Scan() {\n                add(sc.Text())\n            }\n        }\n    }\n    return out\n}\n\n// emitFinalOutput is the run-shutdown hook. Order matters:\n//  1. AWS pair verification (if --verify) — needs the dedupe table to be\n//     fully populated so AKID + secret in the same source can be paired.\n//  2. SARIF output (if --sarif).\n//  3. NDJSON output (if --ndjson).\n//  4. Stats summary (if --stats).\n// SARIF and NDJSON are mutually permissive — both can be emitted in the\n// same run, but operators typically pick one.\nfunc emitFinalOutput(config *Config) {\n    if config.Verify {\n        verifyClient := createHTTPClientWithConfig(config)\n        timeout := time.Duration(config.VerifyTimeout) * time.Second\n        if timeout <= 0 {\n            timeout = 10 * time.Second\n        }\n\n        // Per-finding verifiers run concurrently across a bounded worker\n        // pool. Per-host limits still apply via verifyHostLimiter.\n        all := flushFindings()\n        if len(all) > 0 {\n            VerifyAllConcurrent(all, verifyClient, timeout, config.VerifyWorkers)\n        }\n\n        // AWS pair verification is separate: the SigV4 path is paired\n        // (AKID + secret) and lives outside the per-finding verifier map.\n        for _, p := range pairAWSCredentials() {\n            if globalStats != nil {\n                statInc(&globalStats.VerifyAttempts)\n            }\n            ctx, cancel := context.WithTimeout(context.Background(), timeout)\n            res := verifyAWSPair(ctx, verifyClient, p)\n            cancel()\n            findingsMutex.Lock()\n            for _, f := range findingsByHash {\n                if (f.RuleID == \"aws.access_key_id\" && f.Value == p.AccessKeyID) ||\n                    (f.RuleID == \"aws.secret_access_key\" && f.Value == p.SecretAccessKey) {\n                    f.Verify = &res\n                    if res.Alive {\n                        f.Verified = true\n                        f.Confidence = 1.0\n                    }\n                }\n            }\n            findingsMutex.Unlock()\n            switch {\n            case res.Alive && globalStats != nil:\n                statInc(&globalStats.VerifyAlive)\n            case res.Error != \"\" && globalStats != nil:\n                statInc(&globalStats.VerifyError)\n            case globalStats != nil:\n                statInc(&globalStats.VerifyDead)\n            }\n        }\n    }\n    if config.SARIF {\n        outputSARIF()\n    }\n    if config.NDJSON {\n        outputNDJSON()\n    }\n    if config.Stats {\n        printStats(globalStats)\n    }\n}\n\n// runSelfTestCLI exercises the curated rule registry against its embedded\n// TP/FP fixtures and prints a per-rule precision/recall summary. Exits with\n// non-zero status if any rule fails — useful in CI to gate detector regressions.\nfunc runSelfTestCLI() {\n    results := runSelfTest()\n    overallOK := true\n    fmt.Printf(\"[%sSELF-TEST%s] JShunter %s rule registry\\n\", colors[\"BLUE\"], colors[\"NC\"], version)\n    for _, r := range results {\n        statusColor := colors[\"GREEN\"]\n        statusText := \"PASS\"\n        if !r.OK {\n            statusColor = colors[\"RED\"]\n            statusText = \"FAIL\"\n            overallOK = false\n        }\n        fmt.Printf(\"  [%s%s%s] %-40s  TP %d/%d  FP %d/%d\\n\",\n            statusColor, statusText, colors[\"NC\"],\n            r.Name, r.TPPassed, r.TPTotal, r.FPCaught, r.FPTotal)\n        for _, n := range r.Notes {\n            fmt.Printf(\"        %s%s%s %s\\n\", colors[\"YELLOW\"], \"·\", colors[\"NC\"], n)\n        }\n    }\n    if !overallOK {\n        os.Exit(1)\n    }\n}\n\n// looksLikeHTMLContentType is the content-type-only sibling of\n// looksLikeHTML; used pre-body-read so we can decide whether to take the\n// inline-HTML path before allocating the body slice.\nfunc looksLikeHTMLContentType(ct string) bool {\n    if ct == \"\" {\n        return false\n    }\n    low := strings.ToLower(ct)\n    if i := strings.Index(low, \";\"); i != -1 {\n        low = low[:i]\n    }\n    low = strings.TrimSpace(low)\n    return low == \"text/html\" || low == \"application/xhtml+xml\"\n}\n\n// scanHTMLArtifacts pulls inline <script> bodies and SRI/CSP metadata out\n// of an HTML response and feeds each through reportMatchesWithConfig under\n// a synthetic source label so operators can locate the exact tag.\nfunc scanHTMLArtifacts(pageURL string, body []byte, config *Config) {\n    arts, err := ExtractFromHTML(body)\n    if err != nil {\n        if config.Verbose {\n            fmt.Printf(\"[%sHTML%s] %s: %v\\n\", colors[\"YELLOW\"], colors[\"NC\"], pageURL, err)\n        }\n        return\n    }\n    for _, sc := range arts.InlineScripts {\n        // application/ld+json is structured data; still scan because it\n        // sometimes carries access tokens or webhook URLs.\n        src := fmt.Sprintf(\"%s#inline[%d]\", pageURL, sc.Index)\n        if globalStats != nil {\n            statAdd(&globalStats.BytesParsed, int64(len(sc.Body)))\n        }\n        processed := processJSAnalysis([]byte(sc.Body), config)\n        reportMatchesWithConfig(src, processed, config)\n    }\n    if config.CSPOrigins && len(arts.CSPOrigins) > 0 {\n        emitCSPOrigins(pageURL+\"#meta-csp\", arts.CSPOrigins)\n    }\n    if config.Verbose && len(arts.ExternalJS) > 0 {\n        fmt.Printf(\"[%sHTML%s] %s: %d external scripts referenced\\n\",\n            colors[\"CYAN\"], colors[\"NC\"], pageURL, len(arts.ExternalJS))\n    }\n}\n\n// emitCSPOrigins prints CSP-allowed origins one per line, prefixed with\n// the source for grep-friendly piping into the URL queue of a follow-up\n// scan. Concise and pipeline-friendly.\nfunc emitCSPOrigins(source string, origins []string) {\n    for _, o := range origins {\n        fmt.Printf(\"[CSP] %s\\t%s\\n\", source, o)\n    }\n}\n\n// validateTargetURL refuses internal/loopback/private targets unless the user\n// explicitly opts in. Recon tools that follow links are SSRF-prone if they\n// blindly fetch any URL the input feeds them; this gate is the smallest\n// useful guard. Only http/https are allowed; file:// is always rejected.\nfunc validateTargetURL(urlStr string, allowInternal bool) error {\n    if !strings.HasPrefix(urlStr, \"http://\") && !strings.HasPrefix(urlStr, \"https://\") {\n        return fmt.Errorf(\"only http(s) URLs are permitted; got %q\", urlStr)\n    }\n    parsed, err := url.Parse(urlStr)\n    if err != nil {\n        return fmt.Errorf(\"malformed URL: %w\", err)\n    }\n    host := parsed.Hostname()\n    if host == \"\" {\n        return fmt.Errorf(\"URL has no host\")\n    }\n    if allowInternal {\n        return nil\n    }\n    lowHost := strings.ToLower(host)\n    if lowHost == \"localhost\" || lowHost == \"ip6-localhost\" || lowHost == \"ip6-loopback\" {\n        return fmt.Errorf(\"internal target %q blocked (use --allow-internal to override)\", host)\n    }\n    if ip := net.ParseIP(host); ip != nil {\n        if ip.IsLoopback() || ip.IsPrivate() || ip.IsLinkLocalUnicast() ||\n            ip.IsLinkLocalMulticast() || ip.IsUnspecified() {\n            return fmt.Errorf(\"internal IP %q blocked (use --allow-internal to override)\", host)\n        }\n    }\n    return nil\n}\n\nfunc displayAsciiArt() {\n    versionStatus := getVersionStatus()\n    var statusColor string\n    var statusText string\n    \n    switch versionStatus {\n    case \"latest\":\n        statusColor = colors[\"GREEN\"]\n        statusText = \"latest\"\n    case \"outdated\":\n        statusColor = colors[\"RED\"]\n        statusText = \"outdated\"\n    default:\n        statusColor = colors[\"YELLOW\"]\n        statusText = \"Unknown\"\n    }\n    \n    fmt.Printf(`\n         ________             __         \n     __ / / __/ /  __ _____  / /____ ____\n    / // /\\ \\/ _ \\/ // / _ \\/ __/ -_) __/\n    \\___/___/_//_/\\_,_/_//_/\\__/\\__/_/  \n\n     %s (%s%s%s%s)                         Created by cc1a2b\n`, version, statusColor, statusText, colors[\"NC\"], \"\")\n}\n\nfunc customHelp() {\n    displayAsciiArt()\n    fmt.Println(\"Usage:\")\n    fmt.Println(\"  -u,  --url URL                Input a URL\")\n    fmt.Println(\"  -l,  --list FILE.txt          Input a file with URLs (.txt)\")\n    fmt.Println(\"  -f,  --file FILE.js           Path to JavaScript file\")\n    fmt.Println(\"       --har FILE               Ingest a Chrome DevTools HAR archive\")\n    fmt.Println()\n    fmt.Println(\"Basic Options:\")\n    fmt.Println(\"  -t,  --threads INT            Number of concurrent threads (default: 5)\")\n    fmt.Println(\"  -c,  --cookies <cookies>      Authentication cookies for protected resources\")\n    fmt.Println(\"  -p,  --proxy host:port        HTTP/SOCKS5 proxy (e.g., 127.0.0.1:8080 for Burp Suite)\")\n    fmt.Println(\"  -q,  --quiet                  Suppress ASCII art output\")\n    fmt.Println(\"       --no-color               Disable ANSI color (auto-off when not a TTY)\")\n    fmt.Println(\"  -o,  --output FILENAME        Output file path (full values, not redacted)\")\n    fmt.Println(\"  -r,  --regex <pattern>        RegEx for filtering results\")\n    fmt.Println(\"       --update, --up           Update the tool to latest version\")\n    fmt.Println(\"  -ep, --end-point              Extract endpoints from JavaScript files\")\n    fmt.Println(\"  -k,  --skip-tls               Skip TLS certificate verification\")\n    fmt.Println(\"  -fo, --found-only             Only show results when sensitive data is found\")\n    fmt.Println()\n    fmt.Println(\"HTTP Configuration:\")\n    fmt.Println(\"  -H,  --header \\\"Key: Value\\\"    Custom HTTP headers (repeatable, including Auth)\")\n    fmt.Println(\"  -U,  --user-agent UA          Custom User-Agent string or file path\")\n    fmt.Println(\"  -R,  --rate-limit MS          Request rate limiting delay (milliseconds)\")\n    fmt.Println(\"  -T,  --timeout SEC            HTTP request timeout (seconds)\")\n    fmt.Println(\"  -y,  --retry INT              Retry attempts for failed requests (default: 2)\")\n    fmt.Println(\"       --per-host INT           Per-host outbound concurrency cap (default: 4)\")\n    fmt.Println(\"       --max-bytes N            Cap response body read in bytes (default: 32MiB)\")\n    fmt.Println(\"       --allow-internal         Permit localhost / RFC1918 / link-local targets\")\n    fmt.Println(\"       --cache-dir DIR          Persist responses on disk; revalidate via ETag\")\n    fmt.Println()\n    fmt.Println(\"JavaScript Analysis:\")\n    fmt.Println(\"  -d,  --deobfuscate            Deobfuscate minified and obfuscated JavaScript\")\n    fmt.Println(\"  -m,  --sourcemap              Fetch and parse source maps + sourcesContent[]\")\n    fmt.Println(\"  -e,  --eval                   Analyze dynamic code execution (eval, Function)\")\n    fmt.Println(\"  -z,  --obfs-detect            Detect code obfuscation patterns and techniques\")\n    fmt.Println(\"       --inline-html            Scan inline <script> tags + SRI/CSP in HTML responses\")\n    fmt.Println(\"       --csp-origins            Emit CSP-allowed origins as candidate endpoints\")\n    fmt.Println()\n    fmt.Println(\"Security Analysis:\")\n    fmt.Println(\"  -s,  --secrets                Detect API keys, tokens, and credentials\")\n    fmt.Println(\"  -x,  --tokens                 Extract JWT and authentication tokens\")\n    fmt.Println(\"  -P,  --params                 Discover hidden parameters and variables\")\n    fmt.Println(\"  -PU, --param-urls             Advanced parameter extraction with URL context\")\n    fmt.Println(\"  -i,  --internal               Filter for internal/private endpoints\")\n    fmt.Println(\"  -g,  --graphql                Analyze GraphQL endpoints and queries\")\n    fmt.Println(\"  -B,  --bypass                 Detect WAF bypass patterns and techniques\")\n    fmt.Println(\"  -F,  --firebase               Analyze Firebase configurations and keys\")\n    fmt.Println(\"  -L,  --links                  Extract and analyze all embedded links\")\n    fmt.Println()\n    fmt.Println(\"Detection Tuning:\")\n    fmt.Println(\"  -mc, --min-confidence FLOAT   Minimum confidence (0.0-1.0) for a finding (default: 0.50)\")\n    fmt.Println(\"  -sc, --show-confidence        Print [conf=X.XX] alongside each finding\")\n    fmt.Println(\"       --no-fp-filter           Disable the false-positive filter (debug)\")\n    fmt.Println(\"       --ignore-file FILE       Permanent suppressions (.jshunterignore)\")\n    fmt.Println(\"       --diff PREVIOUS.json     Report only NEW findings vs previous JSON envelope\")\n    fmt.Println(\"       --rules-file FILE.json   Load an external JSON rule pack\")\n    fmt.Println(\"       --only-rules id,glob     Run only matching rules (supports * glob)\")\n    fmt.Println(\"       --disable-rule id,glob   Disable matching rules (supports * glob)\")\n    fmt.Println()\n    fmt.Println(\"Verification:\")\n    fmt.Println(\"       --verify                 Probe findings against provider read-only endpoints\")\n    fmt.Println(\"       --verify-timeout SEC     Timeout per verification probe (default: 10)\")\n    fmt.Println(\"       --verify-workers INT     Concurrent verifier worker pool (default: 8)\")\n    fmt.Println()\n    fmt.Println(\"Scope & Discovery:\")\n    fmt.Println(\"  -w,  --crawl DEPTH            Recursive JavaScript discovery depth (default: 1)\")\n    fmt.Println(\"  -D,  --domain DOMAIN          Limit analysis to specific domain\")\n    fmt.Println(\"  -E,  --ext                    Filter by JavaScript file extensions\")\n    fmt.Println(\"       --robots                 Fetch /robots.txt for each input host and exit\")\n    fmt.Println()\n    fmt.Println(\"Output Formats:\")\n    fmt.Println(\"  -j,  --json                   Structured JSON output (schema_version 2)\")\n    fmt.Println(\"       --ndjson                 Newline-delimited JSON (jq / SIEM streaming)\")\n    fmt.Println(\"       --sarif                  SARIF 2.1.0 (GitHub code-scanning compatible)\")\n    fmt.Println(\"  -C,  --csv                    CSV format for spreadsheet analysis\")\n    fmt.Println(\"  -v,  --verbose                Detailed analysis and debug output\")\n    fmt.Println(\"  -n,  --burp                   Burp Suite compatible export format\")\n    fmt.Println(\"       --stats                  Per-stage counters on stderr at end of run\")\n    fmt.Println()\n    fmt.Println(\"Registry:\")\n    fmt.Println(\"       --list-rules             Print the rule registry as a table and exit\")\n    fmt.Println(\"       --explain RULE_ID        Print full rule details and exit\")\n    fmt.Println(\"       --self-test              Run rule registry against built-in TP/FP fixtures\")\n    fmt.Println()\n    fmt.Println(\"  -h,  --help                   Display this help message\")\n}\n\nfunc processStdin(output, regex, cookies, proxy string, threads int) {\n    scanner := bufio.NewScanner(os.Stdin)\n    for scanner.Scan() {\n        line := scanner.Text()\n        fmt.Println(\"Processing line from stdin:\", line)\n\n    }\n    if err := scanner.Err(); err != nil {\n        fmt.Fprintln(os.Stderr, \"Error reading from stdin:\", err)\n    }\n}\n\n\nfunc isInputFromStdin() bool {\n    fi, err := os.Stdin.Stat()\n    if err != nil {\n        fmt.Println(\"Error checking stdin:\", err)\n        return false\n    }\n    return fi.Mode()&os.ModeCharDevice == 0\n}\n\n// isStdoutTTY returns true when stdout is a terminal — used to auto-disable\n// ANSI color when the operator is piping or redirecting output. The\n// `os.ModeCharDevice` check is the same heuristic POSIX `isatty(3)` exposes.\nfunc isStdoutTTY() bool {\n    fi, err := os.Stdout.Stat()\n    if err != nil {\n        return false\n    }\n    return fi.Mode()&os.ModeCharDevice != 0\n}\n\nfunc disableColors() {\n    for k := range colors {\n        colors[k] = \"\"\n    }\n}\n\n\nfunc processJSFile(jsFile, regex string) {\n    // Create minimal config for backward compatibility\n    config := &Config{\n        Regex: regex,\n    }\n    processJSFileWithConfig(jsFile, config)\n}\n\n\nfunc processInputs(url, list, output, regex, cookie, proxy string, threads int, skipTLS, foundOnly bool) {\n    // Create config for backward compatibility\n    config := &Config{\n        URL: url, List: list, Output: output, Regex: regex,\n        Cookies: cookie, Proxy: proxy, Threads: threads,\n        SkipTLS: skipTLS, FoundOnly: foundOnly,\n        Timeout: 30, Retry: 2,\n    }\n    processInputsWithConfig(url, config)\n    return\n}\n\nfunc processInputsOld(url, list, output, regex, cookie, proxy string, threads int, skipTLS, foundOnly bool) {\n    var wg sync.WaitGroup\n    urlChannel := make(chan string)\n\n    var fileWriter *os.File\n    if output != \"\" {\n        var err error\n        fileWriter, err = os.Create(output)\n        if err != nil {\n            fmt.Printf(\"Error creating output file: %v\\n\", err)\n            return\n        }\n        defer fileWriter.Close()\n    }\n\n    for i := 0; i < threads; i++ {\n        wg.Add(1)\n        go func() {\n            defer wg.Done()\n            for u := range urlChannel {\n                // Create minimal config for each request\n                config := &Config{\n                    Regex: regex, Cookies: cookie, Proxy: proxy,\n                    SkipTLS: skipTLS, FoundOnly: foundOnly,\n                    Timeout: 30, Retry: 1,\n                }\n                _, sensitiveData := searchForSensitiveDataWithConfig(u, config)\n\n                // Don't print sensitive data if ParamURLs flag is set (user only wants URL params)\n                if !config.ParamURLs {\n                    if fileWriter != nil {\n                        fmt.Fprintln(fileWriter, \"URL:\", u)\n                        for name, matches := range sensitiveData {\n                            for _, match := range matches {\n                                fmt.Fprintf(fileWriter, \"Sensitive Data [%s%s%s]: %s\\n\", colors[\"YELLOW\"], name, colors[\"NC\"], match)\n                            }\n                        }\n                    } else {\n                        for name, matches := range sensitiveData {\n                            for _, match := range matches {\n                                fmt.Printf(\"Sensitive Data [%s%s%s]: %s\\n\", colors[\"YELLOW\"], name, colors[\"NC\"], match)\n                            }\n                        }\n                    }\n                }\n            }\n        }()\n    }\n\n    if err := enqueueURLs(url, list, urlChannel, regex); err != nil {\n        fmt.Printf(\"Error in input processing: %v\\n\", err)\n        close(urlChannel)\n        return\n    }\n\n    close(urlChannel)\n    wg.Wait()\n    \n    // Print buffered MISSING messages only if no findings were made\n    // This is for the old/legacy function - always clear buffer\n    globalSeenMutex.Lock()\n    foundAny := globalFoundAny\n    globalSeenMutex.Unlock()\n    \n    if !foundAny && !foundOnly {\n        missingMutex.Lock()\n        for _, msg := range missingMessages {\n            fmt.Printf(\"[%sMISSING%s] No sensitive data found at: %s\\n\", colors[\"BLUE\"], colors[\"NC\"], msg)\n        }\n        missingMessages = missingMessages[:0] // Clear the buffer\n        missingMutex.Unlock()\n    } else {\n        // Clear the buffer if findings were made\n        missingMutex.Lock()\n        missingMessages = missingMessages[:0]\n        missingMutex.Unlock()\n    }\n}\n\n\nfunc enqueueURLs(url, list string, urlChannel chan<- string, regex string) error {\n    if list != \"\" {\n        return enqueueFromFile(list, urlChannel)\n    } else if url != \"\" {\n        enqueueSingleURL(url, urlChannel, regex)\n    } else {\n        enqueueFromStdin(urlChannel)\n    }\n    return nil\n}\n\nfunc enqueueFromFile(filename string, urlChannel chan<- string) error {\n    file, err := os.Open(filename)\n    if err != nil {\n        return fmt.Errorf(\"Error opening file: %w\", err)\n    }\n    defer file.Close()\n\n    scanner := bufio.NewScanner(file)\n    for scanner.Scan() {\n        urlChannel <- scanner.Text()\n    }\n    return scanner.Err()\n}\n\nfunc enqueueSingleURL(url string, urlChannel chan<- string, regex string) {\n    if strings.HasPrefix(url, \"http://\") || strings.HasPrefix(url, \"https://\") {\n        urlChannel <- url\n    } else {\n        processJSFile(url, regex)\n    }\n}\n\nfunc enqueueFromStdin(urlChannel chan<- string) {\n    scanner := bufio.NewScanner(os.Stdin)\n    for scanner.Scan() {\n        urlChannel <- scanner.Text()\n    }\n    if err := scanner.Err(); err != nil {\n        fmt.Printf(\"Error reading from stdin: %v\\n\", err)\n    }\n}\n\n\n// isTLSCanceledError checks if an error is a TLS cancellation error (common with proxy interception)\nfunc isTLSCanceledError(err error) bool {\n    if err == nil {\n        return false\n    }\n    errStr := strings.ToLower(err.Error())\n    // Check for various TLS and connection errors that can occur with proxy interception\n    return strings.Contains(errStr, \"tls: user canceled\") || \n           strings.Contains(errStr, \"user canceled\") ||\n           strings.Contains(errStr, \"tls: handshake failure\") ||\n           strings.Contains(errStr, \"remote error: tls\") ||\n           strings.Contains(errStr, \"connection reset\") ||\n           err == io.EOF // EOF can occur when proxy closes connection\n}\n\n// isJavaScriptContentType checks if the Content-Type header indicates JavaScript content\nfunc isJavaScriptContentType(contentType string) bool {\n    if contentType == \"\" {\n        return false\n    }\n    contentType = strings.ToLower(strings.TrimSpace(contentType))\n    // Remove charset and other parameters (e.g., \"application/javascript; charset=utf-8\")\n    if idx := strings.Index(contentType, \";\"); idx != -1 {\n        contentType = contentType[:idx]\n    }\n    contentType = strings.TrimSpace(contentType)\n    \n    // Common JavaScript MIME types\n    jsTypes := []string{\n        \"application/javascript\",\n        \"application/x-javascript\",\n        \"text/javascript\",\n        \"text/ecmascript\",\n        \"application/ecmascript\",\n    }\n    \n    for _, jsType := range jsTypes {\n        if contentType == jsType {\n            return true\n        }\n    }\n    \n    return false\n}\n\n// isValidStatusCode checks if the HTTP status code indicates a successful response\nfunc isValidStatusCode(statusCode int) bool {\n    // Accept 2xx status codes (successful responses)\n    return statusCode >= 200 && statusCode < 300\n}\n\n// isNonJavaScriptContentType checks if Content-Type indicates non-JavaScript content that should be filtered out\nfunc isNonJavaScriptContentType(contentType string) bool {\n    if contentType == \"\" {\n        return false // Unknown content type, check URL extension instead\n    }\n    contentType = strings.ToLower(strings.TrimSpace(contentType))\n    // Remove charset and other parameters (e.g., \"text/html; charset=utf-8\")\n    if idx := strings.Index(contentType, \";\"); idx != -1 {\n        contentType = contentType[:idx]\n    }\n    contentType = strings.TrimSpace(contentType)\n    \n    // Non-JavaScript content types to filter out\n    nonJSTypes := []string{\n        // HTML\n        \"text/html\",\n        \"application/xhtml+xml\",\n        // CSS\n        \"text/css\",\n        // Plain text\n        \"text/plain\",\n        // JSON (unless it's a JS file with wrong content-type)\n        \"application/json\",\n        // XML\n        \"text/xml\",\n        \"application/xml\",\n        \"application/rss+xml\",\n        \"application/atom+xml\",\n        // Images\n        \"image/jpeg\",\n        \"image/jpg\",\n        \"image/png\",\n        \"image/gif\",\n        \"image/webp\",\n        \"image/svg+xml\",\n        \"image/x-icon\",\n        \"image/vnd.microsoft.icon\",\n        // Fonts\n        \"font/woff\",\n        \"font/woff2\",\n        \"application/font-woff\",\n        \"application/font-woff2\",\n        // Video\n        \"video/mp4\",\n        \"video/webm\",\n        \"video/ogg\",\n        // Audio\n        \"audio/mpeg\",\n        \"audio/ogg\",\n        \"audio/wav\",\n        // Documents\n        \"application/pdf\",\n        \"application/msword\",\n        \"application/vnd.ms-excel\",\n        // Other\n        \"application/octet-stream\",\n        \"application/x-www-form-urlencoded\",\n        \"multipart/form-data\",\n    }\n    \n    for _, nonJSType := range nonJSTypes {\n        if contentType == nonJSType {\n            return true\n        }\n    }\n    \n    // Filter out any text/* types that aren't JavaScript\n    if strings.HasPrefix(contentType, \"text/\") && !isJavaScriptContentType(contentType) {\n        return true\n    }\n    \n    return false\n}\n\n// shouldProcessResponse checks if the response should be processed based on Content-Type and status code\nfunc shouldProcessResponse(resp *http.Response, urlStr string, config *Config) bool {\n    // Check status code first - silently skip invalid status codes\n    if !isValidStatusCode(resp.StatusCode) {\n        return false\n    }\n    \n    // Check Content-Type\n    contentType := resp.Header.Get(\"Content-Type\")\n    \n    // If Content-Type explicitly indicates non-JavaScript, skip it\n    if isNonJavaScriptContentType(contentType) {\n        return false\n    }\n    \n    // If Content-Type is JavaScript, process it\n    if isJavaScriptContentType(contentType) {\n        return true\n    }\n    \n    // If Content-Type is unknown or missing, check URL extension as fallback\n    urlLower := strings.ToLower(urlStr)\n    hasJSExtension := strings.HasSuffix(urlLower, \".js\") || \n                     strings.Contains(urlLower, \".js?\") ||\n                     strings.Contains(urlLower, \".js&\") ||\n                     strings.Contains(urlLower, \".js#\")\n    \n    // Only process if URL has .js extension\n    return hasJSExtension\n}\n\nfunc searchForSensitiveData(urlStr, regex, cookie, proxyStr string, skipTLS, foundOnly bool) (string, map[string][]string) {\n    var client *http.Client\n\n    transport := &http.Transport{\n        TLSClientConfig: &tls.Config{InsecureSkipVerify: skipTLS},\n        DisableKeepAlives: false,\n        MaxIdleConns: 10,\n        IdleConnTimeout: 30 * time.Second,\n    }\n\n    var clientTimeout time.Duration = 30 * time.Second\n\n    if proxyStr != \"\" {\n        // Check if it's a SOCKS5 proxy\n        if strings.HasPrefix(proxyStr, \"socks5://\") || strings.HasPrefix(proxyStr, \"socks5h://\") {\n            // Parse SOCKS5 proxy\n            proxyURL, err := url.Parse(proxyStr)\n            if err != nil {\n                fmt.Printf(\"Invalid SOCKS5 proxy URL: %v\\n\", err)\n                return urlStr, nil\n            }\n\n            // Create SOCKS5 dialer\n            var auth *proxy.Auth\n            if proxyURL.User != nil {\n                password, _ := proxyURL.User.Password()\n                auth = &proxy.Auth{\n                    User:     proxyURL.User.Username(),\n                    Password: password,\n                }\n            }\n\n            dialer, err := proxy.SOCKS5(\"tcp\", proxyURL.Host, auth, proxy.Direct)\n            if err != nil {\n                fmt.Printf(\"Failed to create SOCKS5 dialer: %v\\n\", err)\n                return urlStr, nil\n            }\n\n            // Use context dialer for SOCKS5\n            transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) {\n                return dialer.Dial(network, addr)\n            }\n            transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}\n            clientTimeout = 60 * time.Second\n        } else {\n            // HTTP/HTTPS proxy - Normalize proxy URL - add http:// if not present\n            proxyURLStr := proxyStr\n            if !strings.HasPrefix(proxyStr, \"http://\") && !strings.HasPrefix(proxyStr, \"https://\") {\n                proxyURLStr = \"http://\" + proxyStr\n            }\n\n            proxyURL, err := url.Parse(proxyURLStr)\n            if err != nil {\n                fmt.Printf(\"Invalid proxy URL: %v\\n\", err)\n                return urlStr, nil\n            }\n            transport.Proxy = http.ProxyURL(proxyURL)\n\n            // When using proxy (like Burp), skip TLS verification to avoid certificate issues\n            transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}\n\n            // Increase timeout for proxy connections (Burp interception may take time)\n            clientTimeout = 60 * time.Second\n        }\n    }\n\n    client = &http.Client{\n        Transport: transport,\n        Timeout: clientTimeout,\n    }\n\n    var sensitiveData map[string][]string\n\n    if strings.HasPrefix(urlStr, \"http://\") || strings.HasPrefix(urlStr, \"https://\") {\n        req, err := http.NewRequest(\"GET\", urlStr, nil)\n        if err != nil {\n            fmt.Printf(\"Failed to create request for URL %s: %v\\n\", urlStr, err)\n            return urlStr, nil\n        }\n\n        if cookie != \"\" {\n            req.Header.Set(\"Cookie\", cookie)\n        }\n\n        resp, err := client.Do(req)\n        if err != nil {\n            // Suppress TLS user canceled errors when using proxy (Burp interception)\n            if proxyStr == \"\" || !isTLSCanceledError(err) {\n                // Only print if not using proxy or if it's a different error\n            }\n            return urlStr, nil\n        }\n        defer resp.Body.Close()\n\n        // Filter: Only process JavaScript content (create minimal config for filtering)\n        minimalConfig := &Config{Verbose: false}\n        if !shouldProcessResponse(resp, urlStr, minimalConfig) {\n            return urlStr, nil\n        }\n\n        // Try to read the body, even if there might be an error (partial reads are useful)\n        body, err := io.ReadAll(resp.Body)\n        if err != nil {\n            // Suppress TLS user canceled errors when using proxy (Burp may close connection)\n            if proxyStr == \"\" || !isTLSCanceledError(err) {\n                // Only show error if we got no data at all\n                if len(body) == 0 {\n                    fmt.Printf(\"Error reading response body: %v\\n\", err)\n                    return urlStr, nil\n                }\n                // If we got some data, continue processing it despite the error\n            } else if len(body) == 0 {\n                // Proxy error and no data - silently return\n                return urlStr, nil\n            }\n            // If we have data (even partial), continue to process it\n        }\n\n        // Process the body even if there was a read error (might have partial data)\n        if len(body) > 0 {\n            sensitiveData = reportMatches(urlStr, body, regexPatterns, regex, foundOnly)\n        } else {\n            sensitiveData = make(map[string][]string)\n        }\n    } else {\n        body, err := os.ReadFile(urlStr)\n        if err != nil {\n            fmt.Printf(\"Error reading local file %s: %v\\n\", urlStr, err)\n            return urlStr, nil\n        }\n\n        sensitiveData = reportMatches(urlStr, body, regexPatterns, regex, foundOnly)\n    }\n\n    return urlStr, sensitiveData\n}\n\n\nfunc isUnwantedEmail(email string) bool {\n    unwantedPrefixes := []string{\n        \"info@\", \"career@\", \"careers@\", \"jobs@\", \"admin@\", \"support@\", \"contact@\",\n        \"help@\", \"noreply@\", \"no-reply@\", \"test@\", \"demo@\", \"example@\",\n        \"sales@\", \"marketing@\", \"press@\", \"media@\", \"feedback@\", \"hello@\",\n        \"team@\", \"hr@\", \"legal@\", \"privacy@\", \"abuse@\", \"postmaster@\",\n        \"webmaster@\", \"hostmaster@\", \"security@\", \"compliance@\", \"billing@\",\n        \"service@\", \"newsletter@\", \"notifications@\", \"alerts@\", \"noemail@\",\n        \"donotreply@\", \"do-not-reply@\", \"mailer@\", \"mail@\", \"email@\",\n        \"integration@\", \"api@\", \"dev@\", \"developer@\", \"developers@\",\n    }\n\n    unwantedDomains := []string{\n        \"example.com\", \"test.com\", \"localhost\", \"example.org\", \"example.net\",\n        \"domain.com\", \"email.com\", \"mail.com\", \"yoursite.com\", \"yourdomain.com\",\n        \"sentry.io\", \"sentry-next.wixpress.com\",\n    }\n    \n    email = strings.ToLower(email)\n    \n    // Check unwanted prefixes\n    for _, prefix := range unwantedPrefixes {\n        if strings.HasPrefix(email, prefix) {\n            return true\n        }\n    }\n    \n    // Check unwanted domains\n    for _, domain := range unwantedDomains {\n        if strings.HasSuffix(email, \"@\"+domain) {\n            return true\n        }\n    }\n    \n    return false\n}\n\nfunc reportMatches(source string, body []byte, regexPatterns map[string]*regexp.Regexp, filterRegex string, foundOnly bool) map[string][]string {\n    matchesMap := make(map[string][]string)\n\n    for name, pattern := range regexPatterns {\n        if pattern.Match(body) {\n            matches := pattern.FindAllString(string(body), -1)\n            if len(matches) > 0 {\n                // Apply regex filter if provided\n                if filterRegex != \"\" {\n                    filterPattern, err := regexp.Compile(filterRegex)\n                    if err == nil {\n                        filteredMatches := []string{}\n                        for _, match := range matches {\n                            if filterPattern.MatchString(match) {\n                                filteredMatches = append(filteredMatches, match)\n                            }\n                        }\n                        if len(filteredMatches) > 0 {\n                            matchesMap[name] = append(matchesMap[name], filteredMatches...)\n                        }\n                    }\n                } else {\n                    // Special filtering for emails\n                    if name == \"Email\" {\n                        filteredMatches := []string{}\n                        for _, match := range matches {\n                            if !isUnwantedEmail(match) {\n                                filteredMatches = append(filteredMatches, match)\n                            }\n                        }\n                        if len(filteredMatches) > 0 {\n                            matchesMap[name] = append(matchesMap[name], filteredMatches...)\n                        }\n                    } else {\n                        matchesMap[name] = append(matchesMap[name], matches...)\n                    }\n                }\n            }\n        }\n    }\n\n    if len(matchesMap) > 0 {\n        fmt.Printf(\"[%s FOUND %s] Sensitive data at: %s\\n\", colors[\"RED\"], colors[\"NC\"], source)\n        // Mark that we found something\n        globalSeenMutex.Lock()\n        globalFoundAny = true\n        globalSeenMutex.Unlock()\n    } else {\n        // Buffer MISSING messages instead of printing immediately\n        if !foundOnly {\n            missingMutex.Lock()\n            missingMessages = append(missingMessages, source)\n            missingMutex.Unlock()\n        }\n    }\n\n    return matchesMap\n}\n\nfunc getVersionStatus() string {\n    currentVersion := version\n    \n    resp, err := http.Get(\"https://api.github.com/repos/cc1a2b/jshunter/releases/latest\")\n    if err != nil {\n        return \"Unknown\"\n    }\n    defer resp.Body.Close()\n    \n    if resp.StatusCode != 200 {\n        return \"Unknown\"\n    }\n    \n    body, err := io.ReadAll(resp.Body)\n    if err != nil {\n        return \"Unknown\"\n    }\n    \n    var release struct {\n        TagName string `json:\"tag_name\"`\n    }\n    \n    err = json.Unmarshal(body, &release)\n    if err != nil {\n        return \"Unknown\"\n    }\n    \n    latestVersion := release.TagName\n    \n    if latestVersion == currentVersion {\n        return \"latest\"\n    }\n    \n    return \"outdated\"\n}\n\nfunc updateTool() {\n    fmt.Printf(\"[%sINFO%s] Checking for updates...\\n\", colors[\"BLUE\"], colors[\"NC\"])\n    \n    currentVersion := version\n    \n    resp, err := http.Get(\"https://api.github.com/repos/cc1a2b/jshunter/releases/latest\")\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to check for updates: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        fmt.Printf(\"[%sINFO%s] You can manually update from: https://github.com/cc1a2b/jshunter/releases\\n\", colors[\"YELLOW\"], colors[\"NC\"])\n        return\n    }\n    defer resp.Body.Close()\n    \n    if resp.StatusCode != 200 {\n        fmt.Printf(\"[%sERROR%s] Failed to fetch release information\\n\", colors[\"RED\"], colors[\"NC\"])\n        fmt.Printf(\"[%sINFO%s] You can manually update from: https://github.com/cc1a2b/jshunter/releases\\n\", colors[\"YELLOW\"], colors[\"NC\"])\n        return\n    }\n    \n    body, err := io.ReadAll(resp.Body)\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to read response: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    \n    var release struct {\n        TagName string `json:\"tag_name\"`\n        Assets  []struct {\n            Name               string `json:\"name\"`\n            BrowserDownloadURL string `json:\"browser_download_url\"`\n        } `json:\"assets\"`\n    }\n    \n    err = json.Unmarshal(body, &release)\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to parse release information: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    \n    latestVersion := release.TagName\n    \n    if latestVersion == currentVersion {\n        fmt.Printf(\"[%sINFO%s] You are already running the latest version: %s\\n\", colors[\"GREEN\"], colors[\"NC\"], currentVersion)\n        return\n    }\n    \n    fmt.Printf(\"[%sINFO%s] New version available: %s (current: %s)\\n\", colors[\"YELLOW\"], colors[\"NC\"], latestVersion, currentVersion)\n    \n    var downloadURL string\n    var binaryName string\n    \n    \n    goos := runtime.GOOS\n    goarch := runtime.GOARCH\n    \n    binaryName = fmt.Sprintf(\"jshunter_%s_%s\", goos, goarch)\n    \n    // First try to find platform-specific binary\n    for _, asset := range release.Assets {\n        if strings.Contains(asset.Name, goos) && strings.Contains(asset.Name, goarch) {\n            downloadURL = asset.BrowserDownloadURL\n            binaryName = asset.Name\n            break\n        }\n    }\n    \n    // If no platform-specific binary found, look for generic binary\n    if downloadURL == \"\" {\n        for _, asset := range release.Assets {\n            if asset.Name == \"jshunter\" || strings.HasPrefix(asset.Name, \"jshunter\") {\n                downloadURL = asset.BrowserDownloadURL\n                binaryName = asset.Name\n                break\n            }\n        }\n    }\n    \n    if downloadURL == \"\" {\n        fmt.Printf(\"[%sERROR%s] No suitable binary found for your platform (%s_%s)\\n\", colors[\"RED\"], colors[\"NC\"], goos, goarch)\n        fmt.Printf(\"[%sINFO%s] Please download manually from: https://github.com/cc1a2b/jshunter/releases/tag/%s\\n\", colors[\"YELLOW\"], colors[\"NC\"], latestVersion)\n        return\n    }\n    \n    resp, err = http.Get(downloadURL)\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to download update: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    defer resp.Body.Close()\n    \n    if resp.StatusCode != 200 {\n        fmt.Printf(\"[%sERROR%s] Failed to download update (status: %d)\\n\", colors[\"RED\"], colors[\"NC\"], resp.StatusCode)\n        return\n    }\n    \n    // Get content length for progress bar\n    contentLength := resp.ContentLength\n    if contentLength <= 0 {\n        contentLength = 0\n    }\n    \n    // Create smart loader\n    var binaryData []byte\n    if contentLength > 0 {\n        binaryData = make([]byte, 0, contentLength)\n        reader := &progressReader{\n            reader: resp.Body,\n            total:  contentLength,\n            onProgress: func(current int64) {\n                // Smart progress bar like httpx\n                progress := float64(current) / float64(contentLength)\n                barWidth := 20\n                filled := int(progress * float64(barWidth))\n                \n                bar := strings.Repeat(\"#\", filled) + strings.Repeat(\" \", barWidth-filled)\n                percentage := int(progress * 100)\n                \n                // Format file size\n                currentMB := float64(current) / (1024 * 1024)\n                totalMB := float64(contentLength) / (1024 * 1024)\n                \n                fmt.Printf(\"\\r[%sINFO%s] Downloading %s [%s%s%s] %d%% (%.1f/%.1f MB)\", \n                    colors[\"BLUE\"], colors[\"NC\"], binaryName,\n                    colors[\"BLUE\"], bar, colors[\"NC\"], \n                    percentage, currentMB, totalMB)\n            },\n        }\n        \n        binaryData, err = io.ReadAll(reader)\n        \n        // Final progress update to show 100%\n        if err == nil {\n            progress := float64(reader.total) / float64(reader.total)\n            barWidth := 20\n            filled := int(progress * float64(barWidth))\n            bar := strings.Repeat(\"#\", filled) + strings.Repeat(\" \", barWidth-filled)\n            percentage := int(progress * 100)\n            currentMB := float64(reader.total) / (1024 * 1024)\n            totalMB := float64(reader.total) / (1024 * 1024)\n            \n            fmt.Printf(\"\\r[%sINFO%s] Downloading %s [%s%s%s] %d%% (%.1f/%.1f MB)\\n\", \n                colors[\"BLUE\"], colors[\"NC\"], binaryName,\n                colors[\"BLUE\"], bar, colors[\"NC\"], \n                percentage, currentMB, totalMB)\n        } else {\n            fmt.Println() // New line after loader\n        }\n    } else {\n        binaryData, err = io.ReadAll(resp.Body)\n    }\n    \n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to read binary data: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    \n    currentPath, err := os.Executable()\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to get current executable path: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    \n    backupPath := currentPath + \".backup\"\n    err = os.Rename(currentPath, backupPath)\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to create backup: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        return\n    }\n    \n    err = os.WriteFile(currentPath, binaryData, 0755)\n    if err != nil {\n        fmt.Printf(\"[%sERROR%s] Failed to write new binary: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n        os.Rename(backupPath, currentPath)\n        return\n    }\n    \n    os.Remove(backupPath)\n    \n    fmt.Printf(\"[%sSUCCESS%s] Successfully updated to %s!\\n\", colors[\"GREEN\"], colors[\"NC\"], latestVersion)\n    fmt.Printf(\"[%sINFO%s] Restart the tool to use the new version.\\n\", colors[\"BLUE\"], colors[\"NC\"])\n}\n\nfunc processJSFileForEndpoints(jsFile, regex, output string) {\n    if _, err := os.Stat(jsFile); os.IsNotExist(err) {\n        fmt.Printf(\"[%sERROR%s] File not found: %s\\n\", colors[\"RED\"], colors[\"NC\"], jsFile)\n        return\n    } else if err != nil {\n        fmt.Printf(\"[%sERROR%s] Unable to access file %s: %v\\n\", colors[\"RED\"], colors[\"NC\"], jsFile, err)\n        return\n    }\n    \n    endpoints := extractEndpointsFromFile(jsFile, regex)\n    \n    if output != \"\" {\n        writeEndpointsToFile(endpoints, output, jsFile)\n    } else {\n        displayEndpoints(endpoints, jsFile)\n    }\n}\n\nfunc processInputsForEndpoints(url, list, output, regex, cookie, proxy string, threads int, skipTLS, foundOnly bool) {\n    // Create config for backward compatibility\n    config := &Config{\n        URL: url, List: list, Output: output, Regex: regex,\n        Cookies: cookie, Proxy: proxy, Threads: threads,\n        SkipTLS: skipTLS, FoundOnly: foundOnly,\n        Timeout: 30, Retry: 2,\n    }\n    processInputsForEndpointsWithConfig(url, config)\n    return\n}\n\nfunc processInputsForEndpointsOld(url, list, output, regex, cookie, proxy string, threads int, skipTLS, foundOnly bool) {\n    var wg sync.WaitGroup\n    urlChannel := make(chan string)\n    \n    var fileWriter *os.File\n    if output != \"\" {\n        var err error\n        fileWriter, err = os.Create(output)\n        if err != nil {\n            fmt.Printf(\"Error creating output file: %v\\n\", err)\n            return\n        }\n        defer fileWriter.Close()\n    }\n    \n    for i := 0; i < threads; i++ {\n        wg.Add(1)\n        go func() {\n            defer wg.Done()\n            for u := range urlChannel {\n                // Create minimal config for each request\n                config := &Config{\n                    Regex: regex, Cookies: cookie, Proxy: proxy,\n                    SkipTLS: skipTLS,\n                    Timeout: 30, Retry: 1,\n                }\n                endpoints := extractEndpointsFromURLWithConfig(u, config)\n                \n                if fileWriter != nil {\n                    fmt.Fprintf(fileWriter, \"URL: %s\\n\", u)\n                    for _, endpoint := range endpoints {\n                        fmt.Fprintf(fileWriter, \"ENDPOINT: %s\\n\", endpoint)\n                    }\n                    fmt.Fprintln(fileWriter, \"\")\n                } else {\n                    for _, endpoint := range endpoints {\n                        fmt.Println(endpoint)\n                    }\n                }\n            }\n        }()\n    }\n    \n    if err := enqueueURLs(url, list, urlChannel, regex); err != nil {\n        fmt.Printf(\"Error in input processing: %v\\n\", err)\n        close(urlChannel)\n        return\n    }\n    \n    close(urlChannel)\n    wg.Wait()\n}\n\nfunc extractEndpointsFromFile(filePath, regex string) []string {\n    body, err := os.ReadFile(filePath)\n    if err != nil {\n        fmt.Printf(\"Error reading file %s: %v\\n\", filePath, err)\n        return nil\n    }\n    \n    return extractEndpointsFromContent(string(body), regex, \"\")\n}\n\nfunc extractEndpointsFromURL(urlStr, regex, cookie, proxy string, skipTLS bool) []string {\n    // Create a minimal config for backward compatibility\n    config := &Config{\n        Proxy:   proxy,\n        Cookies: cookie,\n        SkipTLS: skipTLS,\n        Timeout: 30,\n        Retry:   1,\n    }\n    return extractEndpointsFromURLWithConfig(urlStr, config)\n}\n\nfunc extractEndpointsFromContent(content, regex, targetDomain string) []string {\n    content = string(stripJSComments([]byte(content)))\n    var endpoints []string\n    var baseURLs []string\n\n    baseURLPatterns := map[string]*regexp.Regexp{\n        \"base_url\":        regexp.MustCompile(`baseURL\\s*[:=]\\s*[\"']([^\"']*)[\"']`),\n        \"api_base\":        regexp.MustCompile(`apiBase\\s*[:=]\\s*[\"']([^\"']*)[\"']`),\n        \"api_url\":         regexp.MustCompile(`API_URL\\s*[:=]\\s*[\"']([^\"']*)[\"']`),\n        \"server_url\":      regexp.MustCompile(`SERVER_URL\\s*[:=]\\s*[\"']([^\"']*)[\"']`),\n        \"endpoint_base\":   regexp.MustCompile(`endpointBase\\s*[:=]\\s*[\"']([^\"']*)[\"']`),\n    }\n    \n    for _, pattern := range baseURLPatterns {\n        matches := pattern.FindAllStringSubmatch(content, -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                baseURL := strings.Trim(match[1], `\"'`)\n                if baseURL != \"\" && !contains(baseURLs, baseURL) {\n                    baseURLs = append(baseURLs, baseURL)\n                }\n            }\n        }\n    }\n    \n    endpointPatterns := map[string]*regexp.Regexp{\n        \"ajax_url\":        regexp.MustCompile(`\\.ajax\\s*\\(\\s*[\"']([^\"']*)[\"']`),\n        \"fetch_url\":       regexp.MustCompile(`fetch\\s*\\(\\s*[\"']([^\"']*)[\"']`),\n        \"xhr_url\":         regexp.MustCompile(`\\.open\\s*\\(\\s*[\"'][^\"']*[\"']\\s*,\\s*[\"']([^\"']*)[\"']`),\n        \"axios_url\":       regexp.MustCompile(`axios\\.[a-z]+\\s*\\(\\s*[\"']([^\"']*)[\"']`),\n        \"request_url\":     regexp.MustCompile(`request\\.[a-z]+\\s*\\(\\s*[\"']([^\"']*)[\"']`),\n        \"api_endpoint\":    regexp.MustCompile(`[\"'](/api/[a-zA-Z0-9._~:/?#[\\]@!$&'()*+,;=%\\-]*)[\"']`),\n        \"rest_endpoint\":   regexp.MustCompile(`[\"'](/[a-zA-Z0-9._~:/?#[\\]@!$&'()*+,;=%\\-]*)[\"']`),\n        \"graphql_endpoint\": regexp.MustCompile(`[\"'](/graphql[^\"']*)[\"']`),\n    }\n    \n    var relativeEndpoints []string\n    for _, pattern := range endpointPatterns {\n        matches := pattern.FindAllStringSubmatch(content, -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                endpoint := strings.Trim(match[1], `\"'`)\n                if endpoint != \"\" && !contains(relativeEndpoints, endpoint) {\n                    endpoint = cleanEndpoint(endpoint)\n                    if isValidEndpoint(endpoint) {\n                        relativeEndpoints = append(relativeEndpoints, endpoint)\n                    }\n                }\n            }\n        }\n    }\n    \n    fullURLPatterns := map[string]*regexp.Regexp{\n        \"full_url\":        regexp.MustCompile(`https?://[a-zA-Z0-9.-]+/[a-zA-Z0-9._~:/?#[\\]@!$&'()*+,;=%\\-]+`),\n        \"websocket_url\":   regexp.MustCompile(`wss?://[a-zA-Z0-9.-]+/[a-zA-Z0-9._~:/?#[\\]@!$&'()*+,;=%\\-]+`),\n    }\n    \n    for _, pattern := range fullURLPatterns {\n        matches := pattern.FindAllString(content, -1)\n        for _, match := range matches {\n            match = cleanEndpoint(match)\n            if match != \"\" && !contains(endpoints, match) && isValidEndpoint(match) {\n                endpoints = append(endpoints, match)\n            }\n        }\n    }\n    \n    for _, baseURL := range baseURLs {\n        baseURL = strings.TrimRight(baseURL, \"/\")\n        for _, relEndpoint := range relativeEndpoints {\n            if strings.HasPrefix(relEndpoint, \"/\") {\n                fullEndpoint := baseURL + relEndpoint\n                if !contains(endpoints, fullEndpoint) {\n                    endpoints = append(endpoints, fullEndpoint)\n                }\n            }\n        }\n    }\n    \n    if targetDomain != \"\" {\n        if !strings.HasPrefix(targetDomain, \"http\") {\n            targetDomain = \"https://\" + targetDomain\n        }\n        targetDomain = strings.TrimRight(targetDomain, \"/\")\n        \n        for _, relEndpoint := range relativeEndpoints {\n            fullEndpoint := targetDomain + relEndpoint\n            if !contains(endpoints, fullEndpoint) {\n                endpoints = append(endpoints, fullEndpoint)\n            }\n        }\n    } else {\n        if len(baseURLs) > 0 {\n            baseURL := strings.TrimRight(baseURLs[0], \"/\")\n            for _, relEndpoint := range relativeEndpoints {\n                fullEndpoint := baseURL + relEndpoint\n                if !contains(endpoints, fullEndpoint) {\n                    endpoints = append(endpoints, fullEndpoint)\n                }\n            }\n        } else {\n            for _, relEndpoint := range relativeEndpoints {\n                if !contains(endpoints, relEndpoint) {\n                    endpoints = append(endpoints, relEndpoint)\n                }\n            }\n        }\n    }\n    \n    if regex != \"\" {\n        filteredEndpoints := []string{}\n        customPattern, err := regexp.Compile(regex)\n        if err != nil {\n            fmt.Printf(\"Invalid regex pattern: %v\\n\", err)\n            return endpoints\n        }\n        \n        for _, endpoint := range endpoints {\n            if customPattern.MatchString(endpoint) {\n                filteredEndpoints = append(filteredEndpoints, endpoint)\n            }\n        }\n        endpoints = filteredEndpoints\n    }\n    \n    return endpoints\n}\n\n\nfunc cleanEndpoint(endpoint string) string {\n\n    endpoint = strings.Trim(endpoint, `\"'`)\n    endpoint = strings.TrimSpace(endpoint)\n    \n    endpoint = strings.TrimRight(endpoint, \";,)\")\n    endpoint = strings.TrimRight(endpoint, `\"'`)\n    \n\n    if strings.Contains(endpoint, \"${\") {\n        return \"\"\n    }\n    \n\n    endpoint = strings.Trim(endpoint, `\"'`)\n    \n\n    endpoint = strings.TrimRight(endpoint, \";,)\")\n    endpoint = strings.TrimRight(endpoint, `\"'`)\n    \n    return endpoint\n}\n\n\nvar endpointBackrefRe = regexp.MustCompile(`\\$\\d`)\n\nfunc isValidEndpoint(endpoint string) bool {\n\n    if endpoint == \"\" {\n        return false\n    }\n\n\n    if strings.Contains(endpoint, \"${\") || strings.Contains(endpoint, \"+\") {\n        return false\n    }\n\n    // Regex backreferences ($1, $2, $N) — synthetic, never a real URL.\n    if endpointBackrefRe.MatchString(endpoint) {\n        return false\n    }\n\n    // Protocol-relative or double-slash leading paths cause garbage joins\n    // (host + \"//foo\" → \"host//foo\"). Comments, CDN refs, JS literals.\n    if strings.HasPrefix(endpoint, \"//\") {\n        return false\n    }\n\n    // Truncated templated fragments — value/key placeholder was meant to be\n    // substituted at runtime (\"?id=\", \"maps.google.\", \"tel:\").\n    if n := len(endpoint); n > 0 {\n        switch endpoint[n-1] {\n        case '=', '.', ':', '&', '?':\n            return false\n        }\n    }\n    \n   \n    if len(endpoint) < 2 {\n        return false\n    }\n    \n    \n    skipWords := []string{\"GET\", \"POST\", \"PUT\", \"DELETE\", \"PATCH\", \"HEAD\", \"OPTIONS\", \"true\", \"false\", \"null\", \"undefined\"}\n    for _, word := range skipWords {\n        if endpoint == word {\n            return false\n        }\n    }\n    \n  \n    if strings.HasSuffix(endpoint, \"'\") || strings.HasSuffix(endpoint, \"\\\"\") || \n       strings.HasSuffix(endpoint, \";\") || strings.HasSuffix(endpoint, \")\") ||\n       strings.HasSuffix(endpoint, \"';\") || strings.HasSuffix(endpoint, \"\\\";\") ||\n       strings.HasSuffix(endpoint, \"')\") || strings.HasSuffix(endpoint, \"\\\")\") {\n        return false\n    }\n    \n    \n    if strings.Contains(endpoint, \"';\") || strings.Contains(endpoint, \"\\\";\") ||\n       strings.Contains(endpoint, \"')\") || strings.Contains(endpoint, \"\\\")\") {\n        return false\n    }\n    \n\n    if strings.Contains(endpoint, \",\") || strings.Contains(endpoint, \"(\") || \n       strings.Contains(endpoint, \"Y=\") || strings.Contains(endpoint, \"&\") {\n        return false\n    }\n    \n\n    if strings.HasSuffix(endpoint, \"/a\") || strings.HasSuffix(endpoint, \"/g\") ||\n       strings.HasSuffix(endpoint, \"//\") || strings.HasSuffix(endpoint, \"/\") {\n        return false\n    }\n    \n\n    if !strings.HasPrefix(endpoint, \"/\") && !strings.HasPrefix(endpoint, \"http\") {\n        return false\n    }\n    \n\n    externalDomains := []string{\n        \"fonts.googleapis.com\",\n        \"fonts.gstatic.com\", \n        \"www.googletagmanager.com\",\n        \"www.google-analytics.com\",\n        \"static.hotjar.com\",\n        \"www.hotjar.com\",\n        \"cdnjs.cloudflare.com\",\n        \"unpkg.com\",\n        \"cdn.jsdelivr.net\",\n        \"ajax.googleapis.com\",\n        \"code.jquery.com\",\n        \"maxcdn.bootstrapcdn.com\",\n        \"stackpath.bootstrapcdn.com\",\n        \"www.opensource.org\",\n        \"flowplayer.org\",\n        \"docs.jquery.com\",\n        \"www.adobe.com\",\n        \"www.w3.org\",\n        \"jquery.com\",\n        \"github.com\",\n        \"raw.githubusercontent.com\",\n    }\n    \n    for _, domain := range externalDomains {\n        if strings.Contains(endpoint, domain) {\n            return false\n        }\n    }\n    \n   \n    if strings.HasPrefix(endpoint, \"http\") {\n\n        parts := strings.Split(endpoint, \"/\")\n        if len(parts) < 4 || parts[3] == \"\" {\n            return false\n        }\n        \n\n        if strings.Contains(endpoint, \"?family=\") || strings.Contains(endpoint, \"?id=\") ||\n           strings.Contains(endpoint, \"&display=\") || strings.Contains(endpoint, \"&version=\") {\n            return false\n        }\n    }\n    \n    return true\n}\n\n\nfunc displayEndpoints(endpoints []string, source string) {\n    if len(endpoints) > 0 {\n        for _, endpoint := range endpoints {\n            fmt.Println(endpoint)\n        }\n    }\n}\n\n\nfunc writeEndpointsToFile(endpoints []string, outputFile, source string) {\n    file, err := os.OpenFile(outputFile, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)\n    if err != nil {\n        fmt.Printf(\"Error opening output file: %v\\n\", err)\n        return\n    }\n    defer file.Close()\n    \n    fmt.Fprintf(file, \"SOURCE: %s\\n\", source)\n    for _, endpoint := range endpoints {\n        fmt.Fprintf(file, \"ENDPOINT: %s\\n\", endpoint)\n    }\n    fmt.Fprintln(file, \"\")\n    \n    fmt.Printf(\"[%sSUCCESS%s] Endpoints saved to: %s\\n\", colors[\"GREEN\"], colors[\"NC\"], outputFile)\n}\n\n\nfunc contains(slice []string, item string) bool {\n    for _, s := range slice {\n        if s == item {\n            return true\n        }\n    }\n    return false\n}\n\n// createHTTPClientWithConfig creates an HTTP client with all advanced options\nfunc createHTTPClientWithConfig(config *Config) *http.Client {\n    transport := &http.Transport{\n        TLSClientConfig: &tls.Config{InsecureSkipVerify: config.SkipTLS},\n        DisableKeepAlives: false,\n        MaxIdleConns: 10,\n        IdleConnTimeout: 30 * time.Second,\n    }\n\n    clientTimeout := time.Duration(config.Timeout) * time.Second\n\n    if config.Proxy != \"\" {\n        proxyStr := config.Proxy\n\n        // Check if it's a SOCKS5 proxy\n        if strings.HasPrefix(proxyStr, \"socks5://\") || strings.HasPrefix(proxyStr, \"socks5h://\") {\n            // Parse SOCKS5 proxy\n            proxyURL, err := url.Parse(proxyStr)\n            if err != nil {\n                fmt.Printf(\"[%sERROR%s] Invalid SOCKS5 proxy URL %s: %v\\n\", colors[\"RED\"], colors[\"NC\"], proxyStr, err)\n            } else {\n                // Create SOCKS5 dialer\n                var auth *proxy.Auth\n                if proxyURL.User != nil {\n                    password, _ := proxyURL.User.Password()\n                    auth = &proxy.Auth{\n                        User:     proxyURL.User.Username(),\n                        Password: password,\n                    }\n                }\n\n                dialer, err := proxy.SOCKS5(\"tcp\", proxyURL.Host, auth, proxy.Direct)\n                if err != nil {\n                    fmt.Printf(\"[%sERROR%s] Failed to create SOCKS5 dialer: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n                } else {\n                    // Use context dialer for SOCKS5\n                    transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) {\n                        return dialer.Dial(network, addr)\n                    }\n                    transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}\n                    clientTimeout = 60 * time.Second\n                    if config.Verbose {\n                        fmt.Printf(\"[%sINFO%s] SOCKS5 proxy configured: %s\\n\", colors[\"BLUE\"], colors[\"NC\"], proxyStr)\n                    }\n                }\n            }\n        } else {\n            // HTTP/HTTPS proxy\n            proxyURLStr := proxyStr\n            if !strings.HasPrefix(proxyStr, \"http://\") && !strings.HasPrefix(proxyStr, \"https://\") {\n                proxyURLStr = \"http://\" + proxyStr\n            }\n\n            proxyURL, err := url.Parse(proxyURLStr)\n            if err != nil {\n                fmt.Printf(\"[%sERROR%s] Invalid proxy URL %s: %v\\n\", colors[\"RED\"], colors[\"NC\"], proxyURLStr, err)\n            } else {\n                transport.Proxy = http.ProxyURL(proxyURL)\n                transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}\n                clientTimeout = 60 * time.Second\n                if config.Verbose {\n                    fmt.Printf(\"[%sINFO%s] HTTP proxy configured: %s\\n\", colors[\"BLUE\"], colors[\"NC\"], proxyURLStr)\n                }\n            }\n        }\n    }\n\n    return &http.Client{\n        Transport: transport,\n        Timeout: clientTimeout,\n    }\n}\n\n// makeRequestWithRetry makes an HTTP request with retry logic, per-host\n// concurrency cap, exponential backoff with jitter, and a 429/5xx circuit\n// breaker. The host controller's recordOutcome teaches the breaker after\n// every response, so a host that starts misbehaving is benched cooperatively\n// across the whole run rather than per-goroutine.\nfunc makeRequestWithRetry(client *http.Client, req *http.Request, config *Config) (*http.Response, error) {\n    if config.RateLimit > 0 {\n        time.Sleep(time.Duration(config.RateLimit) * time.Millisecond)\n    }\n\n    host := req.URL.Host\n    ctrl := getHostController()\n    release, allowed := ctrl.acquire(host)\n    defer release()\n    if !allowed {\n        if config.Verbose {\n            fmt.Printf(\"[%sBREAKER%s] %s\\n\", colors[\"YELLOW\"], colors[\"NC\"], describeBreaker(host))\n        }\n        return nil, fmt.Errorf(\"host %s circuit-broken; cooldown in effect\", host)\n    }\n\n    var resp *http.Response\n    var err error\n\n    maxRetries := config.Retry\n    if maxRetries < 1 {\n        maxRetries = 1\n    }\n\n    for attempt := 0; attempt < maxRetries; attempt++ {\n        resp, err = client.Do(req)\n        if err == nil {\n            ctrl.recordOutcome(host, resp.StatusCode, parseRetryAfter(resp.Header))\n            // 429 / 503 with Retry-After: honor before retry; otherwise return.\n            if (resp.StatusCode == http.StatusTooManyRequests || resp.StatusCode == http.StatusServiceUnavailable) && attempt < maxRetries-1 {\n                ra := parseRetryAfter(resp.Header)\n                if ra == 0 {\n                    ra = backoffWithJitter(attempt)\n                }\n                resp.Body.Close()\n                time.Sleep(ra)\n                continue\n            }\n            return resp, nil\n        }\n\n        // Don't retry on TLS cancellation errors (proxy interception)\n        if config.Proxy != \"\" && isTLSCanceledError(err) {\n            return nil, err\n        }\n\n        if attempt < maxRetries-1 {\n            time.Sleep(backoffWithJitter(attempt))\n        }\n    }\n\n    if err != nil {\n        ctrl.recordOutcome(host, 0, 0)\n    }\n    return nil, err\n}\n\n// searchForSensitiveDataWithConfig enhanced version with all new features\nfunc searchForSensitiveDataWithConfig(urlStr string, config *Config) (string, map[string][]string) {\n    client := createHTTPClientWithConfig(config)\n    var sensitiveData map[string][]string\n\n    if strings.HasPrefix(urlStr, \"http://\") || strings.HasPrefix(urlStr, \"https://\") {\n        if err := validateTargetURL(urlStr, config.AllowInternal); err != nil {\n            if globalStats != nil {\n                statInc(&globalStats.URLsBlocked)\n            }\n            if !config.Quiet {\n                fmt.Printf(\"[%sBLOCK%s] %v\\n\", colors[\"YELLOW\"], colors[\"NC\"], err)\n            }\n            return urlStr, nil\n        }\n        if globalStats != nil {\n            statInc(&globalStats.URLsFetched)\n        }\n        req, err := http.NewRequest(\"GET\", urlStr, nil)\n        if err != nil {\n            if config.Verbose {\n                fmt.Printf(\"Failed to create request for URL %s: %v\\n\", urlStr, err)\n            }\n            return urlStr, nil\n        }\n\n        // Apply custom headers\n        for _, header := range config.Headers {\n            parts := strings.SplitN(header, \":\", 2)\n            if len(parts) == 2 {\n                key := strings.TrimSpace(parts[0])\n                value := strings.TrimSpace(parts[1])\n                req.Header.Set(key, value)\n                if config.Verbose {\n                    fmt.Printf(\"[%sINFO%s] Added header: %s: %s\\n\", colors[\"CYAN\"], colors[\"NC\"], key, value)\n                }\n            } else if config.Verbose {\n                fmt.Printf(\"[%sWARN%s] Invalid header format (expected 'Key: Value'): %s\\n\", colors[\"YELLOW\"], colors[\"NC\"], header)\n            }\n        }\n\n        // Apply custom User-Agent (randomly select from list if available)\n        if len(config.UserAgents) > 0 {\n            // Randomly select a user agent from the list for each request\n            selectedUA := config.UserAgents[rand.Intn(len(config.UserAgents))]\n            req.Header.Set(\"User-Agent\", selectedUA)\n        } else if config.UserAgent != \"\" {\n            req.Header.Set(\"User-Agent\", config.UserAgent)\n        }\n\n        // Apply cookies\n        if config.Cookies != \"\" {\n            req.Header.Set(\"Cookie\", config.Cookies)\n        }\n\n        // Cache-aware fetch: attach If-None-Match / If-Modified-Since so a\n        // re-run revalidates instead of re-downloading. The 304 path is\n        // handled below in the response branch.\n        if config.Cache != nil {\n            config.Cache.AttachConditional(req)\n        }\n\n        resp, err := makeRequestWithRetry(client, req, config)\n        if err != nil {\n            return urlStr, nil\n        }\n        \n        if config.Verbose {\n            fmt.Printf(\"[%sINFO%s] Successfully fetched %s (Status: %d)\\n\", colors[\"GREEN\"], colors[\"NC\"], urlStr, resp.StatusCode)\n        }\n        defer resp.Body.Close()\n\n        // 304 Not Modified: serve from disk cache if we have a body for\n        // this URL. Cache headers were attached pre-flight when --cache-dir\n        // was set; this branch closes the loop.\n        if resp.StatusCode == http.StatusNotModified && config.Cache != nil {\n            if cb, _, ok := config.Cache.Lookup(urlStr); ok {\n                if config.Verbose {\n                    fmt.Printf(\"[%sCACHE%s] 304 hit for %s (%d bytes from disk)\\n\",\n                        colors[\"CYAN\"], colors[\"NC\"], urlStr, len(cb))\n                }\n                processedBody := processJSAnalysis(cb, config)\n                sensitiveData = reportMatchesWithConfig(urlStr, processedBody, config)\n                return urlStr, sensitiveData\n            }\n        }\n\n        // Decide whether to process the response. v0.6+++ accepts HTML when\n        // --inline-html is set so homepage `<script>` tags are scanned.\n        contentType := resp.Header.Get(\"Content-Type\")\n        isHTML := looksLikeHTMLContentType(contentType)\n        if !shouldProcessResponse(resp, urlStr, config) && !(config.InlineHTML && isHTML) {\n            return urlStr, nil\n        }\n\n        // v0.6 — bound the body read so a hostile target can't exhaust memory\n        // via gzip bomb or pathological streaming response.\n        var bodyReader io.Reader = resp.Body\n        if config.MaxBytes > 0 {\n            bodyReader = io.LimitReader(resp.Body, config.MaxBytes)\n        }\n        body, err := io.ReadAll(bodyReader)\n        if err != nil {\n            if config.Proxy == \"\" || !isTLSCanceledError(err) {\n                if len(body) == 0 && config.Verbose {\n                    fmt.Printf(\"Error reading response body: %v\\n\", err)\n                }\n            }\n            if len(body) == 0 {\n                return urlStr, nil\n            }\n        }\n        if config.MaxBytes > 0 && int64(len(body)) >= config.MaxBytes {\n            if globalStats != nil {\n                statInc(&globalStats.BytesTruncated)\n            }\n            if config.Verbose {\n                fmt.Printf(\"[%sWARN%s] Truncated response from %s at %d bytes (--max-bytes)\\n\",\n                    colors[\"YELLOW\"], colors[\"NC\"], urlStr, config.MaxBytes)\n            }\n        }\n        if globalStats != nil {\n            statAdd(&globalStats.BytesParsed, int64(len(body)))\n        }\n\n        // Persist to disk cache when enabled. Skipped silently for\n        // session-bearing responses (Set-Cookie / Authorization).\n        if config.Cache != nil {\n            _ = config.Cache.Store(urlStr, resp, body)\n        }\n\n        // CSP origins from response header (pure recon: third-party hosts\n        // the page is allowed to load from). Only collected when opted in.\n        if config.CSPOrigins {\n            if csp := resp.Header.Get(\"Content-Security-Policy\"); csp != \"\" {\n                emitCSPOrigins(urlStr, ParseCSPOrigins(csp))\n            }\n        }\n\n        // HTML branch: extract inline scripts and SRI/CSP from the body.\n        // Each inline script is scanned as its own source so findings are\n        // attributable to the specific tag index.\n        if config.InlineHTML && (isHTML || looksLikeHTML(body, contentType)) {\n            scanHTMLArtifacts(urlStr, body, config)\n        }\n\n        // Process JS analysis features\n        if len(body) > 0 {\n            processedBody := processJSAnalysis(body, config)\n            sensitiveData = reportMatchesWithConfig(urlStr, processedBody, config)\n\n            // Source-map ingestion — fetch <body>.map (or decode inline data\n            // URI), scan every entry in sourcesContent[]. Gated by --sourcemap.\n            if config.SourceMap {\n                n, err := FetchAndScanSourceMap(client, urlStr, body, config)\n                if err != nil && config.Verbose {\n                    fmt.Printf(\"[%sSOURCEMAP%s] %s: %v\\n\", colors[\"YELLOW\"], colors[\"NC\"], urlStr, err)\n                } else if n > 0 && config.Verbose {\n                    fmt.Printf(\"[%sSOURCEMAP%s] %s: scanned %d original sources\\n\",\n                        colors[\"CYAN\"], colors[\"NC\"], urlStr, n)\n                }\n            }\n        } else {\n            sensitiveData = make(map[string][]string)\n        }\n    } else {\n        body, err := os.ReadFile(urlStr)\n        if err != nil {\n            if config.Verbose {\n                fmt.Printf(\"Error reading local file %s: %v\\n\", urlStr, err)\n            }\n            return urlStr, nil\n        }\n\n        processedBody := processJSAnalysis(body, config)\n        sensitiveData = reportMatchesWithConfig(urlStr, processedBody, config)\n    }\n\n    return urlStr, sensitiveData\n}\n\n// processJSAnalysis applies JS analysis features (deobfuscation, sourcemap, etc.)\n// stripJSComments replaces JS line (// ...) and block (/* ... */) comments\n// with spaces, preserving newlines and byte offsets. String literals\n// (\", ', `) are copied verbatim so URLs and secrets that legitimately\n// contain // (e.g. \"https://...\") are kept intact.\nfunc stripJSComments(body []byte) []byte {\n    out := make([]byte, len(body))\n    i := 0\n    for i < len(body) {\n        c := body[i]\n\n        if c == '\"' || c == '\\'' || c == '`' {\n            quote := c\n            out[i] = c\n            i++\n            for i < len(body) {\n                ch := body[i]\n                out[i] = ch\n                if ch == '\\\\' && i+1 < len(body) {\n                    out[i+1] = body[i+1]\n                    i += 2\n                    continue\n                }\n                i++\n                if ch == quote {\n                    break\n                }\n            }\n            continue\n        }\n\n        if c == '/' && i+1 < len(body) && body[i+1] == '/' {\n            for i < len(body) && body[i] != '\\n' {\n                out[i] = ' '\n                i++\n            }\n            continue\n        }\n\n        if c == '/' && i+1 < len(body) && body[i+1] == '*' {\n            out[i] = ' '\n            out[i+1] = ' '\n            i += 2\n            for i < len(body) {\n                if i+1 < len(body) && body[i] == '*' && body[i+1] == '/' {\n                    out[i] = ' '\n                    out[i+1] = ' '\n                    i += 2\n                    break\n                }\n                if body[i] == '\\n' {\n                    out[i] = '\\n'\n                } else {\n                    out[i] = ' '\n                }\n                i++\n            }\n            continue\n        }\n\n        out[i] = c\n        i++\n    }\n    return out\n}\n\nfunc processJSAnalysis(body []byte, config *Config) []byte {\n    content := string(body)\n    \n    // Deobfuscation (basic - can be enhanced)\n    if config.Deobfuscate {\n        content = basicDeobfuscate(content)\n    }\n    \n    // Source map parsing (placeholder - would need actual sourcemap library)\n    if config.SourceMap {\n        // Extract sourcemap URL and parse if available\n        content = extractSourceMap(content)\n    }\n    \n    // Eval analysis - extract strings from eval() calls\n    if config.Eval {\n        content = extractEvalContent(content)\n    }\n    \n    // Obfuscation detection\n    if config.ObfsDetect {\n        if isObfuscated(content) && config.Verbose {\n            fmt.Printf(\"[%sOBFS%s] Obfuscated code detected\\n\", colors[\"YELLOW\"], colors[\"NC\"])\n        }\n    }\n    \n    return []byte(content)\n}\n\n// Basic deobfuscation helpers\nfunc basicDeobfuscate(content string) string {\n    // Remove common obfuscation patterns\n    // This is a basic implementation - can be enhanced\n    content = strings.ReplaceAll(content, \"\\\\x\", \"\")\n    content = strings.ReplaceAll(content, \"\\\\u\", \"\")\n    return content\n}\n\nfunc extractSourceMap(content string) string {\n    // Extract sourcemap references\n    re := regexp.MustCompile(`//# sourceMappingURL=([^\\s]+)`)\n    matches := re.FindAllStringSubmatch(content, -1)\n    if len(matches) > 0 {\n        // Would fetch and parse sourcemap here\n    }\n    return content\n}\n\nfunc extractEvalContent(content string) string {\n    // Extract content from eval() calls for analysis\n    re := regexp.MustCompile(`eval\\s*\\(\\s*[\"']([^\"']+)[\"']`)\n    matches := re.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            content += \"\\n// EVAL: \" + match[1]\n        }\n    }\n    return content\n}\n\nfunc isObfuscated(content string) bool {\n    // Simple heuristics for obfuscation detection\n    if len(content) > 1000 && strings.Count(content, \"\\\\x\") > 50 {\n        return true\n    }\n    if strings.Contains(content, \"eval(\") && strings.Count(content, \"String.fromCharCode\") > 10 {\n        return true\n    }\n    return false\n}\n\n// extractURLParamsWithBaseURLs - Advanced extraction of GET parameters with their base URLs\nfunc extractURLParamsWithBaseURLs(content, source string) []string {\n    var resultURLs []string\n    seenURLs := make(map[string]bool)\n    \n    // Extract base URL from source if it's a URL\n    var baseURL string\n    var sourceDomain string // Store the main domain for fallback\n    if strings.HasPrefix(source, \"http://\") || strings.HasPrefix(source, \"https://\") {\n        parsedURL, err := url.Parse(source)\n        if err == nil {\n            baseURL = parsedURL.Scheme + \"://\" + parsedURL.Host\n            sourceDomain = parsedURL.Scheme + \"://\" + parsedURL.Host\n        }\n    }\n    \n    // Pattern 1: URLSearchParams.get() - Extract parameter names\n    // Match: urlParams.get('param_name') or searchParams.get(\"param_name\")\n    urlParamsGetPattern := regexp.MustCompile(`(?:urlParams|searchParams|params|urlSearchParams|queryParams|locationParams)\\.get\\([\"']([a-zA-Z0-9_\\-\\[\\]]+)[\"']\\)`)\n    matches := urlParamsGetPattern.FindAllStringSubmatch(content, -1)\n    paramSet := make(map[string]bool)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 2: URLSearchParams.getAll() - Extract parameter names\n    urlParamsGetAllPattern := regexp.MustCompile(`(?:urlParams|searchParams|params|urlSearchParams|queryParams)\\.getAll\\([\"']([a-zA-Z0-9_\\-\\[\\]]+)[\"']\\)`)\n    matches = urlParamsGetAllPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 3: URL.searchParams.get() - Extract parameter names\n    urlSearchParamsPattern := regexp.MustCompile(`(?:new\\s+URL\\([^)]+\\)|currentUrl|url|apiUrl|baseUrl)\\.searchParams\\.get\\([\"']([a-zA-Z0-9_\\-\\[\\]]+)[\"']\\)`)\n    matches = urlSearchParamsPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 4: Manual string parsing - Extract from split('&') patterns\n    manualParsePattern := regexp.MustCompile(`pair\\[0\\]\\s*===\\s*[\"']([a-zA-Z0-9_\\-]+)[\"']`)\n    matches = manualParsePattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 5: Custom getParam() function calls\n    customGetParamPattern := regexp.MustCompile(`getParam\\([\"']([a-zA-Z0-9_\\-]+)[\"']\\)`)\n    matches = customGetParamPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 5b: URLSearchParams.has() - parameters that are checked\n    urlParamsHasPattern := regexp.MustCompile(`(?:urlParams|searchParams|params|urlSearchParams|queryParams)\\.has\\([\"']([a-zA-Z0-9_\\-\\[\\]]+)[\"']\\)`)\n    matches = urlParamsHasPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 5c: Direct URL parameter extraction from query strings in code\n    directQueryPattern := regexp.MustCompile(`[\"']([a-zA-Z0-9_\\-]+)[\"']\\s*[:=]\\s*(?:urlParams|searchParams|params)\\.get\\(`)\n    matches = directQueryPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            param := strings.TrimSpace(match[1])\n            if len(param) > 0 && len(param) < 100 {\n                paramSet[param] = true\n            }\n        }\n    }\n    \n    // Pattern 6: Fetch/Axios with URLSearchParams in URL (template literals and strings)\n    fetchWithParamsPattern := regexp.MustCompile(`fetch\\([\"'` + \"`\" + `]([^\"'` + \"`\" + `]+)\\?[^\"'` + \"`\" + `]*[\"'` + \"`\" + `]`)\n    matches = fetchWithParamsPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            fullURL := match[1]\n            // Extract base URL\n            if strings.HasPrefix(fullURL, \"http\") {\n                parsedURL, err := url.Parse(fullURL)\n                if err == nil {\n                    base := parsedURL.Scheme + \"://\" + parsedURL.Host + parsedURL.Path\n                    // Extract params from query string\n                    if parsedURL.RawQuery != \"\" {\n                        queryParams, _ := url.ParseQuery(parsedURL.RawQuery)\n                        var params []string\n                        for key := range queryParams {\n                            if len(key) > 0 && len(key) < 100 {\n                                params = append(params, key)\n                            }\n                        }\n                        if len(params) > 0 {\n                            // Check if URL is from same base domain\n                            urlDomain := extractBaseDomain(parsedURL.Host)\n                            sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                            if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                                sort.Strings(params)\n                                queryStr := strings.Join(params, \"=&\") + \"=\"\n                                resultURL := base + \"?\" + queryStr\n                                if !seenURLs[resultURL] {\n                                    seenURLs[resultURL] = true\n                                    resultURLs = append(resultURLs, resultURL)\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 6b: Fetch with template literals containing parameters\n    fetchTemplatePattern := regexp.MustCompile(`fetch\\(` + \"`\" + `([^` + \"`\" + `]+)\\$\\{[^}]+\\}[^` + \"`\" + `]*` + \"`\" + `\\)`)\n    matches = fetchTemplatePattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            urlPart := match[1]\n            // Extract base URL from template\n            if strings.Contains(urlPart, \"?\") {\n                parts := strings.Split(urlPart, \"?\")\n                if len(parts) > 0 {\n                    basePart := parts[0]\n                    if strings.HasPrefix(basePart, \"http\") {\n                        parsedURL, err := url.Parse(basePart)\n                        if err == nil {\n                            base := parsedURL.Scheme + \"://\" + parsedURL.Host + parsedURL.Path\n                            // Extract parameter names from template (look for ${var} patterns)\n                            paramVarPattern := regexp.MustCompile(`\\$\\{([a-zA-Z_][a-zA-Z0-9_]*)\\}`)\n                            varMatches := paramVarPattern.FindAllStringSubmatch(match[0], -1)\n                            var params []string\n                            for _, vm := range varMatches {\n                                if len(vm) > 1 {\n                                    // Try to find what this variable represents (might be a param name)\n                                    varName := vm[1]\n                                    // Look for this variable being assigned from urlParams.get()\n                                    varPattern := regexp.MustCompile(varName + `\\s*=\\s*(?:urlParams|searchParams|params)\\.get\\([\"']([^\"']+)[\"']\\)`)\n                                    varAssignMatches := varPattern.FindAllStringSubmatch(content, -1)\n                                    if len(varAssignMatches) > 0 {\n                                        paramName := varAssignMatches[0][1]\n                                        params = append(params, paramName)\n                                    }\n                                }\n                            }\n                            // Also extract from query string part if present\n                            if len(parts) > 1 {\n                                queryPart := parts[1]\n                                queryParamPattern := regexp.MustCompile(`([a-zA-Z0-9_\\-]+)=`)\n                                queryMatches := queryParamPattern.FindAllStringSubmatch(queryPart, -1)\n                                for _, qm := range queryMatches {\n                                    if len(qm) > 1 {\n                                        params = append(params, qm[1])\n                                    }\n                                }\n                            }\n                            if len(params) > 0 {\n                                // Check if URL is from same base domain\n                                urlDomain := extractBaseDomain(parsedURL.Host)\n                                sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                                if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                                    sort.Strings(params)\n                                    queryStr := strings.Join(params, \"=&\") + \"=\"\n                                    resultURL := base + \"?\" + queryStr\n                                    if !seenURLs[resultURL] {\n                                        seenURLs[resultURL] = true\n                                        resultURLs = append(resultURLs, resultURL)\n                                    }\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 7: Axios params object\n    axiosParamsPattern := regexp.MustCompile(`axios\\.(?:get|post|put|delete|patch)\\([\"']([^\"']+)[\"'][^)]*params\\s*:\\s*\\{([^}]+)\\}`)\n    matches = axiosParamsPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 2 {\n            apiURL := match[1]\n            paramsStr := match[2]\n            // Extract parameter names from params object\n            paramNamePattern := regexp.MustCompile(`([a-zA-Z0-9_\\-]+)\\s*:`)\n            paramMatches := paramNamePattern.FindAllStringSubmatch(paramsStr, -1)\n            var params []string\n            for _, pm := range paramMatches {\n                if len(pm) > 1 {\n                    param := strings.TrimSpace(pm[1])\n                    if len(param) > 0 && len(param) < 100 {\n                        params = append(params, param)\n                    }\n                }\n            }\n            if len(params) > 0 {\n                // Build full URL\n                if !strings.HasPrefix(apiURL, \"http\") && baseURL != \"\" {\n                    apiURL = baseURL + apiURL\n                } else if !strings.HasPrefix(apiURL, \"http\") {\n                    // Use source domain if available, otherwise skip\n                    if sourceDomain != \"\" {\n                        apiURL = sourceDomain + apiURL\n                    } else {\n                        continue // Skip if no domain available\n                    }\n                }\n                // Check if URL is from same base domain\n                parsedAPIURL, err := url.Parse(apiURL)\n                if err == nil {\n                    urlDomain := extractBaseDomain(parsedAPIURL.Host)\n                    sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                    if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                        sort.Strings(params)\n                        queryStr := strings.Join(params, \"=&\") + \"=\"\n                        // Check if apiURL already has a query string\n                        separator := \"?\"\n                        if strings.Contains(apiURL, \"?\") {\n                            separator = \"&\"\n                        }\n                        resultURL := apiURL + separator + queryStr\n                        if !seenURLs[resultURL] {\n                            seenURLs[resultURL] = true\n                            resultURLs = append(resultURLs, resultURL)\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 8: URLSearchParams constructor with object\n    urlSearchParamsObjPattern := regexp.MustCompile(`new\\s+URLSearchParams\\([^)]*\\{([^}]+)\\}[^)]*\\)`)\n    matches = urlSearchParamsObjPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            paramsStr := match[1]\n            paramNamePattern := regexp.MustCompile(`([a-zA-Z0-9_\\-]+)\\s*:`)\n            paramMatches := paramNamePattern.FindAllStringSubmatch(paramsStr, -1)\n            var params []string\n            for _, pm := range paramMatches {\n                if len(pm) > 1 {\n                    param := strings.TrimSpace(pm[1])\n                    if len(param) > 0 && len(param) < 100 {\n                        params = append(params, param)\n                    }\n                }\n            }\n            if len(params) > 0 {\n                // Find associated URL in nearby context\n                contextStart := strings.LastIndex(content[:strings.Index(content, match[0])], \"fetch(\")\n                contextStart2 := strings.LastIndex(content[:strings.Index(content, match[0])], \"axios.\")\n                if contextStart2 > contextStart {\n                    contextStart = contextStart2\n                }\n                if contextStart > 0 {\n                    // Extract URL from context\n                    urlPattern := regexp.MustCompile(`[\"']([^\"']+)[\"']`)\n                    context := content[contextStart:strings.Index(content, match[0])]\n                    urlMatches := urlPattern.FindAllStringSubmatch(context, -1)\n                    if len(urlMatches) > 0 {\n                        apiURL := urlMatches[0][1]\n                        if !strings.HasPrefix(apiURL, \"http\") && baseURL != \"\" {\n                            apiURL = baseURL + apiURL\n                        } else if !strings.HasPrefix(apiURL, \"http\") {\n                            // Use source domain if available, otherwise skip\n                            if sourceDomain != \"\" {\n                                apiURL = sourceDomain + apiURL\n                            } else {\n                                continue // Skip if no domain available\n                            }\n                        }\n                        // Check if URL is from same base domain\n                        parsedAPIURL, err := url.Parse(apiURL)\n                        if err == nil {\n                            urlDomain := extractBaseDomain(parsedAPIURL.Host)\n                            sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                            if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                                sort.Strings(params)\n                                queryStr := strings.Join(params, \"=&\") + \"=\"\n                                // Check if apiURL already has a query string\n                                separator := \"?\"\n                                if strings.Contains(apiURL, \"?\") {\n                                    separator = \"&\"\n                                }\n                                resultURL := apiURL + separator + queryStr\n                                if !seenURLs[resultURL] {\n                                    seenURLs[resultURL] = true\n                                    resultURLs = append(resultURLs, resultURL)\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 9: Direct URL with query parameters in strings\n    directURLPattern := regexp.MustCompile(`[\"'](https?://[^\"']+\\?[^\"']+)[\"']`)\n    matches = directURLPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            fullURL := match[1]\n            parsedURL, err := url.Parse(fullURL)\n            if err == nil {\n                base := parsedURL.Scheme + \"://\" + parsedURL.Host + parsedURL.Path\n                if parsedURL.RawQuery != \"\" {\n                    queryParams, _ := url.ParseQuery(parsedURL.RawQuery)\n                    var params []string\n                    for key := range queryParams {\n                        if len(key) > 0 && len(key) < 100 {\n                            params = append(params, key)\n                        }\n                    }\n                    if len(params) > 0 {\n                        // Check if URL is from same base domain\n                        urlDomain := extractBaseDomain(parsedURL.Host)\n                        sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                        if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                            sort.Strings(params)\n                            queryStr := strings.Join(params, \"=&\") + \"=\"\n                            resultURL := base + \"?\" + queryStr\n                            if !seenURLs[resultURL] {\n                                seenURLs[resultURL] = true\n                                resultURLs = append(resultURLs, resultURL)\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 10: XHR.open() with parameters\n    xhrPattern := regexp.MustCompile(`\\.open\\([\"'](?:GET|POST|PUT|DELETE|PATCH)[\"']\\s*,\\s*[\"']([^\"']+\\?[^\"']+)[\"']`)\n    matches = xhrPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            fullURL := match[1]\n            parsedURL, err := url.Parse(fullURL)\n            if err == nil {\n                base := parsedURL.Scheme + \"://\" + parsedURL.Host + parsedURL.Path\n                if parsedURL.RawQuery != \"\" {\n                    queryParams, _ := url.ParseQuery(parsedURL.RawQuery)\n                    var params []string\n                    for key := range queryParams {\n                        if len(key) > 0 && len(key) < 100 {\n                            params = append(params, key)\n                        }\n                    }\n                    if len(params) > 0 {\n                        // Check if URL is from same base domain\n                        urlDomain := extractBaseDomain(parsedURL.Host)\n                        sourceBaseDomain := extractBaseDomain(extractDomain(source))\n                        if sourceBaseDomain == \"\" || urlDomain == sourceBaseDomain {\n                            sort.Strings(params)\n                            queryStr := strings.Join(params, \"=&\") + \"=\"\n                            resultURL := base + \"?\" + queryStr\n                            if !seenURLs[resultURL] {\n                                seenURLs[resultURL] = true\n                                resultURLs = append(resultURLs, resultURL)\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    \n    // Pattern 11: Template literals with parameters\n    templateLiteralPattern := regexp.MustCompile(`\\$\\{([^}]+)\\?[^}]*\\}`)\n    matches = templateLiteralPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            // This is complex, skip for now or extract base URL\n        }\n    }\n    \n    // Always create URLs from all collected parameters (even if some URLs were found from fetch/axios)\n    if len(paramSet) > 0 {\n        // Group parameters by context - find parameters used together in same function\n        paramGroups := groupParamsByContext(content, paramSet)\n        \n        // Try to find base URLs in the content\n        baseURLPatterns := []*regexp.Regexp{\n            regexp.MustCompile(`[\"'](https?://[^\"']+/[^\"']*)[\"']`),\n            regexp.MustCompile(`baseURL\\s*[:=]\\s*[\"']([^\"']+)[\"']`),\n            regexp.MustCompile(`apiBase\\s*[:=]\\s*[\"']([^\"']+)[\"']`),\n            regexp.MustCompile(`API_URL\\s*[:=]\\s*[\"']([^\"']+)[\"']`),\n            regexp.MustCompile(`endpointBase\\s*[:=]\\s*[\"']([^\"']+)[\"']`),\n        }\n        \n        var foundBaseURL string\n        for _, pattern := range baseURLPatterns {\n            urlMatches := pattern.FindAllStringSubmatch(content, -1)\n            if len(urlMatches) > 0 {\n                foundBaseURL = urlMatches[0][1]\n                // Remove query string if present\n                if idx := strings.Index(foundBaseURL, \"?\"); idx != -1 {\n                    foundBaseURL = foundBaseURL[:idx]\n                }\n                if !strings.HasSuffix(foundBaseURL, \"/\") {\n                    foundBaseURL = strings.TrimRight(foundBaseURL, \"/\")\n                }\n                break\n            }\n        }\n        \n        if foundBaseURL == \"\" {\n            if baseURL != \"\" {\n                // Use the source domain root path when full path is unknown\n                foundBaseURL = baseURL\n            } else if sourceDomain != \"\" {\n                // Use source domain root when no base URL found\n                foundBaseURL = sourceDomain\n            } else {\n                // Skip if no domain available\n                return resultURLs\n            }\n        }\n        \n        // Create URLs for each parameter group\n        for _, group := range paramGroups {\n            if len(group) > 0 {\n                sort.Strings(group)\n                // Remove duplicates from group\n                uniqueGroup := []string{}\n                seenInGroup := make(map[string]bool)\n                for _, p := range group {\n                    if !seenInGroup[p] {\n                        seenInGroup[p] = true\n                        uniqueGroup = append(uniqueGroup, p)\n                    }\n                }\n                \n                if len(uniqueGroup) == 1 {\n                    // Single parameter - use root path\n                    singleURL := foundBaseURL + \"/?\" + uniqueGroup[0] + \"=\"\n                    if !seenURLs[singleURL] {\n                        seenURLs[singleURL] = true\n                        resultURLs = append(resultURLs, singleURL)\n                    }\n                } else if len(uniqueGroup) > 1 {\n                    // Multiple parameters - join with &, use root path\n                    queryStr := strings.Join(uniqueGroup, \"=&\") + \"=\"\n                    resultURL := foundBaseURL + \"/?\" + queryStr\n                    if !seenURLs[resultURL] {\n                        seenURLs[resultURL] = true\n                        resultURLs = append(resultURLs, resultURL)\n                    }\n                }\n            }\n        }\n        \n        // Also create individual URLs for each parameter (if not already created)\n        paramsList := make([]string, 0, len(paramSet))\n        for param := range paramSet {\n            paramsList = append(paramsList, param)\n        }\n        sort.Strings(paramsList)\n        for _, param := range paramsList {\n            // Use root path when full path is unknown\n            singleURL := foundBaseURL + \"/?\" + param + \"=\"\n            if !seenURLs[singleURL] {\n                seenURLs[singleURL] = true\n                resultURLs = append(resultURLs, singleURL)\n            }\n        }\n    }\n    \n    // Filter URLs by domain - only include URLs from the same base domain as source\n    if sourceDomain != \"\" {\n        sourceBaseDomain := extractBaseDomain(extractDomain(source))\n        if sourceBaseDomain != \"\" {\n            filteredURLs := []string{}\n            for _, resultURL := range resultURLs {\n                // Extract domain from result URL\n                urlDomain := extractDomain(resultURL)\n                if urlDomain != \"\" {\n                    urlBaseDomain := extractBaseDomain(urlDomain)\n                    // Only include if it's from the same base domain\n                    if urlBaseDomain == sourceBaseDomain {\n                        filteredURLs = append(filteredURLs, resultURL)\n                    }\n                } else {\n                    // If we can't extract domain (relative URL), include it (it's from source domain)\n                    filteredURLs = append(filteredURLs, resultURL)\n                }\n            }\n            return filteredURLs\n        }\n    }\n    \n    return resultURLs\n}\n\n// groupParamsByContext groups parameters that are used together in the same function/context\nfunc groupParamsByContext(content string, paramSet map[string]bool) [][]string {\n    var groups [][]string\n    usedParams := make(map[string]bool)\n    \n    // Find function blocks and group parameters within them\n    // Look for common patterns where multiple params are used together\n    \n    // Pattern: Multiple .get() calls in sequence (likely same function)\n    urlParamsPattern := regexp.MustCompile(`(?:urlParams|searchParams|params|urlSearchParams|queryParams)\\.get\\([\"']([a-zA-Z0-9_\\-\\[\\]]+)[\"']\\)`)\n    allMatches := urlParamsPattern.FindAllStringSubmatchIndex(content, -1)\n    \n    // Group consecutive parameter extractions (within 200 chars)\n    var currentGroup []string\n    lastPos := -1\n    \n    for _, match := range allMatches {\n        if len(match) >= 4 {\n            paramName := content[match[2]:match[3]]\n            currentPos := match[0]\n            \n            if paramSet[paramName] && !usedParams[paramName] {\n                if lastPos == -1 || (currentPos - lastPos) < 200 {\n                    // Same context\n                    currentGroup = append(currentGroup, paramName)\n                    usedParams[paramName] = true\n                } else {\n                    // New context\n                    if len(currentGroup) > 0 {\n                        groups = append(groups, currentGroup)\n                    }\n                    currentGroup = []string{paramName}\n                    usedParams[paramName] = true\n                }\n                lastPos = currentPos\n            }\n        }\n    }\n    \n    if len(currentGroup) > 0 {\n        groups = append(groups, currentGroup)\n    }\n    \n    // Also look for URLSearchParams object creation with multiple params\n    urlSearchParamsObjPattern := regexp.MustCompile(`new\\s+URLSearchParams\\([^)]*\\{([^}]+)\\}`)\n    matches := urlSearchParamsObjPattern.FindAllStringSubmatch(content, -1)\n    for _, match := range matches {\n        if len(match) > 1 {\n            paramsStr := match[1]\n            paramNamePattern := regexp.MustCompile(`([a-zA-Z0-9_\\-]+)\\s*:`)\n            paramMatches := paramNamePattern.FindAllStringSubmatch(paramsStr, -1)\n            var group []string\n            for _, pm := range paramMatches {\n                if len(pm) > 1 {\n                    param := strings.TrimSpace(pm[1])\n                    if paramSet[param] && !usedParams[param] {\n                        group = append(group, param)\n                        usedParams[param] = true\n                    }\n                }\n            }\n            if len(group) > 0 {\n                groups = append(groups, group)\n            }\n        }\n    }\n    \n    return groups\n}\n\n// cleanURL removes trailing punctuation and invalid characters from URLs\nfunc cleanURL(urlStr string) string {\n    // Remove trailing punctuation: , ; \\ ) | etc.\n    urlStr = strings.TrimRight(urlStr, \",;\\\\|)\")\n    \n    // Remove any trailing quotes\n    urlStr = strings.Trim(urlStr, `\"'`)\n    \n    // Remove any trailing special characters that are not valid in URLs\n    urlStr = strings.TrimRight(urlStr, \" \\t\\n\\r\")\n    \n    return urlStr\n}\n\n// isValidURL checks if a URL is valid (proper format, not malformed)\nfunc isValidURL(urlStr string) bool {\n    // Parse the URL\n    parsedURL, err := url.Parse(urlStr)\n    if err != nil {\n        return false\n    }\n    \n    // Must have a valid scheme\n    if parsedURL.Scheme != \"http\" && parsedURL.Scheme != \"https\" {\n        return false\n    }\n    \n    // Must have a host\n    if parsedURL.Host == \"\" {\n        return false\n    }\n    \n    // Check for malformed port (e.g., :80x)\n    if strings.Contains(parsedURL.Host, \":\") {\n        hostParts := strings.Split(parsedURL.Host, \":\")\n        if len(hostParts) == 2 {\n            // Port must be numeric\n            port := hostParts[1]\n            for _, char := range port {\n                if char < '0' || char > '9' {\n                    return false // Invalid port (contains non-numeric characters)\n                }\n            }\n        }\n    }\n    \n    return true\n}\n\n// isPlaceholderURL checks if a URL is a placeholder/template (not a real URL)\nfunc isPlaceholderURL(urlStr string) bool {\n    urlLower := strings.ToLower(urlStr)\n    \n    // Common placeholder patterns\n    placeholders := []string{\n        \"servername\", \"hostname\", \"domain.com\", \"example.com\", \"example.org\",\n        \"yourserver\", \"yourdomain\", \"localhost\", \"127.0.0.1\", \"0.0.0.0\",\n        \"server.com\", \"host.com\", \"domain.org\", \"site.com\", \"mydomain.com\",\n        \":port/\", \"accounturl\", \"username\", \"password\",\n    }\n    \n    for _, placeholder := range placeholders {\n        if strings.Contains(urlLower, placeholder) {\n            return true\n        }\n    }\n    \n    // Check for template variables like ${variable} or {variable} or %variable%\n    if strings.Contains(urlStr, \"${\") || strings.Contains(urlStr, \"%{\") || \n       strings.Contains(urlStr, \"{{\") || strings.Contains(urlStr, \"<%\") {\n        return true\n    }\n    \n    return false\n}\n\n// isURLInComment checks if a URL appears to be in a JavaScript comment\nfunc isURLInComment(context, match string) bool {\n    // Find the position of the match in the context\n    matchPos := strings.Index(context, match)\n    if matchPos == -1 {\n        return false\n    }\n    \n    // Look backwards from the match to check for comment markers\n    beforeMatch := context[:matchPos]\n    \n    // Check for single-line comment (//)\n    // Find the last newline before the match\n    lastNewline := strings.LastIndex(beforeMatch, \"\\n\")\n    if lastNewline != -1 {\n        lineBeforeMatch := beforeMatch[lastNewline+1:]\n        // If there's a // before the match on the same line, it's in a comment\n        if strings.Contains(lineBeforeMatch, \"//\") {\n            return true\n        }\n    } else {\n        // No newline found, check entire beforeMatch\n        if strings.Contains(beforeMatch, \"//\") {\n            return true\n        }\n    }\n    \n    // Check for multi-line comment (/* ... */)\n    // Find the last /* and */ before the match\n    lastCommentStart := strings.LastIndex(beforeMatch, \"/*\")\n    lastCommentEnd := strings.LastIndex(beforeMatch, \"*/\")\n    \n    // If /* is found and there's no */ after it (or */ comes before /*), we're in a comment\n    if lastCommentStart != -1 {\n        if lastCommentEnd == -1 || lastCommentEnd < lastCommentStart {\n            // We're inside a multi-line comment\n            return true\n        }\n    }\n    \n    return false\n}\n\n// isMatchInBase64DataURI checks if a match is inside a base64 data URI (e.g., data:image/png;base64,...)\nfunc isMatchInBase64DataURI(context, match string) bool {\n    // Find the position of the match in the context\n    matchPos := strings.Index(context, match)\n    if matchPos == -1 {\n        return false\n    }\n    \n    // Look backwards from the match position to find \"base64,\"\n    // This is more reliable than looking for the full data URI pattern\n    searchStart := matchPos - 300 // Look back up to 300 characters\n    if searchStart < 0 {\n        searchStart = 0\n    }\n    searchContext := context[searchStart:matchPos]\n    \n    // Find the last occurrence of \"base64,\" before the match\n    base64Pos := strings.LastIndex(searchContext, \"base64,\")\n    if base64Pos == -1 {\n        return false\n    }\n    \n    // Check if there's a data URI pattern before \"base64,\"\n    // Pattern: data:image/[type];base64, or data:[type];base64,\n    dataURIPattern := regexp.MustCompile(`data:(?:image/[a-zA-Z0-9+\\-]+|application/[a-zA-Z0-9+\\-]+|text/[a-zA-Z0-9+\\-]+);base64,`)\n    \n    // Get the text before \"base64,\" to check for data URI pattern\n    beforeBase64 := searchContext[:base64Pos+6] // Include \"base64,\" in the check\n    \n    // Check if we have a valid data URI pattern ending with \"base64,\"\n    // Look backwards from \"base64,\" to find \"data:\"\n    dataPos := strings.LastIndex(beforeBase64, \"data:\")\n    if dataPos == -1 {\n        return false\n    }\n    \n    // Extract the potential data URI\n    potentialDataURI := searchContext[dataPos:base64Pos+6]\n    \n    // Check if it matches the data URI pattern\n    if dataURIPattern.MatchString(potentialDataURI) {\n        // The match is after \"base64,\", so it's part of base64 encoded data\n        return true\n    }\n    \n    return false\n}\n\n// isLikelyBase64MediaData checks if a match looks like base64-encoded media content\nfunc isLikelyBase64MediaData(context, match string) bool {\n    // Check if the match itself looks like base64 data\n    if !looksLikeBase64(match) {\n        return false\n    }\n    \n    // Find the position of the match in the context\n    matchPos := strings.Index(context, match)\n    if matchPos == -1 {\n        return false\n    }\n    \n    // Get surrounding context for analysis\n    contextStart := matchPos - 200\n    if contextStart < 0 {\n        contextStart = 0\n    }\n    contextEnd := matchPos + len(match) + 200\n    if contextEnd > len(context) {\n        contextEnd = len(context)\n    }\n    surroundingContext := context[contextStart:contextEnd]\n    \n    // Check for media-related indicators in surrounding context\n    mediaIndicators := []string{\n        \"data:image\", \"data:video\", \"data:audio\",\n        \"base64,\", \"data:application/octet-stream\",\n        \"png\", \"jpg\", \"jpeg\", \"gif\", \"webp\", \"svg\",\n        \"mp4\", \"webm\", \"ogg\", \"wav\", \"mp3\",\n        \"font\", \"woff\", \"woff2\", \"ttf\", \"otf\",\n        \"modernizr\", \"polyfill\", \"encoded\", \"binary\",\n    }\n    \n    lowerContext := strings.ToLower(surroundingContext)\n    for _, indicator := range mediaIndicators {\n        if strings.Contains(lowerContext, indicator) {\n            return true\n        }\n    }\n    \n    // Check for long base64 strings (likely media content)\n    if len(match) > 100 && hasHighBase64Entropy(match) {\n        return true\n    }\n    \n    // Check if it's part of a larger base64 string\n    if isPartOfLargerBase64String(context, matchPos, len(match)) {\n        return true\n    }\n    \n    return false\n}\n\n// looksLikeBase64 checks if a string looks like base64 encoded data\nfunc looksLikeBase64(s string) bool {\n    if len(s) < 16 { // Too short to be meaningful base64\n        return false\n    }\n    \n    // Base64 uses A-Z, a-z, 0-9, +, /, and = for padding\n    validChars := 0\n    for _, r := range s {\n        if (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || \n           (r >= '0' && r <= '9') || r == '+' || r == '/' || r == '=' {\n            validChars++\n        }\n    }\n    \n    // Should be mostly valid base64 characters\n    ratio := float64(validChars) / float64(len(s))\n    return ratio > 0.95\n}\n\n// hasHighBase64Entropy checks if the string has high entropy typical of encoded data\nfunc hasHighBase64Entropy(s string) bool {\n    if len(s) < 32 {\n        return false\n    }\n    \n    // Count character frequency\n    charCount := make(map[rune]int)\n    for _, r := range s {\n        charCount[r]++\n    }\n    \n    // Calculate entropy\n    entropy := 0.0\n    length := float64(len(s))\n    for _, count := range charCount {\n        if count > 0 {\n            p := float64(count) / length\n            entropy -= p * math.Log2(p)\n        }\n    }\n    \n    // Base64 encoded data typically has entropy > 4.5\n    // Media files when base64 encoded usually have high entropy\n    return entropy > 4.5\n}\n\n// isPartOfLargerBase64String checks if the match is part of a larger base64 encoded string\nfunc isPartOfLargerBase64String(context string, matchPos, matchLen int) bool {\n    // Look at characters before and after the match\n    expandedStart := matchPos - 50\n    if expandedStart < 0 {\n        expandedStart = 0\n    }\n    expandedEnd := matchPos + matchLen + 50\n    if expandedEnd > len(context) {\n        expandedEnd = len(context)\n    }\n    \n    expandedString := context[expandedStart:expandedEnd]\n    \n    // Check if the expanded string looks like base64\n    if len(expandedString) > len(context[matchPos:matchPos+matchLen])*2 && looksLikeBase64(expandedString) {\n        return true\n    }\n    \n    return false\n}\n\n// extractDomain extracts the domain from a URL string\nfunc extractDomain(urlStr string) string {\n    if !strings.HasPrefix(urlStr, \"http://\") && !strings.HasPrefix(urlStr, \"https://\") {\n        return \"\"\n    }\n    \n    parsedURL, err := url.Parse(urlStr)\n    if err != nil {\n        return \"\"\n    }\n    \n    host := parsedURL.Host\n    // Remove port if present\n    if idx := strings.Index(host, \":\"); idx != -1 {\n        host = host[:idx]\n    }\n    \n    return host\n}\n\n// extractBaseDomain extracts the base domain (e.g., \"target.com\" from \"assest.target.com\")\nfunc extractBaseDomain(domain string) string {\n    if domain == \"\" {\n        return \"\"\n    }\n    \n    // Handle IP addresses - return as is\n    if net.ParseIP(domain) != nil {\n        return domain\n    }\n    \n    // Handle localhost and single-label domains\n    if !strings.Contains(domain, \".\") {\n        return domain\n    }\n    \n    parts := strings.Split(domain, \".\")\n    if len(parts) < 2 {\n        return domain\n    }\n    \n    // For most cases, base domain is last 2 parts (e.g., target.com)\n    // But handle special cases like .co.uk, .com.au, etc.\n    // For simplicity, we'll use last 2 parts for now\n    // This works for most common cases: target.com, example.org, etc.\n    if len(parts) >= 2 {\n        return parts[len(parts)-2] + \".\" + parts[len(parts)-1]\n    }\n    \n    return domain\n}\n\n// isSameBaseDomain checks if two domains share the same base domain (handles subdomains)\nfunc isSameBaseDomain(domain1, domain2 string) bool {\n    if domain1 == \"\" || domain2 == \"\" {\n        return false\n    }\n    \n    base1 := extractBaseDomain(domain1)\n    base2 := extractBaseDomain(domain2)\n    \n    return base1 == base2 && base1 != \"\"\n}\n\n// isMatchInURL checks if a match appears to be part of a URL from a different domain\nfunc isMatchInURL(context, match, sourceDomain string) bool {\n    if sourceDomain == \"\" {\n        return false // Can't compare if source is not a URL\n    }\n    \n    sourceBaseDomain := extractBaseDomain(sourceDomain)\n    if sourceBaseDomain == \"\" {\n        return false\n    }\n    \n    // Find all URLs in the context\n    urlPattern := regexp.MustCompile(`https?://[^\\s\"'<>\\)]+`)\n    urls := urlPattern.FindAllString(context, -1)\n    \n    for _, urlStr := range urls {\n        // Check if the match is contained within this URL\n        if strings.Contains(urlStr, match) {\n            urlDomain := extractDomain(urlStr)\n            if urlDomain != \"\" {\n                urlBaseDomain := extractBaseDomain(urlDomain)\n                // If the URL's base domain doesn't match the source base domain, filter it out\n                if urlBaseDomain != \"\" && urlBaseDomain != sourceBaseDomain {\n                    return true // Match is part of a URL from a different base domain\n                }\n            }\n        }\n    }\n    \n    return false\n}\n\n// filterMatchesByDomain filters out matches that are from URLs on different domains\nfunc filterMatchesByDomain(matches []string, sourceURL string) []string {\n    sourceDomain := extractDomain(sourceURL)\n    if sourceDomain == \"\" {\n        return matches // Can't filter if source is not a URL\n    }\n    \n    sourceBaseDomain := extractBaseDomain(sourceDomain)\n    if sourceBaseDomain == \"\" {\n        return matches\n    }\n    \n    filtered := []string{}\n    urlPattern := regexp.MustCompile(`https?://[^\\s\"'<>]+`)\n    \n    for _, match := range matches {\n        shouldInclude := true\n        \n        // Check if match is a complete URL\n        if urlPattern.MatchString(match) {\n            matchDomain := extractDomain(match)\n            if matchDomain != \"\" {\n                matchBaseDomain := extractBaseDomain(matchDomain)\n                // Only include if it's from the same base domain\n                if matchBaseDomain != \"\" && matchBaseDomain != sourceBaseDomain {\n                    shouldInclude = false // Different base domain URL\n                }\n            }\n        } else {\n            // Check if match appears to be part of a URL by looking for common URL indicators\n            // This handles cases where the regex matched part of a URL string\n            \n            // Check for email addresses that might be part of URLs\n            if strings.Contains(match, \"@\") {\n                // Try to extract domain from email\n                emailParts := strings.Split(match, \"@\")\n                if len(emailParts) == 2 {\n                    emailDomain := emailParts[1]\n                    emailBaseDomain := extractBaseDomain(emailDomain)\n                    // If email domain is from different base domain, filter it out\n                    if emailBaseDomain != \"\" && emailBaseDomain != sourceBaseDomain {\n                        shouldInclude = false\n                    }\n                }\n            }\n            \n            // For other patterns (UUIDs, etc.), we rely on the context check in isMatchInURL\n            // This secondary filter is mainly for additional safety\n        }\n        \n        if shouldInclude {\n            filtered = append(filtered, match)\n        }\n    }\n    \n    return filtered\n}\n\n// reportMatchesWithConfig enhanced reporting with all security analysis features\nfunc reportMatchesWithConfig(source string, body []byte, config *Config) map[string][]string {\n    body = stripJSComments(body)\n    matchesMap := make(map[string][]string)\n    \n    // Select patterns based on config\n    patternsToUse := make(map[string]*regexp.Regexp)\n    \n    // Check if any Security Analysis flag is set\n    hasSecurityFlag := config.Secrets || config.Tokens || config.GraphQL || \n                       config.Firebase || config.Links || config.Internal || \n                       config.Bypass || config.Params || config.ParamURLs\n    \n    // JS Analysis flags (-d, -m, -e, -z) are modifiers that work WITH pattern detection\n    // They don't disable pattern detection, they just modify the JS before analysis\n    // So if ONLY JS Analysis flags are set, we should still run ALL patterns (normal mode)\n    \n    // If NO Security Analysis flags are set, use all basic patterns (normal mode)\n    // If Security Analysis flags ARE set, ONLY use those specific patterns\n    if !hasSecurityFlag {\n        // Normal mode: include all basic patterns\n        // This includes when ONLY JS Analysis flags are set (like -d alone)\n        for name, pattern := range regexPatterns {\n            patternsToUse[name] = pattern\n        }\n    }\n    \n    // Add specialized patterns based on flags (ONLY if flag is set)\n    if config.Secrets {\n        // Add only secret-related patterns from regexPatterns\n        secretPatterns := []string{\n            // Original patterns\n            \"Google API\", \"Firebase\", \"Amazon Aws Access Key ID\", \"Amazon Mws Auth Token\",\n            \"Facebook Access Token\", \"Authorization Basic\", \"Authorization Bearer\", \"Authorization Api\",\n            \"Twilio Api Key\", \"Twilio Account Sid\", \"Twilio App Sid\", \"Paypal Braintre Access Token\",\n            \"Square Oauth Secret\", \"Square Access Token\", \"Stripe Standard Api\", \"Stripe Restricted Api\",\n            \"Authorization Github Token\", \"Github Access Token\", \"Rsa Private Key\", \"Ssh Dsa Private Key\",\n            \"Ssh Dc Private Key\", \"Pgp Private Block\", \"Ssh Private Key\", \"Aws Api Key\", \"Slack Token\",\n            \"Ssh Priv Key\", \"Heroku Api Key 2\", \"Heroku Api Key 3\", \"Slack Webhook Url\", \"Dropbox Access Token\",\n            \"Salesforce Access Token\", \"Pem Private Key\", \"Google Cloud Sa Key\", \"Stripe Publishable Key\",\n            \"Azure Storage Account Key\", \"Instagram Access Token\", \"Generic Api Key\", \"Generic Secret\",\n\n            // AI/LLM API Keys (CRITICAL)\n            \"OpenAI API Key\", \"OpenAI API Key Project\", \"OpenAI API Key Svc\", \"Anthropic API Key\",\n            \"HuggingFace Token\", \"Cohere API Key\", \"Replicate API Token\", \"Google AI API Key\",\n\n            // AWS Secrets (CRITICAL)\n            \"AWS Secret Access Key\", \"AWS Session Token\",\n\n            // Database Connection Strings (CRITICAL)\n            \"MongoDB Connection String\", \"PostgreSQL Connection String\", \"MySQL Connection String\",\n            \"Redis Connection String\", \"MSSQL Connection String\", \"Database URL Generic\",\n\n            // Azure Secrets (HIGH)\n            \"Azure Client Secret\", \"Azure Storage Connection\", \"Azure SAS Token\", \"Azure SQL Connection\",\n\n            // Cloud Providers (HIGH)\n            \"DigitalOcean Token\", \"DigitalOcean OAuth\", \"DigitalOcean Refresh\", \"Linode API Token\",\n            \"Vultr API Key\", \"Hetzner API Token\", \"Oracle Cloud API Key\", \"IBM Cloud API Key\",\n\n            // CI/CD Tokens (HIGH - supply chain)\n            \"NPM Access Token\", \"PyPI API Token\", \"NuGet API Key\", \"RubyGems API Key\",\n            \"CircleCI Token\", \"Travis CI Token\", \"Jenkins API Token\", \"Bitbucket App Password\",\n            \"Codecov Token\", \"Vercel Token\", \"Netlify Token\",\n\n            // Infrastructure (CRITICAL)\n            \"Vault Token\", \"Kubernetes Token\", \"Docker Registry Password\", \"Terraform Cloud Token\", \"Pulumi Access Token\",\n\n            // Payment Processors (CRITICAL)\n            \"Adyen API Key\", \"Klarna API Key\", \"Razorpay Key\", \"Coinbase API Secret\", \"Binance API Secret\",\n\n            // Communication Services (HIGH)\n            \"Twilio Auth Token\", \"Pusher Secret\", \"Vonage API Secret\", \"Plivo Auth Token\",\n            \"MessageBird API Key\", \"Intercom Access Token\", \"Zendesk API Token\",\n\n            // Search/Analytics (HIGH)\n            \"Algolia Admin API Key\", \"Elasticsearch API Key\", \"Mixpanel API Secret\", \"Amplitude API Key\",\n\n            // Monitoring/Logging (HIGH)\n            \"New Relic License Key\", \"New Relic API Key\", \"New Relic Insights Key\",\n            \"Loggly Token\", \"Splunk HEC Token\", \"Sumo Logic Access Key\", \"Grafana API Key\", \"PagerDuty API Key\",\n\n            // Backend as a Service (HIGH)\n            \"Supabase Service Role Key\", \"Firebase Admin SDK Key\", \"Auth0 Client Secret\", \"Okta API Token Alt\",\n\n            // Cloud Storage (HIGH)\n            \"Cloudinary Secret\", \"Cloudinary URL\", \"Backblaze Application Key\", \"Wasabi Access Key\",\n\n            // Feature Flags\n            \"LaunchDarkly SDK Key\", \"LaunchDarkly API Key\", \"Split.io API Key\", \"Statsig Secret\",\n\n            // Version Control (HIGH)\n            \"GitLab Pipeline Token\", \"GitLab Runner Token\", \"GitHub App Private Key\", \"Bitbucket OAuth Secret\",\n\n            // CMS/Content\n            \"Contentful Management Token\", \"Contentful Delivery Token\", \"Sanity Token\", \"Strapi API Token\",\n\n            // Email Services (HIGH)\n            \"Postmark Server Token\", \"SparkPost API Key\", \"Mailjet API Secret\", \"Mandrill API Key\", \"Customer.io API Key\",\n\n            // Maps/Location\n            \"Mapbox Secret Token\", \"Here API Key\", \"TomTom API Key\",\n\n            // Social/OAuth Secrets\n            \"LinkedIn Client Secret\", \"Spotify Client Secret\", \"Dropbox App Secret\",\n\n            // Hardcoded Credentials\n            \"Private Key Inline\", \"Password Hardcoded\", \"Secret Key Hardcoded\",\n        }\n        for _, name := range secretPatterns {\n            if pattern, exists := regexPatterns[name]; exists {\n                patternsToUse[name] = pattern\n            }\n        }\n    }\n    \n    if config.Tokens {\n        // JWT patterns\n        jwtPattern := regexp.MustCompile(`ey[A-Za-z0-9-_=]+\\.[A-Za-z0-9-_=]+\\.?[A-Za-z0-9-_.+/=]*`)\n        patternsToUse[\"JWT Token\"] = jwtPattern\n    }\n    \n    if config.Firebase {\n        // Firebase patterns\n        if pattern, exists := regexPatterns[\"Firebase\"]; exists {\n            patternsToUse[\"Firebase\"] = pattern\n        }\n        if pattern, exists := regexPatterns[\"Firebase Url\"]; exists {\n            patternsToUse[\"Firebase Url\"] = pattern\n        }\n    }\n    \n    if config.GraphQL {\n        // GraphQL patterns - more specific to avoid jQuery false positives\n        // Pattern 1: URLs containing /graphql path\n        graphqlPattern1 := regexp.MustCompile(`(?i)[\"']([^\"']*\\/graphql[^\"']*)[\"']`)\n        patternsToUse[\"GraphQL URL\"] = graphqlPattern1\n        \n        // Pattern 2: GraphQL endpoint in fetch/axios calls\n        graphqlPattern2 := regexp.MustCompile(`(?i)(?:fetch|axios|request|post|get)\\s*\\([^)]*[\"']([^\"']*\\/graphql[^\"']*)[\"']`)\n        patternsToUse[\"GraphQL API Call\"] = graphqlPattern2\n        \n        // Pattern 3: GraphQL variable assignments\n        graphqlPattern3 := regexp.MustCompile(`(?i)(?:graphql|gql)[\\s]*[:=][\\s]*[\"']([^\"']+)[\"']`)\n        patternsToUse[\"GraphQL Endpoint\"] = graphqlPattern3\n        \n        // Pattern 4: GraphQL query/mutation (NOT jQuery - must have query/mutation keyword)\n        graphqlPattern4 := regexp.MustCompile(`(?i)\\b(?:query|mutation|subscription)\\s+\\w+\\s*\\{[^}]+\\}`)\n        patternsToUse[\"GraphQL Query\"] = graphqlPattern4\n        \n        // Pattern 5: GraphQL endpoint in config objects\n        graphqlPattern5 := regexp.MustCompile(`(?i)[\"'](?:graphql_?endpoint|graphql_?url|graphql_?api|gql_?endpoint)[\"']\\s*[:=]\\s*[\"']([^\"']+)[\"']`)\n        patternsToUse[\"GraphQL Config\"] = graphqlPattern5\n    }\n    \n    if config.Links {\n        // Extract URLs but exclude common trailing punctuation\n        linkPattern := regexp.MustCompile(`(https?://[^\\s\"'<>,;\\\\()]+)`)\n        patternsToUse[\"Link/URL\"] = linkPattern\n    }\n    \n    // Extract parameters - Advanced URL parameter detection with base URLs (new -PU flag)\n    if config.ParamURLs {\n        // Use advanced extraction that associates parameters with URLs\n        paramURLs := extractURLParamsWithBaseURLs(string(body), source)\n        \n        // Global deduplication across all files\n        globalSeenMutex.Lock()\n        if len(paramURLs) > 0 {\n            globalFoundAny = true // Mark that we found something\n        }\n        for _, paramURL := range paramURLs {\n            // Only print if we haven't seen this URL before globally\n            if !globalSeenAll[paramURL] {\n                globalSeenAll[paramURL] = true\n                fmt.Println(paramURL)\n            }\n        }\n        globalSeenMutex.Unlock()\n        // Return early - don't run sensitive data detection when using -PU flag\n        // Return empty map to prevent any sensitive data from being printed\n        return make(map[string][]string)\n    }\n    \n    // Extract parameters - Basic parameter discovery (old -P flag behavior)\n    if config.Params {\n        paramSet := make(map[string]bool) // Use set to deduplicate\n        \n        // URL parameters: ?param=value or &param=value\n        urlParamPattern := regexp.MustCompile(`[?&]([a-zA-Z0-9_\\-]+)\\s*=`)\n        matches := urlParamPattern.FindAllStringSubmatch(string(body), -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                param := strings.TrimSpace(match[1])\n                if len(param) > 0 && len(param) < 100 {\n                    paramSet[param] = true\n                }\n            }\n        }\n        \n        // Function parameters in API calls: apiCall({param: value}) or apiCall(\"param\", \"value\")\n        funcParamPattern := regexp.MustCompile(`(?:get|post|put|delete|patch|fetch|axios|request)\\s*\\([^)]*[\"']([a-zA-Z0-9_\\-]+)[\"']\\s*[:=]`)\n        matches = funcParamPattern.FindAllStringSubmatch(string(body), -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                param := strings.TrimSpace(match[1])\n                if len(param) > 0 && len(param) < 100 {\n                    paramSet[param] = true\n                }\n            }\n        }\n        \n        // Query string parameters: ?key=value patterns\n        queryPattern := regexp.MustCompile(`[\"']([^\"']*\\?[a-zA-Z0-9_\\-]+=)`)\n        matches = queryPattern.FindAllStringSubmatch(string(body), -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                queryStr := match[1]\n                // Extract individual params from query string\n                paramParts := regexp.MustCompile(`([a-zA-Z0-9_\\-]+)=`).FindAllStringSubmatch(queryStr, -1)\n                for _, part := range paramParts {\n                    if len(part) > 1 {\n                        param := strings.TrimSpace(part[1])\n                        if len(param) > 0 && len(param) < 100 {\n                            paramSet[param] = true\n                        }\n                    }\n                }\n            }\n        }\n        \n        // Also look for common parameter patterns: paramName: or \"paramName\":\n        commonParamPattern := regexp.MustCompile(`[\"']?([a-zA-Z0-9_\\-]{2,50})[\"']?\\s*[:=]\\s*[\"']?[^\"',}\\s)]`)\n        matches = commonParamPattern.FindAllStringSubmatch(string(body), -1)\n        for _, match := range matches {\n            if len(match) > 1 {\n                param := strings.TrimSpace(match[1])\n                // Filter out common JS keywords\n                if len(param) > 1 && len(param) < 100 && \n                   param != \"function\" && param != \"return\" && param != \"var\" && \n                   param != \"let\" && param != \"const\" && param != \"if\" && \n                   param != \"else\" && param != \"for\" && param != \"while\" {\n                    paramSet[param] = true\n                }\n            }\n        }\n        \n        // Convert set to slice\n        for param := range paramSet {\n            matchesMap[\"Parameter\"] = append(matchesMap[\"Parameter\"], param)\n        }\n    }\n    \n    // v0.6 — run the curated provider registry first. Findings here have\n    // provider-specific validators (AWS prefix+base32, Stripe base62, GitHub\n    // CRC32 checksum, JWT structural decode, Slack hyphen-segment) and are\n    // strictly higher-precision than the legacy regex map. Values detected\n    // here are tracked so the legacy loop below does not re-report them.\n    coveredByRegistry := make(map[string]struct{})\n    if !config.ParamURLs && !config.Params {\n        registryFindings := analyzeBody(source, body, config.MinConfidence)\n\n        // v0.6+ — optional liveness verification. Off by default. Each call\n        // is bounded by --verify-timeout and goes through the host limiter\n        // in verify.go to avoid stampeding a provider's API. The verify\n        // result is folded back into the Finding so JSON consumers can see\n        // alive/dead/unknown alongside the confidence score.\n        if config.Verify {\n            verifyClient := createHTTPClientWithConfig(config)\n            timeout := time.Duration(config.VerifyTimeout) * time.Second\n            if timeout <= 0 {\n                timeout = 10 * time.Second\n            }\n            for _, f := range registryFindings {\n                if _, has := verifierRegistry[f.RuleID]; !has {\n                    registerVerifiers()\n                    if _, has := verifierRegistry[f.RuleID]; !has {\n                        continue\n                    }\n                }\n                if globalStats != nil {\n                    statInc(&globalStats.VerifyAttempts)\n                }\n                vr := runVerify(f.RuleID, f.Value, verifyClient, timeout)\n                f.Verify = &vr\n                if vr.Alive {\n                    f.Verified = true\n                    f.Confidence = 1.0\n                    if globalStats != nil {\n                        statInc(&globalStats.VerifyAlive)\n                    }\n                } else if vr.Error != \"\" {\n                    if globalStats != nil {\n                        statInc(&globalStats.VerifyError)\n                    }\n                } else {\n                    if globalStats != nil {\n                        statInc(&globalStats.VerifyDead)\n                    }\n                }\n            }\n        }\n\n        for _, f := range registryFindings {\n            label := f.Name\n            display := f.Value\n            if config.ShowConfidence {\n                tag := fmt.Sprintf(\" [conf=%.2f\", f.Confidence)\n                if f.Verified {\n                    tag += \" VERIFIED\"\n                }\n                tag += \"]\"\n                display = display + tag\n            }\n            matchesMap[label] = append(matchesMap[label], display)\n            coveredByRegistry[f.Value] = struct{}{}\n        }\n    }\n\n    // Filter by domain scope\n    if config.Domain != \"\" {\n        if !strings.Contains(source, config.Domain) {\n            return matchesMap\n        }\n    }\n    \n    // Filter by extension\n    if config.Ext != \"\" {\n        extList := strings.Split(config.Ext, \",\")\n        matched := false\n        for _, ext := range extList {\n            ext = strings.TrimSpace(ext)\n            if strings.HasSuffix(source, ext) {\n                matched = true\n                break\n            }\n        }\n        if !matched {\n            return matchesMap\n        }\n    }\n    \n    // Run pattern matching\n    bodyStr := string(body)\n    sourceDomain := extractDomain(source)\n    \n    for name, pattern := range patternsToUse {\n        if pattern.Match(body) {\n            // Find all matches with their positions to check context\n            allMatches := pattern.FindAllStringSubmatchIndex(bodyStr, -1)\n            matches := []string{}\n            \n            for _, matchIndex := range allMatches {\n                if len(matchIndex) >= 2 {\n                    // Extract the match - use capture group if available, otherwise use full match\n                    var match string\n                    var start, end int\n                    if len(matchIndex) >= 4 && matchIndex[2] != -1 && matchIndex[3] != -1 {\n                        // Use first capture group if available\n                        match = bodyStr[matchIndex[2]:matchIndex[3]]\n                        start = matchIndex[2]\n                        end = matchIndex[3]\n                    } else {\n                        // Use full match\n                        match = bodyStr[matchIndex[0]:matchIndex[1]]\n                        start = matchIndex[0]\n                        end = matchIndex[1]\n                    }\n                    \n                    // Check context around the match to see if it's part of a URL\n                    \n                    // Look at surrounding context (50 chars before and after for URL check)\n                    contextStart := start - 50\n                    if contextStart < 0 {\n                        contextStart = 0\n                    }\n                    contextEnd := end + 50\n                    if contextEnd > len(bodyStr) {\n                        contextEnd = len(bodyStr)\n                    }\n                    context := bodyStr[contextStart:contextEnd]\n                    \n                    // Check if match is part of a URL in the context\n                    if isMatchInURL(context, match, sourceDomain) {\n                        continue // Skip this match - it's from a different domain URL\n                    }\n                    \n                    // For base64 check, we need more context (200 chars before)\n                    base64ContextStart := start - 200\n                    if base64ContextStart < 0 {\n                        base64ContextStart = 0\n                    }\n                    base64ContextEnd := end + 50\n                    if base64ContextEnd > len(bodyStr) {\n                        base64ContextEnd = len(bodyStr)\n                    }\n                    base64Context := bodyStr[base64ContextStart:base64ContextEnd]\n                    \n                    // Check if match is inside a base64 data URI (e.g., data:image/png;base64,...)\n                    if isMatchInBase64DataURI(base64Context, match) {\n                        continue // Skip this match - it's part of base64 encoded image/data\n                    }\n                    \n                    // Check if match looks like base64-encoded media data (improved detection)\n                    if isLikelyBase64MediaData(base64Context, match) {\n                        continue // Skip this match - it's likely base64 encoded media content\n                    }\n                    \n                    // For Links flag, clean up URLs and filter\n                    if config.Links && (name == \"Link/URL\") {\n                        // Clean trailing punctuation and invalid characters\n                        match = cleanURL(match)\n                        \n                        // Skip if URL is empty after cleaning\n                        if match == \"\" {\n                            continue\n                        }\n                        \n                        // Validate URL format (skip malformed URLs like http://example.com:80x/)\n                        if !isValidURL(match) {\n                            continue\n                        }\n                        \n                        // Skip placeholder/template URLs (like http://servername:port/accountURL)\n                        if isPlaceholderURL(match) {\n                            continue\n                        }\n                        \n                        // Check if URL is in a comment - skip if it is\n                        if isURLInComment(context, match) {\n                            continue\n                        }\n                        \n                        // Filter: only show URLs from SAME base domain (user wants their own domain URLs)\n                        // Skip external domains (like ad360plus.com, MuazKhan.com)\n                        matchDomain := extractDomain(match)\n                        if matchDomain != \"\" && sourceDomain != \"\" {\n                            matchBaseDomain := extractBaseDomain(matchDomain)\n                            sourceBaseDomain := extractBaseDomain(sourceDomain)\n                            // Skip URLs from DIFFERENT base domains (external URLs)\n                            if matchBaseDomain != sourceBaseDomain {\n                                continue\n                            }\n                        }\n                    }\n\n                    // Filter unwanted emails\n                    if name == \"Email\" && isUnwantedEmail(match) {\n                        continue\n                    }\n\n                    // v0.6 — skip if curated registry already reported this value.\n                    if _, already := coveredByRegistry[match]; already {\n                        continue\n                    }\n\n                    // v0.6 — false-positive pipeline.\n                    // Skip filter for URL/Link/GraphQL/Param classes which have\n                    // their own dedicated cleanup paths above; they are not\n                    // secret-class detections.\n                    skipFP := name == \"Link/URL\" || strings.HasPrefix(name, \"GraphQL \") ||\n                        name == \"Parameter\" || name == \"Email\" || strings.HasPrefix(name, \"Firebase Url\")\n                    if !config.NoFPFilter && !skipFP {\n                        keep, score, _ := applyLegacyFPFilter(name, match, bodyStr, source, start, end)\n                        if !keep {\n                            continue\n                        }\n                        if config.MinConfidence > 0 && score < config.MinConfidence {\n                            if globalStats != nil {\n                                statInc(&globalStats.DroppedBelowConf)\n                            }\n                            continue\n                        }\n                        // Apply ignore-list and diff-baseline to the legacy\n                        // path too. Hash the raw value (before any\n                        // confidence-suffix mutation below).\n                        if activeIgnoreList != nil || activeDiffSeen != nil {\n                            rawHash := hashValue(match)\n                            syn := &Finding{\n                                ValueHash: rawHash,\n                                RuleID:    \"legacy:\" + name,\n                                Value:     match,\n                                Source:    source,\n                            }\n                            if activeIgnoreList != nil && activeIgnoreList.ShouldIgnore(syn) {\n                                continue\n                            }\n                            if activeDiffSeen != nil && activeDiffSeen[rawHash] {\n                                continue\n                            }\n                        }\n                        // matchesMap holds the FULL value (file output relies on\n                        // it). Redaction happens at console-print time only.\n                        if config.ShowConfidence {\n                            match = fmt.Sprintf(\"%s [conf=%.2f]\", match, score)\n                        }\n                        if globalStats != nil {\n                            statInc(&globalStats.FindingsAfterFilter)\n                        }\n                    }\n\n                    matches = append(matches, match)\n                }\n            }\n            \n            if len(matches) > 0 {\n                // Additional filtering for known false positives\n                matches = filterMatchesByDomain(matches, source)\n                \n                if len(matches) > 0 {\n                    if config.Regex != \"\" {\n                        filterPattern, err := regexp.Compile(config.Regex)\n                        if err == nil {\n                            filteredMatches := []string{}\n                            for _, match := range matches {\n                                if filterPattern.MatchString(match) {\n                                    filteredMatches = append(filteredMatches, match)\n                                }\n                            }\n                            if len(filteredMatches) > 0 {\n                                matchesMap[name] = append(matchesMap[name], filteredMatches...)\n                            }\n                        }\n                    } else {\n                        matchesMap[name] = append(matchesMap[name], matches...)\n                    }\n                }\n            }\n        }\n    }\n    \n    // Filter internal endpoints only\n    if config.Internal {\n        filtered := make(map[string][]string)\n        for name, matches := range matchesMap {\n            for _, match := range matches {\n                if strings.Contains(match, \"internal\") || strings.Contains(match, \"private\") ||\n                   strings.Contains(match, \"127.0.0.1\") || strings.Contains(match, \"localhost\") {\n                    filtered[name] = append(filtered[name], match)\n                }\n            }\n        }\n        matchesMap = filtered\n    }\n    \n    // Output formatting\n    if len(matchesMap) > 0 {\n        // Special handling for Params flag - just show parameter names, one per line\n        if config.Params && len(matchesMap[\"Parameter\"]) > 0 {\n            // Global deduplication across all files\n            globalSeenMutex.Lock()\n            globalFoundAny = true // Mark that we found something\n            for _, param := range matchesMap[\"Parameter\"] {\n                // Only print if we haven't seen this parameter before globally\n                if !globalSeenParams[param] {\n                    globalSeenParams[param] = true\n                    fmt.Println(param)\n                }\n            }\n            globalSeenMutex.Unlock()\n            return matchesMap\n        }\n        \n        if config.JSON {\n            outputJSON(source, matchesMap)\n        } else if config.CSV {\n            outputCSV(source, matchesMap)\n        } else if config.Burp {\n            outputBurp(source, matchesMap)\n        } else if config.SARIF || config.NDJSON {\n            // Per-source console output is suppressed; structured output\n            // is emitted exactly once at end of run via emitFinalOutput.\n        } else {\n            globalSeenMutex.Lock()\n            globalFoundAny = true\n            headerPrinted := false\n            for name, matches := range matchesMap {\n                for _, match := range matches {\n                    key := name + \":\" + match\n                    if !globalSeenAll[key] {\n                        globalSeenAll[key] = true\n                        if !headerPrinted && !config.Quiet {\n                            fmt.Printf(\"[%s FOUND %s] Sensitive data at: %s\\n\", colors[\"RED\"], colors[\"NC\"], source)\n                            headerPrinted = true\n                        }\n                        fmt.Printf(\"Sensitive Data [%s%s%s]: %s\\n\", colors[\"YELLOW\"], name, colors[\"NC\"], match)\n                    }\n                }\n            }\n            globalSeenMutex.Unlock()\n        }\n    } else {\n        // Don't show MISSING if:\n        // 1. FoundOnly flag is set\n        // 2. ANY flag is set (Security Analysis OR JS Analysis flags)\n        // 3. Quiet mode is enabled\n        // MISSING messages should only show in pure \"normal\" mode (no flags at all)\n        hasAnyFlag := config.Params || config.ParamURLs || config.Secrets || config.Tokens || \n                     config.GraphQL || config.Firebase || config.Links || config.Internal || \n                     config.Bypass || config.ExtractEndpoints || config.Deobfuscate || \n                     config.SourceMap || config.Eval || config.ObfsDetect\n        \n        // Buffer MISSING messages only for pure normal mode (no flags at all)\n        if !config.FoundOnly && !hasAnyFlag && !config.Quiet {\n            globalSeenMutex.Lock()\n            foundAny := globalFoundAny\n            globalSeenMutex.Unlock()\n            \n            // Only buffer if no findings have been made yet\n            if !foundAny {\n                missingMutex.Lock()\n                missingMessages = append(missingMessages, source)\n                missingMutex.Unlock()\n            }\n        }\n    }\n    \n    return matchesMap\n}\n\n// Output formatters\n//\n// v0.6 — JSON output is now schema_version 2. Downstream parsers should key\n// off the top-level `schema_version` field. Within a major schema version\n// changes are additive only.\nfunc outputJSON(source string, matchesMap map[string][]string) {\n    findings := flushFindings()\n    result := map[string]interface{}{\n        \"schema_version\": SchemaVersion,\n        \"source\":         source,\n        \"matches\":        matchesMap,\n        \"findings\":       findings,\n        \"tool\":           map[string]string{\"name\": \"jshunter\", \"version\": version},\n    }\n    jsonData, _ := json.MarshalIndent(result, \"\", \"  \")\n    fmt.Println(string(jsonData))\n}\n\nfunc outputCSV(source string, matchesMap map[string][]string) {\n    writer := csv.NewWriter(os.Stdout)\n    writer.Write([]string{\"Source\", \"Type\", \"Value\"})\n    for name, matches := range matchesMap {\n        for _, match := range matches {\n            writer.Write([]string{source, name, match})\n        }\n    }\n    writer.Flush()\n}\n\nfunc outputBurp(source string, matchesMap map[string][]string) {\n    // Burp Suite format (simplified)\n    for name, matches := range matchesMap {\n        for _, match := range matches {\n            fmt.Printf(\"%s\\t%s\\t%s\\n\", source, name, match)\n        }\n    }\n}\n\n// Wrapper functions using Config - enhanced versions\nfunc processInputsWithConfig(url string, config *Config) {\n    // Reset global state for new processing session\n    globalSeenMutex.Lock()\n    globalFoundAny = false\n    globalSeenMutex.Unlock()\n    missingMutex.Lock()\n    missingMessages = missingMessages[:0]\n    missingMutex.Unlock()\n    \n    // Use crawling if depth > 1\n    if config.CrawlDepth > 1 && url != \"\" {\n        visited := make(map[string]bool)\n        crawlAndProcessJS(url, config, config.CrawlDepth, visited)\n        return\n    }\n    \n    var wg sync.WaitGroup\n    urlChannel := make(chan string)\n\n    var fileWriter *os.File\n    if config.Output != \"\" {\n        var err error\n        fileWriter, err = os.Create(config.Output)\n        if err != nil {\n            fmt.Printf(\"Error creating output file: %v\\n\", err)\n            return\n        }\n        defer fileWriter.Close()\n    }\n\n    for i := 0; i < config.Threads; i++ {\n        wg.Add(1)\n        go func() {\n            defer wg.Done()\n            for u := range urlChannel {\n                _, sensitiveData := searchForSensitiveDataWithConfig(u, config)\n\n                // Don't print sensitive data if ParamURLs flag is set (user only wants URL params)\n                if !config.ParamURLs {\n                    if fileWriter != nil {\n                        fmt.Fprintln(fileWriter, \"URL:\", u)\n                        for name, matches := range sensitiveData {\n                            for _, match := range matches {\n                                fmt.Fprintf(fileWriter, \"Sensitive Data [%s%s%s]: %s\\n\", colors[\"YELLOW\"], name, colors[\"NC\"], match)\n                            }\n                        }\n                    }\n                }\n            }\n        }()\n    }\n\n    if err := enqueueURLs(url, config.List, urlChannel, config.Regex); err != nil {\n        fmt.Printf(\"Error in input processing: %v\\n\", err)\n        close(urlChannel)\n        return\n    }\n\n    close(urlChannel)\n    wg.Wait()\n    \n    // Print buffered MISSING messages only if no findings were made\n    // AND no flags are set (pure normal mode only)\n    globalSeenMutex.Lock()\n    foundAny := globalFoundAny\n    globalSeenMutex.Unlock()\n    \n    hasAnyFlag := config.Params || config.ParamURLs || config.Secrets || config.Tokens || \n                 config.GraphQL || config.Firebase || config.Links || config.Internal || \n                 config.Bypass || config.ExtractEndpoints || config.Deobfuscate || \n                 config.SourceMap || config.Eval || config.ObfsDetect\n    \n    if !foundAny && !config.FoundOnly && !hasAnyFlag && !config.Quiet {\n        missingMutex.Lock()\n        for _, msg := range missingMessages {\n            fmt.Printf(\"[%sMISSING%s] No sensitive data found at: %s\\n\", colors[\"BLUE\"], colors[\"NC\"], msg)\n        }\n        missingMessages = missingMessages[:0] // Clear the buffer\n        missingMutex.Unlock()\n    } else {\n        // Clear the buffer if findings were made or specific flags are set\n        missingMutex.Lock()\n        missingMessages = missingMessages[:0]\n        missingMutex.Unlock()\n    }\n\n    emitFinalOutput(config)\n}\n\nfunc processInputsForEndpointsWithConfig(url string, config *Config) {\n    // Use enhanced endpoint extraction with config\n    var wg sync.WaitGroup\n    urlChannel := make(chan string)\n    \n    var fileWriter *os.File\n    if config.Output != \"\" {\n        var err error\n        fileWriter, err = os.Create(config.Output)\n        if err != nil {\n            fmt.Printf(\"Error creating output file: %v\\n\", err)\n            return\n        }\n        defer fileWriter.Close()\n    }\n    \n    for i := 0; i < config.Threads; i++ {\n        wg.Add(1)\n        go func() {\n            defer wg.Done()\n            for u := range urlChannel {\n                endpoints := extractEndpointsFromURLWithConfig(u, config)\n                \n                if fileWriter != nil {\n                    fmt.Fprintf(fileWriter, \"URL: %s\\n\", u)\n                    for _, endpoint := range endpoints {\n                        fmt.Fprintf(fileWriter, \"ENDPOINT: %s\\n\", endpoint)\n                    }\n                    fmt.Fprintln(fileWriter, \"\")\n                } else {\n                    for _, endpoint := range endpoints {\n                        fmt.Println(endpoint)\n                    }\n                }\n            }\n        }()\n    }\n    \n    if err := enqueueURLs(url, config.List, urlChannel, config.Regex); err != nil {\n        fmt.Printf(\"Error in input processing: %v\\n\", err)\n        close(urlChannel)\n        return\n    }\n    \n    close(urlChannel)\n    wg.Wait()\n}\n\nfunc processJSFileWithConfig(jsFile string, config *Config) {\n    // Reset global state for new processing session\n    globalSeenMutex.Lock()\n    globalFoundAny = false\n    globalSeenMutex.Unlock()\n    missingMutex.Lock()\n    missingMessages = missingMessages[:0]\n    missingMutex.Unlock()\n\n    if _, err := os.Stat(jsFile); os.IsNotExist(err) {\n        fmt.Printf(\"[%sERROR%s] File not found: %s\\n\", colors[\"RED\"], colors[\"NC\"], jsFile)\n    } else if err != nil {\n        if !config.Quiet {\n            fmt.Printf(\"[%sERROR%s] Unable to access file %s: %v\\n\", colors[\"RED\"], colors[\"NC\"], jsFile, err)\n        }\n    } else {\n        if !config.Quiet {\n            fmt.Printf(\"[%sFOUND%s] FILE: %s\\n\", colors[\"RED\"], colors[\"NC\"], jsFile)\n        }\n        _, sensitiveData := searchForSensitiveDataWithConfig(jsFile, config)\n\n        // If user specified -o flag, write results to output file\n        if config.Output != \"\" {\n            f, err := os.OpenFile(config.Output, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)\n            if err != nil {\n                fmt.Printf(\"[%sERROR%s] Error writing to output file: %v\\n\", colors[\"RED\"], colors[\"NC\"], err)\n            } else {\n                defer f.Close()\n                fmt.Fprintf(f, \"FILE: %s\\n\", jsFile)\n                for name, matches := range sensitiveData {\n                    for _, match := range matches {\n                        fmt.Fprintf(f, \"Sensitive Data [%s]: %s\\n\", name, match)\n                    }\n                }\n                if len(sensitiveData) > 0 {\n                    fmt.Fprintln(f, \"\") // Add blank line between files\n                }\n            }\n        }\n\n        // Print buffered MISSING messages only if no findings were made\n        // AND no flags are set (pure normal mode only)\n        globalSeenMutex.Lock()\n        foundAny := globalFoundAny\n        globalSeenMutex.Unlock()\n\n        hasAnyFlag := config.Params || config.ParamURLs || config.Secrets || config.Tokens ||\n            config.GraphQL || config.Firebase || config.Links || config.Internal ||\n            config.Bypass || config.ExtractEndpoints || config.Deobfuscate ||\n            config.SourceMap || config.Eval || config.ObfsDetect\n\n        if !foundAny && !config.FoundOnly && !hasAnyFlag && !config.Quiet {\n            missingMutex.Lock()\n            for _, msg := range missingMessages {\n                fmt.Printf(\"[%sMISSING%s] No sensitive data found at: %s\\n\", colors[\"BLUE\"], colors[\"NC\"], msg)\n            }\n            missingMessages = missingMessages[:0] // Clear the buffer\n            missingMutex.Unlock()\n        } else {\n            // Clear the buffer if findings were made or specific flags are set\n            missingMutex.Lock()\n            missingMessages = missingMessages[:0]\n            missingMutex.Unlock()\n        }\n    }\n    emitFinalOutput(config)\n}\n\nfunc processJSFileForEndpointsWithConfig(jsFile string, config *Config) {\n    if _, err := os.Stat(jsFile); os.IsNotExist(err) {\n        fmt.Printf(\"[%sERROR%s] File not found: %s\\n\", colors[\"RED\"], colors[\"NC\"], jsFile)\n        return\n    } else if err != nil {\n        fmt.Printf(\"[%sERROR%s] Unable to access file %s: %v\\n\", colors[\"RED\"], colors[\"NC\"], jsFile, err)\n        return\n    }\n    \n    endpoints := extractEndpointsFromFile(jsFile, config.Regex)\n    \n    if config.Output != \"\" {\n        writeEndpointsToFile(endpoints, config.Output, jsFile)\n    } else {\n        displayEndpoints(endpoints, jsFile)\n    }\n}\n\n// extractEndpointsFromURLWithConfig enhanced endpoint extraction with config\nfunc extractEndpointsFromURLWithConfig(urlStr string, config *Config) []string {\n    client := createHTTPClientWithConfig(config)\n    \n    req, err := http.NewRequest(\"GET\", urlStr, nil)\n    if err != nil {\n        return nil \n    }\n    \n    // Apply custom headers\n    for _, header := range config.Headers {\n        parts := strings.SplitN(header, \":\", 2)\n        if len(parts) == 2 {\n            key := strings.TrimSpace(parts[0])\n            value := strings.TrimSpace(parts[1])\n            req.Header.Set(key, value)\n            if config.Verbose {\n                fmt.Printf(\"[%sINFO%s] Added header: %s: %s\\n\", colors[\"CYAN\"], colors[\"NC\"], key, value)\n            }\n        } else if config.Verbose {\n            fmt.Printf(\"[%sWARN%s] Invalid header format (expected 'Key: Value'): %s\\n\", colors[\"YELLOW\"], colors[\"NC\"], header)\n        }\n    }\n    \n    // Apply custom User-Agent (randomly select from list if available)\n    if len(config.UserAgents) > 0 {\n        // Randomly select a user agent from the list for each request\n        selectedUA := config.UserAgents[rand.Intn(len(config.UserAgents))]\n        req.Header.Set(\"User-Agent\", selectedUA)\n        if config.Verbose {\n            fmt.Printf(\"[%sINFO%s] Using User-Agent: %s\\n\", colors[\"CYAN\"], colors[\"NC\"], selectedUA)\n        }\n    } else if config.UserAgent != \"\" {\n        req.Header.Set(\"User-Agent\", config.UserAgent)\n        if config.Verbose {\n            fmt.Printf(\"[%sINFO%s] Using User-Agent: %s\\n\", colors[\"CYAN\"], colors[\"NC\"], config.UserAgent)\n        }\n    }\n    \n    if config.Cookies != \"\" {\n        req.Header.Set(\"Cookie\", config.Cookies)\n    }\n    \n    resp, err := makeRequestWithRetry(client, req, config)\n    if err != nil {\n        return nil\n    }\n    \n    if config.Verbose {\n        fmt.Printf(\"[%sINFO%s] Successfully fetched %s (Status: %d)\\n\", colors[\"GREEN\"], colors[\"NC\"], urlStr, resp.StatusCode)\n    }\n    defer resp.Body.Close()\n    \n    // Filter: Only process JavaScript content\n    if !shouldProcessResponse(resp, urlStr, config) {\n        return nil\n    }\n    \n    body, err := io.ReadAll(resp.Body)\n    if err != nil {\n        if len(body) == 0 {\n            return nil \n        }\n    }\n    \n    // Process JS analysis\n    processedBody := processJSAnalysis(body, config)\n    \n    parsedURL, err := url.Parse(urlStr)\n    if err != nil {\n        return nil\n    }\n    baseURL := parsedURL.Scheme + \"://\" + parsedURL.Host\n    \n    return extractEndpointsFromContent(string(processedBody), config.Regex, baseURL)\n}\n\n// crawlAndProcessJS recursively crawls and processes JS files\nfunc crawlAndProcessJS(initialURL string, config *Config, depth int, visited map[string]bool) {\n    if depth <= 0 || visited[initialURL] {\n        return\n    }\n    visited[initialURL] = true\n    \n    client := createHTTPClientWithConfig(config)\n    req, err := http.NewRequest(\"GET\", initialURL, nil)\n    if err != nil {\n        return\n    }\n    \n    // Apply headers\n    for _, header := range config.Headers {\n        parts := strings.SplitN(header, \":\", 2)\n        if len(parts) == 2 {\n            req.Header.Set(strings.TrimSpace(parts[0]), strings.TrimSpace(parts[1]))\n        }\n    }\n    \n    // Apply custom User-Agent (randomly select from list if available)\n    if len(config.UserAgents) > 0 {\n        // Randomly select a user agent from the list for each request\n        selectedUA := config.UserAgents[rand.Intn(len(config.UserAgents))]\n        req.Header.Set(\"User-Agent\", selectedUA)\n    } else if config.UserAgent != \"\" {\n        req.Header.Set(\"User-Agent\", config.UserAgent)\n    }\n    \n    if config.Cookies != \"\" {\n        req.Header.Set(\"Cookie\", config.Cookies)\n    }\n    \n    resp, err := makeRequestWithRetry(client, req, config)\n    if err != nil {\n        return\n    }\n    defer resp.Body.Close()\n    \n    body, err := io.ReadAll(resp.Body)\n    if err != nil {\n        return\n    }\n    \n    // Process current page\n    searchForSensitiveDataWithConfig(initialURL, config)\n    \n    // Find JS file references\n    jsPattern := regexp.MustCompile(`(?:src|href)\\s*=\\s*[\"']([^\"']+\\.js[^\"']*)[\"']`)\n    matches := jsPattern.FindAllStringSubmatch(string(body), -1)\n    \n    parsedURL, err := url.Parse(initialURL)\n    if err != nil {\n        return\n    }\n    baseURL := parsedURL.Scheme + \"://\" + parsedURL.Host\n    \n    for _, match := range matches {\n        if len(match) > 1 {\n            jsURL := match[1]\n            if !strings.HasPrefix(jsURL, \"http\") {\n                if strings.HasPrefix(jsURL, \"//\") {\n                    jsURL = parsedURL.Scheme + \":\" + jsURL\n                } else if strings.HasPrefix(jsURL, \"/\") {\n                    jsURL = baseURL + jsURL\n                } else {\n                    jsURL = baseURL + \"/\" + jsURL\n                }\n            }\n            \n            // Check domain scope\n            if config.Domain != \"\" && !strings.Contains(jsURL, config.Domain) {\n                continue\n            }\n            \n            // Check extension filter\n            if config.Ext != \"\" {\n                extList := strings.Split(config.Ext, \",\")\n                matched := false\n                for _, ext := range extList {\n                    ext = strings.TrimSpace(ext)\n                    if strings.HasSuffix(jsURL, ext) {\n                        matched = true\n                        break\n                    }\n                }\n                if !matched {\n                    continue\n                }\n            }\n            \n            // Recursively process\n            if !visited[jsURL] {\n                crawlAndProcessJS(jsURL, config, depth-1, visited)\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "internal/jshunter/ndjson.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n)\n\n// outputNDJSON streams the dedupe-table snapshot one finding per line. Stable\n// ordering: severity desc, then confidence desc, then name. This is the\n// preferred shape for `jq -c`, `mlr`, and SIEM ingestion paths.\nfunc outputNDJSON() {\n\tenc := json.NewEncoder(os.Stdout)\n\tenc.SetEscapeHTML(false)\n\tfor _, f := range flushFindings() {\n\t\tif err := enc.Encode(f); err != nil {\n\t\t\tfmt.Fprintf(os.Stderr, \"[ndjson] encode: %v\\n\", err)\n\t\t\treturn\n\t\t}\n\t}\n}\n"
  },
  {
    "path": "internal/jshunter/robots.go",
    "content": "package jshunter\n\nimport (\n\t\"bufio\"\n\t\"context\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"strings\"\n\t\"time\"\n)\n\n// robots.txt is recon gold: targets explicitly list their non-crawlable\n// paths. JSHunter's `--robots` mode fetches the file, returns the\n// disallowed paths, and the operator can feed those back as targets.\n//\n// We don't honor robots.txt for our own crawling by default — that's a\n// recon-tool decision, not a library decision — but operators on\n// engagements with explicit rules-of-engagement can pipe the disallow\n// list into the URL queue.\n//\n// Spec: https://www.rfc-editor.org/rfc/rfc9309\n\ntype RobotsResult struct {\n\tURL       string\n\tDisallow  []string\n\tAllow     []string\n\tSitemaps  []string\n\tUserAgent string\n}\n\n// FetchRobots fetches `<base>/robots.txt` and parses the rules for the\n// configured user-agent (or `*` if none). Returns nil result when the\n// file is absent or unreachable — robots.txt absence is the most common\n// case and not an error.\nfunc FetchRobots(client *http.Client, baseURL string, ua string) (*RobotsResult, error) {\n\ttarget := strings.TrimRight(baseURL, \"/\") + \"/robots.txt\"\n\tif err := validateTargetURL(target, false); err != nil {\n\t\treturn nil, fmt.Errorf(\"robots: %w\", err)\n\t}\n\tctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)\n\tdefer cancel()\n\treq, err := http.NewRequestWithContext(ctx, \"GET\", target, nil)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tif ua != \"\" {\n\t\treq.Header.Set(\"User-Agent\", ua)\n\t}\n\tresp, err := client.Do(req)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tdefer resp.Body.Close()\n\tif resp.StatusCode == http.StatusNotFound {\n\t\treturn nil, nil\n\t}\n\tif resp.StatusCode < 200 || resp.StatusCode >= 300 {\n\t\treturn nil, fmt.Errorf(\"robots fetch returned %d\", resp.StatusCode)\n\t}\n\n\traw, err := io.ReadAll(io.LimitReader(resp.Body, 1024*1024))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\treturn parseRobots(target, ua, raw), nil\n}\n\n// parseRobots implements a defensive subset of RFC 9309. We don't model\n// `Crawl-delay` or wildcard groups perfectly — operators wanting strict\n// compliance should use a dedicated robots library — but for \"give me\n// the disallow paths\" the simple group walk is correct.\nfunc parseRobots(target, ua string, body []byte) *RobotsResult {\n\tres := &RobotsResult{URL: target, UserAgent: ua}\n\tsc := bufio.NewScanner(strings.NewReader(string(body)))\n\tcurrentGroup := []string{}\n\tmatchActive := false\n\tdefaultMatch := false\n\tif ua == \"\" {\n\t\tua = \"*\"\n\t}\n\tuaLower := strings.ToLower(ua)\n\n\tflush := func() {\n\t\t// A blank line ends a group; commit if current group applies.\n\t\tcurrentGroup = currentGroup[:0]\n\t\tmatchActive = false\n\t}\n\n\tfor sc.Scan() {\n\t\tline := sc.Text()\n\t\tif i := strings.Index(line, \"#\"); i >= 0 {\n\t\t\tline = line[:i]\n\t\t}\n\t\tline = strings.TrimSpace(line)\n\t\tif line == \"\" {\n\t\t\tflush()\n\t\t\tcontinue\n\t\t}\n\t\tcolon := strings.Index(line, \":\")\n\t\tif colon < 0 {\n\t\t\tcontinue\n\t\t}\n\t\tfield := strings.TrimSpace(strings.ToLower(line[:colon]))\n\t\tval := strings.TrimSpace(line[colon+1:])\n\n\t\tswitch field {\n\t\tcase \"user-agent\":\n\t\t\tcurrentGroup = append(currentGroup, strings.ToLower(val))\n\t\t\tmatchActive = false\n\t\t\tfor _, g := range currentGroup {\n\t\t\t\tif g == uaLower || g == \"*\" {\n\t\t\t\t\tmatchActive = true\n\t\t\t\t\tif g == \"*\" {\n\t\t\t\t\t\tdefaultMatch = true\n\t\t\t\t\t}\n\t\t\t\t\tbreak\n\t\t\t\t}\n\t\t\t}\n\t\tcase \"disallow\":\n\t\t\tif matchActive && val != \"\" {\n\t\t\t\tres.Disallow = append(res.Disallow, val)\n\t\t\t}\n\t\tcase \"allow\":\n\t\t\tif matchActive && val != \"\" {\n\t\t\t\tres.Allow = append(res.Allow, val)\n\t\t\t}\n\t\tcase \"sitemap\":\n\t\t\tres.Sitemaps = append(res.Sitemaps, val)\n\t\t}\n\t}\n\t_ = defaultMatch\n\treturn res\n}\n"
  },
  {
    "path": "internal/jshunter/rules_cli.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n\t\"sort\"\n\t\"strings\"\n)\n\n// runListRules prints every registered rule as a single-line entry. Sorted\n// by rule_id so output is grep-friendly. ctx-required and validator\n// annotations help operators understand which rules are high-precision.\nfunc runListRules() {\n\tregisterRules()\n\tregisterVerifiers()\n\trules := make([]Rule, len(rulesRegistry))\n\tcopy(rules, rulesRegistry)\n\tsort.Slice(rules, func(i, j int) bool { return rules[i].ID < rules[j].ID })\n\n\tfmt.Printf(\"%-32s %-9s %-20s %s\\n\", \"RULE_ID\", \"SEVERITY\", \"PROVIDER\", \"NAME\")\n\tfor _, r := range rules {\n\t\tflags := []string{}\n\t\tif r.RequiresContext {\n\t\t\tflags = append(flags, \"ctx\")\n\t\t}\n\t\tif r.Validate != nil {\n\t\t\tflags = append(flags, \"validate\")\n\t\t}\n\t\tif r.HighFPProne {\n\t\t\tflags = append(flags, \"high-fp\")\n\t\t}\n\t\tif _, ok := verifierRegistry[r.ID]; ok {\n\t\t\tflags = append(flags, \"verify\")\n\t\t}\n\t\tflagStr := \"\"\n\t\tif len(flags) > 0 {\n\t\t\tflagStr = \" [\" + strings.Join(flags, \",\") + \"]\"\n\t\t}\n\t\tfmt.Printf(\"%-32s %-9s %-20s %s%s\\n\", r.ID, r.Severity, r.Provider, r.Name, flagStr)\n\t}\n}\n\n// runExplainRule prints a JSON dump of a single rule, including its TP/FP\n// fixtures so operators can see what the rule was designed to catch.\nfunc runExplainRule(id string) {\n\tregisterRules()\n\tregisterVerifiers()\n\tr, ok := rulesIndex[id]\n\tif !ok {\n\t\tfmt.Fprintf(os.Stderr, \"rule %q not found; try --list-rules\\n\", id)\n\t\tos.Exit(2)\n\t}\n\ttype explain struct {\n\t\tID              string   `json:\"id\"`\n\t\tName            string   `json:\"name\"`\n\t\tProvider        string   `json:\"provider,omitempty\"`\n\t\tSecretType      string   `json:\"secret_type,omitempty\"`\n\t\tSeverity        Severity `json:\"severity\"`\n\t\tPattern         string   `json:\"pattern\"`\n\t\tConfidencePrior float64  `json:\"confidence_prior\"`\n\t\tRequiresContext bool     `json:\"requires_context\"`\n\t\tContextKeywords []string `json:\"context_keywords,omitempty\"`\n\t\tMinEntropy      float64  `json:\"min_entropy,omitempty\"`\n\t\tMinLen          int      `json:\"min_len,omitempty\"`\n\t\tMaxLen          int      `json:\"max_len,omitempty\"`\n\t\tHighFPProne     bool     `json:\"high_fp_prone\"`\n\t\tHasValidator    bool     `json:\"has_validator\"`\n\t\tHasVerifier     bool     `json:\"has_verifier\"`\n\t\tTPExamples      []string `json:\"tp_examples,omitempty\"`\n\t\tFPExamples      []string `json:\"fp_examples,omitempty\"`\n\t}\n\t_, hasV := verifierRegistry[id]\n\tout := explain{\n\t\tID:              r.ID,\n\t\tName:            r.Name,\n\t\tProvider:        r.Provider,\n\t\tSecretType:      r.SecretType,\n\t\tSeverity:        r.Severity,\n\t\tPattern:         r.Pattern.String(),\n\t\tConfidencePrior: r.ConfidencePrior,\n\t\tRequiresContext: r.RequiresContext,\n\t\tContextKeywords: r.ContextKeywords,\n\t\tMinEntropy:      r.MinEntropy,\n\t\tMinLen:          r.MinLen,\n\t\tMaxLen:          r.MaxLen,\n\t\tHighFPProne:     r.HighFPProne,\n\t\tHasValidator:    r.Validate != nil,\n\t\tHasVerifier:     hasV,\n\t\tTPExamples:      r.TPExamples,\n\t\tFPExamples:      r.FPExamples,\n\t}\n\tb, _ := json.MarshalIndent(out, \"\", \"  \")\n\tfmt.Println(string(b))\n}\n\n// applyRuleSelection mutates rulesRegistry to honor --only-rules and\n// --disable-rule, both comma-separated lists with glob support\n// (`aws.*`, `*.api_key`).\n//\n//\tonly:    if non-empty, ONLY rules matching at least one pattern are kept\n//\tdisable: rules matching any pattern are dropped (applied after `only`)\n//\n// Mutating the registry rather than gating at match time keeps the regex\n// loop tight on big bundles.\nfunc applyRuleSelection(only, disable string) int {\n\tregisterRules()\n\tif only == \"\" && disable == \"\" {\n\t\treturn len(rulesRegistry)\n\t}\n\tkeep := func(id string) bool {\n\t\tif only != \"\" {\n\t\t\tmatched := false\n\t\t\tfor _, p := range strings.Split(only, \",\") {\n\t\t\t\tp = strings.TrimSpace(p)\n\t\t\t\tif p == \"\" {\n\t\t\t\t\tcontinue\n\t\t\t\t}\n\t\t\t\tif id == p || globMatch(p, id) {\n\t\t\t\t\tmatched = true\n\t\t\t\t\tbreak\n\t\t\t\t}\n\t\t\t}\n\t\t\tif !matched {\n\t\t\t\treturn false\n\t\t\t}\n\t\t}\n\t\tif disable != \"\" {\n\t\t\tfor _, p := range strings.Split(disable, \",\") {\n\t\t\t\tp = strings.TrimSpace(p)\n\t\t\t\tif p == \"\" {\n\t\t\t\t\tcontinue\n\t\t\t\t}\n\t\t\t\tif id == p || globMatch(p, id) {\n\t\t\t\t\treturn false\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t\treturn true\n\t}\n\n\tout := make([]Rule, 0, len(rulesRegistry))\n\tfor i := range rulesRegistry {\n\t\tif keep(rulesRegistry[i].ID) {\n\t\t\tout = append(out, rulesRegistry[i])\n\t\t}\n\t}\n\trulesRegistry = out\n\trulesIndex = map[string]*Rule{}\n\tfor i := range rulesRegistry {\n\t\trulesIndex[rulesRegistry[i].ID] = &rulesRegistry[i]\n\t}\n\treturn len(rulesRegistry)\n}\n"
  },
  {
    "path": "internal/jshunter/rules_loader.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n\t\"regexp\"\n\t\"strings\"\n)\n\n// ExternalRule is the JSON-friendly serialization of a Rule. Validators are\n// not serializable (they are Go funcs) — external rules get scoring only,\n// no provider validator. That is the right trade: contributors can ship\n// detectors quickly, but provider-specific liveness checks stay first-party\n// to avoid pasting attack code into a YAML loader.\ntype ExternalRule struct {\n\tID              string   `json:\"id\"`\n\tName            string   `json:\"name\"`\n\tProvider        string   `json:\"provider\"`\n\tSecretType      string   `json:\"secret_type\"`\n\tSeverity        string   `json:\"severity\"`\n\tPattern         string   `json:\"pattern\"`\n\tGroup           int      `json:\"group,omitempty\"`\n\tConfidencePrior float64  `json:\"confidence_prior,omitempty\"`\n\tRequiresContext bool     `json:\"requires_context,omitempty\"`\n\tContextKeywords []string `json:\"context_keywords,omitempty\"`\n\tMinEntropy      float64  `json:\"min_entropy,omitempty\"`\n\tMinLen          int      `json:\"min_len,omitempty\"`\n\tMaxLen          int      `json:\"max_len,omitempty\"`\n\tHighFPProne     bool     `json:\"high_fp_prone,omitempty\"`\n\tTPExamples      []string `json:\"tp_examples,omitempty\"`\n\tFPExamples      []string `json:\"fp_examples,omitempty\"`\n}\n\n// LoadRulesFile reads, validates, compiles, and registers an external rule\n// pack. Pack format: a top-level JSON array of ExternalRule objects.\n//\n// On any validation failure for a single rule, the loader rejects the WHOLE\n// file — partial loads invite \"why didn't my rule fire?\" support questions.\n// Returns the count of rules successfully appended.\nfunc LoadRulesFile(path string) (int, error) {\n\tregisterRules() // ensure built-in registry is materialized first\n\n\traw, err := os.ReadFile(path)\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"read rules file: %w\", err)\n\t}\n\n\tvar ext []ExternalRule\n\tif err := json.Unmarshal(raw, &ext); err != nil {\n\t\treturn 0, fmt.Errorf(\"parse rules file: %w\", err)\n\t}\n\n\tcompiled, err := validateAndCompileExternalRules(ext)\n\tif err != nil {\n\t\treturn 0, err\n\t}\n\n\tfor _, r := range compiled {\n\t\trulesRegistry = append(rulesRegistry, r)\n\t\tidx := len(rulesRegistry) - 1\n\t\trulesIndex[r.ID] = &rulesRegistry[idx]\n\t}\n\treturn len(compiled), nil\n}\n\n// validateAndCompileExternalRules enforces contract (id present, regex\n// compiles, severity is one of the documented values, no field clashes with\n// built-in registry) and returns the resulting Rule slice ready to register.\nfunc validateAndCompileExternalRules(ext []ExternalRule) ([]Rule, error) {\n\tout := make([]Rule, 0, len(ext))\n\tseenIDs := map[string]struct{}{}\n\tfor i, e := range ext {\n\t\tif strings.TrimSpace(e.ID) == \"\" {\n\t\t\treturn nil, fmt.Errorf(\"rule[%d]: id is required\", i)\n\t\t}\n\t\tif _, dup := seenIDs[e.ID]; dup {\n\t\t\treturn nil, fmt.Errorf(\"rule[%d]: duplicate id %q within file\", i, e.ID)\n\t\t}\n\t\tseenIDs[e.ID] = struct{}{}\n\t\tif _, dup := rulesIndex[e.ID]; dup {\n\t\t\treturn nil, fmt.Errorf(\"rule[%d]: id %q clashes with built-in rule\", i, e.ID)\n\t\t}\n\t\tif strings.TrimSpace(e.Name) == \"\" {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: name is required\", e.ID)\n\t\t}\n\t\tif strings.TrimSpace(e.Pattern) == \"\" {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: pattern is required\", e.ID)\n\t\t}\n\n\t\t// Length sanity: refuse a pattern over 4 KiB. v0.6 uses Go's RE2\n\t\t// engine which is ReDoS-safe by construction, but a 100 KiB regex\n\t\t// is still a memory hazard at compile time.\n\t\tif len(e.Pattern) > 4096 {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: pattern exceeds 4096 bytes\", e.ID)\n\t\t}\n\n\t\tre, err := regexp.Compile(e.Pattern)\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: pattern does not compile: %w\", e.ID, err)\n\t\t}\n\n\t\tsev := normalizeSeverity(e.Severity)\n\t\tif sev == \"\" {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: severity must be one of critical|high|medium|low|info\", e.ID)\n\t\t}\n\n\t\tprior := e.ConfidencePrior\n\t\tif prior == 0 {\n\t\t\tprior = 0.55\n\t\t}\n\t\tif prior < 0 || prior > 1 {\n\t\t\treturn nil, fmt.Errorf(\"rule %q: confidence_prior must be in [0,1]\", e.ID)\n\t\t}\n\n\t\tout = append(out, Rule{\n\t\t\tID:              e.ID,\n\t\t\tName:            e.Name,\n\t\t\tProvider:        e.Provider,\n\t\t\tSecretType:      e.SecretType,\n\t\t\tSeverity:        sev,\n\t\t\tPattern:         re,\n\t\t\tGroup:           e.Group,\n\t\t\tConfidencePrior: prior,\n\t\t\tRequiresContext: e.RequiresContext,\n\t\t\tContextKeywords: e.ContextKeywords,\n\t\t\tMinEntropy:      e.MinEntropy,\n\t\t\tMinLen:          e.MinLen,\n\t\t\tMaxLen:          e.MaxLen,\n\t\t\tHighFPProne:     e.HighFPProne,\n\t\t\tTPExamples:      e.TPExamples,\n\t\t\tFPExamples:      e.FPExamples,\n\t\t})\n\t}\n\treturn out, nil\n}\n\n// normalizeSeverity accepts the documented spellings and lowercases for the\n// Severity enum used everywhere else.\nfunc normalizeSeverity(s string) Severity {\n\tswitch strings.ToLower(strings.TrimSpace(s)) {\n\tcase \"critical\", \"crit\":\n\t\treturn SevCritical\n\tcase \"high\":\n\t\treturn SevHigh\n\tcase \"medium\", \"med\":\n\t\treturn SevMedium\n\tcase \"low\":\n\t\treturn SevLow\n\tcase \"info\", \"informational\":\n\t\treturn SevInfo\n\t}\n\treturn \"\"\n}\n"
  },
  {
    "path": "internal/jshunter/sarif.go",
    "content": "package jshunter\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n)\n\n// SARIF 2.1.0 output. Lets JSHunter feed GitHub Code Scanning, Azure Defender,\n// any other consumer that accepts the SARIF tool format. Spec:\n// https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html\n\ntype SARIFEnvelope struct {\n\tVersion string     `json:\"version\"`\n\tSchema  string     `json:\"$schema\"`\n\tRuns    []SARIFRun `json:\"runs\"`\n}\n\ntype SARIFRun struct {\n\tTool    SARIFTool     `json:\"tool\"`\n\tResults []SARIFResult `json:\"results\"`\n}\n\ntype SARIFTool struct {\n\tDriver SARIFDriver `json:\"driver\"`\n}\n\ntype SARIFDriver struct {\n\tName           string      `json:\"name\"`\n\tVersion        string      `json:\"version\"`\n\tInformationURI string      `json:\"informationUri,omitempty\"`\n\tRules          []SARIFRule `json:\"rules\"`\n}\n\ntype SARIFRule struct {\n\tID                   string                  `json:\"id\"`\n\tName                 string                  `json:\"name,omitempty\"`\n\tShortDescription     SARIFText               `json:\"shortDescription\"`\n\tFullDescription      *SARIFText              `json:\"fullDescription,omitempty\"`\n\tHelpURI              string                  `json:\"helpUri,omitempty\"`\n\tDefaultConfiguration *SARIFRuleConfiguration `json:\"defaultConfiguration,omitempty\"`\n\tProperties           map[string]string       `json:\"properties,omitempty\"`\n}\n\ntype SARIFText struct {\n\tText string `json:\"text\"`\n}\n\ntype SARIFRuleConfiguration struct {\n\tLevel string `json:\"level\"`\n}\n\ntype SARIFResult struct {\n\tRuleID              string                 `json:\"ruleId\"`\n\tLevel               string                 `json:\"level\"`\n\tMessage             SARIFText              `json:\"message\"`\n\tLocations           []SARIFLocation        `json:\"locations\"`\n\tPartialFingerprints map[string]string      `json:\"partialFingerprints,omitempty\"`\n\tProperties          map[string]interface{} `json:\"properties,omitempty\"`\n}\n\ntype SARIFLocation struct {\n\tPhysicalLocation SARIFPhysicalLocation `json:\"physicalLocation\"`\n}\n\ntype SARIFPhysicalLocation struct {\n\tArtifactLocation SARIFArtifactLocation `json:\"artifactLocation\"`\n\tRegion           *SARIFRegion          `json:\"region,omitempty\"`\n}\n\ntype SARIFArtifactLocation struct {\n\tURI string `json:\"uri\"`\n}\n\ntype SARIFRegion struct {\n\tStartLine   int `json:\"startLine,omitempty\"`\n\tStartColumn int `json:\"startColumn,omitempty\"`\n}\n\n// severityToSARIFLevel maps JSHunter severities to the four levels SARIF\n// recognizes. Critical and High both become \"error\" because GitHub\n// code-scanning treats anything below \"error\" as informational.\nfunc severityToSARIFLevel(sev Severity) string {\n\tswitch sev {\n\tcase SevCritical, SevHigh:\n\t\treturn \"error\"\n\tcase SevMedium:\n\t\treturn \"warning\"\n\tcase SevLow:\n\t\treturn \"note\"\n\t}\n\treturn \"none\"\n}\n\n// ToSARIF converts the dedupe-table snapshot into a SARIF 2.1.0 envelope.\n// One result per Location so downstream tools can highlight every occurrence\n// rather than collapsing them under one finding.\nfunc ToSARIF() *SARIFEnvelope {\n\tregisterRules()\n\tdriverRules := make([]SARIFRule, 0, len(rulesRegistry))\n\tfor _, r := range rulesRegistry {\n\t\tdriverRules = append(driverRules, SARIFRule{\n\t\t\tID:               r.ID,\n\t\t\tName:             r.Name,\n\t\t\tShortDescription: SARIFText{Text: r.Name},\n\t\t\tDefaultConfiguration: &SARIFRuleConfiguration{\n\t\t\t\tLevel: severityToSARIFLevel(r.Severity),\n\t\t\t},\n\t\t\tProperties: map[string]string{\n\t\t\t\t\"provider\":    r.Provider,\n\t\t\t\t\"secret_type\": r.SecretType,\n\t\t\t\t\"severity\":    string(r.Severity),\n\t\t\t},\n\t\t})\n\t}\n\n\tresults := []SARIFResult{}\n\tfor _, f := range flushFindings() {\n\t\tlocs := f.Locations\n\t\tif len(locs) == 0 {\n\t\t\tlocs = []Location{{Source: f.Source, Line: f.Line, Column: f.Column}}\n\t\t}\n\t\tfor _, loc := range locs {\n\t\t\tregion := &SARIFRegion{StartLine: loc.Line, StartColumn: loc.Column}\n\t\t\tif loc.Line == 0 && loc.Column == 0 {\n\t\t\t\tregion = nil\n\t\t\t}\n\t\t\tresults = append(results, SARIFResult{\n\t\t\t\tRuleID: f.RuleID,\n\t\t\t\tLevel:  severityToSARIFLevel(f.Severity),\n\t\t\t\tMessage: SARIFText{\n\t\t\t\t\tText: fmt.Sprintf(\"%s detected (confidence=%.2f, verified=%v)\", f.Name, f.Confidence, f.Verified),\n\t\t\t\t},\n\t\t\t\tLocations: []SARIFLocation{{\n\t\t\t\t\tPhysicalLocation: SARIFPhysicalLocation{\n\t\t\t\t\t\tArtifactLocation: SARIFArtifactLocation{URI: loc.Source},\n\t\t\t\t\t\tRegion:           region,\n\t\t\t\t\t},\n\t\t\t\t}},\n\t\t\t\t// partialFingerprints lets GitHub Code Scanning persist\n\t\t\t\t// dismiss/suppress decisions across runs even when the\n\t\t\t\t// finding moves source/line. value_hash is stable per\n\t\t\t\t// secret value; ruleId+secretType disambiguates classes.\n\t\t\t\tPartialFingerprints: map[string]string{\n\t\t\t\t\t\"jshunter/valueHash\":      f.ValueHash,\n\t\t\t\t\t\"jshunter/ruleSecretType\": f.RuleID + \":\" + f.SecretType,\n\t\t\t\t},\n\t\t\t\tProperties: map[string]interface{}{\n\t\t\t\t\t\"confidence\": f.Confidence,\n\t\t\t\t\t\"verified\":   f.Verified,\n\t\t\t\t\t\"value_hash\": f.ValueHash,\n\t\t\t\t\t\"redacted\":   f.Redacted,\n\t\t\t\t\t\"entropy\":    f.Entropy,\n\t\t\t\t\t\"reasons\":    f.Reasons,\n\t\t\t\t},\n\t\t\t})\n\t\t}\n\t}\n\n\treturn &SARIFEnvelope{\n\t\tVersion: \"2.1.0\",\n\t\tSchema:  \"https://json.schemastore.org/sarif-2.1.0.json\",\n\t\tRuns: []SARIFRun{{\n\t\t\tTool: SARIFTool{Driver: SARIFDriver{\n\t\t\t\tName:           \"JSHunter\",\n\t\t\t\tVersion:        version,\n\t\t\t\tInformationURI: \"https://github.com/cc1a2b/jshunter\",\n\t\t\t\tRules:          driverRules,\n\t\t\t}},\n\t\t\tResults: results,\n\t\t}},\n\t}\n}\n\n// outputSARIF writes the envelope to stdout. Operators piping to a file via\n// `jshunter ... --sarif > findings.sarif` get a clean JSON document.\nfunc outputSARIF() {\n\tenv := ToSARIF()\n\tb, err := json.MarshalIndent(env, \"\", \"  \")\n\tif err != nil {\n\t\tfmt.Fprintf(os.Stderr, \"[sarif] marshal: %v\\n\", err)\n\t\treturn\n\t}\n\tfmt.Println(string(b))\n}\n"
  },
  {
    "path": "internal/jshunter/sourcemap.go",
    "content": "package jshunter\n\nimport (\n\t\"context\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"net/url\"\n\t\"regexp\"\n\t\"strings\"\n\t\"time\"\n)\n\n// Source-map ingestion. v0.5's `--sourcemap` flag was a stub — it found\n// the `//# sourceMappingURL=` reference and did nothing with it. v0.6+\n// fetches the map (or decodes the inline data URI), parses the JSON, and\n// scans every entry in `sourcesContent[]` as its own source. Modern\n// bundlers (Vite, esbuild, webpack 5, Turbopack, Rspack) ship the\n// pre-minification source verbatim in this array — including comments,\n// dev-only branches, and the original variable names — which is the\n// single highest-leverage signal for secret recon on production sites.\n//\n// Specs:\n//   https://sourcemaps.info/spec.html\n//   https://tc39.es/ecma426/\n\ntype sourceMap struct {\n\tVersion        int      `json:\"version\"`\n\tFile           string   `json:\"file,omitempty\"`\n\tSourceRoot     string   `json:\"sourceRoot,omitempty\"`\n\tSources        []string `json:\"sources\"`\n\tSourcesContent []string `json:\"sourcesContent,omitempty\"`\n}\n\n// sourceMappingURL captures both header form and inline-data form. The\n// `//#` and the older `//@` markers are equivalent per the spec.\nvar sourceMappingURLRe = regexp.MustCompile(`(?m)^[\\t ]*//[#@]\\s*sourceMappingURL=([^\\s]+)\\s*$`)\n\n// FetchAndScanSourceMap looks for a sourceMappingURL marker in `body`,\n// resolves it relative to `baseURL`, fetches (or decodes) the map, and\n// runs the v0.6 detection pipeline against every entry in\n// sourcesContent[]. Returns the count of original sources scanned.\n//\n// data: URI inlining is supported. http(s): is fetched via the same\n// hardened client (host limiter, max-bytes, SSRF guard).\nfunc FetchAndScanSourceMap(client *http.Client, baseURL string, body []byte, config *Config) (int, error) {\n\tm := sourceMappingURLRe.FindSubmatch(body)\n\tif m == nil {\n\t\treturn 0, nil\n\t}\n\tmapRef := strings.TrimSpace(string(m[1]))\n\tif mapRef == \"\" {\n\t\treturn 0, nil\n\t}\n\n\tmapBytes, err := fetchSourceMapPayload(client, baseURL, mapRef, config)\n\tif err != nil {\n\t\treturn 0, fmt.Errorf(\"fetch sourcemap: %w\", err)\n\t}\n\tif config.MaxBytes > 0 && int64(len(mapBytes)) > config.MaxBytes {\n\t\tmapBytes = mapBytes[:config.MaxBytes]\n\t}\n\n\tvar sm sourceMap\n\tif err := json.Unmarshal(mapBytes, &sm); err != nil {\n\t\treturn 0, fmt.Errorf(\"parse sourcemap: %w\", err)\n\t}\n\tif sm.Version != 3 {\n\t\t// Source-map v3 is the only deployed version; v1/v2 were proposals.\n\t\t// Treat unknown as best-effort.\n\t}\n\n\tscanned := 0\n\tfor i, content := range sm.SourcesContent {\n\t\tif content == \"\" {\n\t\t\tcontinue\n\t\t}\n\t\tsrc := sourceLabel(baseURL, &sm, i)\n\t\tif globalStats != nil {\n\t\t\tstatAdd(&globalStats.BytesParsed, int64(len(content)))\n\t\t}\n\t\tprocessed := processJSAnalysis([]byte(content), config)\n\t\treportMatchesWithConfig(src, processed, config)\n\t\tscanned++\n\t}\n\treturn scanned, nil\n}\n\n// sourceLabel builds a stable identifier for an in-map source so the\n// operator can locate it from the output (`vim` won't open it, but the\n// hash and path are consistent across runs).\nfunc sourceLabel(baseURL string, sm *sourceMap, idx int) string {\n\tif idx < len(sm.Sources) && sm.Sources[idx] != \"\" {\n\t\ts := sm.Sources[idx]\n\t\tif sm.SourceRoot != \"\" && !strings.HasPrefix(s, \"/\") && !strings.Contains(s, \"://\") {\n\t\t\ts = strings.TrimRight(sm.SourceRoot, \"/\") + \"/\" + s\n\t\t}\n\t\treturn baseURL + \".map#\" + s\n\t}\n\treturn fmt.Sprintf(\"%s.map#sources[%d]\", baseURL, idx)\n}\n\n// fetchSourceMapPayload resolves the map reference. Three forms supported:\n//   1. data:application/json[;base64],{...}  — inline (Vite dev, webpack devtool)\n//   2. /static/app.js.map  — root-relative\n//   3. https://cdn/app.js.map  — absolute\nfunc fetchSourceMapPayload(client *http.Client, baseURL, ref string, config *Config) ([]byte, error) {\n\tif strings.HasPrefix(ref, \"data:\") {\n\t\treturn decodeDataURI(ref)\n\t}\n\n\tmapURL := ref\n\tif !strings.Contains(ref, \"://\") {\n\t\tbase, err := url.Parse(baseURL)\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"parse base url: %w\", err)\n\t\t}\n\t\trel, err := url.Parse(ref)\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"parse map ref: %w\", err)\n\t\t}\n\t\tmapURL = base.ResolveReference(rel).String()\n\t}\n\n\tif err := validateTargetURL(mapURL, config.AllowInternal); err != nil {\n\t\treturn nil, err\n\t}\n\n\tctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\n\tdefer cancel()\n\treq, err := http.NewRequestWithContext(ctx, \"GET\", mapURL, nil)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tif config.UserAgent != \"\" {\n\t\treq.Header.Set(\"User-Agent\", config.UserAgent)\n\t}\n\tresp, err := client.Do(req)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tdefer resp.Body.Close()\n\tif resp.StatusCode < 200 || resp.StatusCode >= 300 {\n\t\treturn nil, fmt.Errorf(\"sourcemap fetch returned %d\", resp.StatusCode)\n\t}\n\n\tlimit := config.MaxBytes\n\tif limit <= 0 {\n\t\tlimit = DefaultMaxBytes\n\t}\n\treturn io.ReadAll(io.LimitReader(resp.Body, limit))\n}\n\n// decodeDataURI handles both base64 and percent-encoded data: URIs.\n// data:[<mediatype>][;base64],<data>\nfunc decodeDataURI(uri string) ([]byte, error) {\n\tconst prefix = \"data:\"\n\tif !strings.HasPrefix(uri, prefix) {\n\t\treturn nil, fmt.Errorf(\"not a data URI\")\n\t}\n\trest := uri[len(prefix):]\n\tcomma := strings.Index(rest, \",\")\n\tif comma < 0 {\n\t\treturn nil, fmt.Errorf(\"data URI missing comma\")\n\t}\n\tmeta, body := rest[:comma], rest[comma+1:]\n\tif strings.Contains(meta, \";base64\") {\n\t\tdec, err := harBase64Decode([]byte(body))\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"data URI base64: %w\", err)\n\t\t}\n\t\treturn dec, nil\n\t}\n\t// percent-encoded\n\tout, err := url.QueryUnescape(body)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"data URI url-unescape: %w\", err)\n\t}\n\treturn []byte(out), nil\n}\n"
  },
  {
    "path": "internal/jshunter/stats.go",
    "content": "package jshunter\n\nimport (\n\t\"crypto/rand\"\n\t\"encoding/hex\"\n\t\"fmt\"\n\t\"os\"\n\t\"sync/atomic\"\n\t\"time\"\n)\n\n// Stats is the operator's audit trail for a JSHunter run. Without these\n// counters, \"the FP filter dropped 800 things\" is opaque; with them, the\n// operator can answer \"did the filter eat a real key?\" by re-running with\n// --no-fp-filter and diffing.\ntype Stats struct {\n\tRunID                string\n\tStartedAt            time.Time\n\tURLsQueued           int64\n\tURLsFetched          int64\n\tURLsBlocked          int64\n\tBytesParsed          int64\n\tBytesTruncated       int64\n\tRegistryHits         int64\n\tLegacyMatchesRaw     int64\n\tDroppedVendorNoise   int64\n\tDroppedFixture       int64\n\tDroppedSourcemap     int64\n\tDroppedLowEntropy    int64\n\tDroppedNoContext     int64\n\tDroppedBelowConf     int64\n\tDroppedRegistryDup   int64\n\tFindingsAfterFilter  int64\n\tFindingsAfterDedupe  int64\n\tVerifyAttempts       int64\n\tVerifyAlive          int64\n\tVerifyDead           int64\n\tVerifyError          int64\n}\n\nvar globalStats *Stats\n\n// initStats creates a stats struct with a freshly minted run-id; safe to call\n// once per process. The run-id makes log lines correlatable across stages.\nfunc initStats() *Stats {\n\tif globalStats != nil {\n\t\treturn globalStats\n\t}\n\tglobalStats = &Stats{\n\t\tRunID:     newRunID(),\n\t\tStartedAt: time.Now(),\n\t}\n\treturn globalStats\n}\n\nfunc newRunID() string {\n\tb := make([]byte, 6)\n\tif _, err := rand.Read(b); err != nil {\n\t\treturn fmt.Sprintf(\"rid-%d\", time.Now().UnixNano())\n\t}\n\treturn \"rid-\" + hex.EncodeToString(b)\n}\n\n// Inc adds 1 to a named counter via atomic ops. The counter pointer is passed\n// directly so callers can use it freely from goroutines without contention.\nfunc statInc(p *int64) {\n\tif p != nil {\n\t\tatomic.AddInt64(p, 1)\n\t}\n}\n\nfunc statAdd(p *int64, n int64) {\n\tif p != nil {\n\t\tatomic.AddInt64(p, n)\n\t}\n}\n\n// printStats emits a human-friendly summary to stderr (so it doesn't pollute\n// the stdout pipeline operators feed into other tools). When --json is set,\n// the same numbers ride the JSON envelope under \"stats\".\nfunc printStats(s *Stats) {\n\tif s == nil {\n\t\treturn\n\t}\n\tdur := time.Since(s.StartedAt).Round(time.Millisecond)\n\tfmt.Fprintf(os.Stderr, \"\\n[%sSTATS%s] run=%s duration=%s\\n\", colors[\"BLUE\"], colors[\"NC\"], s.RunID, dur)\n\tfmt.Fprintf(os.Stderr, \"  fetched         : %d (%d blocked, %d truncated)\\n\",\n\t\tatomic.LoadInt64(&s.URLsFetched),\n\t\tatomic.LoadInt64(&s.URLsBlocked),\n\t\tatomic.LoadInt64(&s.BytesTruncated))\n\tfmt.Fprintf(os.Stderr, \"  bytes parsed    : %d\\n\", atomic.LoadInt64(&s.BytesParsed))\n\tfmt.Fprintf(os.Stderr, \"  registry hits   : %d\\n\", atomic.LoadInt64(&s.RegistryHits))\n\tfmt.Fprintf(os.Stderr, \"  legacy raw      : %d\\n\", atomic.LoadInt64(&s.LegacyMatchesRaw))\n\tfmt.Fprintf(os.Stderr, \"  dropped/vendor  : %d\\n\", atomic.LoadInt64(&s.DroppedVendorNoise))\n\tfmt.Fprintf(os.Stderr, \"  dropped/fixture : %d\\n\", atomic.LoadInt64(&s.DroppedFixture))\n\tfmt.Fprintf(os.Stderr, \"  dropped/srcmap  : %d\\n\", atomic.LoadInt64(&s.DroppedSourcemap))\n\tfmt.Fprintf(os.Stderr, \"  dropped/entropy : %d\\n\", atomic.LoadInt64(&s.DroppedLowEntropy))\n\tfmt.Fprintf(os.Stderr, \"  dropped/context : %d\\n\", atomic.LoadInt64(&s.DroppedNoContext))\n\tfmt.Fprintf(os.Stderr, \"  dropped/conf    : %d\\n\", atomic.LoadInt64(&s.DroppedBelowConf))\n\tfmt.Fprintf(os.Stderr, \"  dropped/dup     : %d\\n\", atomic.LoadInt64(&s.DroppedRegistryDup))\n\tfmt.Fprintf(os.Stderr, \"  findings post   : %d (after dedupe %d)\\n\",\n\t\tatomic.LoadInt64(&s.FindingsAfterFilter),\n\t\tatomic.LoadInt64(&s.FindingsAfterDedupe))\n\tif atomic.LoadInt64(&s.VerifyAttempts) > 0 {\n\t\tfmt.Fprintf(os.Stderr, \"  verify          : %d alive / %d dead / %d error\\n\",\n\t\t\tatomic.LoadInt64(&s.VerifyAlive),\n\t\t\tatomic.LoadInt64(&s.VerifyDead),\n\t\t\tatomic.LoadInt64(&s.VerifyError))\n\t}\n}\n"
  },
  {
    "path": "internal/jshunter/verify.go",
    "content": "package jshunter\n\nimport (\n\t\"context\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"io\"\n\t\"net/http\"\n\t\"strings\"\n\t\"sync\"\n\t\"time\"\n)\n\n// Verifier is the read-only liveness check for a discovered secret. Verifiers\n// never POST, never mutate, never list resources beyond the smallest possible\n// scope. Off by default — gated behind --verify — to keep recon legal and quiet.\ntype Verifier func(ctx context.Context, client *http.Client, value string) VerifyResult\n\n// VerifyResult is the structured outcome of a single liveness probe.\ntype VerifyResult struct {\n\tAlive   bool   `json:\"alive\"`\n\tStatus  int    `json:\"status,omitempty\"`\n\tAccount string `json:\"account,omitempty\"`\n\tNote    string `json:\"note,omitempty\"`\n\tError   string `json:\"error,omitempty\"`\n}\n\n// verifierRegistry maps rule_id -> liveness check.\nvar (\n\tverifierRegistry  = map[string]Verifier{}\n\tverifierRegOnce   sync.Once\n\tverifyHostLimiter = newHostLimiter(2, 250*time.Millisecond)\n)\n\n// registerVerifiers wires every rule that has a documented, read-only,\n// no-cost endpoint we can use to confirm liveness without taking a destructive\n// action. Per the brief: never auto-verify; always opt-in. Source citations\n// for each endpoint are inline so a reviewer can audit at a glance.\nfunc registerVerifiers() {\n\tverifierRegOnce.Do(func() {\n\t\t// Stripe — GET /v1/balance is the documented health-check endpoint:\n\t\t// https://docs.stripe.com/keys\n\t\tverifierRegistry[\"stripe.secret_key\"] = stripeVerify\n\t\tverifierRegistry[\"stripe.restricted_key\"] = stripeVerify\n\n\t\t// GitHub — GET /user with `Authorization: token <pat>`:\n\t\t// https://docs.github.com/en/rest/users/users\n\t\tverifierRegistry[\"github.pat_classic\"] = githubVerify\n\t\tverifierRegistry[\"github.fine_grained_pat\"] = githubVerify\n\n\t\t// OpenAI — GET /v1/models, no token cost:\n\t\t// https://platform.openai.com/docs/api-reference/models/list\n\t\tverifierRegistry[\"openai.legacy_key\"] = openaiVerify\n\t\tverifierRegistry[\"openai.project_key\"] = openaiVerify\n\t\tverifierRegistry[\"openai.svcacct_key\"] = openaiVerify\n\n\t\t// Anthropic — GET /v1/models with x-api-key + anthropic-version:\n\t\t// https://platform.claude.com/docs/en/api/overview\n\t\tverifierRegistry[\"anthropic.api_key\"] = anthropicVerify\n\n\t\t// Slack — GET auth.test, hundreds-rpm rate limit, returns ok+team:\n\t\t// https://docs.slack.dev/reference/methods/auth.test\n\t\tverifierRegistry[\"slack.user_or_bot_token\"] = slackVerify\n\t\tverifierRegistry[\"slack.app_token\"] = slackVerify\n\n\t\t// SendGrid — GET /v3/scopes returns the scopes the key has:\n\t\t// https://docs.sendgrid.com/api-reference/api-key-permissions\n\t\tverifierRegistry[\"sendgrid.api_key\"] = sendgridVerify\n\n\t\t// Mailgun — GET /v3/domains with HTTP basic api:<key>:\n\t\t// https://documentation.mailgun.com/docs/mailgun/api-reference/openapi-final/tag/Domains/\n\t\tverifierRegistry[\"mailgun.api_key\"] = mailgunVerify\n\n\t\t// HuggingFace — GET /api/whoami-v2 returns the user record:\n\t\t// https://huggingface.co/docs/api-inference/quicktour\n\t\tverifierRegistry[\"huggingface.token\"] = huggingfaceVerify\n\t})\n}\n\n// hostLimiter bounds outbound calls per provider host so a verify pass over\n// many findings doesn't trip rate limits and get the operator's IP banned.\ntype hostLimiter struct {\n\tmu      sync.Mutex\n\ttokens  map[string]chan struct{}\n\tper     int\n\tcooldown time.Duration\n}\n\nfunc newHostLimiter(per int, cooldown time.Duration) *hostLimiter {\n\treturn &hostLimiter{\n\t\ttokens:   map[string]chan struct{}{},\n\t\tper:      per,\n\t\tcooldown: cooldown,\n\t}\n}\n\nfunc (h *hostLimiter) acquire(host string) func() {\n\th.mu.Lock()\n\tch, ok := h.tokens[host]\n\tif !ok {\n\t\tch = make(chan struct{}, h.per)\n\t\th.tokens[host] = ch\n\t}\n\th.mu.Unlock()\n\tch <- struct{}{}\n\treturn func() {\n\t\ttime.Sleep(h.cooldown)\n\t\t<-ch\n\t}\n}\n\n// runVerify dispatches `value` to the rule's verifier with a bounded timeout.\n// Returns a redacted-friendly result; never returns the raw secret in errors.\nfunc runVerify(ruleID, value string, client *http.Client, timeout time.Duration) VerifyResult {\n\tregisterVerifiers()\n\tv, ok := verifierRegistry[ruleID]\n\tif !ok {\n\t\treturn VerifyResult{Note: \"no verifier registered for rule\"}\n\t}\n\tctx, cancel := context.WithTimeout(context.Background(), timeout)\n\tdefer cancel()\n\treturn v(ctx, client, value)\n}\n\n// doVerifyRequest is the shared HTTP path. It enforces the per-host limiter,\n// reads at most 64 KiB of the response body (we only need status + small JSON),\n// and translates any low-level error into a VerifyResult.Error string.\nfunc doVerifyRequest(ctx context.Context, client *http.Client, req *http.Request) (*http.Response, []byte, VerifyResult) {\n\thost := req.URL.Host\n\trelease := verifyHostLimiter.acquire(host)\n\tdefer release()\n\n\tresp, err := client.Do(req.WithContext(ctx))\n\tif err != nil {\n\t\treturn nil, nil, VerifyResult{Error: sanitizeNetErr(err.Error())}\n\t}\n\tdefer resp.Body.Close()\n\tbody, _ := io.ReadAll(&capReader{r: resp.Body, max: 64 * 1024})\n\treturn resp, body, VerifyResult{Status: resp.StatusCode}\n}\n\n// sanitizeNetErr ensures we never leak a secret value in an error message.\n// Common transport errors include the URL with query/path; some tokens (e.g.,\n// Slack legacy) can be passed as ?token=, so we redact aggressively.\nfunc sanitizeNetErr(msg string) string {\n\tif i := strings.Index(msg, \"token=\"); i != -1 {\n\t\tmsg = msg[:i] + \"token=***REDACTED***\"\n\t}\n\tif i := strings.Index(msg, \"Bearer \"); i != -1 {\n\t\tmsg = msg[:i] + \"Bearer ***REDACTED***\"\n\t}\n\treturn msg\n}\n\n// capReader caps a stream at `max` bytes — verify endpoints return small JSON;\n// a hostile or misconfigured proxy could otherwise stream arbitrary content.\ntype capReader struct {\n\tr   interface{ Read([]byte) (int, error) }\n\tmax int64\n\tn   int64\n}\n\nfunc (c *capReader) Read(p []byte) (int, error) {\n\tif c.n >= c.max {\n\t\treturn 0, fmt.Errorf(\"verify: response exceeded %d bytes\", c.max)\n\t}\n\tif int64(len(p)) > c.max-c.n {\n\t\tp = p[:c.max-c.n]\n\t}\n\tn, err := c.r.Read(p)\n\tc.n += int64(n)\n\treturn n, err\n}\n\n// stripeVerify uses the documented lightweight `/v1/balance` endpoint.\n// 200 → live, account info isn't returned by /v1/balance so we leave Account empty.\nfunc stripeVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.stripe.com/v1/balance\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"Bearer \"+value)\n\tresp, _, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch {\n\tcase resp.StatusCode == http.StatusOK:\n\t\tres.Alive = true\n\t\tres.Note = \"stripe /v1/balance returned 200\"\n\tcase resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden:\n\t\tres.Note = \"stripe rejected the key\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"stripe returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// githubVerify hits /user. A successful response carries `login` which we\n// surface as Account so the operator knows whose token they captured.\nfunc githubVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.github.com/user\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"token \"+value)\n\treq.Header.Set(\"Accept\", \"application/vnd.github+json\")\n\treq.Header.Set(\"X-GitHub-Api-Version\", \"2022-11-28\")\n\tresp, body, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch resp.StatusCode {\n\tcase http.StatusOK:\n\t\tres.Alive = true\n\t\tvar u struct {\n\t\t\tLogin string `json:\"login\"`\n\t\t}\n\t\t_ = json.Unmarshal(body, &u)\n\t\tres.Account = u.Login\n\t\tres.Note = \"github /user returned 200\"\n\tcase http.StatusUnauthorized:\n\t\tres.Note = \"github rejected the token\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"github returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// openaiVerify uses GET /v1/models. No token cost.\nfunc openaiVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.openai.com/v1/models\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"Bearer \"+value)\n\tresp, _, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch resp.StatusCode {\n\tcase http.StatusOK:\n\t\tres.Alive = true\n\t\tres.Note = \"openai /v1/models returned 200\"\n\tcase http.StatusUnauthorized:\n\t\tres.Note = \"openai rejected the key\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"openai returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// anthropicVerify uses x-api-key + anthropic-version on GET /v1/models.\nfunc anthropicVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.anthropic.com/v1/models\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"x-api-key\", value)\n\treq.Header.Set(\"anthropic-version\", \"2023-06-01\")\n\tresp, _, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch resp.StatusCode {\n\tcase http.StatusOK:\n\t\tres.Alive = true\n\t\tres.Note = \"anthropic /v1/models returned 200\"\n\tcase http.StatusUnauthorized:\n\t\tres.Note = \"anthropic rejected the key\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"anthropic returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// slackVerify hits auth.test which returns ok=true plus team/user metadata.\n// We surface team_id/user as Account for triage.\nfunc slackVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://slack.com/api/auth.test\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"Bearer \"+value)\n\tresp, body, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tif resp.StatusCode != http.StatusOK {\n\t\tres.Note = fmt.Sprintf(\"slack returned %d\", resp.StatusCode)\n\t\treturn res\n\t}\n\tvar s struct {\n\t\tOK     bool   `json:\"ok\"`\n\t\tTeam   string `json:\"team\"`\n\t\tUser   string `json:\"user\"`\n\t\tError  string `json:\"error\"`\n\t\tTeamID string `json:\"team_id\"`\n\t\tUserID string `json:\"user_id\"`\n\t}\n\tif err := json.Unmarshal(body, &s); err != nil {\n\t\tres.Note = \"slack: cannot parse auth.test response\"\n\t\treturn res\n\t}\n\tif !s.OK {\n\t\tres.Note = \"slack auth.test ok=false: \" + s.Error\n\t\treturn res\n\t}\n\tres.Alive = true\n\tres.Account = fmt.Sprintf(\"%s/%s (team=%s user=%s)\", s.TeamID, s.UserID, s.Team, s.User)\n\tres.Note = \"slack auth.test ok=true\"\n\treturn res\n}\n\n// sendgridVerify uses /v3/scopes to confirm the key works.\nfunc sendgridVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.sendgrid.com/v3/scopes\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"Bearer \"+value)\n\tresp, _, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch resp.StatusCode {\n\tcase http.StatusOK:\n\t\tres.Alive = true\n\t\tres.Note = \"sendgrid /v3/scopes returned 200\"\n\tcase http.StatusUnauthorized, http.StatusForbidden:\n\t\tres.Note = \"sendgrid rejected the key\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"sendgrid returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// mailgunVerify uses HTTP Basic api:<key> against /v3/domains.\nfunc mailgunVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://api.mailgun.net/v3/domains\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.SetBasicAuth(\"api\", value)\n\tresp, _, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tswitch resp.StatusCode {\n\tcase http.StatusOK:\n\t\tres.Alive = true\n\t\tres.Note = \"mailgun /v3/domains returned 200\"\n\tcase http.StatusUnauthorized:\n\t\tres.Note = \"mailgun rejected the key\"\n\tdefault:\n\t\tres.Note = fmt.Sprintf(\"mailgun returned %d\", resp.StatusCode)\n\t}\n\treturn res\n}\n\n// huggingfaceVerify uses /api/whoami-v2 which returns the account record.\nfunc huggingfaceVerify(ctx context.Context, client *http.Client, value string) VerifyResult {\n\treq, err := http.NewRequest(\"GET\", \"https://huggingface.co/api/whoami-v2\", nil)\n\tif err != nil {\n\t\treturn VerifyResult{Error: err.Error()}\n\t}\n\treq.Header.Set(\"Authorization\", \"Bearer \"+value)\n\tresp, body, res := doVerifyRequest(ctx, client, req)\n\tif resp == nil {\n\t\treturn res\n\t}\n\tif resp.StatusCode != http.StatusOK {\n\t\tres.Note = fmt.Sprintf(\"huggingface returned %d\", resp.StatusCode)\n\t\treturn res\n\t}\n\tvar u struct {\n\t\tName string `json:\"name\"`\n\t\tType string `json:\"type\"`\n\t}\n\t_ = json.Unmarshal(body, &u)\n\tres.Alive = true\n\tres.Account = fmt.Sprintf(\"%s (%s)\", u.Name, u.Type)\n\tres.Note = \"huggingface /api/whoami-v2 returned 200\"\n\treturn res\n}\n"
  },
  {
    "path": "patterns.json",
    "content": "{\n  \"ajax_url\": \"\\\\.ajax\\\\s*\\\\(\\\\s*[\\\"'][^\\\"']*[\\\"']\",\n  \"api_endpoint\": \"[\\\"']/api/[a-zA-Z0-9._~:/?#[\\\\]@!$\\u0026'()*+,;=%\\\\-]*[\\\"']\",\n  \"endpoint_url\": \"https?://[a-zA-Z0-9.-]+/[a-zA-Z0-9._~:/?#[\\\\]@!$\\u0026'()*+,;=%\\\\-]*\",\n  \"fetch_url\": \"fetch\\\\s*\\\\(\\\\s*[\\\"'][^\\\"']*[\\\"']\",\n  \"graphql_endpoint\": \"[\\\"']/graphql[\\\"']\",\n  \"rest_endpoint\": \"[\\\"']/[a-zA-Z0-9._~:/?#[\\\\]@!$\\u0026'()*+,;=%\\\\-]*[\\\"']\",\n  \"websocket_url\": \"new\\\\s+WebSocket\\\\s*\\\\(\\\\s*[\\\"'][^\\\"']*[\\\"']\",\n  \"xhr_url\": \"\\\\.open\\\\s*\\\\(\\\\s*[\\\"'][^\\\"']*[\\\"']\\\\s*,\\\\s*[\\\"'][^\\\"']*[\\\"']\"\n}"
  }
]