Showing preview only (275K chars total). Download the full file or copy to clipboard to get everything.
Repository: uhop/node-re2
Branch: master
Commit: 209646f8995e
Files: 103
Total size: 248.6 KB
Directory structure:
gitextract_8333oox3/
├── .clinerules
├── .cursorrules
├── .editorconfig
├── .github/
│ ├── COPILOT-INSTRUCTIONS.md
│ ├── FUNDING.yml
│ ├── actions/
│ │ ├── linux-alpine-node-20/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-alpine-node-22/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-alpine-node-24/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-alpine-node-25/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-node-20/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-node-22/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ ├── linux-node-24/
│ │ │ ├── Dockerfile
│ │ │ ├── action.yml
│ │ │ └── entrypoint.sh
│ │ └── linux-node-25/
│ │ ├── Dockerfile
│ │ ├── action.yml
│ │ └── entrypoint.sh
│ ├── dependabot.yml
│ └── workflows/
│ ├── build.yml
│ └── tests.yml
├── .gitignore
├── .gitmodules
├── .prettierignore
├── .prettierrc
├── .vscode/
│ ├── c_cpp_properties.json
│ ├── launch.json
│ ├── settings.json
│ └── tasks.json
├── .windsurf/
│ ├── skills/
│ │ ├── docs-review/
│ │ │ └── SKILL.md
│ │ └── write-tests/
│ │ └── SKILL.md
│ └── workflows/
│ ├── add-module.md
│ ├── ai-docs-update.md
│ └── release-check.md
├── .windsurfrules
├── AGENTS.md
├── ARCHITECTURE.md
├── CLAUDE.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── bench/
│ ├── bad-pattern.mjs
│ └── set-match.mjs
├── binding.gyp
├── lib/
│ ├── accessors.cc
│ ├── addon.cc
│ ├── exec.cc
│ ├── isolate_data.h
│ ├── match.cc
│ ├── new.cc
│ ├── pattern.cc
│ ├── pattern.h
│ ├── replace.cc
│ ├── search.cc
│ ├── set.cc
│ ├── split.cc
│ ├── test.cc
│ ├── to_string.cc
│ ├── util.cc
│ ├── util.h
│ ├── wrapped_re2.h
│ └── wrapped_re2_set.h
├── llms-full.txt
├── llms.txt
├── package.json
├── re2.d.ts
├── re2.js
├── scripts/
│ └── verify-build.js
├── tests/
│ ├── manual/
│ │ ├── matchall-bench.js
│ │ ├── memory-check.js
│ │ ├── memory-monitor.js
│ │ ├── test-unicode-warning.mjs
│ │ └── worker.js
│ ├── test-cjs.cjs
│ ├── test-exec.mjs
│ ├── test-general.mjs
│ ├── test-groups.mjs
│ ├── test-invalid.mjs
│ ├── test-match.mjs
│ ├── test-matchAll.mjs
│ ├── test-prototype.mjs
│ ├── test-replace.mjs
│ ├── test-search.mjs
│ ├── test-set.mjs
│ ├── test-source.mjs
│ ├── test-split.mjs
│ ├── test-symbols.mjs
│ ├── test-test.mjs
│ ├── test-toString.mjs
│ └── test-unicode-classes.mjs
├── ts-tests/
│ └── test-types.ts
└── tsconfig.json
================================================
FILE CONTENTS
================================================
================================================
FILE: .clinerules
================================================
<!-- Canonical source: AGENTS.md — keep this file in sync -->
# node-re2 — AI Agent Rules
## Project identity
node-re2 provides Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. The npm package name is `re2`. It is a C++ native addon built with `node-gyp` and `nan`.
## Critical rules
- **CommonJS.** The project is `"type": "commonjs"`. Use `require()` in source, `import` in tests (`.mjs`).
- **No transpilation.** JavaScript code runs directly.
- **Do not modify vendored code.** Never edit files under `vendor/`. They are git submodules.
- **Do not modify or delete test expectations** without understanding why they changed.
- **Do not add comments or remove comments** unless explicitly asked.
- **Keep `re2.js` and `re2.d.ts` in sync.** All public API exposed from `re2.js` must be typed in `re2.d.ts`.
- **The addon must build on all supported platforms:** Linux (x64, arm64, Alpine), macOS (x64, arm64), Windows (x64, arm64).
- **RE2 is always Unicode-mode.** The `u` flag is always added implicitly.
- **Buffer support is a first-class feature.** All methods that accept strings must also accept Buffers, returning Buffers when given Buffer input.
## Code style
- C++ code: tabs, 4-wide indentation. JavaScript: 2-space indentation.
- Prettier: 80 char width, single quotes, no bracket spacing, no trailing commas, arrow parens "avoid" (see `.prettierrc`).
- nan (Native Abstractions for Node.js) for the C++ addon API.
- Semicolons are enforced by Prettier (default `semi: true`).
## Architecture quick reference
- `re2.js` is the main entry point. Loads `build/Release/re2.node`, sets up Symbol aliases (`Symbol.match`, `Symbol.search`, `Symbol.replace`, `Symbol.split`, `Symbol.matchAll`).
- C++ addon (`lib/*.cc`) wraps Google's RE2 via nan. Each RegExp method has its own `.cc` file.
- `lib/new.cc` handles construction: parse pattern/flags, translate RegExp → RE2 syntax (via `lib/pattern.cc`).
- `lib/pattern.cc` translates Unicode class names (`\p{Letter}` → `\p{L}`, `\p{Script=Latin}` → `\p{Latin}`).
- `lib/set.cc` implements `RE2.Set` for multi-pattern matching.
- `lib/util.cc` provides UTF-8 ↔ UTF-16 conversion and buffer helpers.
- Prebuilt artifacts downloaded at install time via `install-artifact-from-github`.
## Verification commands
- `npm test` — run the full test suite (worker threads)
- `node tests/test-<name>.mjs` — run a single test file directly
- `npm run test:seq` — run sequentially
- `npm run test:proc` — run multi-process
- `npm run ts-check` — TypeScript type checking
- `npm run lint` — Prettier check
- `npm run lint:fix` — Prettier write
- `npm run verify-build` — quick smoke test
- `npm run rebuild` — rebuild the native addon (release)
- `npm run rebuild:dev` — rebuild the native addon (debug)
## File layout
- Entry point: `re2.js` + `re2.d.ts`
- C++ addon: `lib/*.cc`, `lib/*.h`
- Build config: `binding.gyp`
- Tests: `tests/test-*.mjs`
- TypeScript tests: `ts-tests/test-*.ts`
- Benchmarks: `bench/`
- Vendored deps: `vendor/re2/`, `vendor/abseil-cpp/` (git submodules)
- CI: `.github/workflows/`, `.github/actions/`
## When reading the codebase
- Start with `ARCHITECTURE.md` for the module map and dependency graph.
- `re2.d.ts` is the best API reference for the public API. It includes `internalSource` and Buffer overloads.
- `re2.js` is tiny — read it first for the JS-side setup.
- `lib/addon.cc` shows how all C++ methods are registered.
- `lib/wrapped_re2.h` defines the core C++ class.
================================================
FILE: .cursorrules
================================================
<!-- Canonical source: AGENTS.md — keep this file in sync -->
# node-re2 — AI Agent Rules
## Project identity
node-re2 provides Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. The npm package name is `re2`. It is a C++ native addon built with `node-gyp` and `nan`.
## Critical rules
- **CommonJS.** The project is `"type": "commonjs"`. Use `require()` in source, `import` in tests (`.mjs`).
- **No transpilation.** JavaScript code runs directly.
- **Do not modify vendored code.** Never edit files under `vendor/`. They are git submodules.
- **Do not modify or delete test expectations** without understanding why they changed.
- **Do not add comments or remove comments** unless explicitly asked.
- **Keep `re2.js` and `re2.d.ts` in sync.** All public API exposed from `re2.js` must be typed in `re2.d.ts`.
- **The addon must build on all supported platforms:** Linux (x64, arm64, Alpine), macOS (x64, arm64), Windows (x64, arm64).
- **RE2 is always Unicode-mode.** The `u` flag is always added implicitly.
- **Buffer support is a first-class feature.** All methods that accept strings must also accept Buffers, returning Buffers when given Buffer input.
## Code style
- C++ code: tabs, 4-wide indentation. JavaScript: 2-space indentation.
- Prettier: 80 char width, single quotes, no bracket spacing, no trailing commas, arrow parens "avoid" (see `.prettierrc`).
- nan (Native Abstractions for Node.js) for the C++ addon API.
- Semicolons are enforced by Prettier (default `semi: true`).
## Architecture quick reference
- `re2.js` is the main entry point. Loads `build/Release/re2.node`, sets up Symbol aliases (`Symbol.match`, `Symbol.search`, `Symbol.replace`, `Symbol.split`, `Symbol.matchAll`).
- C++ addon (`lib/*.cc`) wraps Google's RE2 via nan. Each RegExp method has its own `.cc` file.
- `lib/new.cc` handles construction: parse pattern/flags, translate RegExp → RE2 syntax (via `lib/pattern.cc`).
- `lib/pattern.cc` translates Unicode class names (`\p{Letter}` → `\p{L}`, `\p{Script=Latin}` → `\p{Latin}`).
- `lib/set.cc` implements `RE2.Set` for multi-pattern matching.
- `lib/util.cc` provides UTF-8 ↔ UTF-16 conversion and buffer helpers.
- Prebuilt artifacts downloaded at install time via `install-artifact-from-github`.
## Verification commands
- `npm test` — run the full test suite (worker threads)
- `node tests/test-<name>.mjs` — run a single test file directly
- `npm run test:seq` — run sequentially
- `npm run test:proc` — run multi-process
- `npm run ts-check` — TypeScript type checking
- `npm run lint` — Prettier check
- `npm run lint:fix` — Prettier write
- `npm run verify-build` — quick smoke test
- `npm run rebuild` — rebuild the native addon (release)
- `npm run rebuild:dev` — rebuild the native addon (debug)
## File layout
- Entry point: `re2.js` + `re2.d.ts`
- C++ addon: `lib/*.cc`, `lib/*.h`
- Build config: `binding.gyp`
- Tests: `tests/test-*.mjs`
- TypeScript tests: `ts-tests/test-*.ts`
- Benchmarks: `bench/`
- Vendored deps: `vendor/re2/`, `vendor/abseil-cpp/` (git submodules)
- CI: `.github/workflows/`, `.github/actions/`
## When reading the codebase
- Start with `ARCHITECTURE.md` for the module map and dependency graph.
- `re2.d.ts` is the best API reference for the public API. It includes `internalSource` and Buffer overloads.
- `re2.js` is tiny — read it first for the JS-side setup.
- `lib/addon.cc` shows how all C++ methods are registered.
- `lib/wrapped_re2.h` defines the core C++ class.
================================================
FILE: .editorconfig
================================================
root = true
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
indent_style = space
indent_size = 2
[*.{h,cc,cpp}]
indent_style = tab
indent_size = 4
================================================
FILE: .github/COPILOT-INSTRUCTIONS.md
================================================
<!-- GitHub Copilot project instructions — canonical source is AGENTS.md -->
See [AGENTS.md](../AGENTS.md) for all AI agent rules and project conventions.
================================================
FILE: .github/FUNDING.yml
================================================
github: uhop
buy_me_a_coffee: uhop
================================================
FILE: .github/actions/linux-alpine-node-20/Dockerfile
================================================
FROM node:20-alpine
RUN apk add --no-cache python3 make gcc g++ linux-headers
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-alpine-node-20/action.yml
================================================
name: 'Create a binary artifact for Node 20 on Alpine Linux'
description: 'Create a binary artifact for Node 20 on Alpine Linux using musl'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-alpine-node-20/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-alpine-node-22/Dockerfile
================================================
FROM node:22-alpine
RUN apk add --no-cache python3 make gcc g++ linux-headers
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-alpine-node-22/action.yml
================================================
name: 'Create a binary artifact for Node 22 on Alpine Linux'
description: 'Create a binary artifact for Node 22 on Alpine Linux using musl'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-alpine-node-22/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-alpine-node-24/Dockerfile
================================================
FROM node:24-alpine
RUN apk add --no-cache python3 make gcc g++ linux-headers
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-alpine-node-24/action.yml
================================================
name: 'Create a binary artifact for Node 24 on Alpine Linux'
description: 'Create a binary artifact for Node 24 on Alpine Linux using musl'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-alpine-node-24/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-alpine-node-25/Dockerfile
================================================
FROM node:25-alpine
RUN apk add --no-cache python3 make gcc g++ linux-headers
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-alpine-node-25/action.yml
================================================
name: 'Create a binary artifact for Node 25 on Alpine Linux'
description: 'Create a binary artifact for Node 25 on Alpine Linux using musl'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-alpine-node-25/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-node-20/Dockerfile
================================================
FROM node:20-bullseye
RUN apt install python3 make gcc g++
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-node-20/action.yml
================================================
name: 'Create a binary artifact for Node 20 on Debian Bullseye Linux'
description: 'Create a binary artifact for Node 20 on Debian Bullseye Linux'
inputs:
node-version:
description: 'Node.js version'
required: false
default: '20'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-node-20/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-node-22/Dockerfile
================================================
FROM node:22-bullseye
RUN apt install python3 make gcc g++
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-node-22/action.yml
================================================
name: 'Create a binary artifact for Node 22 on Debian Bullseye Linux'
description: 'Create a binary artifact for Node 22 on Debian Bullseye Linux'
inputs:
node-version:
description: 'Node.js version'
required: false
default: '22'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-node-22/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-node-24/Dockerfile
================================================
FROM node:24-bullseye
RUN apt install python3 make gcc g++
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-node-24/action.yml
================================================
name: 'Create a binary artifact for Node 24 on Debian Bullseye Linux'
description: 'Create a binary artifact for Node 24 on Debian Bullseye Linux'
inputs:
node-version:
description: 'Node.js version'
required: false
default: '24'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-node-24/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/actions/linux-node-25/Dockerfile
================================================
FROM node:25-trixie
RUN apt install python3 make gcc g++
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
================================================
FILE: .github/actions/linux-node-25/action.yml
================================================
name: 'Create a binary artifact for Node 25 on Debian Trixie Linux'
description: 'Create a binary artifact for Node 25 on Debian Trixie Linux'
inputs:
node-version:
description: 'Node.js version'
required: false
default: '25'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{inputs.node-version}}
================================================
FILE: .github/actions/linux-node-25/entrypoint.sh
================================================
#!/bin/sh
set -e
export USERNAME=`whoami`
export DEVELOPMENT_SKIP_GETTING_ASSET=true
npm i
npm run build --if-present
npm test
npm run save-to-github
================================================
FILE: .github/dependabot.yml
================================================
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://help.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
version: 2
updates:
- package-ecosystem: "npm" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
================================================
FILE: .github/workflows/build.yml
================================================
name: Node.js builds
on:
push:
tags:
- v?[0-9]+.[0-9]+.[0-9]+.[0-9]+
- v?[0-9]+.[0-9]+.[0-9]+
- v?[0-9]+.[0-9]+
permissions:
id-token: write
contents: write
attestations: write
jobs:
create-release:
name: Create release
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- env:
GH_TOKEN: ${{github.token}}
run: |
REF=${{github.ref}}
TAG=${REF#"refs/tags/"}
gh release create -t "Release ${TAG}" -n "" "${{github.ref}}"
build:
name: Node.js ${{matrix.node-version}} on ${{matrix.os}}
needs: create-release
runs-on: ${{matrix.os}}
strategy:
matrix:
os: [macos-latest, windows-latest, macos-15-intel, windows-11-arm]
node-version: [20, 22, 24, 25]
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Setup Node.js ${{matrix.node-version}}
uses: actions/setup-node@v6
with:
node-version: ${{matrix.node-version}}
- name: Install the package and run tests
env:
DEVELOPMENT_SKIP_GETTING_ASSET: true
run: |
npm i
npm run build --if-present
npm test
- name: Save to GitHub
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
run: npm run save-to-github
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-node-20:
name: Node.js 20 on Bullseye
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-20/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-node-22:
name: Node.js 22 on Bullseye
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-22/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-alpine-node-20:
name: Node.js 20 on Alpine
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-20/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-alpine-node-22:
name: Node.js 22 on Alpine
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-22/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-node-20:
name: Node.js 20 on Bullseye ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-20/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-node-22:
name: Node.js 22 on Bullseye ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-22/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-alpine-node-20:
name: Node.js 20 on Alpine ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-20/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-alpine-node-22:
name: Node.js 22 on Alpine ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-22/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-node-24:
name: Node.js 24 on Bullseye
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-24/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-alpine-node-24:
name: Node.js 24 on Alpine
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-24/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-node-24:
name: Node.js 24 on Bullseye ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-24/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-alpine-node-24:
name: Node.js 24 on Alpine ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-24/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-node-25:
name: Node.js 25 on Trixie
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-25/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-alpine-node-25:
name: Node.js 25 on Alpine
needs: create-release
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-25/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-node-25:
name: Node.js 25 on Trixie ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-node-25/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
build-linux-arm64-alpine-node-25:
name: Node.js 25 on Alpine ARM64
needs: create-release
runs-on: ubuntu-24.04-arm
continue-on-error: true
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Install, test, and create artifact
uses: ./.github/actions/linux-alpine-node-25/
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Attest
if: env.CREATED_ASSET_NAME != ''
uses: actions/attest-build-provenance@v4
with:
subject-name: '${{ env.CREATED_ASSET_NAME }}'
subject-path: '${{ github.workspace }}/build/Release/re2.node'
================================================
FILE: .github/workflows/tests.yml
================================================
name: Node.js CI
on:
push:
branches: ['*']
pull_request:
branches: [master]
jobs:
tests:
name: Node.js ${{matrix.node-version}} on ${{matrix.os}}
permissions:
contents: read
runs-on: ${{matrix.os}}
strategy:
matrix:
os: [ubuntu-latest, macOS-latest, windows-latest]
node-version: [20, 22, 24, 25]
steps:
- uses: actions/checkout@v6
with:
submodules: true
- name: Setup Node.js ${{matrix.node-version}}
uses: actions/setup-node@v6
with:
node-version: ${{matrix.node-version}}
- name: Install the package and run tests
env:
DEVELOPMENT_SKIP_GETTING_ASSET: true
run: |
npm i
npm run build --if-present
npm test
================================================
FILE: .gitignore
================================================
node_modules/
build/
report/
coverage/
.AppleDouble
/.development
/.developmentx
/.xdevelopment
/scripts/save-local.sh
================================================
FILE: .gitmodules
================================================
[submodule "vendor/re2"]
path = vendor/re2
url = https://github.com/google/re2
[submodule "vendor/abseil-cpp"]
path = vendor/abseil-cpp
url = https://github.com/abseil/abseil-cpp
[submodule "wiki"]
path = wiki
url = git@github.com:uhop/node-re2.wiki.git
================================================
FILE: .prettierignore
================================================
/.windsurf/workflows
================================================
FILE: .prettierrc
================================================
{
"printWidth": 80,
"singleQuote": true,
"bracketSpacing": false,
"arrowParens": "avoid",
"trailingComma": "none"
}
================================================
FILE: .vscode/c_cpp_properties.json
================================================
{
"configurations": [
{
"name": "Mac",
"includePath": [
"${workspaceFolder}/**",
"/${env.NVM_INC}/**"
],
"defines": [],
"macFrameworkPath": [
"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks"
],
"compilerPath": "/usr/bin/clang",
"cStandard": "c17",
"cppStandard": "c++17",
"intelliSenseMode": "macos-clang-arm64"
}
],
"version": 4
}
================================================
FILE: .vscode/launch.json
================================================
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Debug tests",
"preLaunchTask": "npm: build:dev",
"program": "${env:NVM_BIN}/node",
"args": ["${workspaceFolder}/tests/tests.js"],
"cwd": "${workspaceFolder}"
}
]
}
================================================
FILE: .vscode/settings.json
================================================
{
"cSpell.words": [
"heya",
"PCRE",
"replacee",
"Submatch"
]
}
================================================
FILE: .vscode/tasks.json
================================================
{
"version": "2.0.0",
"tasks": [
{
"type": "npm",
"script": "build:dev",
"group": "build",
"problemMatcher": [],
"label": "npm: build:dev",
"detail": "node-gyp -j max build --debug"
}
]
}
================================================
FILE: .windsurf/skills/docs-review/SKILL.md
================================================
---
name: docs-review
description: Review and improve English in documentation files for brevity and clarity. Use when asked to review docs, improve documentation writing, or edit prose for clarity.
---
# Review Documentation for node-re2
Review and improve English in documentation files for brevity, clarity, and correctness.
## Steps
1. Read the target documentation file(s).
2. Check for:
- Grammatical errors and awkward phrasing.
- Verbose or redundant sentences — prefer concise, direct language.
- Consistency with existing project terminology (RE2, RegExp, Buffer, nan, node-gyp, etc.).
- Correct code examples that match the current API.
- Accurate links (wiki, npm, GitHub).
3. Make edits directly in the file:
- Preserve the existing structure and headings.
- Do not add or remove comments in code examples unless explicitly asked.
- Keep technical accuracy — do not change meaning.
4. If reviewing `README.md`, cross-check API descriptions against `re2.d.ts`.
5. If reviewing `llms.txt` or `llms-full.txt`, ensure examples are runnable and API signatures match `re2.d.ts`.
6. Report a summary of changes made.
## Style guidelines
- Use active voice.
- Prefer short sentences.
- Use "RE2" (not "re2" or "Re2") when referring to the engine or the JS object.
- Use backticks for code references: `RE2`, `Buffer`, `exec()`, etc.
- Use "e.g." and "i.e." sparingly — prefer "for example" and "that is" in longer prose.
- American English spelling.
================================================
FILE: .windsurf/skills/write-tests/SKILL.md
================================================
---
name: write-tests
description: Write or update tape-six tests for a module or feature. Use when asked to write tests, add test coverage, or create typing tests for node-re2.
---
# Write Tests for node-re2
Write or update tests using the tape-six testing library.
## Steps
1. Read `node_modules/tape-six/TESTING.md` for the full tape-six API reference (assertions, hooks, patterns, configuration).
2. Identify the module or feature to test. Read its source code to understand the public API.
3. Check existing tests in `tests/` for node-re2 conventions and patterns.
4. Create or update the test file in `tests/`:
- For runtime tests use `.mjs`.
- Import RE2 with: `import {RE2} from '../re2.js';`
- Import tape-six with: `import test from 'tape-six';`
- Test with both **string** and **Buffer** inputs — Buffer support is a first-class feature.
- Test edge cases: empty strings, no match, global flag behavior, lastIndex, Unicode input.
5. For TypeScript typing tests, update `ts-tests/test-types.ts`:
- Verify typed usage patterns compile correctly.
// turbo
6. Run the new test file directly to verify: `node tests/test-<name>.mjs`
// turbo
7. Run the full test suite to check for regressions: `npm test`
- If debugging, use `npm run test:seq` (runs sequentially, easier to trace issues).
8. Report results and any failures.
## node-re2 test conventions
- Test file naming: `test-*.mjs` in `tests/`.
- TypeScript typing tests: `test-*.ts` in `ts-tests/`.
- Runtime tests (`.mjs`): ESM imports, `import test from 'tape-six'`.
- Tests are configured in `package.json` under the `"tape6"` section.
- Test files should be directly executable: `node tests/test-foo.mjs`.
- Existing tests use synchronous `t => { ... }` style (not async/promise-based).
- Always test both string and Buffer variants of methods.
- Use `t.ok()`, `t.equal()`, `t.deepEqual()`, `t.fail()` for assertions.
- Use try/catch blocks to test error conditions (e.g., invalid patterns throwing `SyntaxError`).
================================================
FILE: .windsurf/workflows/add-module.md
================================================
---
description: Checklist for adding a new C++ method or JS feature to node-re2
---
# Add a New Module
Follow these steps when adding a new method, feature, or C++ implementation.
## New C++ method (e.g., `lib/foo.cc`)
1. Create `lib/foo.cc` with the implementation.
- Use nan for the Node.js addon API.
- Follow existing patterns in `lib/exec.cc` or `lib/test.cc`.
- Tabs for indentation, 4-wide.
- Include `lib/wrapped_re2.h` and `lib/util.h` as needed.
2. Register the method in `lib/addon.cc`:
- Add `Nan::SetPrototypeMethod(tpl, "foo", Foo);` or equivalent.
3. Add the method to `lib/wrapped_re2.h` if it needs a static declaration.
4. Add the source file to `binding.gyp` in the `"sources"` array.
// turbo
5. Rebuild the addon: `npm run rebuild`
6. Update `re2.js` if JS-side setup is needed (e.g., Symbol aliases).
7. Update `re2.d.ts` with TypeScript declarations for the new method.
- Keep `re2.js` and `re2.d.ts` in sync.
8. Create `tests/test-foo.mjs` with automated tests (tape-six, ESM):
- `import {RE2} from '../re2.js';`
- Test with strings and Buffers.
- Test edge cases (empty input, no match, global flag, etc.).
// turbo
9. Run the new test: `node tests/test-foo.mjs`
10. Update TypeScript tests in `ts-tests/test-types.ts` if the public API changed.
11. Update `README.md` with documentation for the new feature.
12. Update `ARCHITECTURE.md` — add to project layout and C++ addon table.
13. Update `llms.txt` and `llms-full.txt` with a description and examples.
14. Update `AGENTS.md` if the architecture quick reference needs updating.
// turbo
15. Verify: `npm test`
// turbo
16. Verify: `npm run ts-check`
// turbo
17. Verify: `npm run lint`
## JS-only feature (e.g., new Symbol alias, helper)
1. Add the implementation to `re2.js`.
2. Update `re2.d.ts` with TypeScript declarations.
3. Create or update tests in `tests/`.
// turbo
4. Run the new test: `node tests/test-<name>.mjs`
5. Update `README.md`, `llms.txt`, `llms-full.txt`.
6. Update `AGENTS.md` and `ARCHITECTURE.md` if needed.
// turbo
7. Verify: `npm test`
// turbo
8. Verify: `npm run ts-check`
// turbo
9. Verify: `npm run lint`
================================================
FILE: .windsurf/workflows/ai-docs-update.md
================================================
---
description: Update AI-facing documentation files after API or architecture changes
---
# AI Documentation Update
Update all AI-facing files after changes to the public API, modules, or project structure.
## Steps
1. Read `re2.js` and `re2.d.ts` to identify the current public API.
2. Read `AGENTS.md` and `ARCHITECTURE.md` for current state.
3. Identify what changed (new methods, new flags, new C++ files, renamed exports, removed features, etc.).
4. Update `llms.txt`:
- Ensure the API section matches `re2.d.ts`.
- Update common patterns if new features were added.
- Keep it concise — this is for quick LLM consumption.
5. Update `llms-full.txt`:
- Full API reference with all methods, options, and examples.
- Include any new features, RE2.Set changes, or Buffer behavior.
6. Update `ARCHITECTURE.md` if project structure or module dependencies changed.
7. Update `AGENTS.md` if critical rules, commands, or architecture quick reference changed.
8. Sync `.windsurfrules`, `.cursorrules`, `.clinerules` if `AGENTS.md` changed:
- These three files should be identical copies of the condensed rules.
9. Update `README.md` if the public-facing docs need to reflect new features.
10. Track progress with the todo list and provide a summary when done.
================================================
FILE: .windsurf/workflows/release-check.md
================================================
---
description: Pre-release verification checklist for node-re2
---
# Release Check
Run through this checklist before publishing a new version.
## Steps
1. Check that `re2.js` and `re2.d.ts` are in sync (all exports, all types).
2. Check that `ARCHITECTURE.md` reflects any structural changes.
3. Check that `AGENTS.md` is up to date with any rule or workflow changes.
4. Check that `.windsurfrules`, `.clinerules`, `.cursorrules` are in sync with `AGENTS.md`.
5. Check that `llms.txt` and `llms-full.txt` are up to date with any API changes.
6. Verify `package.json`:
- `files` array includes all necessary entries (`binding.gyp`, `lib`, `re2.d.ts`, `scripts/*.js`, `vendor`).
- `main` points to `re2.js`.
- `types` points to `re2.d.ts`.
7. Check that the copyright year in `LICENSE` includes the current year.
8. Bump `version` in `package.json`.
9. Update release history in `README.md`.
10. Run `npm install` to regenerate `package-lock.json`.
// turbo
11. Rebuild the native addon: `npm run rebuild`
// turbo
12. Run the quick smoke test: `npm run verify-build`
// turbo
13. Run the full test suite: `npm test`
// turbo
14. Run TypeScript check: `npm run ts-check`
// turbo
15. Run lint: `npm run lint`
// turbo
16. Dry-run publish to verify package contents: `npm pack --dry-run`
================================================
FILE: .windsurfrules
================================================
<!-- Canonical source: AGENTS.md — keep this file in sync -->
# node-re2 — AI Agent Rules
## Project identity
node-re2 provides Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. The npm package name is `re2`. It is a C++ native addon built with `node-gyp` and `nan`.
## Critical rules
- **CommonJS.** The project is `"type": "commonjs"`. Use `require()` in source, `import` in tests (`.mjs`).
- **No transpilation.** JavaScript code runs directly.
- **Do not modify vendored code.** Never edit files under `vendor/`. They are git submodules.
- **Do not modify or delete test expectations** without understanding why they changed.
- **Do not add comments or remove comments** unless explicitly asked.
- **Keep `re2.js` and `re2.d.ts` in sync.** All public API exposed from `re2.js` must be typed in `re2.d.ts`.
- **The addon must build on all supported platforms:** Linux (x64, arm64, Alpine), macOS (x64, arm64), Windows (x64, arm64).
- **RE2 is always Unicode-mode.** The `u` flag is always added implicitly.
- **Buffer support is a first-class feature.** All methods that accept strings must also accept Buffers, returning Buffers when given Buffer input.
## Code style
- C++ code: tabs, 4-wide indentation. JavaScript: 2-space indentation.
- Prettier: 80 char width, single quotes, no bracket spacing, no trailing commas, arrow parens "avoid" (see `.prettierrc`).
- nan (Native Abstractions for Node.js) for the C++ addon API.
- Semicolons are enforced by Prettier (default `semi: true`).
## Architecture quick reference
- `re2.js` is the main entry point. Loads `build/Release/re2.node`, sets up Symbol aliases (`Symbol.match`, `Symbol.search`, `Symbol.replace`, `Symbol.split`, `Symbol.matchAll`).
- C++ addon (`lib/*.cc`) wraps Google's RE2 via nan. Each RegExp method has its own `.cc` file.
- `lib/new.cc` handles construction: parse pattern/flags, translate RegExp → RE2 syntax (via `lib/pattern.cc`).
- `lib/pattern.cc` translates Unicode class names (`\p{Letter}` → `\p{L}`, `\p{Script=Latin}` → `\p{Latin}`).
- `lib/set.cc` implements `RE2.Set` for multi-pattern matching.
- `lib/util.cc` provides UTF-8 ↔ UTF-16 conversion and buffer helpers.
- Prebuilt artifacts downloaded at install time via `install-artifact-from-github`.
## Verification commands
- `npm test` — run the full test suite (worker threads)
- `node tests/test-<name>.mjs` — run a single test file directly
- `npm run test:seq` — run sequentially
- `npm run test:proc` — run multi-process
- `npm run ts-check` — TypeScript type checking
- `npm run lint` — Prettier check
- `npm run lint:fix` — Prettier write
- `npm run verify-build` — quick smoke test
- `npm run rebuild` — rebuild the native addon (release)
- `npm run rebuild:dev` — rebuild the native addon (debug)
## File layout
- Entry point: `re2.js` + `re2.d.ts`
- C++ addon: `lib/*.cc`, `lib/*.h`
- Build config: `binding.gyp`
- Tests: `tests/test-*.mjs`
- TypeScript tests: `ts-tests/test-*.ts`
- Benchmarks: `bench/`
- Vendored deps: `vendor/re2/`, `vendor/abseil-cpp/` (git submodules)
- CI: `.github/workflows/`, `.github/actions/`
## When reading the codebase
- Start with `ARCHITECTURE.md` for the module map and dependency graph.
- `re2.d.ts` is the best API reference for the public API. It includes `internalSource` and Buffer overloads.
- `re2.js` is tiny — read it first for the JS-side setup.
- `lib/addon.cc` shows how all C++ methods are registered.
- `lib/wrapped_re2.h` defines the core C++ class.
================================================
FILE: AGENTS.md
================================================
# AGENTS.md — node-re2
> `node-re2` provides Node.js bindings for [RE2](https://github.com/google/re2): a fast, safe alternative to backtracking regular expression engines. The npm package name is `re2`. It is a C++ native addon built with `node-gyp` and `nan`.
For project structure, module dependencies, and the architecture overview see [ARCHITECTURE.md](./ARCHITECTURE.md).
For detailed usage docs see the [README](./README.md) and the [wiki](https://github.com/uhop/node-re2/wiki).
## Setup
This project uses git submodules for vendored dependencies (RE2 and Abseil):
```bash
git clone --recursive git@github.com:uhop/node-re2.git
cd node-re2
npm install
```
If the native addon fails to download a prebuilt artifact, it builds locally via `node-gyp`.
## Commands
- **Install:** `npm install` (downloads prebuilt artifact or builds from source)
- **Build (release):** `npm run rebuild` (or `node-gyp -j max rebuild`)
- **Build (debug):** `npm run rebuild:dev` (or `node-gyp -j max rebuild --debug`)
- **Test:** `npm test` (runs `tape6 --flags FO`, worker threads)
- **Test (sequential):** `npm run test:seq`
- **Test (multi-process):** `npm run test:proc`
- **Test (single file):** `node tests/test-<name>.mjs`
- **TypeScript check:** `npm run ts-check`
- **Lint:** `npm run lint` (Prettier check)
- **Lint fix:** `npm run lint:fix` (Prettier write)
- **Verify build:** `npm run verify-build`
## Project structure
```
node-re2/
├── package.json # Package config; "tape6" section configures test discovery
├── binding.gyp # node-gyp build configuration for the C++ addon
├── re2.js # Main entry point: loads native addon, sets up Symbol aliases
├── re2.d.ts # TypeScript declarations for the public API
├── tsconfig.json # TypeScript config (noEmit, strict, types: ["node"])
├── lib/ # C++ source code (native addon)
│ ├── addon.cc # Node.js addon initialization, method registration
│ ├── wrapped_re2.h # WrappedRE2 class definition (core C++ wrapper)
│ ├── wrapped_re2_set.h # WrappedRE2Set class definition (RE2.Set wrapper)
│ ├── isolate_data.h # Per-isolate data struct for thread-safe addon state
│ ├── new.cc # Constructor: parse pattern/flags, create RE2 instance
│ ├── exec.cc # RE2.prototype.exec() implementation
│ ├── test.cc # RE2.prototype.test() implementation
│ ├── match.cc # RE2.prototype.match() implementation
│ ├── replace.cc # RE2.prototype.replace() implementation
│ ├── search.cc # RE2.prototype.search() implementation
│ ├── split.cc # RE2.prototype.split() implementation
│ ├── to_string.cc # RE2.prototype.toString() implementation
│ ├── accessors.cc # Property accessors (source, flags, lastIndex, etc.)
│ ├── pattern.cc # Pattern translation (RegExp → RE2 syntax, Unicode classes)
│ ├── set.cc # RE2.Set implementation (multi-pattern matching)
│ ├── util.cc # Shared utilities (UTF-8/UTF-16 conversion, buffer helpers)
│ ├── util.h # Utility declarations
│ └── pattern.h # Pattern translation declarations
├── scripts/
│ └── verify-build.js # Quick smoke test for the built addon
├── tests/ # Test files (test-*.mjs using tape-six)
├── ts-tests/ # TypeScript type-checking tests
│ └── test-types.ts # Verifies type declarations compile correctly
├── bench/ # Benchmarks
├── vendor/ # Vendored C++ dependencies (git submodules)
│ ├── re2/ # Google RE2 library source
│ └── abseil-cpp/ # Abseil C++ library (RE2 dependency)
└── .github/ # CI workflows, Dependabot config, actions
```
## Code style
- **CommonJS** throughout (`"type": "commonjs"` in package.json).
- **No transpilation** — JavaScript code runs directly.
- **C++ code** uses tabs for indentation, 4-wide. JavaScript uses 2-space indentation.
- **Prettier** for JS/TS formatting (see `.prettierrc`): 80 char width, single quotes, no bracket spacing, no trailing commas, arrow parens "avoid".
- **nan** (Native Abstractions for Node.js) for the C++ addon API.
- Semicolons are enforced by Prettier (default `semi: true`).
- Imports use `require()` syntax in source, `import` in tests (`.mjs`).
## Critical rules
- **Do not modify vendored code.** Never edit files under `vendor/`. They are git submodules.
- **Do not modify or delete test expectations** without understanding why they changed.
- **Do not add comments or remove comments** unless explicitly asked.
- **Keep `re2.js` and `re2.d.ts` in sync.** All public API exposed from `re2.js` must be typed in `re2.d.ts`.
- **The addon must build on all supported platforms:** Linux (x64, arm64, Alpine), macOS (x64, arm64), Windows (x64, arm64).
- **RE2 is always Unicode-mode.** The `u` flag is always added implicitly.
- **Buffer support is a first-class feature.** All methods that accept strings must also accept Buffers, returning Buffers when given Buffer input.
## Architecture
- `re2.js` is the main entry point. It loads the native C++ addon from `build/Release/re2.node` and sets up `Symbol.match`, `Symbol.search`, `Symbol.replace`, `Symbol.split`, and `Symbol.matchAll` on the prototype.
- The C++ addon (`lib/*.cc`) wraps Google's RE2 library via nan. Each RegExp method has its own `.cc` file.
- `lib/new.cc` handles construction: parsing patterns, translating RegExp syntax to RE2 syntax (via `lib/pattern.cc`), and creating the underlying `re2::RE2` instance.
- `lib/pattern.cc` translates JavaScript RegExp features to RE2 equivalents, including Unicode class names (`\p{Letter}` → `\p{L}`, `\p{Script=Latin}` → `\p{Latin}`).
- `lib/set.cc` implements `RE2.Set` for multi-pattern matching using `re2::RE2::Set`.
- `lib/util.cc` provides UTF-8 ↔ UTF-16 conversion helpers and buffer utilities.
- Prebuilt native artifacts are hosted on GitHub Releases and downloaded at install time via `install-artifact-from-github`.
## Writing tests
```js
import test from 'tape-six';
import {RE2} from '../re2.js';
test('example', t => {
const re = new RE2('a(b*)', 'i');
const result = re.exec('aBbC');
t.ok(result);
t.equal(result[0], 'aBb');
t.equal(result[1], 'Bb');
});
```
- Test files use `tape-six`: `.mjs` for runtime tests, `.ts` for TypeScript typing tests.
- Test file naming convention: `test-*.mjs` in `tests/`, `test-*.ts` in `ts-tests/`.
- Tests are configured in `package.json` under the `"tape6"` section.
- Test files should be directly executable: `node tests/test-foo.mjs`.
## Key conventions
- The library is a drop-in replacement for `RegExp` — the `RE2` object emulates the standard `RegExp` API.
- `RE2.Set` provides multi-pattern matching: `new RE2.Set(patterns, flags, options)`.
- Static helpers: `RE2.getUtf8Length(str)`, `RE2.getUtf16Length(buf)`.
- `RE2.unicodeWarningLevel` controls behavior when non-Unicode regexps are created.
- The `install` script tries to download a prebuilt `.node` artifact before falling back to `node-gyp rebuild`.
- All C++ source is in `lib/`, all vendored third-party C++ is in `vendor/`.
================================================
FILE: ARCHITECTURE.md
================================================
# Architecture
`node-re2` provides Node.js bindings for Google's [RE2](https://github.com/google/re2) regular expression engine. It is a C++ native addon built with `node-gyp` and `nan`. The `RE2` object is a drop-in replacement for `RegExp` with guaranteed linear-time matching (no ReDoS).
## Project layout
```
package.json # Package config; "tape6" section configures test discovery
binding.gyp # node-gyp build configuration for the C++ addon
re2.js # Main entry point: loads native addon, sets up Symbol aliases
re2.d.ts # TypeScript declarations for the public API
tsconfig.json # TypeScript config (noEmit, strict, types: ["node"])
lib/ # C++ source code (native addon)
├── addon.cc # Node.js addon initialization, method registration
├── wrapped_re2.h # WrappedRE2 class definition (core C++ wrapper)
├── wrapped_re2_set.h # WrappedRE2Set class definition (RE2.Set wrapper)
├── isolate_data.h # Per-isolate data struct for thread-safe addon state
├── new.cc # Constructor: parse pattern/flags, create RE2 instance
├── exec.cc # RE2.prototype.exec() implementation
├── test.cc # RE2.prototype.test() implementation
├── match.cc # RE2.prototype.match() implementation
├── replace.cc # RE2.prototype.replace() implementation
├── search.cc # RE2.prototype.search() implementation
├── split.cc # RE2.prototype.split() implementation
├── to_string.cc # RE2.prototype.toString() implementation
├── accessors.cc # Property accessors (source, flags, lastIndex, etc.)
├── pattern.cc # Pattern translation (RegExp → RE2 syntax, Unicode classes)
├── pattern.h # Pattern translation declarations
├── set.cc # RE2.Set implementation (multi-pattern matching)
├── util.cc # Shared utilities (UTF-8/UTF-16 conversion, buffer helpers)
└── util.h # Utility declarations
scripts/
└── verify-build.js # Quick smoke test for the built addon
tests/ # Test files (test-*.mjs using tape-six)
ts-tests/ # TypeScript type-checking tests
└── test-types.ts # Verifies type declarations compile correctly
bench/ # Benchmarks
vendor/ # Vendored C++ dependencies (git submodules) — DO NOT MODIFY
├── re2/ # Google RE2 library source
└── abseil-cpp/ # Abseil C++ library (RE2 dependency)
.github/ # CI workflows, Dependabot config, actions
```
## Core concepts
### How the addon works
1. `re2.js` is the entry point. It loads the compiled C++ addon from `build/Release/re2.node`.
2. The addon exposes an `RE2` constructor that wraps `re2::RE2` from Google's RE2 library.
3. `re2.js` adds `Symbol.match`, `Symbol.search`, `Symbol.replace`, `Symbol.split`, and `Symbol.matchAll` to the prototype so `RE2` instances work with ES6 string methods.
4. The `RE2` constructor can be called with or without `new` (factory mode).
### C++ addon structure
Each RegExp method has its own `.cc` file for maintainability:
| File | Purpose |
| --------------- | ---------------------------------------------------------------- |
| `addon.cc` | Node.js module initialization, registers all methods/accessors |
| `isolate_data.h` | Per-isolate data struct (`AddonData`) for thread-safe addon state |
| `wrapped_re2.h` | `WrappedRE2` class: holds `re2::RE2*`, flags, lastIndex, source |
| `new.cc` | Constructor: parses pattern + flags, translates syntax, creates RE2 instance |
| `exec.cc` | `exec()` — find match with capture groups |
| `test.cc` | `test()` — boolean match check |
| `match.cc` | `match()` — String.prototype.match equivalent |
| `replace.cc` | `replace()` — substitution with string or function replacer |
| `search.cc` | `search()` — find index of first match |
| `split.cc` | `split()` — split string by pattern |
| `to_string.cc` | `toString()` — `/pattern/flags` representation |
| `accessors.cc` | Property getters: `source`, `flags`, `lastIndex`, `global`, `ignoreCase`, `multiline`, `dotAll`, `unicode`, `sticky`, `hasIndices`, `internalSource` |
| `pattern.cc` | Translates JS RegExp syntax to RE2 syntax, maps Unicode property names |
| `set.cc` | `RE2.Set` — multi-pattern matching via `re2::RE2::Set` |
| `util.cc` | UTF-8 ↔ UTF-16 conversion, buffer/string helpers |
### Pattern translation (pattern.cc)
JavaScript RegExp features are translated to RE2 equivalents:
- Named groups: `(?<name>...)` syntax is preserved (RE2 supports it natively).
- Unicode classes: long names like `\p{Letter}` are mapped to short names `\p{L}`. Script names like `\p{Script=Latin}` are mapped to `\p{Latin}`.
- Backreferences and lookahead assertions are **not supported** — RE2 throws `SyntaxError`.
### Buffer support
All methods accept both strings and Node.js Buffers:
- Buffer inputs are assumed UTF-8 encoded.
- Buffer inputs produce Buffer outputs (in composite result objects too).
- Offsets and lengths are in bytes (not characters) when using Buffers.
- The `useBuffers` property on replacer functions controls offset reporting in `replace()`.
### RE2.Set (set.cc)
Multi-pattern matching using `re2::RE2::Set`:
- `new RE2.Set(patterns, flags?, options?)` — compile multiple patterns into a single automaton.
- `set.test(str)` — returns `true` if any pattern matches.
- `set.match(str)` — returns array of indices of matching patterns.
- Properties: `size`, `source`, `sources`, `flags`, `anchor`.
### Build system
- `binding.gyp` defines the node-gyp build: compiles all `.cc` files in `lib/` plus vendored RE2 and Abseil sources.
- Platform-specific compiler flags are set for GCC, Clang, and MSVC.
- The `install` npm script first tries to download a prebuilt `re2.node` from GitHub Releases via `install-artifact-from-github`, falling back to a local `node-gyp rebuild`.
- Prebuilt artifacts cover: Linux (x64, arm64, Alpine/musl), macOS (x64, arm64), Windows (x64, arm64).
## Module dependency graph
```
re2.js ──→ build/Release/re2.node (compiled C++ addon)
│
├── lib/addon.cc (init)
│ ├── lib/new.cc ──→ lib/pattern.cc
│ ├── lib/exec.cc
│ ├── lib/test.cc
│ ├── lib/match.cc
│ ├── lib/replace.cc
│ ├── lib/search.cc
│ ├── lib/split.cc
│ ├── lib/to_string.cc
│ ├── lib/accessors.cc
│ └── lib/set.cc
│
├── lib/wrapped_re2.h (shared class definition)
├── lib/wrapped_re2_set.h (RE2.Set class)
├── lib/util.cc / lib/util.h (shared utilities)
│
└── vendor/ (re2 + abseil-cpp)
```
## Testing
- **Framework**: tape-six (`tape6`)
- **Run all**: `npm test` (worker threads via `tape6 --flags FO`)
- **Run sequential**: `npm run test:seq`
- **Run multi-process**: `npm run test:proc`
- **Run single file**: `node tests/test-<name>.mjs`
- **TypeScript check**: `npm run ts-check`
- **Lint**: `npm run lint` (Prettier check)
- **Lint fix**: `npm run lint:fix` (Prettier write)
- **Verify build**: `npm run verify-build` (quick smoke test)
## Import paths
```js
// CommonJS (source, scripts)
const RE2 = require('re2');
// ESM (tests)
import {RE2} from '../re2.js';
```
================================================
FILE: CLAUDE.md
================================================
<!-- Claude Code project instructions — canonical source is AGENTS.md -->
See [AGENTS.md](./AGENTS.md) for all AI agent rules and project conventions.
================================================
FILE: CONTRIBUTING.md
================================================
# Contributing to node-re2
Thank you for your interest in contributing!
## Getting started
This project uses git submodules for vendored dependencies (RE2 and Abseil). Clone recursively:
```bash
git clone --recursive git@github.com:uhop/node-re2.git
cd node-re2
npm install
```
See [ARCHITECTURE.md](./ARCHITECTURE.md) for the module map and dependency graph.
## Development workflow
1. Make your changes.
2. Rebuild the addon: `npm run rebuild`
3. Lint: `npm run lint:fix`
4. Test: `npm test`
5. Type-check: `npm run ts-check`
## Code style
- CommonJS (`require()`/`module.exports`) in JavaScript source, ESM (`import`) in tests (`.mjs`).
- C++ code uses tabs (4-wide indentation). JavaScript uses 2-space indentation.
- Formatted with Prettier — see `.prettierrc` for settings.
- C++ addon API uses nan (Native Abstractions for Node.js).
- Keep `re2.js` and `re2.d.ts` in sync.
## Important notes
- Never edit files under `vendor/` — they are git submodules.
- RE2 always operates in Unicode mode — the `u` flag is added implicitly.
- Buffer support is a first-class feature — all methods must handle both strings and Buffers.
## AI agents
If you are an AI coding agent, see [AGENTS.md](./AGENTS.md) for detailed project conventions, commands, and architecture.
================================================
FILE: LICENSE
================================================
This library is available under the terms of the modified BSD license. No external contributions
are allowed under licenses which are fundamentally incompatible with the BSD license that this library is distributed under.
The text of the BSD license is reproduced below.
-------------------------------------------------------------------------------
The "New" BSD License:
**********************
Copyright (c) 2005-2026, Eugene Lazutkin
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of Eugene Lazutkin nor the names of other contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
================================================
FILE: README.md
================================================
# node-re2 [![NPM version][npm-img]][npm-url]
[npm-img]: https://img.shields.io/npm/v/re2.svg
[npm-url]: https://npmjs.org/package/re2
This project provides Node.js bindings for [RE2](https://github.com/google/re2):
a fast, safe alternative to backtracking regular expression engines written by [Russ Cox](http://swtch.com/~rsc/) in C++.
To learn more about RE2, start with [Regular Expression Matching in the Wild](http://swtch.com/~rsc/regexp/regexp3.html). More resources are on his [Implementing Regular Expressions](http://swtch.com/~rsc/regexp/) page.
`RE2`'s regular expression language is almost a superset of what `RegExp` provides
(see [Syntax](https://github.com/google/re2/wiki/Syntax)),
but it lacks backreferences and lookahead assertions. See below for details.
`RE2` always works in [Unicode mode](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) — character codes are interpreted as Unicode code points, not as binary values of UTF-16.
See `RE2.unicodeWarningLevel` below for details.
`RE2` emulates standard `RegExp`, making it a practical drop-in replacement in most cases.
It also provides `String`-based regular expression methods. The constructor accepts `RegExp` directly, honoring all properties.
It can work with [Node.js Buffers](https://nodejs.org/api/buffer.html) directly, reducing overhead and making processing of long files fast.
The project is a C++ addon built with [nan](https://github.com/nodejs/nan). It cannot be used in web browsers.
All documentation is in this README and in the [wiki](https://github.com/uhop/node-re2/wiki).
## Why use node-re2?
The built-in Node.js regular expression engine can run in exponential time with a special combination:
- A vulnerable regular expression
- "Evil input"
This can lead to what is known as a [Regular Expression Denial of Service (ReDoS)](https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS).
To check if your regular expressions are vulnerable, try one of these projects:
- [rxxr2](http://www.cs.bham.ac.uk/~hxt/research/rxxr2/)
- [safe-regex](https://github.com/substack/safe-regex)
Neither project is perfect.
node-re2 protects against ReDoS by evaluating patterns in `RE2` instead of the built-in regex engine.
To run the bundled benchmark (make sure node-re2 is built first):
```bash
npx nano-bench bench/bad-pattern.mjs
```
## Standard features
`RE2` objects are created just like `RegExp`:
* [`new RE2(pattern[, flags])`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp)
Supported flags: `g` (global), `i` (ignoreCase), `m` (multiline), `s` (dotAll), `u` (unicode, always on), `y` (sticky), `d` (hasIndices).
Supported properties:
* [`re2.lastIndex`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex)
* [`re2.global`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/global)
* [`re2.ignoreCase`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/ignoreCase)
* [`re2.multiline`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/multiline)
* [`re2.dotAll`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/dotAll)
* [`re2.unicode`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode) — always `true`; see details below.
* [`re2.sticky`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/sticky)
* [`re2.hasIndices`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/hasIndices)
* [`re2.source`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/source)
* [`re2.flags`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/flags)
Supported methods:
* [`re2.exec(str)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec)
* [`re2.test(str)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/test)
* [`re2.toString()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/toString)
Well-known symbol-based methods are supported (see [Symbols](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol)):
* [`re2[Symbol.match](str)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/match)
* [`re2[Symbol.matchAll](str)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/matchAll)
* [`re2[Symbol.search](str)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/search)
* [`re2[Symbol.replace](str, newSubStr|function)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/replace)
* [`re2[Symbol.split](str[, limit])`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/split)
This lets you use `RE2` instances on strings directly, just like `RegExp`:
```js
const re = new RE2('1');
'213'.match(re); // [ '1', index: 1, input: '213' ]
'213'.search(re); // 1
'213'.replace(re, '+'); // 2+3
'213'.split(re); // [ '2', '3' ]
Array.from('2131'.matchAll(new RE2('1', 'g'))); // matchAll requires the g flag
// [['1', index: 1, input: '2131'], ['1', index: 3, input: '2131']]
```
[Named groups](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Named_capturing_group) are supported.
## Extensions
### Shortcut construction
`RE2` can be created from a regular expression:
```js
const re1 = new RE2(/ab*/ig); // from a RegExp object
const re2 = new RE2(re1); // from another RE2 object
```
### `String` methods
`RE2` provides the standard `String` regex methods with swapped receiver and argument:
* `re2.match(str)`
* See [`str.match(regexp)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match)
* `re2.replace(str, newSubStr|function)`
* See [`str.replace(regexp, newSubStr|function)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace)
* `re2.search(str)`
* See [`str.search(regexp)`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/search)
* `re2.split(str[, limit])`
* See [`str.split(regexp[, limit])`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split)
These methods are also available as well-known symbol-based methods for transparent use with ES6 string/regex machinery.
### `Buffer` support
Most methods accept Buffers instead of strings for direct UTF-8 processing:
* `re2.exec(buf)`
* `re2.test(buf)`
* `re2.match(buf)`
* `re2.search(buf)`
* `re2.split(buf[, limit])`
* `re2.replace(buf, replacer)`
Differences from string-based versions:
* All buffers are assumed to be encoded as [UTF-8](https://en.wikipedia.org/wiki/UTF-8)
(ASCII is a proper subset of UTF-8).
* Results are `Buffer` objects, even in composite objects. Convert with
[`buf.toString()`](https://nodejs.org/api/buffer.html#buffer_buf_tostring_encoding_start_end).
* All offsets and lengths are in bytes, not characters (each UTF-8 character occupies 1–4 bytes).
This lets you slice buffers directly without costly character-to-byte recalculations.
When `re2.replace()` is used with a replacer function, the replacer receives string arguments and character offsets by default. Set `useBuffers` to `true` on the function to receive byte offsets instead:
```js
function strReplacer(match, offset, input) {
// typeof match == "string"
return "<= " + offset + " characters|";
}
RE2("б").replace("абв", strReplacer);
// "а<= 1 characters|в"
function bufReplacer(match, offset, input) {
// typeof match == "string"
return "<= " + offset + " bytes|";
}
bufReplacer.useBuffers = true;
RE2("б").replace("абв", bufReplacer);
// "а<= 2 bytes|в"
```
This works for both string and buffer inputs. Buffer input produces buffer output; string input produces string output.
### `RE2.Set`
Use `RE2.Set` when the same string must be tested against many patterns. It builds a single automaton and frequently beats running individual regular expressions one by one.
While `test()` can be simulated by combining patterns with `|`, `match()` returns which patterns matched — something a single regular expression cannot do.
* `new RE2.Set(patterns[, flagsOrOptions][, options])`
* `patterns` is any iterable of strings, `Buffer`s, `RegExp`, or `RE2` instances; flags (if provided) apply to the whole set.
* `flagsOrOptions` can be a string/`Buffer` with standard flags (`i`, `m`, `s`, `u`, `g`, `y`, `d`).
* `options.anchor` can be `'unanchored'` (default), `'start'`, or `'both'`.
* `set.test(str)` returns `true` if any pattern matches and `false` otherwise.
* `set.match(str)` returns an array of indexes of matching patterns.
* This is an array of integer indices of patterns that matched sorted in ascending order.
* If no patterns matched, an empty array is returned.
* Read-only properties:
* `set.size` (number of patterns), `set.flags` (`RegExp` flags as a string), `set.anchor` (anchor mode as a string)
* `set.source` (all patterns joined with `|` as a string), `set.sources` (individual pattern sources as an array of strings)
It is based on [RE2::Set](https://github.com/google/re2/blob/main/re2/set.h).
Example:
```js
const routes = new RE2.Set([
'^/users/\\d+$',
'^/posts/\\d+$'
], 'i', {anchor: 'start'});
routes.test('/users/7'); // true
routes.match('/posts/42'); // [1]
routes.sources; // ['^/users/\\d+$', '^/posts/\\d+$']
routes.toString(); // '/^/users/\\d+$|^/posts/\\d+$/iu'
```
To run the bundled benchmark (make sure node-re2 is built first):
```bash
npx nano-bench bench/set-match.mjs
```
### Calculate length
Two helpers convert between UTF-8 and UTF-16 sizes:
* `RE2.getUtf8Length(str)` — byte size needed to encode a string as a UTF-8 buffer.
* `RE2.getUtf16Length(buf)` — character count needed to decode a UTF-8 buffer as a string.
### Property: `internalSource`
`source` emulates the standard `RegExp` property and can recreate an identical `RE2` or `RegExp` instance. To inspect the RE2-translated pattern (useful for debugging), use the read-only `internalSource` property.
### Unicode warning level
`RE2` always works in Unicode mode. In most cases this is either invisible or preferred. For applications that need tight control, the static property `RE2.unicodeWarningLevel` governs what happens when a non-Unicode regular expression is created.
If a regular expression lacks the `u` flag, it is added silently by default:
```js
const x = /./;
x.flags; // ''
const y = new RE2(x);
y.flags; // 'u'
```
Values of `RE2.unicodeWarningLevel`:
* `'nothing'` (default) — silently add `u`.
* `'warnOnce'` — warn once, then silently add `u`. Assigning this value resets the one-time flag.
* `'warn'` — warn every time, still add `u`.
* `'throw'` — throw `SyntaxError`.
* Any other value is silently ignored, leaving the previous value unchanged.
Warnings and exceptions help audit an application for stray non-Unicode regular expressions.
`RE2.unicodeWarningLevel` is global. Be careful in multi-threaded environments — it is shared across threads.
## How to install
```bash
npm install re2
```
The project works with other package managers but is not tested with them.
See the wiki for notes on [yarn](https://github.com/uhop/node-re2/wiki/Using-with-yarn) and [pnpm](https://github.com/uhop/node-re2/wiki/Using-with-pnpm).
### Precompiled artifacts
The [install script](https://github.com/uhop/install-artifact-from-github/blob/master/bin/install-from-cache.js) attempts to download a prebuilt artifact from GitHub Releases. Override the download location with the `RE2_DOWNLOAD_MIRROR` environment variable.
If the download fails, the script builds RE2 locally using [node-gyp](https://github.com/nodejs/node-gyp).
## How to use
It is used just like `RegExp`.
```js
const RE2 = require('re2');
// with default flags
let re = new RE2('a(b*)');
let result = re.exec('abbc');
console.log(result[0]); // 'abb'
console.log(result[1]); // 'bb'
result = re.exec('aBbC');
console.log(result[0]); // 'a'
console.log(result[1]); // ''
// with explicit flags
re = new RE2('a(b*)', 'i');
result = re.exec('aBbC');
console.log(result[0]); // 'aBb'
console.log(result[1]); // 'Bb'
// from regular expression object
const regexp = new RegExp('a(b*)', 'i');
re = new RE2(regexp);
result = re.exec('aBbC');
console.log(result[0]); // 'aBb'
console.log(result[1]); // 'Bb'
// from regular expression literal
re = new RE2(/a(b*)/i);
result = re.exec('aBbC');
console.log(result[0]); // 'aBb'
console.log(result[1]); // 'Bb'
// from another RE2 object
const rex = new RE2(re);
result = rex.exec('aBbC');
console.log(result[0]); // 'aBb'
console.log(result[1]); // 'Bb'
// shortcut
result = new RE2('ab*').exec('abba');
// factory
result = RE2('ab*').exec('abba');
```
## Limitations (things RE2 does not support)
`RE2` avoids any regular expression features that require worst-case exponential time to evaluate.
The most notable missing features are backreferences and lookahead assertions.
If your application uses them, you should continue to use `RegExp` —
but since they are fundamentally vulnerable to
[ReDoS](https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS),
consider replacing them.
`RE2` throws `SyntaxError` for unsupported features.
Wrap `RE2` declarations in a try-catch to fall back to `RegExp`:
```js
let re = /(a)+(b)*/;
try {
re = new RE2(re);
// use RE2 as a drop-in replacement
} catch (e) {
// use the original RegExp
}
const result = re.exec(sample);
```
`RE2` may also behave differently from the built-in engine in corner cases.
### Backreferences
`RE2` does not support backreferences — numbered references to previously
matched groups (`\1`, `\2`, etc.). Example:
```js
/(cat|dog)\1/.test("catcat"); // true
/(cat|dog)\1/.test("dogdog"); // true
/(cat|dog)\1/.test("catdog"); // false
/(cat|dog)\1/.test("dogcat"); // false
```
### Lookahead assertions
`RE2` does not support lookahead assertions, which make a match depend on subsequent contents.
```js
/abc(?=def)/; // match abc only if it is followed by def
/abc(?!def)/; // match abc only if it is not followed by def
```
### Mismatched behavior
`RE2` and the built-in engine may disagree in edge cases. Verify your regular expressions before switching. They should work in the vast majority of cases.
Example:
```js
const RE2 = require('re2');
const pattern = '(?:(a)|(b)|(c))+';
const built_in = new RegExp(pattern);
const re2 = new RE2(pattern);
const input = 'abc';
const bi_res = built_in.exec(input);
const re2_res = re2.exec(input);
console.log('bi_res: ' + bi_res); // prints: bi_res: abc,,,c
console.log('re2_res : ' + re2_res); // prints: re2_res : abc,a,b,c
```
### Unicode
`RE2` always works in Unicode mode. See `RE2.unicodeWarningLevel` above for details.
#### Unicode classes `\p{...}` and `\P{...}`
`RE2` supports a subset of Unicode classes as defined in [RE2 Syntax](https://github.com/google/re2/wiki/Syntax). Google RE2 natively supports only short names (e.g., `L` for `Letter`). Like `RegExp`, node-re2 also accepts long names by translating them to short names.
Only the `\p{name}` form is supported, not `\p{name=value}` in general.
The exception is `Script` and `sc`, e.g., `\p{Script=Latin}` and `\p{sc=Cyrillic}`.
The same applies to `\P{...}`.
## Release history
- 1.24.0 *Fixed multi-threaded crash in worker threads (#235). Added named import: `import {RE2} from 're2'`. Added CJS test. Updated docs and dependencies.*
- 1.23.3 *Updated Abseil and dev dependencies.*
- 1.23.2 *Updated dev dependencies.*
- 1.23.1 *Updated Abseil and dev dependencies.*
- 1.23.0 *Updated all dependencies, upgraded tooling. New feature: `RE2.Set` (thx, [Wes](https://github.com/wrmedford)).*
- 1.22.3 *Technical release: upgraded QEMU emulations to native ARM runners to speed up the build process.*
- 1.22.2 *Updated all dependencies and the list of pre-compiled targets: Node 20, 22, 24, 25 (thx, [Jiayu Liu](https://github.com/jimexist)).*
- 1.22.1 *Added support for translation of scripts as Unicode classes.*
- 1.22.0 *Added support for translation of Unicode classes (thx, [John Livingston](https://github.com/JohnXLivingston)). Added [attestations](https://github.com/uhop/node-re2/attestations).*
- 1.21.5 *Updated all dependencies and the list of pre-compiled targets. Fixed minor bugs. C++ style fix (thx, [Benjamin Brienen](https://github.com/BenjaminBrienen)). Added Windows 11 ARM build runner (thx, [Kagami Sascha Rosylight](https://github.com/saschanaz)).*
- 1.21.4 *Fixed a regression reported by [caroline-matsec](https://github.com/caroline-matsec), thx! Added pre-compilation targets for Alpine Linux on ARM. Updated deps.*
- 1.21.3 *Fixed an empty string regression reported by [Rhys Arkins](https://github.com/rarkins), thx! Updated deps.*
- 1.21.2 *Fixed another memory regression reported by [matthewvalentine](https://github.com/matthewvalentine), thx! Updated deps. Added more tests and benchmarks.*
- 1.21.1 *Fixed a memory regression reported by [matthewvalentine](https://github.com/matthewvalentine), thx! Updated deps.*
- 1.21.0 *Fixed the performance problem reported by [matthewvalentine](https://github.com/matthewvalentine) (thx!). The change improves performance for multiple use cases.*
- 1.20.12 *Updated deps. Maintenance chores. Fixes for buffer-related bugs: `exec()` index (reported by [matthewvalentine](https://github.com/matthewvalentine), thx) and `match()` index.*
- 1.20.11 *Updated deps. Added support for Node 22 (thx, [Elton Leong](https://github.com/eltonkl)).*
- 1.20.10 *Updated deps. Removed files the pack used for development (thx, [Haruaki OTAKE](https://github.com/aaharu)). Added arm64 Linux prebilds (thx, [Christopher M](https://github.com/cmanou)). Fixed non-`npm` `corepack` problem (thx, [Steven](https://github.com/styfle)).*
- 1.20.9 *Updated deps. Added more `absail-cpp` files that manifested itself on NixOS. Thx, [Laura Hausmann](https://github.com/zotanmew).*
- 1.20.8 *Updated deps: `install-artifact-from-github`. A default HTTPS agent is used for fetching precompiled artifacts avoiding unnecessary long wait times.*
- 1.20.7 *Added more `absail-cpp` files that manifested itself on ARM Alpine. Thx, [Laura Hausmann](https://github.com/zotanmew).*
- 1.20.6 *Updated deps, notably `node-gyp`.*
- 1.20.5 *Updated deps, added Node 21 and retired Node 16 as pre-compilation targets.*
- 1.20.4 *Updated deps. Fix: the 2nd argument of the constructor overrides flags. Thx, [gost-serb](https://github.com/gost-serb).*
- 1.20.3 *Fix: subsequent numbers are incorporated into group if they would form a legal group reference. Thx, [Oleksii Vasyliev](https://github.com/le0pard).*
- 1.20.2 *Fix: added a missing C++ file, which caused a bug on Alpine Linux. Thx, [rbitanga-manticore](https://github.com/rbitanga-manticore).*
- 1.20.1 *Fix: files included in the npm package to build the C++ code.*
- 1.20.0 *Updated RE2. New version uses `abseil-cpp` and required the adaptation work. Thx, [Stefano Rivera](https://github.com/stefanor).*
The rest can be consulted in the project's wiki [Release history](https://github.com/uhop/node-re2/wiki/Release-history).
## License
BSD-3-Clause
================================================
FILE: bench/bad-pattern.mjs
================================================
import {RE2} from '../re2.js';
const BAD_PATTERN = '([a-z]+)+$';
const BAD_INPUT = 'a'.repeat(10) + '!';
const regExp = new RegExp(BAD_PATTERN);
const re2 = new RE2(BAD_PATTERN);
export default {
RegExp: n => {
let count = 0;
for (let i = 0; i < n; ++i) {
if (regExp.test(BAD_INPUT)) ++count;
}
return count;
},
RE2: n => {
let count = 0;
for (let i = 0; i < n; ++i) {
if (re2.test(BAD_INPUT)) ++count;
}
return count;
}
};
================================================
FILE: bench/set-match.mjs
================================================
import {RE2} from '../re2.js';
const PATTERN_COUNT = 200;
const patterns = [];
for (let i = 0; i < PATTERN_COUNT; ++i) {
patterns.push('token' + i + '(?:[a-z]+)?');
}
const INPUT_COUNT = 500;
const inputs = [];
for (let j = 0; j < INPUT_COUNT; ++j) {
inputs.push(
'xx' +
(j % PATTERN_COUNT) +
' ' +
(j & 7) +
' token' +
(j % PATTERN_COUNT) +
' tail'
);
}
const re2Set = new RE2.Set(patterns);
const re2List = patterns.map(p => new RE2(p));
const jsList = patterns.map(p => new RegExp(p));
export default {
RegExp: n => {
let count = 0;
for (let i = 0; i < n; ++i) {
for (const input of inputs) {
const matches = [];
for (const pattern of jsList) {
if (pattern.test(input)) matches.push(pattern);
}
count += matches.length;
}
}
return count;
},
RE2: n => {
let count = 0;
for (let i = 0; i < n; ++i) {
for (const input of inputs) {
const matches = [];
for (const pattern of re2List) {
if (pattern.test(input)) matches.push(pattern);
}
count += matches.length;
}
}
return count;
},
'RE2.Set': n => {
let count = 0;
for (let i = 0; i < n; ++i) {
for (const input of inputs) {
const matches = re2Set.match(input);
count += matches.length;
}
}
return count;
}
};
================================================
FILE: binding.gyp
================================================
{
"targets": [
{
"target_name": "re2",
"sources": [
"lib/addon.cc",
"lib/accessors.cc",
"lib/pattern.cc",
"lib/util.cc",
"lib/new.cc",
"lib/exec.cc",
"lib/test.cc",
"lib/match.cc",
"lib/replace.cc",
"lib/search.cc",
"lib/split.cc",
"lib/to_string.cc",
"lib/set.cc",
"vendor/re2/re2/bitmap256.cc",
"vendor/re2/re2/bitstate.cc",
"vendor/re2/re2/compile.cc",
"vendor/re2/re2/dfa.cc",
"vendor/re2/re2/filtered_re2.cc",
"vendor/re2/re2/mimics_pcre.cc",
"vendor/re2/re2/nfa.cc",
"vendor/re2/re2/onepass.cc",
"vendor/re2/re2/parse.cc",
"vendor/re2/re2/perl_groups.cc",
"vendor/re2/re2/prefilter.cc",
"vendor/re2/re2/prefilter_tree.cc",
"vendor/re2/re2/prog.cc",
"vendor/re2/re2/re2.cc",
"vendor/re2/re2/regexp.cc",
"vendor/re2/re2/set.cc",
"vendor/re2/re2/simplify.cc",
"vendor/re2/re2/tostring.cc",
"vendor/re2/re2/unicode_casefold.cc",
"vendor/re2/re2/unicode_groups.cc",
"vendor/re2/util/pcre.cc",
"vendor/re2/util/rune.cc",
"vendor/re2/util/strutil.cc",
"vendor/abseil-cpp/absl/base/internal/cycleclock.cc",
"vendor/abseil-cpp/absl/base/internal/low_level_alloc.cc",
"vendor/abseil-cpp/absl/base/internal/raw_logging.cc",
"vendor/abseil-cpp/absl/base/internal/spinlock.cc",
"vendor/abseil-cpp/absl/base/internal/spinlock_wait.cc",
"vendor/abseil-cpp/absl/base/internal/strerror.cc",
"vendor/abseil-cpp/absl/base/internal/sysinfo.cc",
"vendor/abseil-cpp/absl/base/internal/thread_identity.cc",
"vendor/abseil-cpp/absl/base/internal/throw_delegate.cc",
"vendor/abseil-cpp/absl/base/internal/unscaledcycleclock.cc",
"vendor/abseil-cpp/absl/container/internal/hashtablez_sampler.cc",
"vendor/abseil-cpp/absl/container/internal/hashtablez_sampler_force_weak_definition.cc",
"vendor/abseil-cpp/absl/container/internal/raw_hash_set.cc",
"vendor/abseil-cpp/absl/debugging/internal/borrowed_fixup_buffer.cc",
"vendor/abseil-cpp/absl/debugging/internal/decode_rust_punycode.cc",
"vendor/abseil-cpp/absl/debugging/internal/demangle.cc",
"vendor/abseil-cpp/absl/debugging/internal/demangle_rust.cc",
"vendor/abseil-cpp/absl/debugging/internal/address_is_readable.cc",
"vendor/abseil-cpp/absl/debugging/internal/elf_mem_image.cc",
"vendor/abseil-cpp/absl/debugging/internal/examine_stack.cc",
"vendor/abseil-cpp/absl/debugging/internal/utf8_for_code_point.cc",
"vendor/abseil-cpp/absl/debugging/internal/vdso_support.cc",
"vendor/abseil-cpp/absl/debugging/stacktrace.cc",
"vendor/abseil-cpp/absl/debugging/symbolize.cc",
"vendor/abseil-cpp/absl/flags/commandlineflag.cc",
"vendor/abseil-cpp/absl/flags/internal/commandlineflag.cc",
"vendor/abseil-cpp/absl/flags/internal/flag.cc",
"vendor/abseil-cpp/absl/flags/internal/private_handle_accessor.cc",
"vendor/abseil-cpp/absl/flags/internal/program_name.cc",
"vendor/abseil-cpp/absl/flags/marshalling.cc",
"vendor/abseil-cpp/absl/flags/reflection.cc",
"vendor/abseil-cpp/absl/flags/usage_config.cc",
"vendor/abseil-cpp/absl/hash/internal/city.cc",
"vendor/abseil-cpp/absl/hash/internal/hash.cc",
"vendor/abseil-cpp/absl/log/internal/globals.cc",
"vendor/abseil-cpp/absl/log/internal/log_format.cc",
"vendor/abseil-cpp/absl/log/internal/log_message.cc",
"vendor/abseil-cpp/absl/log/internal/log_sink_set.cc",
"vendor/abseil-cpp/absl/log/internal/nullguard.cc",
"vendor/abseil-cpp/absl/log/internal/proto.cc",
"vendor/abseil-cpp/absl/log/internal/structured_proto.cc",
"vendor/abseil-cpp/absl/log/globals.cc",
"vendor/abseil-cpp/absl/log/log_sink.cc",
"vendor/abseil-cpp/absl/numeric/int128.cc",
"vendor/abseil-cpp/absl/strings/ascii.cc",
"vendor/abseil-cpp/absl/strings/charconv.cc",
"vendor/abseil-cpp/absl/strings/internal/charconv_bigint.cc",
"vendor/abseil-cpp/absl/strings/internal/charconv_parse.cc",
"vendor/abseil-cpp/absl/strings/internal/memutil.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/arg.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/bind.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/extension.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/float_conversion.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/output.cc",
"vendor/abseil-cpp/absl/strings/internal/str_format/parser.cc",
"vendor/abseil-cpp/absl/strings/internal/utf8.cc",
"vendor/abseil-cpp/absl/strings/match.cc",
"vendor/abseil-cpp/absl/strings/numbers.cc",
"vendor/abseil-cpp/absl/strings/str_cat.cc",
"vendor/abseil-cpp/absl/strings/str_split.cc",
"vendor/abseil-cpp/absl/synchronization/internal/create_thread_identity.cc",
"vendor/abseil-cpp/absl/synchronization/internal/graphcycles.cc",
"vendor/abseil-cpp/absl/synchronization/internal/futex_waiter.cc",
"vendor/abseil-cpp/absl/synchronization/internal/kernel_timeout.cc",
"vendor/abseil-cpp/absl/synchronization/internal/per_thread_sem.cc",
"vendor/abseil-cpp/absl/synchronization/internal/waiter_base.cc",
"vendor/abseil-cpp/absl/synchronization/mutex.cc",
"vendor/abseil-cpp/absl/time/clock.cc",
"vendor/abseil-cpp/absl/time/duration.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_fixed.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_if.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_impl.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_info.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_lookup.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_posix.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/zone_info_source.cc",
"vendor/abseil-cpp/absl/time/time.cc",
],
"cflags": [
"-std=c++2a",
"-Wall",
"-Wextra",
"-Wno-sign-compare",
"-Wno-unused-parameter",
"-Wno-missing-field-initializers",
"-Wno-cast-function-type",
"-O3",
"-g"
],
"defines": [
"NDEBUG",
"NOMINMAX"
],
"include_dirs": [
"<!(node -e \"require('nan')\")",
"vendor/re2",
"vendor/abseil-cpp",
],
"xcode_settings": {
"MACOSX_DEPLOYMENT_TARGET": "10.15",
"CLANG_CXX_LANGUAGE_STANDARD": "c++2a",
"CLANG_CXX_LIBRARY": "libc++",
"OTHER_CFLAGS": [
"-std=c++2a",
"-Wall",
"-Wextra",
"-Wno-sign-compare",
"-Wno-unused-parameter",
"-Wno-missing-field-initializers",
"-O3",
"-g"
]
},
"conditions": [
["OS==\"linux\"", {
"cflags": [
"-pthread"
],
"ldflags": [
"-pthread"
]
}],
["OS==\"win\"", {
"sources": [
"vendor/abseil-cpp/absl/synchronization/internal/win32_waiter.cc",
"vendor/abseil-cpp/absl/time/internal/cctz/src/time_zone_name_win.cc"
]
}]
]
}
]
}
================================================
FILE: lib/accessors.cc
================================================
#include "./wrapped_re2.h"
#include <cstring>
#include <string>
#include <vector>
NAN_GETTER(WrappedRE2::GetSource)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().Set(Nan::New("(?:)").ToLocalChecked());
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(Nan::New(re2->source).ToLocalChecked());
}
NAN_GETTER(WrappedRE2::GetInternalSource)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().Set(Nan::New("(?:)").ToLocalChecked());
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(Nan::New(re2->regexp.pattern()).ToLocalChecked());
}
NAN_GETTER(WrappedRE2::GetFlags)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().Set(Nan::New("").ToLocalChecked());
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
std::string flags;
if (re2->hasIndices)
{
flags += "d";
}
if (re2->global)
{
flags += "g";
}
if (re2->ignoreCase)
{
flags += "i";
}
if (re2->multiline)
{
flags += "m";
}
if (re2->dotAll)
{
flags += "s";
}
flags += "u";
if (re2->sticky)
{
flags += "y";
}
info.GetReturnValue().Set(Nan::New(flags).ToLocalChecked());
}
NAN_GETTER(WrappedRE2::GetGlobal)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->global);
}
NAN_GETTER(WrappedRE2::GetIgnoreCase)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->ignoreCase);
}
NAN_GETTER(WrappedRE2::GetMultiline)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->multiline);
}
NAN_GETTER(WrappedRE2::GetDotAll)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->dotAll);
}
NAN_GETTER(WrappedRE2::GetUnicode)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
info.GetReturnValue().Set(true);
}
NAN_GETTER(WrappedRE2::GetSticky)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->sticky);
}
NAN_GETTER(WrappedRE2::GetHasIndices)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(re2->hasIndices);
}
NAN_GETTER(WrappedRE2::GetLastIndex)
{
if (!WrappedRE2::HasInstance(info.This()))
{
info.GetReturnValue().SetUndefined();
return;
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
info.GetReturnValue().Set(static_cast<int>(re2->lastIndex));
}
NAN_SETTER(WrappedRE2::SetLastIndex)
{
if (!WrappedRE2::HasInstance(info.This()))
{
return Nan::ThrowTypeError("Cannot set lastIndex of an invalid RE2 object.");
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (value->IsNumber())
{
int n = value->NumberValue(Nan::GetCurrentContext()).FromMaybe(0);
re2->lastIndex = n <= 0 ? 0 : n;
}
}
std::atomic<WrappedRE2::UnicodeWarningLevels> WrappedRE2::unicodeWarningLevel{WrappedRE2::NOTHING};
NAN_GETTER(WrappedRE2::GetUnicodeWarningLevel)
{
std::string level;
switch (unicodeWarningLevel)
{
case THROW:
level = "throw";
break;
case WARN:
level = "warn";
break;
case WARN_ONCE:
level = "warnOnce";
break;
default:
level = "nothing";
break;
}
info.GetReturnValue().Set(Nan::New(level).ToLocalChecked());
}
NAN_SETTER(WrappedRE2::SetUnicodeWarningLevel)
{
if (value->IsString())
{
Nan::Utf8String s(value);
if (!strcmp(*s, "throw"))
{
unicodeWarningLevel = THROW;
return;
}
if (!strcmp(*s, "warn"))
{
unicodeWarningLevel = WARN;
return;
}
if (!strcmp(*s, "warnOnce"))
{
unicodeWarningLevel = WARN_ONCE;
alreadyWarnedAboutUnicode = false;
return;
}
if (!strcmp(*s, "nothing"))
{
unicodeWarningLevel = NOTHING;
return;
}
}
}
================================================
FILE: lib/addon.cc
================================================
#include "./wrapped_re2.h"
#include "./wrapped_re2_set.h"
#include "./isolate_data.h"
#include <mutex>
#include <unordered_map>
static std::mutex addonDataMutex;
static std::unordered_map<v8::Isolate *, AddonData *> addonDataMap;
AddonData *getAddonData(v8::Isolate *isolate)
{
std::lock_guard<std::mutex> lock(addonDataMutex);
auto it = addonDataMap.find(isolate);
return it != addonDataMap.end() ? it->second : nullptr;
}
void setAddonData(v8::Isolate *isolate, AddonData *data)
{
std::lock_guard<std::mutex> lock(addonDataMutex);
addonDataMap[isolate] = data;
}
void deleteAddonData(v8::Isolate *isolate)
{
std::lock_guard<std::mutex> lock(addonDataMutex);
auto it = addonDataMap.find(isolate);
if (it != addonDataMap.end())
{
delete it->second;
addonDataMap.erase(it);
}
}
static NAN_METHOD(GetUtf8Length)
{
auto t = info[0]->ToString(Nan::GetCurrentContext());
if (t.IsEmpty())
{
return;
}
auto s = t.ToLocalChecked();
info.GetReturnValue().Set(static_cast<int>(s->Utf8Length(v8::Isolate::GetCurrent())));
}
static NAN_METHOD(GetUtf16Length)
{
if (node::Buffer::HasInstance(info[0]))
{
const auto *s = node::Buffer::Data(info[0]);
info.GetReturnValue().Set(static_cast<int>(getUtf16Length(s, s + node::Buffer::Length(info[0]))));
return;
}
info.GetReturnValue().Set(-1);
}
static void cleanup(void *p)
{
v8::Isolate *isolate = static_cast<v8::Isolate *>(p);
deleteAddonData(isolate);
}
// NAN_MODULE_INIT(WrappedRE2::Init)
v8::Local<v8::Function> WrappedRE2::Init()
{
Nan::EscapableHandleScope scope;
// prepare constructor template
auto tpl = Nan::New<v8::FunctionTemplate>(New);
tpl->SetClassName(Nan::New("RE2").ToLocalChecked());
auto instanceTemplate = tpl->InstanceTemplate();
instanceTemplate->SetInternalFieldCount(1);
// save the template in per-isolate storage
auto isolate = v8::Isolate::GetCurrent();
auto data = new AddonData();
data->re2Tpl.Reset(tpl);
setAddonData(isolate, data);
node::AddEnvironmentCleanupHook(isolate, cleanup, isolate);
// prototype
Nan::SetPrototypeMethod(tpl, "toString", ToString);
Nan::SetPrototypeMethod(tpl, "exec", Exec);
Nan::SetPrototypeMethod(tpl, "test", Test);
Nan::SetPrototypeMethod(tpl, "match", Match);
Nan::SetPrototypeMethod(tpl, "replace", Replace);
Nan::SetPrototypeMethod(tpl, "search", Search);
Nan::SetPrototypeMethod(tpl, "split", Split);
Nan::SetPrototypeTemplate(tpl, "source", Nan::New("(?:)").ToLocalChecked());
Nan::SetPrototypeTemplate(tpl, "flags", Nan::New("").ToLocalChecked());
Nan::SetAccessor(instanceTemplate, Nan::New("source").ToLocalChecked(), GetSource);
Nan::SetAccessor(instanceTemplate, Nan::New("flags").ToLocalChecked(), GetFlags);
Nan::SetAccessor(instanceTemplate, Nan::New("global").ToLocalChecked(), GetGlobal);
Nan::SetAccessor(instanceTemplate, Nan::New("ignoreCase").ToLocalChecked(), GetIgnoreCase);
Nan::SetAccessor(instanceTemplate, Nan::New("multiline").ToLocalChecked(), GetMultiline);
Nan::SetAccessor(instanceTemplate, Nan::New("dotAll").ToLocalChecked(), GetDotAll);
Nan::SetAccessor(instanceTemplate, Nan::New("unicode").ToLocalChecked(), GetUnicode);
Nan::SetAccessor(instanceTemplate, Nan::New("sticky").ToLocalChecked(), GetSticky);
Nan::SetAccessor(instanceTemplate, Nan::New("hasIndices").ToLocalChecked(), GetHasIndices);
Nan::SetAccessor(instanceTemplate, Nan::New("lastIndex").ToLocalChecked(), GetLastIndex, SetLastIndex);
Nan::SetAccessor(instanceTemplate, Nan::New("internalSource").ToLocalChecked(), GetInternalSource);
auto ctr = Nan::GetFunction(tpl).ToLocalChecked();
auto setCtr = WrappedRE2Set::Init();
Nan::Set(ctr, Nan::New("Set").ToLocalChecked(), setCtr);
// properties
Nan::Export(ctr, "getUtf8Length", GetUtf8Length);
Nan::Export(ctr, "getUtf16Length", GetUtf16Length);
Nan::SetAccessor(v8::Local<v8::Object>(ctr), Nan::New("unicodeWarningLevel").ToLocalChecked(), GetUnicodeWarningLevel, SetUnicodeWarningLevel);
return scope.Escape(ctr);
}
NODE_MODULE_INIT()
{
Nan::HandleScope scope;
Nan::Set(module->ToObject(context).ToLocalChecked(), Nan::New("exports").ToLocalChecked(), WrappedRE2::Init());
}
WrappedRE2::~WrappedRE2()
{
dropCache();
}
// private methods
void WrappedRE2::dropCache()
{
if (!lastString.IsEmpty())
{
// lastString.ClearWeak();
lastString.Reset();
}
if (!lastCache.IsEmpty())
{
// lastCache.ClearWeak();
lastCache.Reset();
}
lastStringValue.clear();
}
const StrVal &WrappedRE2::prepareArgument(const v8::Local<v8::Value> &arg, bool ignoreLastIndex)
{
size_t startFrom = ignoreLastIndex ? 0 : lastIndex;
if (!lastString.IsEmpty())
{
lastString.ClearWeak();
}
if (!lastCache.IsEmpty())
{
lastCache.ClearWeak();
}
if (lastString == arg && !node::Buffer::HasInstance(arg) && !lastCache.IsEmpty())
{
// we have a properly cached string
lastStringValue.setIndex(startFrom);
return lastStringValue;
}
dropCache();
if (node::Buffer::HasInstance(arg))
{
// no need to cache buffers
lastString.Reset(arg);
auto argSize = node::Buffer::Length(arg);
lastStringValue.reset(arg, argSize, argSize, startFrom, true);
return lastStringValue;
}
// caching the string
auto t = arg->ToString(Nan::GetCurrentContext());
if (t.IsEmpty())
{
// do not process bad strings
lastStringValue.isBad = true;
return lastStringValue;
}
lastString.Reset(arg);
auto isolate = v8::Isolate::GetCurrent();
auto s = t.ToLocalChecked();
auto argLength = s->Utf8Length(isolate);
auto buffer = node::Buffer::New(isolate, s).ToLocalChecked();
lastCache.Reset(buffer);
auto argSize = node::Buffer::Length(buffer);
lastStringValue.reset(buffer, argSize, argLength, startFrom);
return lastStringValue;
};
void WrappedRE2::doneWithLastString()
{
if (!lastString.IsEmpty())
{
static_cast<v8::PersistentBase<v8::Value> &>(lastString).SetWeak();
}
if (!lastCache.IsEmpty())
{
static_cast<v8::PersistentBase<v8::Object> &>(lastCache).SetWeak();
}
}
// StrVal
void StrVal::setIndex(size_t newIndex)
{
isValidIndex = newIndex <= length;
if (!isValidIndex)
{
index = newIndex;
byteIndex = 0;
return;
}
if (newIndex == index)
return;
if (isBuffer)
{
byteIndex = index = newIndex;
return;
}
// String
if (!newIndex)
{
byteIndex = index = 0;
return;
}
if (newIndex == length)
{
byteIndex = size;
index = length;
return;
}
byteIndex = index < newIndex ? getUtf16PositionByCounter(data, byteIndex, newIndex - index) : getUtf16PositionByCounter(data, 0, newIndex);
index = newIndex;
}
static char null_buffer[] = {'\0'};
void StrVal::reset(const v8::Local<v8::Value> &arg, size_t argSize, size_t argLength, size_t newIndex, bool buffer)
{
clear();
isBuffer = buffer;
size = argSize;
length = argLength;
data = size ? node::Buffer::Data(arg) : null_buffer;
setIndex(newIndex);
}
================================================
FILE: lib/exec.cc
================================================
#include "./wrapped_re2.h"
#include <vector>
NAN_METHOD(WrappedRE2::Exec)
{
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().SetNull();
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& str = prep;
if (str.isBad) return; // throws an exception
if (re2->global || re2->sticky)
{
if (!str.isValidIndex)
{
re2->lastIndex = 0;
info.GetReturnValue().SetNull();
return;
}
}
// actual work
std::vector<re2::StringPiece> groups(re2->regexp.NumberOfCapturingGroups() + 1);
if (!re2->regexp.Match(str, str.byteIndex, str.size, re2->sticky ? re2::RE2::ANCHOR_START : re2::RE2::UNANCHORED, &groups[0], groups.size()))
{
if (re2->global || re2->sticky)
{
re2->lastIndex = 0;
}
info.GetReturnValue().SetNull();
return;
}
// form a result
auto result = Nan::New<v8::Array>(), indices = Nan::New<v8::Array>();
int indexOffset = re2->global || re2->sticky ? re2->lastIndex : 0;
if (str.isBuffer)
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
Nan::Set(result, i, Nan::CopyBuffer(data, item.size()).ToLocalChecked());
if (re2->hasIndices)
{
auto pair = Nan::New<v8::Array>();
auto offset = data - str.data - str.byteIndex;
auto length = item.size();
Nan::Set(pair, 0, Nan::New<v8::Integer>(indexOffset + static_cast<int>(offset)));
Nan::Set(pair, 1, Nan::New<v8::Integer>(indexOffset + static_cast<int>(offset + length)));
Nan::Set(indices, i, pair);
}
}
else
{
Nan::Set(result, i, Nan::Undefined());
if (re2->hasIndices)
{
Nan::Set(indices, i, Nan::Undefined());
}
}
}
Nan::Set(result, Nan::New("index").ToLocalChecked(), Nan::New<v8::Integer>(indexOffset + static_cast<int>(groups[0].data() - str.data - str.byteIndex)));
}
else
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
Nan::Set(result, i, Nan::New(data, item.size()).ToLocalChecked());
if (re2->hasIndices)
{
auto pair = Nan::New<v8::Array>();
auto offset = getUtf16Length(str.data + str.byteIndex, data);
auto length = getUtf16Length(data, data + item.size());
Nan::Set(pair, 0, Nan::New<v8::Integer>(indexOffset + static_cast<int>(offset)));
Nan::Set(pair, 1, Nan::New<v8::Integer>(indexOffset + static_cast<int>(offset + length)));
Nan::Set(indices, i, pair);
}
}
else
{
Nan::Set(result, i, Nan::Undefined());
if (re2->hasIndices)
{
Nan::Set(indices, i, Nan::Undefined());
}
}
}
Nan::Set(
result,
Nan::New("index").ToLocalChecked(),
Nan::New<v8::Integer>(indexOffset +
static_cast<int>(getUtf16Length(str.data + str.byteIndex, groups[0].data()))));
}
if (re2->global || re2->sticky)
{
re2->lastIndex +=
str.isBuffer ? groups[0].data() - str.data + groups[0].size() - str.byteIndex : getUtf16Length(str.data + str.byteIndex, groups[0].data() + groups[0].size());
}
Nan::Set(result, Nan::New("input").ToLocalChecked(), info[0]);
const auto &groupNames = re2->regexp.CapturingGroupNames();
if (!groupNames.empty())
{
auto groups = Nan::New<v8::Object>();
Nan::SetPrototype(groups, Nan::Null());
for (auto group : groupNames)
{
auto value = Nan::Get(result, group.first);
if (!value.IsEmpty())
{
Nan::Set(groups, Nan::New(group.second).ToLocalChecked(), value.ToLocalChecked());
}
}
Nan::Set(result, Nan::New("groups").ToLocalChecked(), groups);
if (re2->hasIndices)
{
auto indexGroups = Nan::New<v8::Object>();
Nan::SetPrototype(indexGroups, Nan::Null());
for (auto group : groupNames)
{
auto value = Nan::Get(indices, group.first);
if (!value.IsEmpty())
{
Nan::Set(indexGroups, Nan::New(group.second).ToLocalChecked(), value.ToLocalChecked());
}
}
Nan::Set(indices, Nan::New("groups").ToLocalChecked(), indexGroups);
}
}
else
{
Nan::Set(result, Nan::New("groups").ToLocalChecked(), Nan::Undefined());
if (re2->hasIndices)
{
Nan::Set(indices, Nan::New("groups").ToLocalChecked(), Nan::Undefined());
}
}
if (re2->hasIndices)
{
Nan::Set(result, Nan::New("indices").ToLocalChecked(), indices);
}
info.GetReturnValue().Set(result);
}
================================================
FILE: lib/isolate_data.h
================================================
#pragma once
#include <nan.h>
struct AddonData {
Nan::Persistent<v8::FunctionTemplate> re2Tpl;
Nan::Persistent<v8::FunctionTemplate> re2SetTpl;
};
AddonData *getAddonData(v8::Isolate *isolate);
void setAddonData(v8::Isolate *isolate, AddonData *data);
void deleteAddonData(v8::Isolate *isolate);
================================================
FILE: lib/match.cc
================================================
#include "./wrapped_re2.h"
#include <vector>
NAN_METHOD(WrappedRE2::Match)
{
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().SetNull();
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& str = prep;
if (str.isBad) return; // throws an exception
if (!str.isValidIndex)
{
re2->lastIndex = 0;
info.GetReturnValue().SetNull();
return;
}
std::vector<re2::StringPiece> groups;
size_t byteIndex = 0;
auto anchor = re2::RE2::UNANCHORED;
// actual work
if (re2->global)
{
// global: collect all matches
re2::StringPiece match;
if (re2->sticky)
{
anchor = re2::RE2::ANCHOR_START;
}
while (re2->regexp.Match(str, byteIndex, str.size, anchor, &match, 1))
{
groups.push_back(match);
byteIndex = match.data() - str.data + match.size();
}
if (groups.empty())
{
info.GetReturnValue().SetNull();
return;
}
}
else
{
// non-global: just like exec()
if (re2->sticky)
{
byteIndex = str.byteIndex;
anchor = RE2::ANCHOR_START;
}
groups.resize(re2->regexp.NumberOfCapturingGroups() + 1);
if (!re2->regexp.Match(str, byteIndex, str.size, anchor, &groups[0], groups.size()))
{
if (re2->sticky)
re2->lastIndex = 0;
info.GetReturnValue().SetNull();
return;
}
}
// form a result
auto result = Nan::New<v8::Array>(), indices = Nan::New<v8::Array>();
if (str.isBuffer)
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
Nan::Set(result, i, Nan::CopyBuffer(data, item.size()).ToLocalChecked());
if (!re2->global && re2->hasIndices)
{
auto pair = Nan::New<v8::Array>();
auto offset = data - str.data - byteIndex;
auto length = item.size();
Nan::Set(pair, 0, Nan::New<v8::Integer>(static_cast<int>(offset)));
Nan::Set(pair, 1, Nan::New<v8::Integer>(static_cast<int>(offset + length)));
Nan::Set(indices, i, pair);
}
}
else
{
Nan::Set(result, i, Nan::Undefined());
if (!re2->global && re2->hasIndices)
Nan::Set(indices, i, Nan::Undefined());
}
}
if (!re2->global)
{
Nan::Set(result, Nan::New("index").ToLocalChecked(), Nan::New<v8::Integer>(static_cast<int>(groups[0].data() - str.data)));
Nan::Set(result, Nan::New("input").ToLocalChecked(), info[0]);
}
}
else
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
Nan::Set(result, i, Nan::New(data, item.size()).ToLocalChecked());
if (!re2->global && re2->hasIndices)
{
auto pair = Nan::New<v8::Array>();
auto offset = getUtf16Length(str.data + byteIndex, data);
auto length = getUtf16Length(data, data + item.size());
Nan::Set(pair, 0, Nan::New<v8::Integer>(static_cast<int>(offset)));
Nan::Set(pair, 1, Nan::New<v8::Integer>(static_cast<int>(offset + length)));
Nan::Set(indices, i, pair);
}
}
else
{
Nan::Set(result, i, Nan::Undefined());
if (!re2->global && re2->hasIndices)
{
Nan::Set(indices, i, Nan::Undefined());
}
}
}
if (!re2->global)
{
Nan::Set(result, Nan::New("index").ToLocalChecked(), Nan::New<v8::Integer>(static_cast<int>(getUtf16Length(str.data, groups[0].data()))));
Nan::Set(result, Nan::New("input").ToLocalChecked(), info[0]);
}
}
if (re2->global)
{
re2->lastIndex = 0;
}
else if (re2->sticky)
{
re2->lastIndex +=
str.isBuffer ? groups[0].data() - str.data + groups[0].size() - byteIndex : getUtf16Length(str.data + byteIndex, groups[0].data() + groups[0].size());
}
if (!re2->global)
{
const auto &groupNames = re2->regexp.CapturingGroupNames();
if (!groupNames.empty())
{
auto groups = Nan::New<v8::Object>();
Nan::SetPrototype(groups, Nan::Null());
for (auto group : groupNames)
{
auto value = Nan::Get(result, group.first);
if (!value.IsEmpty())
{
Nan::Set(groups, Nan::New(group.second).ToLocalChecked(), value.ToLocalChecked());
}
}
Nan::Set(result, Nan::New("groups").ToLocalChecked(), groups);
if (re2->hasIndices)
{
auto indexGroups = Nan::New<v8::Object>();
Nan::SetPrototype(indexGroups, Nan::Null());
for (auto group : groupNames)
{
auto value = Nan::Get(indices, group.first);
if (!value.IsEmpty())
{
Nan::Set(indexGroups, Nan::New(group.second).ToLocalChecked(), value.ToLocalChecked());
}
}
Nan::Set(indices, Nan::New("groups").ToLocalChecked(), indexGroups);
}
}
else
{
Nan::Set(result, Nan::New("groups").ToLocalChecked(), Nan::Undefined());
if (re2->hasIndices)
{
Nan::Set(indices, Nan::New("groups").ToLocalChecked(), Nan::Undefined());
}
}
if (re2->hasIndices)
{
Nan::Set(result, Nan::New("indices").ToLocalChecked(), indices);
}
}
info.GetReturnValue().Set(result);
}
================================================
FILE: lib/new.cc
================================================
#include "./wrapped_re2.h"
#include "./util.h"
#include "./pattern.h"
#include <map>
#include <memory>
#include <string>
#include <unordered_set>
#include <vector>
std::atomic<bool> WrappedRE2::alreadyWarnedAboutUnicode{false};
static const char *deprecationMessage = "BMP patterns aren't supported by node-re2. An implicit \"u\" flag is assumed by the RE2 constructor. In a future major version, calling the RE2 constructor without the \"u\" flag may become forbidden, or cause a different behavior. Please see https://github.com/uhop/node-re2/issues/21 for more information.";
inline bool ensureUniqueNamedGroups(const std::map<int, std::string> &groups)
{
std::unordered_set<std::string> names;
for (auto group : groups)
{
if (!names.insert(group.second).second)
{
return false;
}
}
return true;
}
NAN_METHOD(WrappedRE2::New)
{
if (!info.IsConstructCall())
{
// call a constructor and return the result
std::vector<v8::Local<v8::Value>> parameters(info.Length());
for (size_t i = 0, n = info.Length(); i < n; ++i)
{
parameters[i] = info[i];
}
auto isolate = v8::Isolate::GetCurrent();
auto data = getAddonData(isolate);
if (!data) return;
auto newObject = Nan::NewInstance(Nan::GetFunction(data->re2Tpl.Get(isolate)).ToLocalChecked(), parameters.size(), ¶meters[0]);
if (!newObject.IsEmpty())
{
info.GetReturnValue().Set(newObject.ToLocalChecked());
}
return;
}
// process arguments
std::vector<char> buffer;
char *data = NULL;
size_t size = 0;
std::string source;
bool global = false;
bool ignoreCase = false;
bool multiline = false;
bool dotAll = false;
bool unicode = false;
bool sticky = false;
bool hasIndices = false;
auto context = Nan::GetCurrentContext();
bool needFlags = true;
if (info.Length() > 1)
{
if (info[1]->IsString())
{
auto isolate = v8::Isolate::GetCurrent();
auto t = info[1]->ToString(Nan::GetCurrentContext());
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
data = &buffer[0];
s->WriteUtf8(isolate, data, buffer.size());
buffer[size] = '\0';
}
else if (node::Buffer::HasInstance(info[1]))
{
size = node::Buffer::Length(info[1]);
data = node::Buffer::Data(info[1]);
}
for (size_t i = 0; i < size; ++i)
{
switch (data[i])
{
case 'g':
global = true;
break;
case 'i':
ignoreCase = true;
break;
case 'm':
multiline = true;
break;
case 's':
dotAll = true;
break;
case 'u':
unicode = true;
break;
case 'y':
sticky = true;
break;
case 'd':
hasIndices = true;
break;
}
}
size = 0;
needFlags = false;
}
bool needConversion = true;
if (node::Buffer::HasInstance(info[0]))
{
size = node::Buffer::Length(info[0]);
data = node::Buffer::Data(info[0]);
source = escapeRegExp(data, size);
}
else if (info[0]->IsRegExp())
{
const auto *re = v8::RegExp::Cast(*info[0]);
auto isolate = v8::Isolate::GetCurrent();
auto t = re->GetSource()->ToString(Nan::GetCurrentContext());
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
data = &buffer[0];
s->WriteUtf8(isolate, data, buffer.size());
buffer[size] = '\0';
source = escapeRegExp(data, size);
if (needFlags)
{
v8::RegExp::Flags flags = re->GetFlags();
global = bool(flags & v8::RegExp::kGlobal);
ignoreCase = bool(flags & v8::RegExp::kIgnoreCase);
multiline = bool(flags & v8::RegExp::kMultiline);
dotAll = bool(flags & v8::RegExp::kDotAll);
unicode = bool(flags & v8::RegExp::kUnicode);
sticky = bool(flags & v8::RegExp::kSticky);
hasIndices = bool(flags & v8::RegExp::kHasIndices);
needFlags = false;
}
}
else if (info[0]->IsObject() && !info[0]->IsString())
{
WrappedRE2 *re2 = nullptr;
auto object = info[0]->ToObject(context).ToLocalChecked();
if (!object.IsEmpty() && object->InternalFieldCount() > 0)
{
re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(object);
}
if (re2)
{
const auto &pattern = re2->regexp.pattern();
size = pattern.size();
buffer.resize(size);
data = &buffer[0];
memcpy(data, pattern.data(), size);
needConversion = false;
source = re2->source;
if (needFlags)
{
global = re2->global;
ignoreCase = re2->ignoreCase;
multiline = re2->multiline;
dotAll = re2->dotAll;
unicode = true;
sticky = re2->sticky;
hasIndices = re2->hasIndices;
needFlags = false;
}
}
}
else if (info[0]->IsString())
{
auto isolate = v8::Isolate::GetCurrent();
auto t = info[0]->ToString(Nan::GetCurrentContext());
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
data = &buffer[0];
s->WriteUtf8(isolate, data, buffer.size());
buffer[size] = '\0';
source = escapeRegExp(data, size);
}
if (!data)
{
return Nan::ThrowTypeError("Expected string, Buffer, RegExp, or RE2 as the 1st argument.");
}
if (!unicode)
{
switch (unicodeWarningLevel)
{
case THROW:
return Nan::ThrowSyntaxError(deprecationMessage);
case WARN:
printDeprecationWarning(deprecationMessage);
break;
case WARN_ONCE:
if (!alreadyWarnedAboutUnicode)
{
printDeprecationWarning(deprecationMessage);
alreadyWarnedAboutUnicode = true;
}
break;
default:
break;
}
}
if (needConversion && translateRegExp(data, size, multiline, buffer))
{
size = buffer.size() - 1;
data = &buffer[0];
}
// create and return an object
re2::RE2::Options options;
options.set_case_sensitive(!ignoreCase);
options.set_one_line(!multiline); // to track this state, otherwise it is ignored
options.set_dot_nl(dotAll);
options.set_log_errors(false); // inappropriate when embedding
std::unique_ptr<WrappedRE2> re2(new WrappedRE2(re2::StringPiece(data, size), options, source, global, ignoreCase, multiline, dotAll, sticky, hasIndices));
if (!re2->regexp.ok())
{
return Nan::ThrowSyntaxError(re2->regexp.error().c_str());
}
if (!ensureUniqueNamedGroups(re2->regexp.CapturingGroupNames()))
{
return Nan::ThrowSyntaxError("duplicate capture group name");
}
re2->Wrap(info.This());
re2.release();
info.GetReturnValue().Set(info.This());
}
================================================
FILE: lib/pattern.cc
================================================
#include "./pattern.h"
#include "./wrapped_re2.h"
#include <cstring>
#include <map>
#include <string>
static char hex[] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
inline bool isUpperCaseAlpha(char ch)
{
return 'A' <= ch && ch <= 'Z';
}
inline bool isHexadecimal(char ch)
{
return ('0' <= ch && ch <= '9') || ('A' <= ch && ch <= 'F') || ('a' <= ch && ch <= 'f');
}
static std::map<std::string, std::string> unicodeClasses = {
{"Uppercase_Letter", "Lu"},
{"Lowercase_Letter", "Ll"},
{"Titlecase_Letter", "Lt"},
{"Cased_Letter", "LC"},
{"Modifier_Letter", "Lm"},
{"Other_Letter", "Lo"},
{"Letter", "L"},
{"Nonspacing_Mark", "Mn"},
{"Spacing_Mark", "Mc"},
{"Enclosing_Mark", "Me"},
{"Mark", "M"},
{"Decimal_Number", "Nd"},
{"Letter_Number", "Nl"},
{"Other_Number", "No"},
{"Number", "N"},
{"Connector_Punctuation", "Pc"},
{"Dash_Punctuation", "Pd"},
{"Open_Punctuation", "Ps"},
{"Close_Punctuation", "Pe"},
{"Initial_Punctuation", "Pi"},
{"Final_Punctuation", "Pf"},
{"Other_Punctuation", "Po"},
{"Punctuation", "P"},
{"Math_Symbol", "Sm"},
{"Currency_Symbol", "Sc"},
{"Modifier_Symbol", "Sk"},
{"Other_Symbol", "So"},
{"Symbol", "S"},
{"Space_Separator", "Zs"},
{"Line_Separator", "Zl"},
{"Paragraph_Separator", "Zp"},
{"Separator", "Z"},
{"Control", "Cc"},
{"Format", "Cf"},
{"Surrogate", "Cs"},
{"Private_Use", "Co"},
{"Unassigned", "Cn"},
{"Other", "C"},
};
bool translateRegExp(const char *data, size_t size, bool multiline, std::vector<char> &buffer)
{
std::string result;
bool changed = false;
if (!size)
{
result = "(?:)";
changed = true;
}
else if (multiline)
{
result = "(?m)";
changed = true;
}
for (size_t i = 0; i < size;)
{
char ch = data[i];
if (ch == '\\')
{
if (i + 1 < size)
{
ch = data[i + 1];
switch (ch)
{
case '\\':
result += "\\\\";
i += 2;
continue;
case 'c':
if (i + 2 < size)
{
ch = data[i + 2];
if (isUpperCaseAlpha(ch))
{
result += "\\x";
result += hex[((ch - '@') / 16) & 15];
result += hex[(ch - '@') & 15];
i += 3;
changed = true;
continue;
}
}
result += "\\c";
i += 2;
continue;
case 'u':
if (i + 2 < size)
{
ch = data[i + 2];
if (isHexadecimal(ch))
{
result += "\\x{";
result += ch;
i += 3;
for (size_t j = 0; j < 3 && i < size; ++i, ++j)
{
ch = data[i];
if (!isHexadecimal(ch))
{
break;
}
result += ch;
}
result += '}';
changed = true;
continue;
}
else if (ch == '{')
{
result += "\\x";
i += 2;
changed = true;
continue;
}
}
result += "\\u";
i += 2;
continue;
case 'p':
case 'P':
if (i + 2 < size) {
if (data[i + 2] == '{') {
size_t j = i + 3;
while (j < size && data[j] != '}') ++j;
if (j < size) {
result += "\\";
result += data[i + 1];
std::string name(data + i + 3, j - i - 3);
if (unicodeClasses.find(name) != unicodeClasses.end()) {
name = unicodeClasses[name];
} else if (name.size() > 7 && !strncmp(name.c_str(), "Script=", 7)) {
name = name.substr(7);
} else if (name.size() > 3 && !strncmp(name.c_str(), "sc=", 3)) {
name = name.substr(3);
}
if (name.size() == 1) {
result += name;
} else {
result += "{";
result += name;
result += "}";
}
i = j + 1;
changed = true;
continue;
}
}
}
result += "\\";
result += data[i + 1];
i += 2;
continue;
default:
result += "\\";
size_t sym_size = getUtf8CharSize(ch);
result.append(data + i + 1, sym_size);
i += sym_size + 1;
continue;
}
}
}
else if (ch == '/')
{
result += "\\/";
i += 1;
changed = true;
continue;
}
else if (ch == '(' && i + 2 < size && data[i + 1] == '?' && data[i + 2] == '<')
{
if (i + 3 >= size || (data[i + 3] != '=' && data[i + 3] != '!'))
{
result += "(?P<";
i += 3;
changed = true;
continue;
}
}
size_t sym_size = getUtf8CharSize(ch);
result.append(data + i, sym_size);
i += sym_size;
}
if (!changed)
{
return false;
}
buffer.resize(0);
buffer.insert(buffer.end(), result.data(), result.data() + result.size());
buffer.push_back('\0');
return true;
}
std::string escapeRegExp(const char *data, size_t size)
{
std::string result;
if (!size)
{
result = "(?:)";
}
size_t prevBackSlashes = 0;
for (size_t i = 0; i < size;)
{
char ch = data[i];
if (ch == '\\')
{
++prevBackSlashes;
}
else if (ch == '/' && !(prevBackSlashes & 1))
{
result += "\\/";
i += 1;
prevBackSlashes = 0;
continue;
}
else
{
prevBackSlashes = 0;
}
size_t sym_size = getUtf8CharSize(ch);
result.append(data + i, sym_size);
i += sym_size;
}
return result;
}
================================================
FILE: lib/pattern.h
================================================
#pragma once
#include <string>
#include <vector>
// Shared helpers for translating JavaScript-style regular expressions
// into RE2-compatible patterns.
bool translateRegExp(const char *data, size_t size, bool multiline, std::vector<char> &buffer);
std::string escapeRegExp(const char *data, size_t size);
================================================
FILE: lib/replace.cc
================================================
#include "./wrapped_re2.h"
#include <algorithm>
#include <memory>
#include <string>
#include <vector>
inline int getMaxSubmatch(
const char *data,
size_t size,
const std::map<std::string, int> &namedGroups)
{
int maxSubmatch = 0, index, index2;
const char *nameBegin;
const char *nameEnd;
for (size_t i = 0; i < size;)
{
char ch = data[i];
if (ch == '$')
{
if (i + 1 < size)
{
ch = data[i + 1];
switch (ch)
{
case '$':
case '&':
case '`':
case '\'':
i += 2;
continue;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
index = ch - '0';
if (i + 2 < size)
{
ch = data[i + 2];
if ('0' <= ch && ch <= '9')
{
index2 = index * 10 + (ch - '0');
if (maxSubmatch < index2)
maxSubmatch = index2;
i += 3;
continue;
}
}
if (maxSubmatch < index)
maxSubmatch = index;
i += 2;
continue;
case '<':
nameBegin = data + i + 2;
nameEnd = (const char *)memchr(nameBegin, '>', size - i - 2);
if (nameEnd)
{
std::string name(nameBegin, nameEnd - nameBegin);
auto group = namedGroups.find(name);
if (group != namedGroups.end())
{
index = group->second;
if (maxSubmatch < index)
maxSubmatch = index;
}
i = nameEnd + 1 - data;
}
else
{
i += 2;
}
continue;
}
}
++i;
continue;
}
i += getUtf8CharSize(ch);
}
return maxSubmatch;
}
inline std::string replace(
const char *data,
size_t size,
const std::vector<re2::StringPiece> &groups,
const re2::StringPiece &str,
const std::map<std::string, int> &namedGroups)
{
std::string result;
size_t index, index2;
const char *nameBegin;
const char *nameEnd;
for (size_t i = 0; i < size;)
{
char ch = data[i];
if (ch == '$')
{
if (i + 1 < size)
{
ch = data[i + 1];
switch (ch)
{
case '$':
result += ch;
i += 2;
continue;
case '&':
result += (std::string)groups[0];
i += 2;
continue;
case '`':
result += std::string(str.data(), groups[0].data() - str.data());
i += 2;
continue;
case '\'':
result += std::string(groups[0].data() + groups[0].size(),
str.data() + str.size() - groups[0].data() - groups[0].size());
i += 2;
continue;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
index = ch - '0';
if (i + 2 < size)
{
ch = data[i + 2];
if ('0' <= ch && ch <= '9')
{
i += 3;
index2 = index * 10 + (ch - '0');
if (index2 && index2 < groups.size())
{
result += (std::string)groups[index2];
continue;
}
else if (index && index < groups.size())
{
result += (std::string)groups[index];
result += ch;
continue;
}
result += '$';
result += '0' + index;
result += ch;
continue;
}
ch = '0' + index;
}
i += 2;
if (index && index < groups.size())
{
result += (std::string)groups[index];
continue;
}
result += '$';
result += ch;
continue;
case '<':
if (!namedGroups.empty())
{
nameBegin = data + i + 2;
nameEnd = (const char *)memchr(nameBegin, '>', size - i - 2);
if (nameEnd)
{
std::string name(nameBegin, nameEnd - nameBegin);
auto group = namedGroups.find(name);
if (group != namedGroups.end())
{
index = group->second;
result += (std::string)groups[index];
}
i = nameEnd + 1 - data;
}
else
{
result += "$<";
i += 2;
}
}
else
{
result += "$<";
i += 2;
}
continue;
}
}
result += '$';
++i;
continue;
}
size_t sym_size = getUtf8CharSize(ch);
result.append(data + i, sym_size);
i += sym_size;
}
return result;
}
static Nan::Maybe<std::string> replace(
WrappedRE2 *re2,
const StrVal &replacee,
const char *replacer,
size_t replacer_size)
{
const re2::StringPiece str = replacee;
const char *data = str.data();
size_t size = str.size();
const auto &namedGroups = re2->regexp.NamedCapturingGroups();
std::vector<re2::StringPiece> groups(std::min(re2->regexp.NumberOfCapturingGroups(), getMaxSubmatch(replacer, replacer_size, namedGroups)) + 1);
const auto &match = groups[0];
size_t byteIndex = 0;
std::string result;
auto anchor = re2::RE2::UNANCHORED;
if (re2->sticky)
{
if (!re2->global)
byteIndex = replacee.byteIndex;
anchor = re2::RE2::ANCHOR_START;
}
if (byteIndex)
{
result = std::string(data, byteIndex);
}
bool noMatch = true;
while (byteIndex <= size && re2->regexp.Match(str, byteIndex, size, anchor, &groups[0], groups.size()))
{
noMatch = false;
auto offset = match.data() - data;
if (!re2->global && re2->sticky)
{
re2->lastIndex +=
replacee.isBuffer ? offset + match.size() - byteIndex : getUtf16Length(data + byteIndex, match.data() + match.size());
}
if (match.data() == data || offset > static_cast<long>(byteIndex))
{
result += std::string(data + byteIndex, offset - byteIndex);
}
result += replace(replacer, replacer_size, groups, str, namedGroups);
if (match.size())
{
byteIndex = offset + match.size();
}
else if ((size_t)offset < size)
{
auto sym_size = getUtf8CharSize(data[offset]);
result.append(data + offset, sym_size);
byteIndex = offset + sym_size;
}
else
{
byteIndex = size;
break;
}
if (!re2->global)
{
break;
}
}
if (byteIndex < size)
{
result += std::string(data + byteIndex, size - byteIndex);
}
if (re2->global)
{
re2->lastIndex = 0;
}
else if (re2->sticky)
{
if (noMatch)
re2->lastIndex = 0;
}
return Nan::Just(result);
}
inline Nan::Maybe<std::string> replace(
const Nan::Callback *replacer,
const std::vector<re2::StringPiece> &groups,
const re2::StringPiece &str,
const v8::Local<v8::Value> &input,
bool useBuffers,
const std::map<std::string, int> &namedGroups)
{
std::vector<v8::Local<v8::Value>> argv;
auto context = Nan::GetCurrentContext();
if (useBuffers)
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
argv.push_back(Nan::CopyBuffer(data, item.size()).ToLocalChecked());
}
else
{
argv.push_back(Nan::Undefined());
}
}
argv.push_back(Nan::New(static_cast<int>(groups[0].data() - str.data())));
}
else
{
for (size_t i = 0, n = groups.size(); i < n; ++i)
{
const auto &item = groups[i];
const auto data = item.data();
if (data)
{
argv.push_back(Nan::New(data, item.size()).ToLocalChecked());
}
else
{
argv.push_back(Nan::Undefined());
}
}
argv.push_back(Nan::New(static_cast<int>(getUtf16Length(str.data(), groups[0].data()))));
}
argv.push_back(input);
if (!namedGroups.empty())
{
auto groups = Nan::New<v8::Object>();
Nan::SetPrototype(groups, Nan::Null());
for (std::pair<std::string, int> group : namedGroups)
{
Nan::Set(groups, Nan::New(group.first).ToLocalChecked(), argv[group.second]);
}
argv.push_back(groups);
}
auto maybeResult = Nan::CallAsFunction(replacer->GetFunction(), context->Global(), static_cast<int>(argv.size()), &argv[0]);
if (maybeResult.IsEmpty())
{
return Nan::Nothing<std::string>();
}
auto result = maybeResult.ToLocalChecked();
if (node::Buffer::HasInstance(result))
{
return Nan::Just(std::string(node::Buffer::Data(result), node::Buffer::Length(result)));
}
auto t = result->ToString(Nan::GetCurrentContext());
if (t.IsEmpty())
{
return Nan::Nothing<std::string>();
}
v8::String::Utf8Value s(v8::Isolate::GetCurrent(), t.ToLocalChecked());
return Nan::Just(std::string(*s));
}
static Nan::Maybe<std::string> replace(
WrappedRE2 *re2,
const StrVal &replacee,
const Nan::Callback *replacer,
const v8::Local<v8::Value> &input,
bool useBuffers)
{
const re2::StringPiece str = replacee;
const char *data = str.data();
size_t size = str.size();
std::vector<re2::StringPiece> groups(re2->regexp.NumberOfCapturingGroups() + 1);
const auto &match = groups[0];
size_t byteIndex = 0;
std::string result;
auto anchor = re2::RE2::UNANCHORED;
if (re2->sticky)
{
if (!re2->global)
byteIndex = replacee.byteIndex;
anchor = RE2::ANCHOR_START;
}
if (byteIndex)
{
result = std::string(data, byteIndex);
}
const auto &namedGroups = re2->regexp.NamedCapturingGroups();
bool noMatch = true;
while (byteIndex <= size && re2->regexp.Match(str, byteIndex, size, anchor, &groups[0], groups.size()))
{
noMatch = false;
auto offset = match.data() - data;
if (!re2->global && re2->sticky)
{
re2->lastIndex += replacee.isBuffer ? offset + match.size() - byteIndex : getUtf16Length(data + byteIndex, match.data() + match.size());
}
if (match.data() == data || offset > static_cast<long>(byteIndex))
{
result += std::string(data + byteIndex, offset - byteIndex);
}
const auto part = replace(replacer, groups, str, input, useBuffers, namedGroups);
if (part.IsNothing())
{
return part;
}
result += part.FromJust();
if (match.size())
{
byteIndex = offset + match.size();
}
else if ((size_t)offset < size)
{
auto sym_size = getUtf8CharSize(data[offset]);
result.append(data + offset, sym_size);
byteIndex = offset + sym_size;
}
else
{
byteIndex = size;
break;
}
if (!re2->global)
{
break;
}
}
if (byteIndex < size)
{
result += std::string(data + byteIndex, size - byteIndex);
}
if (re2->global)
{
re2->lastIndex = 0;
}
else if (re2->sticky)
{
if (noMatch)
{
re2->lastIndex = 0;
}
}
return Nan::Just(result);
}
static bool requiresBuffers(const v8::Local<v8::Function> &f)
{
auto flag(Nan::Get(f, Nan::New("useBuffers").ToLocalChecked()).ToLocalChecked());
if (flag->IsUndefined() || flag->IsNull() || flag->IsFalse())
{
return false;
}
if (flag->IsNumber())
{
return flag->NumberValue(Nan::GetCurrentContext()).FromMaybe(0) != 0;
}
if (flag->IsString())
{
return flag->ToString(Nan::GetCurrentContext()).ToLocalChecked()->Length() > 0;
}
return true;
}
NAN_METHOD(WrappedRE2::Replace)
{
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().Set(info[0]);
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& replacee = prep;
if (replacee.isBad) return; // throws an exception
if (!replacee.isValidIndex)
{
info.GetReturnValue().Set(info[0]);
return;
}
std::string result;
if (info[1]->IsFunction())
{
auto fun = info[1].As<v8::Function>();
const std::unique_ptr<const Nan::Callback> cb(new Nan::Callback(fun));
const auto replaced = replace(re2, replacee, cb.get(), info[0], requiresBuffers(fun));
if (replaced.IsNothing())
{
info.GetReturnValue().Set(info[0]);
return;
}
result = replaced.FromJust();
}
else
{
v8::Local<v8::Object> replacer;
if (node::Buffer::HasInstance(info[1]))
{
replacer = info[1].As<v8::Object>();
}
else
{
auto t = info[1]->ToString(Nan::GetCurrentContext());
if (t.IsEmpty())
return; // throws an exception
replacer = node::Buffer::New(v8::Isolate::GetCurrent(), t.ToLocalChecked()).ToLocalChecked();
}
auto data = node::Buffer::Data(replacer);
auto size = node::Buffer::Length(replacer);
const auto replaced = replace(re2, replacee, data, size);
if (replaced.IsNothing())
{
info.GetReturnValue().Set(info[0]);
return;
}
result = replaced.FromJust();
}
if (replacee.isBuffer)
{
info.GetReturnValue().Set(Nan::CopyBuffer(result.data(), result.size()).ToLocalChecked());
return;
}
info.GetReturnValue().Set(Nan::New(result).ToLocalChecked());
}
================================================
FILE: lib/search.cc
================================================
#include "./wrapped_re2.h"
NAN_METHOD(WrappedRE2::Search)
{
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().Set(-1);
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& str = prep;
if (str.isBad) return; // throws an exception
if (!str.data)
return;
// actual work
re2::StringPiece match;
if (re2->regexp.Match(str, 0, str.size, re2->sticky ? re2::RE2::ANCHOR_START : re2::RE2::UNANCHORED, &match, 1))
{
info.GetReturnValue().Set(static_cast<int>(str.isBuffer ? match.data() - str.data : getUtf16Length(str.data, match.data())));
return;
}
info.GetReturnValue().Set(-1);
}
================================================
FILE: lib/set.cc
================================================
#include "./wrapped_re2_set.h"
#include "./pattern.h"
#include "./util.h"
#include "./wrapped_re2.h"
#include <algorithm>
#include <memory>
#include <string>
#include <vector>
struct SetFlags
{
bool global = false;
bool ignoreCase = false;
bool multiline = false;
bool dotAll = false;
bool unicode = false;
bool sticky = false;
bool hasIndices = false;
};
static bool parseFlags(const v8::Local<v8::Value> &arg, SetFlags &flags)
{
const char *data = nullptr;
size_t size = 0;
std::vector<char> buffer;
if (arg->IsString())
{
auto isolate = v8::Isolate::GetCurrent();
auto t = arg->ToString(Nan::GetCurrentContext());
if (t.IsEmpty())
{
return false;
}
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
s->WriteUtf8(isolate, &buffer[0], buffer.size());
buffer[buffer.size() - 1] = '\0';
data = &buffer[0];
}
else if (node::Buffer::HasInstance(arg))
{
size = node::Buffer::Length(arg);
data = node::Buffer::Data(arg);
}
else
{
return false;
}
for (size_t i = 0; i < size; ++i)
{
switch (data[i])
{
case 'd':
flags.hasIndices = true;
break;
case 'g':
flags.global = true;
break;
case 'i':
flags.ignoreCase = true;
break;
case 'm':
flags.multiline = true;
break;
case 's':
flags.dotAll = true;
break;
case 'u':
flags.unicode = true;
break;
case 'y':
flags.sticky = true;
break;
default:
return false;
}
}
return true;
}
static bool sameEffectiveOptions(const SetFlags &a, const SetFlags &b)
{
return a.ignoreCase == b.ignoreCase && a.multiline == b.multiline && a.dotAll == b.dotAll && a.unicode == b.unicode;
}
static std::string flagsToString(const SetFlags &flags)
{
std::string result;
if (flags.hasIndices)
{
result += 'd';
}
if (flags.global)
{
result += 'g';
}
if (flags.ignoreCase)
{
result += 'i';
}
if (flags.multiline)
{
result += 'm';
}
if (flags.dotAll)
{
result += 's';
}
result += 'u';
if (flags.sticky)
{
result += 'y';
}
return result;
}
static bool collectIterable(const v8::Local<v8::Value> &input, std::vector<v8::Local<v8::Value>> &items)
{
auto context = Nan::GetCurrentContext();
auto isolate = v8::Isolate::GetCurrent();
if (input->IsArray())
{
auto array = v8::Local<v8::Array>::Cast(input);
auto length = array->Length();
items.reserve(length);
for (uint32_t i = 0; i < length; ++i)
{
auto maybe = Nan::Get(array, i);
if (maybe.IsEmpty())
{
return false;
}
items.push_back(maybe.ToLocalChecked());
}
return true;
}
auto maybeObject = input->ToObject(context);
if (maybeObject.IsEmpty())
{
return false;
}
auto object = maybeObject.ToLocalChecked();
auto maybeIteratorFn = object->Get(context, v8::Symbol::GetIterator(isolate));
if (maybeIteratorFn.IsEmpty())
{
return false;
}
auto iteratorFn = maybeIteratorFn.ToLocalChecked();
if (!iteratorFn->IsFunction())
{
return false;
}
auto maybeIterator = iteratorFn.As<v8::Function>()->Call(context, object, 0, nullptr);
if (maybeIterator.IsEmpty())
{
return false;
}
auto iterator = maybeIterator.ToLocalChecked();
if (!iterator->IsObject())
{
return false;
}
auto nextKey = Nan::New("next").ToLocalChecked();
auto valueKey = Nan::New("value").ToLocalChecked();
auto doneKey = Nan::New("done").ToLocalChecked();
for (;;)
{
auto maybeNext = Nan::Get(iterator.As<v8::Object>(), nextKey);
if (maybeNext.IsEmpty())
{
return false;
}
auto next = maybeNext.ToLocalChecked();
if (!next->IsFunction())
{
return false;
}
auto maybeResult = next.As<v8::Function>()->Call(context, iterator, 0, nullptr);
if (maybeResult.IsEmpty())
{
return false;
}
auto result = maybeResult.ToLocalChecked();
if (!result->IsObject())
{
return false;
}
auto resultObj = result->ToObject(context).ToLocalChecked();
auto maybeDone = Nan::Get(resultObj, doneKey);
if (maybeDone.IsEmpty())
{
return false;
}
if (maybeDone.ToLocalChecked()->BooleanValue(isolate))
{
break;
}
auto maybeValue = Nan::Get(resultObj, valueKey);
if (maybeValue.IsEmpty())
{
return false;
}
items.push_back(maybeValue.ToLocalChecked());
}
return true;
}
static bool parseAnchor(const v8::Local<v8::Value> &arg, re2::RE2::Anchor &anchor)
{
if (arg.IsEmpty() || arg->IsUndefined() || arg->IsNull())
{
anchor = re2::RE2::UNANCHORED;
return true;
}
v8::Local<v8::Value> value = arg;
if (arg->IsObject() && !arg->IsString())
{
auto context = Nan::GetCurrentContext();
auto object = arg->ToObject(context).ToLocalChecked();
auto maybeAnchor = Nan::Get(object, Nan::New("anchor").ToLocalChecked());
if (maybeAnchor.IsEmpty())
{
return false;
}
value = maybeAnchor.ToLocalChecked();
if (value->IsUndefined() || value->IsNull())
{
anchor = re2::RE2::UNANCHORED;
return true;
}
}
if (!value->IsString())
{
return false;
}
Nan::Utf8String val(value);
std::string text(*val, val.length());
if (text == "unanchored")
{
anchor = re2::RE2::UNANCHORED;
return true;
}
if (text == "start")
{
anchor = re2::RE2::ANCHOR_START;
return true;
}
if (text == "both")
{
anchor = re2::RE2::ANCHOR_BOTH;
return true;
}
return false;
}
static bool fillInput(const v8::Local<v8::Value> &arg, StrVal &str, v8::Local<v8::Object> &keepAlive)
{
if (node::Buffer::HasInstance(arg))
{
auto size = node::Buffer::Length(arg);
str.reset(arg, size, size, 0, true);
return true;
}
auto context = Nan::GetCurrentContext();
auto isolate = v8::Isolate::GetCurrent();
auto t = arg->ToString(context);
if (t.IsEmpty())
{
return false;
}
auto s = t.ToLocalChecked();
auto utf8Length = s->Utf8Length(isolate);
auto buffer = node::Buffer::New(isolate, s).ToLocalChecked();
keepAlive = buffer;
str.reset(buffer, node::Buffer::Length(buffer), utf8Length, 0);
return true;
}
static std::string anchorToString(re2::RE2::Anchor anchor)
{
switch (anchor)
{
case re2::RE2::ANCHOR_BOTH:
return "both";
case re2::RE2::ANCHOR_START:
return "start";
default:
return "unanchored";
}
}
static std::string makeCombinedSource(const std::vector<std::string> &sources)
{
if (sources.empty())
{
return "(?:)";
}
std::string combined;
for (size_t i = 0, n = sources.size(); i < n; ++i)
{
if (i)
{
combined += '|';
}
combined += sources[i];
}
return combined;
}
static const char setDeprecationMessage[] = "BMP patterns aren't supported by node-re2. An implicit \"u\" flag is assumed by RE2.Set. In a future major version, calling RE2.Set without the \"u\" flag may become forbidden, or cause a different behavior. Please see https://github.com/uhop/node-re2/issues/21 for more information.";
NAN_METHOD(WrappedRE2Set::New)
{
auto context = Nan::GetCurrentContext();
auto isolate = context->GetIsolate();
if (!info.IsConstructCall())
{
std::vector<v8::Local<v8::Value>> parameters(info.Length());
for (size_t i = 0, n = info.Length(); i < n; ++i)
{
parameters[i] = info[i];
}
auto isolate = context->GetIsolate();
auto addonData = getAddonData(isolate);
if (!addonData) return;
auto maybeNew = Nan::NewInstance(Nan::GetFunction(addonData->re2SetTpl.Get(isolate)).ToLocalChecked(), parameters.size(), ¶meters[0]);
if (!maybeNew.IsEmpty())
{
info.GetReturnValue().Set(maybeNew.ToLocalChecked());
}
return;
}
if (!info.Length())
{
return Nan::ThrowTypeError("Expected an iterable of patterns as the 1st argument.");
}
SetFlags flags;
bool haveFlags = false;
bool flagsFromArg = false;
v8::Local<v8::Value> flagsArg;
v8::Local<v8::Value> optionsArg;
if (info.Length() > 1)
{
if (info[1]->IsObject() && !info[1]->IsString() && !node::Buffer::HasInstance(info[1]))
{
optionsArg = info[1];
}
else
{
flagsArg = info[1];
if (info.Length() > 2)
{
optionsArg = info[2];
}
}
}
if (!flagsArg.IsEmpty())
{
if (!parseFlags(flagsArg, flags))
{
return Nan::ThrowTypeError("Invalid flags for RE2.Set.");
}
haveFlags = true;
flagsFromArg = true;
}
re2::RE2::Anchor anchor = re2::RE2::UNANCHORED;
if (!optionsArg.IsEmpty())
{
if (!parseAnchor(optionsArg, anchor))
{
return Nan::ThrowTypeError("Invalid anchor option for RE2.Set.");
}
}
std::vector<v8::Local<v8::Value>> patterns;
if (!collectIterable(info[0], patterns))
{
return Nan::ThrowTypeError("Expected an iterable of patterns as the 1st argument.");
}
auto mergeFlags = [&](const SetFlags &candidate) {
if (flagsFromArg)
{
return true;
}
if (!haveFlags)
{
flags = candidate;
haveFlags = true;
return true;
}
return sameEffectiveOptions(flags, candidate);
};
for (auto &value : patterns)
{
SetFlags patternFlags;
bool hasFlagsForPattern = false;
if (value->IsRegExp())
{
const auto *re = v8::RegExp::Cast(*value);
v8::RegExp::Flags reFlags = re->GetFlags();
patternFlags.global = bool(reFlags & v8::RegExp::kGlobal);
patternFlags.ignoreCase = bool(reFlags & v8::RegExp::kIgnoreCase);
patternFlags.multiline = bool(reFlags & v8::RegExp::kMultiline);
patternFlags.dotAll = bool(reFlags & v8::RegExp::kDotAll);
patternFlags.unicode = bool(reFlags & v8::RegExp::kUnicode);
patternFlags.sticky = bool(reFlags & v8::RegExp::kSticky);
patternFlags.hasIndices = bool(reFlags & v8::RegExp::kHasIndices);
hasFlagsForPattern = true;
}
else if (value->IsObject())
{
auto maybeObj = value->ToObject(context);
if (!maybeObj.IsEmpty())
{
auto obj = maybeObj.ToLocalChecked();
if (WrappedRE2::HasInstance(obj))
{
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(obj);
patternFlags.global = re2->global;
patternFlags.ignoreCase = re2->ignoreCase;
patternFlags.multiline = re2->multiline;
patternFlags.dotAll = re2->dotAll;
patternFlags.unicode = true;
patternFlags.sticky = re2->sticky;
patternFlags.hasIndices = re2->hasIndices;
hasFlagsForPattern = true;
}
}
}
if (hasFlagsForPattern && !mergeFlags(patternFlags))
{
return Nan::ThrowTypeError("All patterns in RE2.Set must use the same flags.");
}
}
if (!flags.unicode)
{
switch (WrappedRE2::unicodeWarningLevel)
{
case WrappedRE2::THROW:
return Nan::ThrowSyntaxError(setDeprecationMessage);
case WrappedRE2::WARN:
printDeprecationWarning(setDeprecationMessage);
break;
case WrappedRE2::WARN_ONCE:
if (!WrappedRE2::alreadyWarnedAboutUnicode)
{
printDeprecationWarning(setDeprecationMessage);
WrappedRE2::alreadyWarnedAboutUnicode = true;
}
break;
default:
break;
}
}
re2::RE2::Options options;
options.set_case_sensitive(!flags.ignoreCase);
options.set_one_line(!flags.multiline);
options.set_dot_nl(flags.dotAll);
options.set_log_errors(false);
std::unique_ptr<WrappedRE2Set> set(new WrappedRE2Set(options, anchor, flagsToString(flags)));
std::vector<char> buffer;
for (auto &value : patterns)
{
const char *data = nullptr;
size_t size = 0;
std::string source;
if (node::Buffer::HasInstance(value))
{
size = node::Buffer::Length(value);
data = node::Buffer::Data(value);
source = escapeRegExp(data, size);
}
else if (value->IsRegExp())
{
const auto *re = v8::RegExp::Cast(*value);
auto t = re->GetSource()->ToString(context);
if (t.IsEmpty())
{
return;
}
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
s->WriteUtf8(isolate, &buffer[0], buffer.size());
buffer[size] = '\0';
data = &buffer[0];
source = escapeRegExp(data, size);
}
else if (value->IsString())
{
auto t = value->ToString(context);
if (t.IsEmpty())
{
return;
}
auto s = t.ToLocalChecked();
size = s->Utf8Length(isolate);
buffer.resize(size + 1);
s->WriteUtf8(isolate, &buffer[0], buffer.size());
buffer[size] = '\0';
data = &buffer[0];
source = escapeRegExp(data, size);
}
else if (value->IsObject())
{
auto maybeObj = value->ToObject(context);
if (maybeObj.IsEmpty())
{
return;
}
auto obj = maybeObj.ToLocalChecked();
if (!WrappedRE2::HasInstance(obj))
{
return Nan::ThrowTypeError("Expected a string, Buffer, RegExp, or RE2 instance in the pattern list.");
}
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(obj);
source = re2->source;
data = source.data();
size = source.size();
}
else
{
return Nan::ThrowTypeError("Expected a string, Buffer, RegExp, or RE2 instance in the pattern list.");
}
if (translateRegExp(data, size, flags.multiline, buffer))
{
data = &buffer[0];
size = buffer.size() - 1;
}
std::string error;
if (set->set.Add(re2::StringPiece(data, size), &error) < 0)
{
if (error.empty())
{
error = "Invalid pattern in RE2.Set.";
}
return Nan::ThrowSyntaxError(error.c_str());
}
set->sources.push_back(source);
}
if (!set->set.Compile())
{
return Nan::ThrowError("RE2.Set could not be compiled.");
}
set->combinedSource = makeCombinedSource(set->sources);
set->Wrap(info.This());
set.release();
info.GetReturnValue().Set(info.This());
}
NAN_METHOD(WrappedRE2Set::Test)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(false);
return;
}
StrVal str;
v8::Local<v8::Object> keepAlive;
if (!fillInput(info[0], str, keepAlive))
{
return;
}
re2::RE2::Set::ErrorInfo errorInfo{re2::RE2::Set::kNoError};
bool matched = re2set->set.Match(str, nullptr, &errorInfo);
if (!matched && errorInfo.kind != re2::RE2::Set::kNoError)
{
const char *message = "RE2.Set matching failed.";
switch (errorInfo.kind)
{
case re2::RE2::Set::kOutOfMemory:
message = "RE2.Set matching failed: out of memory.";
break;
case re2::RE2::Set::kInconsistent:
message = "RE2.Set matching failed: inconsistent result.";
break;
case re2::RE2::Set::kNotCompiled:
message = "RE2.Set matching failed: set is not compiled.";
break;
default:
break;
}
return Nan::ThrowError(message);
}
info.GetReturnValue().Set(matched);
}
NAN_METHOD(WrappedRE2Set::Match)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(Nan::New<v8::Array>(0));
return;
}
StrVal str;
v8::Local<v8::Object> keepAlive;
if (!fillInput(info[0], str, keepAlive))
{
return;
}
std::vector<int> matches;
re2::RE2::Set::ErrorInfo errorInfo{re2::RE2::Set::kNoError};
bool matched = re2set->set.Match(str, &matches, &errorInfo);
if (!matched && errorInfo.kind != re2::RE2::Set::kNoError)
{
const char *message = "RE2.Set matching failed.";
switch (errorInfo.kind)
{
case re2::RE2::Set::kOutOfMemory:
message = "RE2.Set matching failed: out of memory.";
break;
case re2::RE2::Set::kInconsistent:
message = "RE2.Set matching failed: inconsistent result.";
break;
case re2::RE2::Set::kNotCompiled:
message = "RE2.Set matching failed: set is not compiled.";
break;
default:
break;
}
return Nan::ThrowError(message);
}
std::sort(matches.begin(), matches.end());
auto result = Nan::New<v8::Array>(matches.size());
for (size_t i = 0, n = matches.size(); i < n; ++i)
{
Nan::Set(result, i, Nan::New(matches[i]));
}
info.GetReturnValue().Set(result);
}
NAN_METHOD(WrappedRE2Set::ToString)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().SetEmptyString();
return;
}
std::string result = "/";
result += re2set->combinedSource;
result += "/";
result += re2set->flags;
info.GetReturnValue().Set(Nan::New(result).ToLocalChecked());
}
NAN_GETTER(WrappedRE2Set::GetFlags)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(Nan::New("u").ToLocalChecked());
return;
}
info.GetReturnValue().Set(Nan::New(re2set->flags).ToLocalChecked());
}
NAN_GETTER(WrappedRE2Set::GetSources)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(Nan::New<v8::Array>(0));
return;
}
auto result = Nan::New<v8::Array>(re2set->sources.size());
for (size_t i = 0, n = re2set->sources.size(); i < n; ++i)
{
Nan::Set(result, i, Nan::New(re2set->sources[i]).ToLocalChecked());
}
info.GetReturnValue().Set(result);
}
NAN_GETTER(WrappedRE2Set::GetSource)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(Nan::New("(?:)").ToLocalChecked());
return;
}
info.GetReturnValue().Set(Nan::New(re2set->combinedSource).ToLocalChecked());
}
NAN_GETTER(WrappedRE2Set::GetSize)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(0);
return;
}
info.GetReturnValue().Set(static_cast<uint32_t>(re2set->sources.size()));
}
NAN_GETTER(WrappedRE2Set::GetAnchor)
{
auto re2set = Nan::ObjectWrap::Unwrap<WrappedRE2Set>(info.This());
if (!re2set)
{
info.GetReturnValue().Set(Nan::New("unanchored").ToLocalChecked());
return;
}
info.GetReturnValue().Set(Nan::New(anchorToString(re2set->anchor)).ToLocalChecked());
}
v8::Local<v8::Function> WrappedRE2Set::Init()
{
Nan::EscapableHandleScope scope;
auto tpl = Nan::New<v8::FunctionTemplate>(New);
tpl->SetClassName(Nan::New("RE2Set").ToLocalChecked());
auto instanceTemplate = tpl->InstanceTemplate();
instanceTemplate->SetInternalFieldCount(1);
Nan::SetPrototypeMethod(tpl, "test", Test);
Nan::SetPrototypeMethod(tpl, "match", Match);
Nan::SetPrototypeMethod(tpl, "toString", ToString);
Nan::SetAccessor(instanceTemplate, Nan::New("flags").ToLocalChecked(), GetFlags);
Nan::SetAccessor(instanceTemplate, Nan::New("sources").ToLocalChecked(), GetSources);
Nan::SetAccessor(instanceTemplate, Nan::New("source").ToLocalChecked(), GetSource);
Nan::SetAccessor(instanceTemplate, Nan::New("size").ToLocalChecked(), GetSize);
Nan::SetAccessor(instanceTemplate, Nan::New("anchor").ToLocalChecked(), GetAnchor);
auto isolate = v8::Isolate::GetCurrent();
auto data = getAddonData(isolate);
if (data)
{
data->re2SetTpl.Reset(tpl);
}
return scope.Escape(Nan::GetFunction(tpl).ToLocalChecked());
}
================================================
FILE: lib/split.cc
================================================
#include "./wrapped_re2.h"
#include <algorithm>
#include <limits>
#include <vector>
NAN_METHOD(WrappedRE2::Split)
{
auto result = Nan::New<v8::Array>();
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
Nan::Set(result, 0, info[0]);
info.GetReturnValue().Set(result);
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& str = prep;
if (str.isBad) return; // throws an exception
size_t limit = std::numeric_limits<size_t>::max();
if (info.Length() > 1 && info[1]->IsNumber())
{
size_t lim = info[1]->NumberValue(Nan::GetCurrentContext()).FromMaybe(0);
if (lim > 0)
{
limit = lim;
}
}
// actual work
std::vector<re2::StringPiece> groups(re2->regexp.NumberOfCapturingGroups() + 1), pieces;
const auto &match = groups[0];
size_t byteIndex = 0;
while (byteIndex < str.size && re2->regexp.Match(str, byteIndex, str.size, RE2::UNANCHORED, &groups[0], groups.size()))
{
if (match.size())
{
pieces.push_back(re2::StringPiece(str.data + byteIndex, match.data() - str.data - byteIndex));
byteIndex = match.data() - str.data + match.size();
pieces.insert(pieces.end(), groups.begin() + 1, groups.end());
}
else
{
size_t sym_size = getUtf8CharSize(str.data[byteIndex]);
pieces.push_back(re2::StringPiece(str.data + byteIndex, sym_size));
byteIndex += sym_size;
}
if (pieces.size() >= limit)
{
break;
}
}
if (pieces.size() < limit && (byteIndex < str.size || (byteIndex == str.size && match.size())))
{
pieces.push_back(re2::StringPiece(str.data + byteIndex, str.size - byteIndex));
}
if (pieces.empty())
{
Nan::Set(result, 0, info[0]);
info.GetReturnValue().Set(result);
return;
}
// form a result
if (str.isBuffer)
{
for (size_t i = 0, n = std::min(pieces.size(), limit); i < n; ++i)
{
const auto &item = pieces[i];
if (item.data())
{
Nan::Set(result, i, Nan::CopyBuffer(item.data(), item.size()).ToLocalChecked());
}
else
{
Nan::Set(result, i, Nan::Undefined());
}
}
}
else
{
for (size_t i = 0, n = std::min(pieces.size(), limit); i < n; ++i)
{
const auto &item = pieces[i];
if (item.data())
{
Nan::Set(result, i, Nan::New(item.data(), item.size()).ToLocalChecked());
}
else
{
Nan::Set(result, i, Nan::Undefined());
}
}
}
info.GetReturnValue().Set(result);
}
================================================
FILE: lib/test.cc
================================================
#include "./wrapped_re2.h"
#include <vector>
NAN_METHOD(WrappedRE2::Test)
{
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().Set(false);
return;
}
PrepareLastString prep(re2, info[0]);
StrVal& str = prep;
if (str.isBad) return; // throws an exception
if (!re2->global && !re2->sticky)
{
info.GetReturnValue().Set(re2->regexp.Match(str, 0, str.size, re2::RE2::UNANCHORED, NULL, 0));
return;
}
if (!str.isValidIndex)
{
re2->lastIndex = 0;
info.GetReturnValue().Set(false);
return;
}
// actual work
re2::StringPiece match;
if (re2->regexp.Match(str, str.byteIndex, str.size, re2->sticky ? re2::RE2::ANCHOR_START : re2::RE2::UNANCHORED, &match, 1))
{
re2->lastIndex +=
str.isBuffer ? match.data() - str.data + match.size() - str.byteIndex : getUtf16Length(str.data + str.byteIndex, match.data() + match.size());
info.GetReturnValue().Set(true);
return;
}
re2->lastIndex = 0;
info.GetReturnValue().Set(false);
}
================================================
FILE: lib/to_string.cc
================================================
#include "./wrapped_re2.h"
#include <string>
NAN_METHOD(WrappedRE2::ToString)
{
// unpack arguments
auto re2 = Nan::ObjectWrap::Unwrap<WrappedRE2>(info.This());
if (!re2)
{
info.GetReturnValue().SetEmptyString();
return;
}
// actual work
std::string buffer("/");
buffer += re2->source;
buffer += "/";
if (re2->hasIndices)
{
buffer += "d";
}
if (re2->global)
{
buffer += "g";
}
if (re2->ignoreCase)
{
buffer += "i";
}
if (re2->multiline)
{
buffer += "m";
}
if (re2->dotAll)
{
buffer += "s";
}
buffer += "u";
if (re2->sticky)
{
buffer += "y";
}
info.GetReturnValue().Set(Nan::New(buffer).ToLocalChecked());
}
================================================
FILE: lib/util.cc
================================================
#include "./util.h"
void consoleCall(const v8::Local<v8::String> &methodName, v8::Local<v8::Value> text)
{
auto context = Nan::GetCurrentContext();
auto maybeConsole = bind<v8::Object>(
Nan::Get(context->Global(), Nan::New("console").ToLocalChecked()),
[context](v8::Local<v8::Value> console) { return console->ToObject(context); });
if (maybeConsole.IsEmpty())
return;
auto console = maybeConsole.ToLocalChecked();
auto maybeMethod = bind<v8::Object>(
Nan::Get(console, methodName),
[context](v8::Local<v8::Value> method) { return method->ToObject(context); });
if (maybeMethod.IsEmpty())
return;
auto method = maybeMethod.ToLocalChecked();
if (!method->IsFunction())
return;
Nan::CallAsFunction(method, console, 1, &text);
}
void printDeprecationWarning(const char *warning)
{
std::string prefixedWarning = "DeprecationWarning: ";
prefixedWarning += warning;
consoleCall(Nan::New("error").ToLocalChecked(), Nan::New(prefixedWarning).ToLocalChecked());
}
v8::Local<v8::String> callToString(const v8::Local<v8::Object> &object)
{
auto context = Nan::GetCurrentContext();
auto maybeMethod = bind<v8::Object>(
Nan::Get(object, Nan::New("toString").ToLocalChecked()),
[context](v8::Local<v8::Value> method) { return method->ToObject(context); });
if (maybeMethod.IsEmpty())
return Nan::New("No toString() is found").ToLocalChecked();
auto method = maybeMethod.ToLocalChecked();
if (!method->IsFunction())
return Nan::New("No toString() is found").ToLocalChecked();
auto maybeResult = Nan::CallAsFunction(method, object, 0, nullptr);
if (maybeResult.IsEmpty())
{
return Nan::New("nothing was returned").ToLocalChecked();
}
auto result = maybeResult.ToLocalChecked();
if (result->IsObject())
{
return callToString(result->ToObject(context).ToLocalChecked());
}
Nan::Utf8String val(result->ToString(context).ToLocalChecked());
return Nan::New(std::string(*val, val.length())).ToLocalChecked();
}
================================================
FILE: lib/util.h
================================================
#pragma once
#include "./wrapped_re2.h"
template <typename R, typename P, typename L>
inline v8::MaybeLocal<R> bind(v8::MaybeLocal<P> param, L lambda)
{
return param.IsEmpty() ? v8::MaybeLocal<R>() : lambda(param.ToLocalChecked());
}
void consoleCall(const v8::Local<v8::String> &methodName, v8::Local<v8::Value> text);
void printDeprecationWarning(const char *warning);
v8::Local<v8::String> callToString(const v8::Local<v8::Object> &object);
================================================
FILE: lib/wrapped_re2.h
================================================
#pragma once
#include <atomic>
#include <string>
#include <nan.h>
#include <re2/re2.h>
#include "./isolate_data.h"
struct StrVal
{
char *data;
size_t size, length;
size_t index, byteIndex;
bool isBuffer, isValidIndex, isBad;
StrVal() : data(NULL), size(0), length(0), index(0), byteIndex(0), isBuffer(false), isValidIndex(false), isBad(false) {}
operator re2::StringPiece() const { return re2::StringPiece(data, size); }
void setIndex(size_t newIndex = 0);
void reset(const v8::Local<v8::Value> &arg, size_t size, size_t length, size_t newIndex = 0, bool buffer = false);
void clear()
{
isBad = isBuffer = isValidIndex = false;
size = length = index = byteIndex = 0;
data = nullptr;
}
};
class WrappedRE2 : public Nan::ObjectWrap
{
private:
WrappedRE2(
const re2::StringPiece &pattern,
const re2::RE2::Options &options,
const std::string &src,
const bool &g,
const bool &i,
const bool &m,
const bool &s,
const bool &y,
const bool &d) : regexp(pattern, options),
source(src),
global(g),
ignoreCase(i),
multiline(m),
dotAll(s),
sticky(y),
hasIndices(d),
lastIndex(0) {}
static NAN_METHOD(New);
static NAN_METHOD(ToString);
static NAN_GETTER(GetSource);
static NAN_GETTER(GetFlags);
static NAN_GETTER(GetGlobal);
static NAN_GETTER(GetIgnoreCase);
static NAN_GETTER(GetMultiline);
static NAN_GETTER(GetDotAll);
static NAN_GETTER(GetUnicode);
static NAN_GETTER(GetSticky);
static NAN_GETTER(GetHasIndices);
static NAN_GETTER(GetLastIndex);
static NAN_SETTER(SetLastIndex);
static NAN_GETTER(GetInternalSource);
// RegExp methods
static NAN_METHOD(Exec);
static NAN_METHOD(Test);
// String methods
static NAN_METHOD(Match);
static NAN_METHOD(Replace);
static NAN_METHOD(Search);
static NAN_METHOD(Split);
// strict Unicode warning support
static NAN_GETTER(GetUnicodeWarningLevel);
static NAN_SETTER(SetUnicodeWarningLevel);
public:
~WrappedRE2();
static v8::Local<v8::Function> Init();
static inline bool HasInstance(v8::Local<v8::Object> object)
{
auto isolate = v8::Isolate::GetCurrent();
auto data = getAddonData(isolate);
if (!data || data->re2Tpl.IsEmpty()) return false;
return data->re2Tpl.Get(isolate)->HasInstance(object);
}
enum UnicodeWarningLevels
{
NOTHING,
WARN_ONCE,
WARN,
THROW
};
static std::atomic<UnicodeWarningLevels> unicodeWarningLevel;
static std::atomic<bool> alreadyWarnedAboutUnicode;
re2::RE2 regexp;
std::string source;
bool global;
bool ignoreCase;
bool multiline;
bool dotAll;
bool sticky;
bool hasIndices;
size_t lastIndex;
friend struct PrepareLastString;
private:
Nan::Persistent<v8::Value> lastString; // weak pointer
Nan::Persistent<v8::Object> lastCache; // weak pointer
StrVal lastStringValue;
void dropCache();
const StrVal &prepareArgument(const v8::Local<v8::Value> &arg, bool ignoreLastIndex = false);
void doneWithLastString();
};
struct PrepareLastString
{
PrepareLastString(WrappedRE2 *re2, const v8::Local<v8::Value> &arg, bool ignoreLastIndex = false) : re2(re2) {
re2->prepareArgument(arg, ignoreLastIndex);
}
~PrepareLastString() {
re2->doneWithLastString();
}
operator const StrVal&() const {
return re2->lastStringValue;
}
operator StrVal&() {
return re2->lastStringValue;
}
WrappedRE2 *re2;
};
// utilities
inline size_t getUtf8Length(const uint16_t *from, const uint16_t *to)
{
size_t n = 0;
while (from != to)
{
uint16_t ch = *from++;
if (ch <= 0x7F)
++n;
else if (ch <= 0x7FF)
n += 2;
else if (0xD800 <= ch && ch <= 0xDFFF)
{
n += 4;
if (from == to)
break;
++from;
}
else if (ch < 0xFFFF)
n += 3;
else
n += 4;
}
return n;
}
inline size_t getUtf16Length(const char *from, const char *to)
{
size_t n = 0;
while (from != to)
{
unsigned ch = *from & 0xFF;
if (ch < 0xF0)
{
if (ch < 0x80)
{
++from;
}
else
{
if (ch < 0xE0)
{
from += 2;
if (from == to + 1)
{
++n;
break;
}
}
else
{
from += 3;
if (from > to && from < to + 3)
{
++n;
break;
}
}
}
++n;
}
else
{
from += 4;
n += 2;
if (from > to && from < to + 4)
break;
}
}
return n;
}
inline size_t getUtf8CharSize(char ch)
{
return ((0xE5000000 >> ((ch >> 3) & 0x1E)) & 3) + 1;
}
inline size_t getUtf16PositionByCounter(const char *data, size_t from, size_t n)
{
for (; n > 0; --n)
{
size_t s = getUtf8CharSize(data[from]);
from += s;
if (s == 4 && n >= 2)
--n; // this utf8 character will take two utf16 characters
// the decrement above is protected to avoid an overflow of an unsigned integer
}
return from;
}
================================================
FILE: lib/wrapped_re2_set.h
================================================
#pragma once
#include <nan.h>
#include <re2/re2.h>
#include <re2/set.h>
#include "./isolate_data.h"
#include <string>
#include <vector>
class WrappedRE2Set : public Nan::ObjectWrap
{
public:
static v8::Local<v8::Function> Init();
static inline bool HasInstance(v8::Local<v8::Object> object)
{
auto isolate = v8::Isolate::GetCurrent();
auto data = getAddonData(isolate);
if (!data || data->re2SetTpl.IsEmpty()) return false;
return data->re2SetTpl.Get(isolate)->HasInstance(object);
}
private:
WrappedRE2Set(const re2::RE2::Options &options, re2::RE2::Anchor anchor, const std::string &flags) : set(options, anchor), flags(flags), anchor(anchor) {}
static NAN_METHOD(New);
static NAN_METHOD(Test);
static NAN_METHOD(Match);
static NAN_METHOD(ToString);
static NAN_GETTER(GetFlags);
static NAN_GETTER(GetSources);
static NAN_GETTER(GetSource);
static NAN_GETTER(GetSize);
static NAN_GETTER(GetAnchor);
re2::RE2::Set set;
std::vector<std::string> sources;
std::string combinedSource;
std::string flags;
re2::RE2::Anchor anchor;
};
================================================
FILE: llms-full.txt
================================================
# node-re2
> Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. Drop-in RegExp replacement that prevents ReDoS (Regular Expression Denial of Service). Works with strings and Buffers. C++ native addon built with node-gyp and nan.
- Drop-in replacement for RegExp with linear-time matching guarantee
- Prevents ReDoS by disallowing backreferences and lookahead assertions
- Full Unicode mode (always on)
- Buffer support for high-performance binary/UTF-8 processing
- Named capture groups
- Symbol-based methods (Symbol.match, Symbol.search, Symbol.replace, Symbol.split, Symbol.matchAll)
- RE2.Set for multi-pattern matching
- Prebuilt binaries for Linux, macOS, Windows (x64 + arm64)
- TypeScript declarations included
## Install
```bash
npm install re2
```
Prebuilt native binaries are downloaded automatically. Falls back to building from source via node-gyp if no prebuilt is available.
## Quick start
```js
const RE2 = require('re2');
// Create and use like RegExp
const re = new RE2('a(b*)', 'i');
const result = re.exec('aBbC');
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"
// Works with ES6 string methods
'hello world'.match(new RE2('\\w+', 'g')); // ['hello', 'world']
'hello world'.replace(new RE2('world'), 'RE2'); // 'hello RE2'
```
## Importing
```js
// CommonJS
const RE2 = require('re2');
// ESM
import { RE2 } from 're2';
```
## Construction
`new RE2(pattern[, flags])` or `RE2(pattern[, flags])` (factory mode).
Pattern can be:
- **String**: `new RE2('\\d+')`
- **String with flags**: `new RE2('\\d+', 'gi')`
- **RegExp**: `new RE2(/ab*/ig)` — copies pattern and flags.
- **RE2**: `new RE2(existingRE2)` — copies pattern and flags.
- **Buffer**: `new RE2(Buffer.from('pattern'))` — pattern from UTF-8 buffer.
Supported flags:
- `g` — global (find all matches)
- `i` — ignoreCase
- `m` — multiline (`^`/`$` match line boundaries)
- `s` — dotAll (`.` matches `\n`)
- `u` — unicode (always on, added implicitly)
- `y` — sticky (match at lastIndex only)
- `d` — hasIndices (include index info for capture groups)
Invalid patterns throw `SyntaxError`. Patterns with backreferences or lookahead throw `SyntaxError`.
## Properties
### Instance properties
- `re.source` (string) — the pattern string, escaped for use in `new RE2(re.source)` or `new RegExp(re.source)`.
- `re.flags` (string) — the flags string (e.g., `'giu'`).
- `re.lastIndex` (number) — the index at which to start the next match (used with `g` or `y` flags).
- `re.global` (boolean) — whether the `g` flag is set.
- `re.ignoreCase` (boolean) — whether the `i` flag is set.
- `re.multiline` (boolean) — whether the `m` flag is set.
- `re.dotAll` (boolean) — whether the `s` flag is set.
- `re.unicode` (boolean) — always `true` (RE2 always operates in Unicode mode).
- `re.sticky` (boolean) — whether the `y` flag is set.
- `re.hasIndices` (boolean) — whether the `d` flag is set.
- `re.internalSource` (string) — the RE2-translated pattern (for debugging; may differ from `source`).
### Static properties
- `RE2.unicodeWarningLevel` (string) — controls behavior when a non-Unicode regexp is created:
- `'nothing'` (default) — silently add `u` flag.
- `'warnOnce'` — warn once, then silently add `u`. Assigning resets the one-time flag.
- `'warn'` — warn every time.
- `'throw'` — throw `SyntaxError` every time.
## RegExp methods
### re.exec(str)
Executes a search for a match. Returns a result array or `null`.
```js
const re = new RE2('a(b+)', 'g');
const result = re.exec('abbc abbc');
// result[0] === 'abb'
// result[1] === 'bb'
// result.index === 0
// result.input === 'abbc abbc'
// re.lastIndex === 3
```
With `d` flag (hasIndices), result has `.indices` property with `[start, end]` pairs for each group.
With `g` or `y` flag, advances `lastIndex`. Call repeatedly to iterate matches.
### re.test(str)
Returns `true` if the pattern matches, `false` otherwise.
```js
new RE2('\\d+').test('abc123'); // true
new RE2('\\d+').test('abcdef'); // false
```
With `g` or `y` flag, advances `lastIndex`.
### re.toString()
Returns `'/pattern/flags'` string representation.
```js
new RE2('abc', 'gi').toString(); // '/abc/giu'
```
## String methods (via Symbol)
RE2 instances implement well-known symbols, so they work directly with ES6 string methods:
### str.match(re) / re[Symbol.match](str)
```js
'test 123 test 456'.match(new RE2('\\d+', 'g')); // ['123', '456']
'test 123'.match(new RE2('(\\d+)')); // ['123', '123', index: 5, input: 'test 123']
```
### str.matchAll(re) / re[Symbol.matchAll](str)
Returns an iterator of all matches (requires `g` flag).
```js
const re = new RE2('\\d+', 'g');
for (const m of '1a2b3c'.matchAll(re)) {
console.log(m[0]); // '1', '2', '3'
}
```
### str.search(re) / re[Symbol.search](str)
Returns the index of the first match, or `-1`.
```js
'hello world'.search(new RE2('world')); // 6
```
### str.replace(re, replacement) / re[Symbol.replace](str, replacement)
Returns a new string with matches replaced.
```js
'aabba'.replace(new RE2('b', 'g'), 'c'); // 'aacca'
```
Replacement string supports:
- `$1`, `$2`, ... — numbered capture groups.
- `$<name>` — named capture groups.
- `$&` — the matched substring.
- `` $` `` — portion before the match.
- `$'` — portion after the match.
- `$$` — literal `$`.
Replacement function receives `(match, ...groups, offset, input)`:
```js
'abc'.replace(new RE2('(b)'), (match, g1, offset) => `[${g1}@${offset}]`);
// 'a[b@1]c'
```
### str.split(re[, limit]) / re[Symbol.split](str[, limit])
Splits string by pattern.
```js
'a1b2c3'.split(new RE2('\\d')); // ['a', 'b', 'c', '']
'a1b2c3'.split(new RE2('\\d'), 2); // ['a', 'b']
```
## String methods (direct)
These are convenience methods on the RE2 instance with swapped argument order:
- `re.match(str)` — equivalent to `str.match(re)`.
- `re.search(str)` — equivalent to `str.search(re)`.
- `re.replace(str, replacement)` — equivalent to `str.replace(re, replacement)`.
- `re.split(str[, limit])` — equivalent to `str.split(re, limit)`.
```js
const re = new RE2('\\d+', 'g');
re.match('test 123 test 456'); // ['123', '456']
re.search('test 123'); // 5
re.replace('test 1 and 2', 'N'); // 'test N and N' (global replaces all)
re.split('a1b2c'); // ['a', 'b', 'c']
```
## Buffer support
All methods accept Node.js Buffers (UTF-8) instead of strings. When given Buffer input, they return Buffer output.
```js
const re = new RE2('матч', 'g');
const buf = Buffer.from('тест матч тест');
const result = re.exec(buf);
// result[0] is a Buffer containing 'матч' in UTF-8
// result.index is in bytes (not characters)
```
Differences from string mode:
- All offsets and lengths are in **bytes**, not characters.
- Results contain Buffers instead of strings.
- Use `buf.toString()` to convert results back to strings.
### useBuffers on replacer functions
When using `re.replace(buf, replacerFn)`, the replacer receives string arguments and character offsets by default. Set `replacerFn.useBuffers = true` to receive byte offsets instead:
```js
function replacer(match, offset, input) {
return '<' + offset + ' bytes>';
}
replacer.useBuffers = true;
new RE2('б').replace(Buffer.from('абв'), replacer);
```
## RE2.Set
Multi-pattern matching — compile many patterns into a single automaton and test/match against all of them at once. Faster than testing individual patterns when the number of patterns is large.
### Constructor
```js
new RE2.Set(patterns[, flagsOrOptions][, options])
```
- `patterns` — any iterable of strings, Buffers, RegExp, or RE2 instances.
- `flagsOrOptions` — optional string/Buffer with flags (apply to all patterns), or options object.
- `options.anchor` — `'unanchored'` (default), `'start'`, or `'both'`.
```js
const set = new RE2.Set([
'^/users/\\d+$',
'^/posts/\\d+$',
'^/api/.*$'
], 'i', {anchor: 'start'});
```
### set.test(str)
Returns `true` if any pattern matches, `false` otherwise.
```js
set.test('/users/42'); // true
set.test('/unknown'); // false
```
### set.match(str)
Returns an array of indices of matching patterns, sorted ascending. Empty array if none match.
```js
set.match('/users/42'); // [0]
set.match('/api/users'); // [2]
set.match('/unknown'); // []
```
### Properties
- `set.size` (number) — number of patterns.
- `set.source` (string) — all patterns joined with `|`.
- `set.sources` (string[]) — individual pattern sources.
- `set.flags` (string) — flags string.
- `set.anchor` (string) — anchor mode.
### set.toString()
Returns `'/pattern1|pattern2|.../flags'`.
```js
set.toString(); // '/^/users/\\d+$|^/posts/\\d+$|^/api/.*$/iu'
```
## Static helpers
### RE2.getUtf8Length(str)
Calculate the byte size needed to encode a UTF-16 string as UTF-8.
```js
RE2.getUtf8Length('hello'); // 5
RE2.getUtf8Length('привет'); // 12
```
### RE2.getUtf16Length(buf)
Calculate the character count needed to encode a UTF-8 buffer as a UTF-16 string.
```js
RE2.getUtf16Length(Buffer.from('hello')); // 5
RE2.getUtf16Length(Buffer.from('привет')); // 6
```
## Named groups
Named capture groups are supported:
```js
const re = new RE2('(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})');
const result = re.exec('2024-01-15');
result.groups.year; // '2024'
result.groups.month; // '01'
result.groups.day; // '15'
```
Named backreferences in replacement strings:
```js
'2024-01-15'.replace(
new RE2('(?<y>\\d{4})-(?<m>\\d{2})-(?<d>\\d{2})'),
'$<d>/$<m>/$<y>'
); // '15/01/2024'
```
## Unicode classes
RE2 supports Unicode property escapes. Long names are translated to RE2 short names:
```js
new RE2('\\p{Letter}+'); // same as \p{L}+
new RE2('\\p{Number}+'); // same as \p{N}+
new RE2('\\p{Script=Latin}+'); // same as \p{Latin}+
new RE2('\\p{sc=Cyrillic}+'); // same as \p{Cyrillic}+
new RE2('\\P{Letter}+'); // negated: non-letters
```
Only `\p{name}` form is supported (not `\p{name=value}` in general). Exception: `Script` and `sc` names.
## Limitations
RE2 does **not** support:
- **Backreferences** (`\1`, `\2`, etc.) — throw `SyntaxError`.
- **Lookahead assertions** (`(?=...)`, `(?!...)`) — throw `SyntaxError`.
- **Lookbehind assertions** (`(?<=...)`, `(?<!...)`) — throw `SyntaxError`.
Fallback pattern:
```js
let re = /pattern-with-lookahead(?=foo)/;
try {
re = new RE2(re);
} catch (e) {
// use original RegExp as fallback
}
const result = re.exec(input);
```
## Common patterns
### Drop-in RegExp replacement
```js
const RE2 = require('re2');
// Before (vulnerable to ReDoS):
const re = new RegExp(userInput);
// After (safe):
const re = new RE2(userInput);
```
### Process Buffer data efficiently
```js
const RE2 = require('re2');
const fs = require('fs');
const data = fs.readFileSync('large-file.txt');
const re = new RE2('pattern', 'g');
let match;
while ((match = re.exec(data)) !== null) {
console.log('Found at byte offset:', match.index);
}
```
### Route matching with RE2.Set
```js
const RE2 = require('re2');
const routes = new RE2.Set([
'^/users/\\d+$',
'^/posts/\\d+$',
'^/api/v\\d+/.*$'
], 'i');
function findRoute(path) {
const matches = routes.match(path);
return matches.length > 0 ? matches[0] : -1;
}
findRoute('/users/42'); // 0
findRoute('/posts/7'); // 1
findRoute('/api/v2/foo'); // 2
findRoute('/unknown'); // -1
```
### Validate user-supplied patterns safely
```js
const RE2 = require('re2');
function safeMatch(input, pattern, flags) {
try {
const re = new RE2(pattern, flags);
return re.test(input);
} catch (e) {
return false; // invalid pattern
}
}
```
## TypeScript
```ts
import RE2 from 're2';
const re: RE2 = new RE2('\\d+', 'g');
const result: RegExpExecArray | null = re.exec('test 123');
// Buffer overloads
const bufResult: RE2BufferExecArray | null = re.exec(Buffer.from('test 123'));
// RE2.Set
const set: RE2Set = new RE2.Set(['a', 'b'], 'i');
const matches: number[] = set.match('abc');
```
## Project structure notes
- Entry point: `re2.js` (loads native addon), types: `re2.d.ts`.
- C++ addon source: `lib/*.cc`, `lib/*.h`.
- Tests: `tests/test-*.mjs` (runtime), `ts-tests/test-*.ts` (type-checking).
- Vendored dependencies: `vendor/re2/`, `vendor/abseil-cpp/` (git submodules) — **never modify files under `vendor/`**.
## Links
- Docs: https://github.com/uhop/node-re2/wiki
- npm: https://www.npmjs.com/package/re2
- Repository: https://github.com/uhop/node-re2
- RE2 syntax: https://github.com/google/re2/wiki/Syntax
================================================
FILE: llms.txt
================================================
# node-re2
> Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. Drop-in RegExp replacement that prevents ReDoS. Works with strings and Buffers.
## Install
npm install re2
## Quick start
```js
// CommonJS
const RE2 = require('re2');
// ESM
import {RE2} from 're2';
const re = new RE2('a(b*)', 'i');
const result = re.exec('aBbC');
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"
```
## Why use node-re2?
The built-in Node.js RegExp engine can run in exponential time with vulnerable patterns (ReDoS). RE2 guarantees linear-time matching by disallowing backreferences and lookahead assertions.
## API
### Construction
```js
const RE2 = require('re2');
const re1 = new RE2('\\d+'); // from string
const re2 = new RE2('\\d+', 'gi'); // with flags
const re3 = new RE2(/ab*/ig); // from RegExp
const re4 = new RE2(re3); // from another RE2
const re5 = RE2('\\d+'); // factory (no new)
```
Supported flags: `g` (global), `i` (ignoreCase), `m` (multiline), `s` (dotAll), `u` (unicode, always on), `y` (sticky), `d` (hasIndices).
### RegExp methods
- `re.exec(str)` — find match with capture groups.
- `re.test(str)` — boolean match check.
- `re.toString()` — `/pattern/flags` representation.
### String methods (via Symbol)
RE2 instances work with ES6 string methods:
```js
'abc'.match(re);
'abc'.search(re);
'abc'.replace(re, 'x');
'abc'.split(re);
Array.from('abc'.matchAll(re));
```
### String methods (direct)
- `re.match(str)` — equivalent to `str.match(re)`.
- `re.search(str)` — equivalent to `str.search(re)`.
- `re.replace(str, replacement)` — equivalent to `str.replace(re, replacement)`.
- `re.split(str[, limit])` — equivalent to `str.split(re, limit)`.
### Properties
- `re.source` — pattern string.
- `re.flags` — flags string.
- `re.lastIndex` — index for next match (with `g` or `y` flag).
- `re.global`, `re.ignoreCase`, `re.multiline`, `re.dotAll`, `re.unicode`, `re.sticky`, `re.hasIndices` — boolean flag accessors.
- `re.internalSource` — RE2-translated pattern (for debugging).
### Buffer support
All methods accept Buffers (UTF-8) instead of strings. Buffer input produces Buffer output. Offsets are in bytes.
```js
const re = new RE2('матч', 'g');
const buf = Buffer.from('тест матч тест');
const result = re.exec(buf);
// result[0] is a Buffer
```
### RE2.Set
Multi-pattern matching — test a string against many patterns at once.
```js
const set = new RE2.Set(['^/users/\\d+$', '^/posts/\\d+$'], 'i');
set.test('/users/7'); // true
set.match('/posts/42'); // [1]
set.sources; // ['^/users/\\d+$', '^/posts/\\d+$']
```
- `new RE2.Set(patterns[, flags][, options])` — compile patterns.
- `options.anchor`: `'unanchored'` (default), `'start'`, or `'both'`.
- `set.test(str)` — returns `true` if any pattern matches.
- `set.match(str)` — returns array of matching pattern indices.
- Properties: `size`, `source`, `sources`, `flags`, `anchor`.
### Static helpers
- `RE2.getUtf8Length(str)` — byte size of string as UTF-8.
- `RE2.getUtf16Length(buf)` — character count of UTF-8 buffer as UTF-16 string.
- `RE2.unicodeWarningLevel` — `'nothing'` (default), `'warnOnce'`, `'warn'`, or `'throw'`.
## Limitations
RE2 does not support:
- **Backreferences** (`\1`, `\2`, etc.)
- **Lookahead assertions** (`(?=...)`, `(?!...)`)
These throw `SyntaxError`. Use try-catch to fall back to RegExp when needed:
```js
let re = /pattern-with-lookahead/;
try { re = new RE2(re); } catch (e) { /* use original RegExp */ }
```
## Project notes
- C++ addon source is in `lib/`. Vendored deps (`vendor/re2/`, `vendor/abseil-cpp/`) are git submodules — **never modify files under `vendor/`**.
## Links
- Docs: https://github.com/uhop/node-re2/wiki
- npm: https://www.npmjs.com/package/re2
- Full LLM reference: https://github.com/uhop/node-re2/blob/master/llms-full.txt
================================================
FILE: package.json
================================================
{
"name": "re2",
"version": "1.24.0",
"description": "Bindings for RE2: fast, safe alternative to backtracking regular expression engines.",
"homepage": "https://github.com/uhop/node-re2",
"bugs": "https://github.com/uhop/node-re2/issues",
"type": "commonjs",
"main": "re2.js",
"types": "re2.d.ts",
"files": [
"binding.gyp",
"lib",
"re2.d.ts",
"scripts/*.js",
"vendor"
],
"dependencies": {
"install-artifact-from-github": "^1.4.0",
"nan": "^2.26.2",
"node-gyp": "^12.2.0"
},
"devDependencies": {
"@types/node": "^25.5.0",
"nano-benchmark": "^1.0.15",
"prettier": "^3.8.1",
"tape-six": "^1.7.13",
"tape-six-proc": "^1.2.8",
"typescript": "^6.0.2"
},
"scripts": {
"test": "tape6 --flags FO",
"test:seq": "tape6-seq --flags FO",
"test:proc": "tape6-proc --flags FO",
"save-to-github": "save-to-github-cache --artifact build/Release/re2.node",
"install": "install-from-cache --artifact build/Release/re2.node --host-var RE2_DOWNLOAD_MIRROR --skip-path-var RE2_DOWNLOAD_SKIP_PATH --skip-ver-var RE2_DOWNLOAD_SKIP_VER || node-gyp -j max rebuild",
"verify-build": "node scripts/verify-build.js",
"build:dev": "node-gyp -j max build --debug",
"build": "node-gyp -j max build",
"build1": "node-gyp build",
"rebuild:dev": "node-gyp -j max rebuild --debug",
"rebuild": "node-gyp -j max rebuild",
"rebuild1": "node-gyp rebuild",
"clean": "node-gyp clean && node-gyp configure",
"clean-build": "node-gyp clean",
"ts-check": "tsc --noEmit",
"lint": "prettier --check *.js *.ts tests/ bench/",
"lint:fix": "prettier --write *.js *.ts tests/ bench/"
},
"github": "https://github.com/uhop/node-re2",
"repository": {
"type": "git",
"url": "git://github.com/uhop/node-re2.git"
},
"keywords": [
"RegExp",
"RegEx",
"text processing",
"PCRE alternative"
],
"author": "Eugene Lazutkin <eugene.lazutkin@gmail.com> (https://lazutkin.com/)",
"funding": "https://github.com/sponsors/uhop",
"license": "BSD-3-Clause",
"tape6": {
"tests": [
"/tests/test-*.*js",
"/tests/test-*.*ts"
]
}
}
================================================
FILE: re2.d.ts
================================================
/// <reference types="node" />
declare module 're2' {
interface RE2BufferExecArray {
index: number;
input: Buffer;
0: Buffer;
groups?: {
[key: string]: Buffer;
};
indices?: RegExpIndicesArray;
}
interface RE2BufferMatchArray {
index?: number;
input?: Buffer;
0: Buffer;
groups?: {
[key: string]: Buffer;
};
}
interface RE2 extends RegExp {
readonly internalSource: string;
exec(str: string): RegExpExecArray | null;
exec(str: Buffer): RE2BufferExecArray | null;
match(str: string): RegExpMatchArray | null;
match(str: Buffer): RE2BufferMatchArray | null;
test(str: string | Buffer): boolean;
replace<K extends String | Buffer>(
str: K,
replaceValue: string | Buffer
): K;
replace<K extends String | Buffer>(
str: K,
replacer: (substring: string, ...args: any[]) => string | Buffer
): K;
search(str: string | Buffer): number;
split<K extends String | Buffer>(str: K, limit?: number): K[];
}
interface RE2SetOptions {
anchor?: 'unanchored' | 'start' | 'both';
}
interface RE2Set {
readonly size: number;
readonly source: string;
readonly sources: string[];
readonly flags: string;
readonly anchor: 'unanchored' | 'start' | 'both';
match(str: string | Buffer): number[];
test(str: string | Buffer): boolean;
toString(): string;
}
interface RE2SetConstructor {
new (
patterns: Iterable<Buffer | RegExp | RE2 | string>,
flagsOrOptions?: string | Buffer | RE2SetOptions,
options?: RE2SetOptions
): RE2Set;
(
patterns: Iterable<Buffer | RegExp | RE2 | string>,
flagsOrOptions?: string | Buffer | RE2SetOptions,
options?: RE2SetOptions
): RE2Set;
readonly prototype: RE2Set;
}
interface RE2Constructor extends RegExpConstructor {
new (pattern: Buffer | RegExp | RE2 | string): RE2;
new (pattern: Buffer | string, flags?: string | Buffer): RE2;
(pattern: Buffer | RegExp | RE2 | string): RE2;
(pattern: Buffer | string, flags?: string | Buffer): RE2;
readonly prototype: RE2;
unicodeWarningLevel: 'nothing' | 'warnOnce' | 'warn' | 'throw';
getUtf8Length(value: string): number;
getUtf16Length(value: Buffer): number;
Set: RE2SetConstructor;
RE2: RE2Constructor;
}
var RE2: RE2Constructor;
export = RE2;
}
================================================
FILE: re2.js
================================================
'use strict';
const RE2 = require('./build/Release/re2.node');
// const RE2 = require('./build/Debug/re2.node');
const setAliases = (object, dict) => {
for (let [name, alias] of Object.entries(dict)) {
Object.defineProperty(
object,
alias,
Object.getOwnPropertyDescriptor(object, name)
);
}
};
setAliases(RE2.prototype, {
match: Symbol.match,
search: Symbol.search,
replace: Symbol.replace,
split: Symbol.split
});
RE2.prototype[Symbol.matchAll] = function* (str) {
if (!this.global)
throw TypeError(
'String.prototype.matchAll() is called with a non-global RE2 argument'
);
const re = new RE2(this);
re.lastIndex = this.lastIndex;
for (;;) {
const result = re.exec(str);
if (!result) break;
if (result[0] === '') ++re.lastIndex;
yield result;
}
};
module.exports = RE2;
module.exports.RE2 = RE2;
================================================
FILE: scripts/verify-build.js
================================================
'use strict';
// This is a light-weight script to make sure that the package works.
const assert = require('assert').strict;
const RE2 = require("../re2");
const sample = "abbcdefabh";
const re1 = new RE2("ab*", "g");
assert(re1.test(sample));
const re2 = RE2("ab*");
assert(re2.test(sample));
const re3 = new RE2("abc");
assert(!re3.test(sample));
================================================
FILE: tests/manual/matchall-bench.js
================================================
'use strict';
const RE2 = require('../../re2');
const N = 1_000_000;
const s = 'a'.repeat(N),
re = new RE2('a', 'g'),
matches = s.matchAll(re);
let n = 0;
for (const _ of matches) ++n;
if (n !== s.length) console.log('Wrong result.');
console.log('Done.');
================================================
FILE: tests/manual/memory-check.js
================================================
'use strict';
const RE2 = require('../../re2.js');
const L = 20 * 1024 * 1024,
N = 100;
if (typeof globalThis.gc != 'function')
console.log(
"Warning: to run it with explicit gc() calls, you should use --expose-gc as a node's argument."
);
const gc = typeof globalThis.gc == 'function' ? globalThis.gc : () => {};
const s = 'a'.repeat(L),
objects = [];
for (let i = 0; i < N; ++i) {
const re2 = new RE2('x', 'g');
objects.push(re2);
const result = s.replace(re2, '');
if (result.length !== s.length) console.log('Wrong result.');
gc();
}
console.log(
'Done. Now it is spinning: check the memory consumption! To stop it, press Ctrl+C.'
);
for (;;);
================================================
FILE: tests/manual/memory-monitor.js
================================================
'use strict';
const RE2 = require('../../re2');
const N = 5_000_000;
console.log('Never-ending loop: exit with Ctrl+C.');
const aCharCode = 'a'.charCodeAt(0);
const randomAlpha = () =>
String.fromCharCode(aCharCode + Math.floor(Math.random() * 26));
const humanizeNumber = n => {
const negative = n < 0;
if (negative) n = -n;
const s = n.toFixed();
let group1 = s.length % 3;
if (!group1) group1 = 3;
let result = s.substring(0, group1);
for (let i = group1; i < s.length; i += 3) {
result += ',' + s.substring(i, i + 3);
}
return (negative ? '-' : '') + result;
};
const CSI = '\x1B[';
const cursorUp = (n = 1) => CSI + (n > 1 ? n.toFixed() : '') + 'A';
const sgr = (cmd = '') =>
CSI + (Array.isArray(cmd) ? cmd.join(';') : cmd) + 'm';
const RESET = sgr();
const NOTE = sgr(91);
let first = true;
const maxMemory = {
heapTotal: 0,
heapUsed: 0,
external: 0,
arrayBuffers: 0,
rss: 0
},
labels = {
heapTotal: 'heap total',
heapUsed: 'heap used',
external: 'external',
arrayBuffers: 'array buffers',
rss: 'resident set size'
},
maxLabelSize = Math.max(
...Array.from(Object.values(labels)).map(label => label.length)
);
const report = () => {
const memoryUsage = process.memoryUsage(),
previousMax = {...maxMemory};
console.log(
(first ? '' : '\r' + cursorUp(6)) + ''.padStart(maxLabelSize + 1),
'Current'.padStart(15),
'Max'.padStart(15)
);
for (const name in maxMemory) {
const prefix =
previousMax[name] && previousMax[name] < memoryUsage[name] ? NOTE : RESET;
console.log(
(labels[name] + ':').padStart(maxLabelSize + 1),
prefix + humanizeNumber(memoryUsage[name]).padStart(15) + RESET,
humanizeNumber(maxMemory[name]).padStart(15)
);
}
for (const [name, value] of Object.entries(maxMemory)) {
maxMemory[name] = Math.max(value, memoryUsage[name]);
}
first = false;
};
for (;;) {
const re2 = new RE2(randomAlpha(), 'g');
let s = '';
for (let i = 0; i < N; ++i) s += randomAlpha();
let n = 0;
for (const _ of s.matchAll(re2)) ++n;
re2.lastIndex = 0;
const r = s.replace(re2, '');
if (r.length + n != s.length) {
console.log(
'ERROR!',
's:',
s.length,
'r:',
r.length,
'n:',
n,
're2:',
re2.toString()
);
break;
}
report();
}
================================================
FILE: tests/manual/test-unicode-warning.mjs
================================================
import test from 'tape-six';
import {RE2} from '../../re2.js';
// tests
// these tests modify the global state of RE2 and cannot be run in parallel with other tests in the same process
test('test new unicode warnOnce', t => {
let errorMessage = '';
const oldConsole = console;
console = {error: msg => (errorMessage = msg)};
RE2.unicodeWarningLevel = 'warnOnce';
let a = new RE2('.*');
t.ok(errorMessage);
errorMessage = '';
a = new RE2('.?');
t.notOk(errorMessage);
RE2.unicodeWarningLevel = 'warnOnce';
a = new RE2('.+');
t.ok(errorMessage);
RE2.unicodeWarningLevel = 'nothing';
console = oldConsole;
});
test('test new unicode warn', t => {
let errorMessage = '';
const oldConsole = console;
console = {error: msg => (errorMessage = msg)};
RE2.unicodeWarningLevel = 'warn';
let a = new RE2('.*');
t.ok(errorMessage);
errorMessage = '';
a = new RE2('.?');
t.ok(errorMessage);
RE2.unicodeWarningLevel = 'nothing';
console = oldConsole;
});
test('test new
gitextract_8333oox3/ ├── .clinerules ├── .cursorrules ├── .editorconfig ├── .github/ │ ├── COPILOT-INSTRUCTIONS.md │ ├── FUNDING.yml │ ├── actions/ │ │ ├── linux-alpine-node-20/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-alpine-node-22/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-alpine-node-24/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-alpine-node-25/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-node-20/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-node-22/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ ├── linux-node-24/ │ │ │ ├── Dockerfile │ │ │ ├── action.yml │ │ │ └── entrypoint.sh │ │ └── linux-node-25/ │ │ ├── Dockerfile │ │ ├── action.yml │ │ └── entrypoint.sh │ ├── dependabot.yml │ └── workflows/ │ ├── build.yml │ └── tests.yml ├── .gitignore ├── .gitmodules ├── .prettierignore ├── .prettierrc ├── .vscode/ │ ├── c_cpp_properties.json │ ├── launch.json │ ├── settings.json │ └── tasks.json ├── .windsurf/ │ ├── skills/ │ │ ├── docs-review/ │ │ │ └── SKILL.md │ │ └── write-tests/ │ │ └── SKILL.md │ └── workflows/ │ ├── add-module.md │ ├── ai-docs-update.md │ └── release-check.md ├── .windsurfrules ├── AGENTS.md ├── ARCHITECTURE.md ├── CLAUDE.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── bench/ │ ├── bad-pattern.mjs │ └── set-match.mjs ├── binding.gyp ├── lib/ │ ├── accessors.cc │ ├── addon.cc │ ├── exec.cc │ ├── isolate_data.h │ ├── match.cc │ ├── new.cc │ ├── pattern.cc │ ├── pattern.h │ ├── replace.cc │ ├── search.cc │ ├── set.cc │ ├── split.cc │ ├── test.cc │ ├── to_string.cc │ ├── util.cc │ ├── util.h │ ├── wrapped_re2.h │ └── wrapped_re2_set.h ├── llms-full.txt ├── llms.txt ├── package.json ├── re2.d.ts ├── re2.js ├── scripts/ │ └── verify-build.js ├── tests/ │ ├── manual/ │ │ ├── matchall-bench.js │ │ ├── memory-check.js │ │ ├── memory-monitor.js │ │ ├── test-unicode-warning.mjs │ │ └── worker.js │ ├── test-cjs.cjs │ ├── test-exec.mjs │ ├── test-general.mjs │ ├── test-groups.mjs │ ├── test-invalid.mjs │ ├── test-match.mjs │ ├── test-matchAll.mjs │ ├── test-prototype.mjs │ ├── test-replace.mjs │ ├── test-search.mjs │ ├── test-set.mjs │ ├── test-source.mjs │ ├── test-split.mjs │ ├── test-symbols.mjs │ ├── test-test.mjs │ ├── test-toString.mjs │ └── test-unicode-classes.mjs ├── ts-tests/ │ └── test-types.ts └── tsconfig.json
SYMBOL INDEX (122 symbols across 35 files)
FILE: bench/bad-pattern.mjs
constant BAD_PATTERN (line 3) | const BAD_PATTERN = '([a-z]+)+$';
constant BAD_INPUT (line 4) | const BAD_INPUT = 'a'.repeat(10) + '!';
FILE: bench/set-match.mjs
constant PATTERN_COUNT (line 3) | const PATTERN_COUNT = 200;
constant INPUT_COUNT (line 10) | const INPUT_COUNT = 500;
FILE: lib/accessors.cc
function NAN_GETTER (line 7) | NAN_GETTER(WrappedRE2::GetSource)
function NAN_GETTER (line 19) | NAN_GETTER(WrappedRE2::GetInternalSource)
function NAN_GETTER (line 31) | NAN_GETTER(WrappedRE2::GetFlags)
function NAN_GETTER (line 71) | NAN_GETTER(WrappedRE2::GetGlobal)
function NAN_GETTER (line 83) | NAN_GETTER(WrappedRE2::GetIgnoreCase)
function NAN_GETTER (line 95) | NAN_GETTER(WrappedRE2::GetMultiline)
function NAN_GETTER (line 107) | NAN_GETTER(WrappedRE2::GetDotAll)
function NAN_GETTER (line 119) | NAN_GETTER(WrappedRE2::GetUnicode)
function NAN_GETTER (line 130) | NAN_GETTER(WrappedRE2::GetSticky)
function NAN_GETTER (line 142) | NAN_GETTER(WrappedRE2::GetHasIndices)
function NAN_GETTER (line 154) | NAN_GETTER(WrappedRE2::GetLastIndex)
function NAN_SETTER (line 166) | NAN_SETTER(WrappedRE2::SetLastIndex)
function NAN_GETTER (line 183) | NAN_GETTER(WrappedRE2::GetUnicodeWarningLevel)
function NAN_SETTER (line 204) | NAN_SETTER(WrappedRE2::SetUnicodeWarningLevel)
FILE: lib/addon.cc
function AddonData (line 11) | AddonData *getAddonData(v8::Isolate *isolate)
function setAddonData (line 18) | void setAddonData(v8::Isolate *isolate, AddonData *data)
function deleteAddonData (line 24) | void deleteAddonData(v8::Isolate *isolate)
function NAN_METHOD (line 35) | static NAN_METHOD(GetUtf8Length)
function NAN_METHOD (line 46) | static NAN_METHOD(GetUtf16Length)
function cleanup (line 57) | static void cleanup(void *p)
function NODE_MODULE_INIT (line 123) | NODE_MODULE_INIT()
function StrVal (line 151) | const StrVal &WrappedRE2::prepareArgument(const v8::Local<v8::Value> &ar...
FILE: lib/exec.cc
function NAN_METHOD (line 5) | NAN_METHOD(WrappedRE2::Exec)
FILE: lib/isolate_data.h
type AddonData (line 5) | struct AddonData {
FILE: lib/match.cc
function NAN_METHOD (line 5) | NAN_METHOD(WrappedRE2::Match)
FILE: lib/new.cc
function ensureUniqueNamedGroups (line 15) | inline bool ensureUniqueNamedGroups(const std::map<int, std::string> &gr...
function NAN_METHOD (line 30) | NAN_METHOD(WrappedRE2::New)
FILE: lib/pattern.cc
function isUpperCaseAlpha (line 10) | inline bool isUpperCaseAlpha(char ch)
function isHexadecimal (line 15) | inline bool isHexadecimal(char ch)
function translateRegExp (line 61) | bool translateRegExp(const char *data, size_t size, bool multiline, std:...
function escapeRegExp (line 218) | std::string escapeRegExp(const char *data, size_t size)
FILE: lib/replace.cc
function getMaxSubmatch (line 8) | inline int getMaxSubmatch(
function replace (line 89) | inline std::string replace(
function replace (line 213) | static Nan::Maybe<std::string> replace(
function replace (line 297) | inline Nan::Maybe<std::string> replace(
function replace (line 382) | static Nan::Maybe<std::string> replace(
function requiresBuffers (line 473) | static bool requiresBuffers(const v8::Local<v8::Function> &f)
function NAN_METHOD (line 491) | NAN_METHOD(WrappedRE2::Replace)
FILE: lib/search.cc
function NAN_METHOD (line 3) | NAN_METHOD(WrappedRE2::Search)
FILE: lib/set.cc
type SetFlags (line 11) | struct SetFlags
function parseFlags (line 22) | static bool parseFlags(const v8::Local<v8::Value> &arg, SetFlags &flags)
function sameEffectiveOptions (line 86) | static bool sameEffectiveOptions(const SetFlags &a, const SetFlags &b)
function flagsToString (line 91) | static std::string flagsToString(const SetFlags &flags)
function collectIterable (line 122) | static bool collectIterable(const v8::Local<v8::Value> &input, std::vect...
function parseAnchor (line 220) | static bool parseAnchor(const v8::Local<v8::Value> &arg, re2::RE2::Ancho...
function fillInput (line 273) | static bool fillInput(const v8::Local<v8::Value> &arg, StrVal &str, v8::...
function anchorToString (line 297) | static std::string anchorToString(re2::RE2::Anchor anchor)
function makeCombinedSource (line 310) | static std::string makeCombinedSource(const std::vector<std::string> &so...
function NAN_METHOD (line 331) | NAN_METHOD(WrappedRE2Set::New)
function NAN_METHOD (line 591) | NAN_METHOD(WrappedRE2Set::Test)
function NAN_METHOD (line 632) | NAN_METHOD(WrappedRE2Set::Match)
function NAN_METHOD (line 681) | NAN_METHOD(WrappedRE2Set::ToString)
function NAN_GETTER (line 697) | NAN_GETTER(WrappedRE2Set::GetFlags)
function NAN_GETTER (line 708) | NAN_GETTER(WrappedRE2Set::GetSources)
function NAN_GETTER (line 724) | NAN_GETTER(WrappedRE2Set::GetSource)
function NAN_GETTER (line 735) | NAN_GETTER(WrappedRE2Set::GetSize)
function NAN_GETTER (line 746) | NAN_GETTER(WrappedRE2Set::GetAnchor)
FILE: lib/split.cc
function NAN_METHOD (line 7) | NAN_METHOD(WrappedRE2::Split)
FILE: lib/test.cc
function NAN_METHOD (line 5) | NAN_METHOD(WrappedRE2::Test)
FILE: lib/to_string.cc
function NAN_METHOD (line 5) | NAN_METHOD(WrappedRE2::ToString)
FILE: lib/util.cc
function consoleCall (line 3) | void consoleCall(const v8::Local<v8::String> &methodName, v8::Local<v8::...
function printDeprecationWarning (line 28) | void printDeprecationWarning(const char *warning)
function callToString (line 35) | v8::Local<v8::String> callToString(const v8::Local<v8::Object> &object)
FILE: lib/wrapped_re2.h
function class (line 32) | class WrappedRE2 : public Nan::ObjectWrap
function PrepareLastString (line 129) | struct PrepareLastString
FILE: lib/wrapped_re2_set.h
function class (line 12) | class WrappedRE2Set : public Nan::ObjectWrap
FILE: re2.d.ts
type RE2BufferExecArray (line 4) | interface RE2BufferExecArray {
type RE2BufferMatchArray (line 14) | interface RE2BufferMatchArray {
type RE2 (line 23) | interface RE2 extends RegExp {
type RE2SetOptions (line 47) | interface RE2SetOptions {
type RE2Set (line 51) | interface RE2Set {
type RE2SetConstructor (line 63) | interface RE2SetConstructor {
type RE2Constructor (line 77) | interface RE2Constructor extends RegExpConstructor {
FILE: re2.js
constant RE2 (line 3) | const RE2 = require('./build/Release/re2.node');
FILE: scripts/verify-build.js
constant RE2 (line 7) | const RE2 = require("../re2");
FILE: tests/manual/matchall-bench.js
constant RE2 (line 3) | const RE2 = require('../../re2');
FILE: tests/manual/memory-check.js
constant RE2 (line 3) | const RE2 = require('../../re2.js');
FILE: tests/manual/memory-monitor.js
constant RE2 (line 3) | const RE2 = require('../../re2');
constant CSI (line 29) | const CSI = '\x1B[';
constant RESET (line 33) | const RESET = sgr();
constant NOTE (line 34) | const NOTE = sgr(91);
FILE: tests/manual/worker.js
constant RE2 (line 5) | const RE2 = require('../../re2');
function test (line 21) | function test(msg) {
FILE: tests/test-cjs.cjs
constant RE2 (line 2) | const RE2 = require('../re2.js');
FILE: tests/test-exec.mjs
method toString (line 86) | toString() {
FILE: tests/test-general.mjs
method toString (line 93) | toString() {
method toString (line 238) | toString() {
method toString (line 249) | toString() {
FILE: tests/test-groups.mjs
function replacerByNumbers (line 62) | function replacerByNumbers(match, group1, group2, index, source, groups) {
function replacerByNames (line 65) | function replacerByNames(match, group1, group2, index, source, groups) {
FILE: tests/test-match.mjs
method toString (line 50) | toString() {
FILE: tests/test-replace.mjs
function replacer (line 24) | function replacer(match, p1, p2, p3, offset, string) {
function upperToHyphenLower (line 35) | function upperToHyphenLower(match) {
function convert (line 45) | function convert(str, p1, offset, s) {
method toString (line 83) | toString() {
method toString (line 96) | toString() {
method toString (line 110) | toString() {
method toString (line 115) | toString() {
method toString (line 138) | toString() {
function replacer (line 169) | function replacer(match, offset, string) {
function replacer (line 206) | function replacer(match, offset, string) {
FILE: tests/test-search.mjs
method toString (line 31) | toString() {
FILE: tests/test-split.mjs
method toString (line 104) | toString() {
FILE: tests/test-test.mjs
method toString (line 73) | toString() {
FILE: ts-tests/test-types.ts
function assertType (line 3) | function assertType<T>(_val: T) {}
function test_constructors (line 5) | function test_constructors() {
function test_properties (line 28) | function test_properties() {
function test_execTypes (line 44) | function test_execTypes() {
function test_execBufferTypes (line 55) | function test_execBufferTypes() {
function test_matchTypes (line 66) | function test_matchTypes() {
function test_matchBufferTypes (line 77) | function test_matchBufferTypes() {
function test_testTypes (line 86) | function test_testTypes() {
function test_searchTypes (line 92) | function test_searchTypes() {
function test_replaceTypes (line 98) | function test_replaceTypes() {
function test_splitTypes (line 105) | function test_splitTypes() {
function test_toStringType (line 113) | function test_toStringType() {
function test_staticMembers (line 118) | function test_staticMembers() {
function test_setTypes (line 132) | function test_setTypes() {
Condensed preview — 103 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (280K chars).
[
{
"path": ".clinerules",
"chars": 3511,
"preview": "<!-- Canonical source: AGENTS.md — keep this file in sync -->\n# node-re2 — AI Agent Rules\n\n## Project identity\n\nnode-re2"
},
{
"path": ".cursorrules",
"chars": 3511,
"preview": "<!-- Canonical source: AGENTS.md — keep this file in sync -->\n# node-re2 — AI Agent Rules\n\n## Project identity\n\nnode-re2"
},
{
"path": ".editorconfig",
"chars": 198,
"preview": "root = true\n\n[*]\ncharset = utf-8\nend_of_line = lf\ninsert_final_newline = true\ntrim_trailing_whitespace = true\nindent_sty"
},
{
"path": ".github/COPILOT-INSTRUCTIONS.md",
"chars": 156,
"preview": "<!-- GitHub Copilot project instructions — canonical source is AGENTS.md -->\n\nSee [AGENTS.md](../AGENTS.md) for all AI a"
},
{
"path": ".github/FUNDING.yml",
"chars": 35,
"preview": "github: uhop\nbuy_me_a_coffee: uhop\n"
},
{
"path": ".github/actions/linux-alpine-node-20/Dockerfile",
"chars": 144,
"preview": "FROM node:20-alpine\n\nRUN apk add --no-cache python3 make gcc g++ linux-headers\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYP"
},
{
"path": ".github/actions/linux-alpine-node-20/action.yml",
"chars": 225,
"preview": "name: 'Create a binary artifact for Node 20 on Alpine Linux'\ndescription: 'Create a binary artifact for Node 20 on Alpin"
},
{
"path": ".github/actions/linux-alpine-node-20/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-alpine-node-22/Dockerfile",
"chars": 144,
"preview": "FROM node:22-alpine\n\nRUN apk add --no-cache python3 make gcc g++ linux-headers\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYP"
},
{
"path": ".github/actions/linux-alpine-node-22/action.yml",
"chars": 225,
"preview": "name: 'Create a binary artifact for Node 22 on Alpine Linux'\ndescription: 'Create a binary artifact for Node 22 on Alpin"
},
{
"path": ".github/actions/linux-alpine-node-22/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-alpine-node-24/Dockerfile",
"chars": 144,
"preview": "FROM node:24-alpine\n\nRUN apk add --no-cache python3 make gcc g++ linux-headers\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYP"
},
{
"path": ".github/actions/linux-alpine-node-24/action.yml",
"chars": 225,
"preview": "name: 'Create a binary artifact for Node 24 on Alpine Linux'\ndescription: 'Create a binary artifact for Node 24 on Alpin"
},
{
"path": ".github/actions/linux-alpine-node-24/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-alpine-node-25/Dockerfile",
"chars": 144,
"preview": "FROM node:25-alpine\n\nRUN apk add --no-cache python3 make gcc g++ linux-headers\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYP"
},
{
"path": ".github/actions/linux-alpine-node-25/action.yml",
"chars": 225,
"preview": "name: 'Create a binary artifact for Node 25 on Alpine Linux'\ndescription: 'Create a binary artifact for Node 25 on Alpin"
},
{
"path": ".github/actions/linux-alpine-node-25/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-node-20/Dockerfile",
"chars": 125,
"preview": "FROM node:20-bullseye\n\nRUN apt install python3 make gcc g++\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYPOINT [\"/entrypoint."
},
{
"path": ".github/actions/linux-node-20/action.yml",
"chars": 329,
"preview": "name: 'Create a binary artifact for Node 20 on Debian Bullseye Linux'\ndescription: 'Create a binary artifact for Node 20"
},
{
"path": ".github/actions/linux-node-20/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-node-22/Dockerfile",
"chars": 125,
"preview": "FROM node:22-bullseye\n\nRUN apt install python3 make gcc g++\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYPOINT [\"/entrypoint."
},
{
"path": ".github/actions/linux-node-22/action.yml",
"chars": 329,
"preview": "name: 'Create a binary artifact for Node 22 on Debian Bullseye Linux'\ndescription: 'Create a binary artifact for Node 22"
},
{
"path": ".github/actions/linux-node-22/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-node-24/Dockerfile",
"chars": 125,
"preview": "FROM node:24-bullseye\n\nRUN apt install python3 make gcc g++\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYPOINT [\"/entrypoint."
},
{
"path": ".github/actions/linux-node-24/action.yml",
"chars": 329,
"preview": "name: 'Create a binary artifact for Node 24 on Debian Bullseye Linux'\ndescription: 'Create a binary artifact for Node 24"
},
{
"path": ".github/actions/linux-node-24/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/actions/linux-node-25/Dockerfile",
"chars": 123,
"preview": "FROM node:25-trixie\n\nRUN apt install python3 make gcc g++\n\nCOPY entrypoint.sh /entrypoint.sh\nENTRYPOINT [\"/entrypoint.sh"
},
{
"path": ".github/actions/linux-node-25/action.yml",
"chars": 325,
"preview": "name: 'Create a binary artifact for Node 25 on Debian Trixie Linux'\ndescription: 'Create a binary artifact for Node 25 o"
},
{
"path": ".github/actions/linux-node-25/entrypoint.sh",
"chars": 152,
"preview": "#!/bin/sh\n\nset -e\n\nexport USERNAME=`whoami`\nexport DEVELOPMENT_SKIP_GETTING_ASSET=true\nnpm i\nnpm run build --if-present\n"
},
{
"path": ".github/dependabot.yml",
"chars": 600,
"preview": "# To get started with Dependabot version updates, you'll need to specify which\n# package ecosystems to update and where "
},
{
"path": ".github/workflows/build.yml",
"chars": 12037,
"preview": "name: Node.js builds\n\non:\n push:\n tags:\n - v?[0-9]+.[0-9]+.[0-9]+.[0-9]+\n - v?[0-9]+.[0-9]+.[0-9]+\n -"
},
{
"path": ".github/workflows/tests.yml",
"chars": 792,
"preview": "name: Node.js CI\n\non:\n push:\n branches: ['*']\n pull_request:\n branches: [master]\n\njobs:\n tests:\n name: Node."
},
{
"path": ".gitignore",
"chars": 120,
"preview": "node_modules/\nbuild/\nreport/\ncoverage/\n.AppleDouble\n/.development\n/.developmentx\n/.xdevelopment\n\n/scripts/save-local.sh\n"
},
{
"path": ".gitmodules",
"chars": 260,
"preview": "[submodule \"vendor/re2\"]\n\tpath = vendor/re2\n\turl = https://github.com/google/re2\n[submodule \"vendor/abseil-cpp\"]\n\tpath ="
},
{
"path": ".prettierignore",
"chars": 21,
"preview": "/.windsurf/workflows\n"
},
{
"path": ".prettierrc",
"chars": 126,
"preview": "{\n \"printWidth\": 80,\n \"singleQuote\": true,\n \"bracketSpacing\": false,\n \"arrowParens\": \"avoid\",\n \"trailingComma\": \"no"
},
{
"path": ".vscode/c_cpp_properties.json",
"chars": 558,
"preview": "{\n \"configurations\": [\n {\n \"name\": \"Mac\",\n \"includePath\": [\n \"${workspace"
},
{
"path": ".vscode/launch.json",
"chars": 505,
"preview": "{\n // Use IntelliSense to learn about possible attributes.\n // Hover to view descriptions of existing attributes.\n //"
},
{
"path": ".vscode/settings.json",
"chars": 83,
"preview": "{\n \"cSpell.words\": [\n \"heya\",\n \"PCRE\",\n \"replacee\",\n \"Submatch\"\n ]\n}\n"
},
{
"path": ".vscode/tasks.json",
"chars": 213,
"preview": "{\n\t\"version\": \"2.0.0\",\n\t\"tasks\": [\n\t\t{\n\t\t\t\"type\": \"npm\",\n\t\t\t\"script\": \"build:dev\",\n\t\t\t\"group\": \"build\",\n\t\t\t\"problemMatch"
},
{
"path": ".windsurf/skills/docs-review/SKILL.md",
"chars": 1485,
"preview": "---\nname: docs-review\ndescription: Review and improve English in documentation files for brevity and clarity. Use when a"
},
{
"path": ".windsurf/skills/write-tests/SKILL.md",
"chars": 2014,
"preview": "---\nname: write-tests\ndescription: Write or update tape-six tests for a module or feature. Use when asked to write tests"
},
{
"path": ".windsurf/workflows/add-module.md",
"chars": 2182,
"preview": "---\ndescription: Checklist for adding a new C++ method or JS feature to node-re2\n---\n\n# Add a New Module\n\nFollow these s"
},
{
"path": ".windsurf/workflows/ai-docs-update.md",
"chars": 1278,
"preview": "---\ndescription: Update AI-facing documentation files after API or architecture changes\n---\n\n# AI Documentation Update\n\n"
},
{
"path": ".windsurf/workflows/release-check.md",
"chars": 1324,
"preview": "---\ndescription: Pre-release verification checklist for node-re2\n---\n\n# Release Check\n\nRun through this checklist before"
},
{
"path": ".windsurfrules",
"chars": 3511,
"preview": "<!-- Canonical source: AGENTS.md — keep this file in sync -->\n# node-re2 — AI Agent Rules\n\n## Project identity\n\nnode-re2"
},
{
"path": "AGENTS.md",
"chars": 7180,
"preview": "# AGENTS.md — node-re2\n\n> `node-re2` provides Node.js bindings for [RE2](https://github.com/google/re2): a fast, safe al"
},
{
"path": "ARCHITECTURE.md",
"chars": 7833,
"preview": "# Architecture\n\n`node-re2` provides Node.js bindings for Google's [RE2](https://github.com/google/re2) regular expressio"
},
{
"path": "CLAUDE.md",
"chars": 152,
"preview": "<!-- Claude Code project instructions — canonical source is AGENTS.md -->\n\nSee [AGENTS.md](./AGENTS.md) for all AI agent"
},
{
"path": "CONTRIBUTING.md",
"chars": 1277,
"preview": "# Contributing to node-re2\n\nThank you for your interest in contributing!\n\n## Getting started\n\nThis project uses git subm"
},
{
"path": "LICENSE",
"chars": 1909,
"preview": "This library is available under the terms of the modified BSD license. No external contributions\nare allowed under licen"
},
{
"path": "README.md",
"chars": 19758,
"preview": "# node-re2 [![NPM version][npm-img]][npm-url]\n\n[npm-img]: https://img.shields.io/npm/v/re2.svg\n[npm-url]: https://npmjs."
},
{
"path": "bench/bad-pattern.mjs",
"chars": 479,
"preview": "import {RE2} from '../re2.js';\n\nconst BAD_PATTERN = '([a-z]+)+$';\nconst BAD_INPUT = 'a'.repeat(10) + '!';\n\nconst regExp "
},
{
"path": "bench/set-match.mjs",
"chars": 1406,
"preview": "import {RE2} from '../re2.js';\n\nconst PATTERN_COUNT = 200;\n\nconst patterns = [];\nfor (let i = 0; i < PATTERN_COUNT; ++i)"
},
{
"path": "binding.gyp",
"chars": 7665,
"preview": "{\n \"targets\": [\n {\n \"target_name\": \"re2\",\n \"sources\": [\n \"lib/addon.cc\",\n \"lib/accessors.cc\""
},
{
"path": "lib/accessors.cc",
"chars": 4463,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <cstring>\n#include <string>\n#include <vector>\n\nNAN_GETTER(WrappedRE2::GetSource)\n{\n"
},
{
"path": "lib/addon.cc",
"chars": 6824,
"preview": "#include \"./wrapped_re2.h\"\n#include \"./wrapped_re2_set.h\"\n#include \"./isolate_data.h\"\n\n#include <mutex>\n#include <unorde"
},
{
"path": "lib/exec.cc",
"chars": 4405,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <vector>\n\nNAN_METHOD(WrappedRE2::Exec)\n{\n\n\t// unpack arguments\n\n\tauto re2 = Nan::Ob"
},
{
"path": "lib/isolate_data.h",
"chars": 301,
"preview": "#pragma once\n\n#include <nan.h>\n\nstruct AddonData {\n\tNan::Persistent<v8::FunctionTemplate> re2Tpl;\n\tNan::Persistent<v8::F"
},
{
"path": "lib/match.cc",
"chars": 4950,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <vector>\n\nNAN_METHOD(WrappedRE2::Match)\n{\n\n\t// unpack arguments\n\n\tauto re2 = Nan::O"
},
{
"path": "lib/new.cc",
"chars": 6214,
"preview": "#include \"./wrapped_re2.h\"\n#include \"./util.h\"\n#include \"./pattern.h\"\n\n#include <map>\n#include <memory>\n#include <string"
},
{
"path": "lib/pattern.cc",
"chars": 5100,
"preview": "#include \"./pattern.h\"\n#include \"./wrapped_re2.h\"\n\n#include <cstring>\n#include <map>\n#include <string>\n\nstatic char hex["
},
{
"path": "lib/pattern.h",
"chars": 309,
"preview": "#pragma once\n\n#include <string>\n#include <vector>\n\n// Shared helpers for translating JavaScript-style regular expression"
},
{
"path": "lib/replace.cc",
"chars": 11993,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <vector>\n\ninline int getMa"
},
{
"path": "lib/search.cc",
"chars": 678,
"preview": "#include \"./wrapped_re2.h\"\n\nNAN_METHOD(WrappedRE2::Search)\n{\n\n\t// unpack arguments\n\n\tauto re2 = Nan::ObjectWrap::Unwrap<"
},
{
"path": "lib/set.cc",
"chars": 18226,
"preview": "#include \"./wrapped_re2_set.h\"\n#include \"./pattern.h\"\n#include \"./util.h\"\n#include \"./wrapped_re2.h\"\n\n#include <algorith"
},
{
"path": "lib/split.cc",
"chars": 2377,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <algorithm>\n#include <limits>\n#include <vector>\n\nNAN_METHOD(WrappedRE2::Split)\n{\n\n\t"
},
{
"path": "lib/test.cc",
"chars": 1026,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <vector>\n\nNAN_METHOD(WrappedRE2::Test)\n{\n\n\t// unpack arguments\n\n\tauto re2 = Nan::Ob"
},
{
"path": "lib/to_string.cc",
"chars": 660,
"preview": "#include \"./wrapped_re2.h\"\n\n#include <string>\n\nNAN_METHOD(WrappedRE2::ToString)\n{\n\n\t// unpack arguments\n\n\tauto re2 = Nan"
},
{
"path": "lib/util.cc",
"chars": 1957,
"preview": "#include \"./util.h\"\n\nvoid consoleCall(const v8::Local<v8::String> &methodName, v8::Local<v8::Value> text)\n{\n\tauto contex"
},
{
"path": "lib/util.h",
"chars": 449,
"preview": "#pragma once\n\n#include \"./wrapped_re2.h\"\n\ntemplate <typename R, typename P, typename L>\ninline v8::MaybeLocal<R> bind(v8"
},
{
"path": "lib/wrapped_re2.h",
"chars": 4722,
"preview": "#pragma once\n\n#include <atomic>\n#include <string>\n#include <nan.h>\n#include <re2/re2.h>\n\n#include \"./isolate_data.h\"\n\nst"
},
{
"path": "lib/wrapped_re2_set.h",
"chars": 1064,
"preview": "#pragma once\n\n#include <nan.h>\n#include <re2/re2.h>\n#include <re2/set.h>\n\n#include \"./isolate_data.h\"\n\n#include <string>"
},
{
"path": "llms-full.txt",
"chars": 12516,
"preview": "# node-re2\n\n> Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. Drop-in Reg"
},
{
"path": "llms.txt",
"chars": 3932,
"preview": "# node-re2\n\n> Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. Drop-in Reg"
},
{
"path": "package.json",
"chars": 2179,
"preview": "{\n \"name\": \"re2\",\n \"version\": \"1.24.0\",\n \"description\": \"Bindings for RE2: fast, safe alternative to backtracking reg"
},
{
"path": "re2.d.ts",
"chars": 2404,
"preview": "/// <reference types=\"node\" />\n\ndeclare module 're2' {\n interface RE2BufferExecArray {\n index: number;\n input: Bu"
},
{
"path": "re2.js",
"chars": 881,
"preview": "'use strict';\n\nconst RE2 = require('./build/Release/re2.node');\n// const RE2 = require('./build/Debug/re2.node');\n\nconst"
},
{
"path": "scripts/verify-build.js",
"chars": 357,
"preview": "'use strict';\n\n// This is a light-weight script to make sure that the package works.\n\nconst assert = require('assert').s"
},
{
"path": "tests/manual/matchall-bench.js",
"chars": 267,
"preview": "'use strict';\n\nconst RE2 = require('../../re2');\n\nconst N = 1_000_000;\n\nconst s = 'a'.repeat(N),\n re = new RE2('a', 'g'"
},
{
"path": "tests/manual/memory-check.js",
"chars": 680,
"preview": "'use strict';\n\nconst RE2 = require('../../re2.js');\n\nconst L = 20 * 1024 * 1024,\n N = 100;\n\nif (typeof globalThis.gc !="
},
{
"path": "tests/manual/memory-monitor.js",
"chars": 2379,
"preview": "'use strict';\n\nconst RE2 = require('../../re2');\n\nconst N = 5_000_000;\n\nconsole.log('Never-ending loop: exit with Ctrl+C"
},
{
"path": "tests/manual/test-unicode-warning.mjs",
"chars": 1249,
"preview": "import test from 'tape-six';\nimport {RE2} from '../../re2.js';\n\n// tests\n// these tests modify the global state of RE2 a"
},
{
"path": "tests/manual/worker.js",
"chars": 805,
"preview": "'use strict';\n\nconst {Worker, isMainThread} = require('worker_threads');\n\nconst RE2 = require('../../re2');\n\nif (isMainT"
},
{
"path": "tests/test-cjs.cjs",
"chars": 2420,
"preview": "const {test} = require('tape-six');\nconst RE2 = require('../re2.js');\n\ntest('CJS require', t => {\n t.ok(RE2, 'RE2 is lo"
},
{
"path": "tests/test-exec.mjs",
"chars": 9956,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\n// These tests are copied from MDN:\n// https://de"
},
{
"path": "tests/test-general.mjs",
"chars": 6272,
"preview": "import test from 'tape-six';\nimport {default as RE2} from '../re2.js';\n\n// utilities\n\nconst compare = (re1, re2, t) => {"
},
{
"path": "tests/test-groups.mjs",
"chars": 2349,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('groups normal', t => {\n t.equal(RE2('(?<a>"
},
{
"path": "tests/test-invalid.mjs",
"chars": 763,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('invalid', t => {\n let threw;\n\n // Backref"
},
{
"path": "tests/test-match.mjs",
"chars": 3876,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\n// These tests are copied from MDN:\n// https://de"
},
{
"path": "tests/test-matchAll.mjs",
"chars": 2006,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\n// These tests are copied from MDN:\n// https://de"
},
{
"path": "tests/test-prototype.mjs",
"chars": 456,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test prototype', t => {\n t.equal(RE2.proto"
},
{
"path": "tests/test-replace.mjs",
"chars": 9649,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\n// These tests are copied from MDN:\n// https://de"
},
{
"path": "tests/test-search.mjs",
"chars": 1893,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test search', t => {\n const str = 'Total i"
},
{
"path": "tests/test-set.mjs",
"chars": 3189,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\ntest('test set basics', t => {\n const set = new RE2.Set(['"
},
{
"path": "tests/test-source.mjs",
"chars": 1704,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test source identity', t => {\n let re = ne"
},
{
"path": "tests/test-split.mjs",
"chars": 5628,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// utilities\n\nconst verifyBuffer = (bufArray, t) =>\n bufAr"
},
{
"path": "tests/test-symbols.mjs",
"chars": 2763,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test match symbol', t => {\n if (typeof Sym"
},
{
"path": "tests/test-test.mjs",
"chars": 4411,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\n// These tests are copied from MDN:\n// https://de"
},
{
"path": "tests/test-toString.mjs",
"chars": 1052,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test toString', t => {\n t.equal(RE2('').to"
},
{
"path": "tests/test-unicode-classes.mjs",
"chars": 634,
"preview": "import test from 'tape-six';\nimport {RE2} from '../re2.js';\n\n// tests\n\ntest('test_unicodeClasses', t => {\n 'use strict'"
},
{
"path": "ts-tests/test-types.ts",
"chars": 4780,
"preview": "import RE2 from 're2';\n\nfunction assertType<T>(_val: T) {}\n\nfunction test_constructors() {\n const re1 = new RE2('abc');"
},
{
"path": "tsconfig.json",
"chars": 668,
"preview": "{\n \"compilerOptions\": {\n \"noEmit\": true,\n \"lib\": [\"ES2022\"],\n \"types\": [\"node\"],\n \"declaration\": true,\n "
}
]
About this extraction
This page contains the full source code of the uhop/node-re2 GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 103 files (248.6 KB), approximately 78.2k tokens, and a symbol index with 122 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.