Repository: tidyverse/stringr
Branch: main
Commit: ae054b1d28f6
Files: 163
Total size: 377.5 KB

Directory structure:
gitextract_1kgwvzj7/

├── .Rbuildignore
├── .covrignore
├── .github/
│   ├── .gitignore
│   ├── CODE_OF_CONDUCT.md
│   └── workflows/
│       ├── R-CMD-check.yaml
│       ├── pkgdown.yaml
│       ├── pr-commands.yaml
│       └── test-coverage.yaml
├── .gitignore
├── .vscode/
│   ├── extensions.json
│   └── settings.json
├── DESCRIPTION
├── LICENSE
├── LICENSE.md
├── NAMESPACE
├── NEWS.md
├── R/
│   ├── c.R
│   ├── case.R
│   ├── compat-obj-type.R
│   ├── compat-purrr.R
│   ├── compat-types-check.R
│   ├── conv.R
│   ├── count.R
│   ├── data.R
│   ├── detect.R
│   ├── dup.R
│   ├── equal.R
│   ├── escape.R
│   ├── extract.R
│   ├── flatten.R
│   ├── glue.R
│   ├── interp.R
│   ├── length.R
│   ├── locate.R
│   ├── match.R
│   ├── modifiers.R
│   ├── pad.R
│   ├── remove.R
│   ├── replace.R
│   ├── sort.R
│   ├── split.R
│   ├── stringr-package.R
│   ├── sub.R
│   ├── subset.R
│   ├── trim.R
│   ├── trunc.R
│   ├── unique.R
│   ├── utils.R
│   ├── view.R
│   ├── word.R
│   └── wrap.R
├── README.Rmd
├── README.md
├── _pkgdown.yml
├── air.toml
├── codecov.yml
├── cran-comments.md
├── data/
│   ├── fruit.rda
│   ├── sentences.rda
│   └── words.rda
├── data-raw/
│   ├── harvard-sentences.txt
│   └── samples.R
├── inst/
│   └── htmlwidgets/
│       ├── lib/
│       │   └── str_view.css
│       ├── str_view.js
│       └── str_view.yaml
├── man/
│   ├── case.Rd
│   ├── invert_match.Rd
│   ├── modifiers.Rd
│   ├── pipe.Rd
│   ├── str_c.Rd
│   ├── str_conv.Rd
│   ├── str_count.Rd
│   ├── str_detect.Rd
│   ├── str_dup.Rd
│   ├── str_equal.Rd
│   ├── str_escape.Rd
│   ├── str_extract.Rd
│   ├── str_flatten.Rd
│   ├── str_glue.Rd
│   ├── str_interp.Rd
│   ├── str_length.Rd
│   ├── str_like.Rd
│   ├── str_locate.Rd
│   ├── str_match.Rd
│   ├── str_order.Rd
│   ├── str_pad.Rd
│   ├── str_remove.Rd
│   ├── str_replace.Rd
│   ├── str_replace_na.Rd
│   ├── str_split.Rd
│   ├── str_starts.Rd
│   ├── str_sub.Rd
│   ├── str_subset.Rd
│   ├── str_to_camel.Rd
│   ├── str_trim.Rd
│   ├── str_trunc.Rd
│   ├── str_unique.Rd
│   ├── str_view.Rd
│   ├── str_which.Rd
│   ├── str_wrap.Rd
│   ├── stringr-data.Rd
│   ├── stringr-package.Rd
│   └── word.Rd
├── po/
│   ├── R-es.po
│   └── R-stringr.pot
├── revdep/
│   ├── .gitignore
│   ├── README.md
│   ├── cran.md
│   ├── email.yml
│   ├── failures.md
│   └── problems.md
├── stringr.Rproj
├── tests/
│   ├── testthat/
│   │   ├── _snaps/
│   │   │   ├── c.md
│   │   │   ├── conv.md
│   │   │   ├── detect.md
│   │   │   ├── dup.md
│   │   │   ├── equal.md
│   │   │   ├── flatten.md
│   │   │   ├── interp.md
│   │   │   ├── match.md
│   │   │   ├── modifiers.md
│   │   │   ├── replace.md
│   │   │   ├── split.md
│   │   │   ├── sub.md
│   │   │   ├── subset.md
│   │   │   ├── trunc.md
│   │   │   └── view.md
│   │   ├── test-c.R
│   │   ├── test-case.R
│   │   ├── test-conv.R
│   │   ├── test-count.R
│   │   ├── test-detect.R
│   │   ├── test-dup.R
│   │   ├── test-equal.R
│   │   ├── test-escape.R
│   │   ├── test-extract.R
│   │   ├── test-flatten.R
│   │   ├── test-glue.R
│   │   ├── test-interp.R
│   │   ├── test-length.R
│   │   ├── test-locate.R
│   │   ├── test-match.R
│   │   ├── test-modifiers.R
│   │   ├── test-pad.R
│   │   ├── test-remove.R
│   │   ├── test-replace.R
│   │   ├── test-sort.R
│   │   ├── test-split.R
│   │   ├── test-sub.R
│   │   ├── test-subset.R
│   │   ├── test-trim.R
│   │   ├── test-trunc.R
│   │   ├── test-unique.R
│   │   ├── test-utils.R
│   │   ├── test-view.R
│   │   ├── test-word.R
│   │   └── test-wrap.R
│   └── testthat.R
└── vignettes/
    ├── .gitignore
    ├── from-base.Rmd
    ├── locale-sensitive.Rmd
    ├── regular-expressions.Rmd
    └── stringr.Rmd

================================================
FILE CONTENTS
================================================

================================================
FILE: .Rbuildignore
================================================
^pkgdown$
^\.covrignore$
^.*\.Rproj$
^\.Rproj\.user$
^packrat/
^\.Rprofile$
^\.travis\.yml$
^revdep$
^cran-comments\.md$
^data-raw$
^codecov\.yml$
^\.httr-oauth$
^_pkgdown\.yml$
^doc$
^docs$
^Meta$
^README\.Rmd$
^README-.*\.png$
^appveyor\.yml$
^CRAN-RELEASE$
^LICENSE\.md$
^\.github$
^CRAN-SUBMISSION$
^[.]?air[.]toml$
^\.vscode$


================================================
FILE: .covrignore
================================================
R/deprec-*.R
R/compat-*.R


================================================
FILE: .github/.gitignore
================================================
*.html


================================================
FILE: .github/CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
  community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or advances of
  any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
  without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at codeofconduct@posit.co. 
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series of
actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within the
community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
<https://www.contributor-covenant.org/version/2/1/code_of_conduct.html>.

Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][https://github.com/mozilla/inclusion].

For answers to common questions about this code of conduct, see the FAQ at
<https://www.contributor-covenant.org/faq>. Translations are available at <https://www.contributor-covenant.org/translations>.

[homepage]: https://www.contributor-covenant.org


================================================
FILE: .github/workflows/R-CMD-check.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
#
# NOTE: This workflow is overkill for most R packages and
# check-standard.yaml is likely a better choice.
# usethis::use_github_action("check-standard") will install it.
on:
  push:
    branches: [main, master]
  pull_request:
    branches: [main, master]

name: R-CMD-check.yaml

permissions: read-all

jobs:
  R-CMD-check:
    runs-on: ${{ matrix.config.os }}

    name: ${{ matrix.config.os }} (${{ matrix.config.r }})

    strategy:
      fail-fast: false
      matrix:
        config:
          - {os: macos-latest,   r: 'release'}

          - {os: windows-latest, r: 'release'}
          # use 4.0 or 4.1 to check with rtools40's older compiler
          - {os: windows-latest, r: 'oldrel-4'}

          - {os: ubuntu-latest,  r: 'devel', http-user-agent: 'release'}
          - {os: ubuntu-latest,  r: 'release'}
          - {os: ubuntu-latest,  r: 'oldrel-1'}
          - {os: ubuntu-latest,  r: 'oldrel-2'}
          - {os: ubuntu-latest,  r: 'oldrel-3'}
          - {os: ubuntu-latest,  r: 'oldrel-4'}

    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
      R_KEEP_PKG_SOURCE: yes

    steps:
      - uses: actions/checkout@v4

      - uses: r-lib/actions/setup-pandoc@v2

      - uses: r-lib/actions/setup-r@v2
        with:
          r-version: ${{ matrix.config.r }}
          http-user-agent: ${{ matrix.config.http-user-agent }}
          use-public-rspm: true

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          extra-packages: any::rcmdcheck
          needs: check

      - uses: r-lib/actions/check-r-package@v2
        with:
          upload-snapshots: true
          build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")'


================================================
FILE: .github/workflows/pkgdown.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
  push:
    branches: [main, master]
  pull_request:
    branches: [main, master]
  release:
    types: [published]
  workflow_dispatch:

name: pkgdown.yaml

permissions: read-all

jobs:
  pkgdown:
    runs-on: ubuntu-latest
    # Only restrict concurrency for non-PR jobs
    concurrency:
      group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    permissions:
      contents: write
    steps:
      - uses: actions/checkout@v4

      - uses: r-lib/actions/setup-pandoc@v2

      - uses: r-lib/actions/setup-r@v2
        with:
          use-public-rspm: true

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          extra-packages: any::pkgdown, local::.
          needs: website

      - name: Build site
        run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)
        shell: Rscript {0}

      - name: Deploy to GitHub pages 🚀
        if: github.event_name != 'pull_request'
        uses: JamesIves/github-pages-deploy-action@v4.5.0
        with:
          clean: false
          branch: gh-pages
          folder: docs


================================================
FILE: .github/workflows/pr-commands.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
  issue_comment:
    types: [created]

name: pr-commands.yaml

permissions: read-all

jobs:
  document:
    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/document') }}
    name: document
    runs-on: ubuntu-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    permissions:
      contents: write
    steps:
      - uses: actions/checkout@v4

      - uses: r-lib/actions/pr-fetch@v2
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}

      - uses: r-lib/actions/setup-r@v2
        with:
          use-public-rspm: true

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          extra-packages: any::roxygen2
          needs: pr-document

      - name: Document
        run: roxygen2::roxygenise()
        shell: Rscript {0}

      - name: commit
        run: |
          git config --local user.name "$GITHUB_ACTOR"
          git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
          git add man/\* NAMESPACE
          git commit -m 'Document'

      - uses: r-lib/actions/pr-push@v2
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}

  style:
    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/style') }}
    name: style
    runs-on: ubuntu-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    permissions:
      contents: write
    steps:
      - uses: actions/checkout@v4

      - uses: r-lib/actions/pr-fetch@v2
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}

      - uses: r-lib/actions/setup-r@v2

      - name: Install dependencies
        run: install.packages("styler")
        shell: Rscript {0}

      - name: Style
        run: styler::style_pkg()
        shell: Rscript {0}

      - name: commit
        run: |
          git config --local user.name "$GITHUB_ACTOR"
          git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
          git add \*.R
          git commit -m 'Style'

      - uses: r-lib/actions/pr-push@v2
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}


================================================
FILE: .github/workflows/test-coverage.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
  push:
    branches: [main, master]
  pull_request:
    branches: [main, master]

name: test-coverage.yaml

permissions: read-all

jobs:
  test-coverage:
    runs-on: ubuntu-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

    steps:
      - uses: actions/checkout@v4

      - uses: r-lib/actions/setup-r@v2
        with:
          use-public-rspm: true

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          extra-packages: any::covr, any::xml2
          needs: coverage

      - name: Test coverage
        run: |
          cov <- covr::package_coverage(
            quiet = FALSE,
            clean = FALSE,
            install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package")
          )
          covr::to_cobertura(cov)
        shell: Rscript {0}

      - uses: codecov/codecov-action@v4
        with:
          fail_ci_if_error: ${{ github.event_name != 'pull_request' && true || false }}
          file: ./cobertura.xml
          plugin: noop
          disable_search: true
          token: ${{ secrets.CODECOV_TOKEN }}

      - name: Show testthat output
        if: always()
        run: |
          ## --------------------------------------------------------------------
          find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true
        shell: bash

      - name: Upload test results
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: coverage-test-failures
          path: ${{ runner.temp }}/package


================================================
FILE: .gitignore
================================================
docs
.Rproj.user
.Rhistory
.RData
packrat/lib*/
packrat/src
inst/doc
.httr-oauth
revdep/checks
revdep/library
revdep/checks.noindex
revdep/library.noindex
revdep/data.sqlite
/doc/
/Meta/


================================================
FILE: .vscode/extensions.json
================================================
{
    "recommendations": [
        "Posit.air-vscode"
    ]
}


================================================
FILE: .vscode/settings.json
================================================
{
    "[r]": {
        "editor.formatOnSave": true,
        "editor.defaultFormatter": "Posit.air-vscode"
    }
}


================================================
FILE: DESCRIPTION
================================================
Package: stringr
Title: Simple, Consistent Wrappers for Common String Operations
Version: 1.6.0.9000
Authors@R: c(
    person("Hadley", "Wickham", , "hadley@posit.co", role = c("aut", "cre", "cph")),
    person("Posit Software, PBC", role = c("cph", "fnd"))
  )
Description: A consistent, simple and easy to use set of wrappers around
    the fantastic 'stringi' package. All function and argument names (and
    positions) are consistent, all functions deal with "NA"'s and zero
    length vectors in the same way, and the output from one function is
    easy to feed into the input of another.
License: MIT + file LICENSE
URL: https://stringr.tidyverse.org, https://github.com/tidyverse/stringr
BugReports: https://github.com/tidyverse/stringr/issues
Depends: 
    R (>= 3.6)
Imports: 
    cli,
    glue (>= 1.6.1),
    lifecycle (>= 1.0.3),
    magrittr,
    rlang (>= 1.0.0),
    stringi (>= 1.5.3),
    vctrs (>= 0.4.0)
Suggests: 
    covr,
    dplyr,
    gt,
    htmltools,
    htmlwidgets,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    tibble
VignetteBuilder: 
    knitr
Config/Needs/website: tidyverse/tidytemplate
Config/potools/style: explicit
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.3


================================================
FILE: LICENSE
================================================
YEAR: 2023
COPYRIGHT HOLDER: stringr authors


================================================
FILE: LICENSE.md
================================================
# MIT License

Copyright (c) 2023 stringr authors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: NAMESPACE
================================================
# Generated by roxygen2: do not edit by hand

S3method("[",stringr_pattern)
S3method("[",stringr_view)
S3method("[[",stringr_pattern)
S3method(print,stringr_view)
S3method(type,character)
S3method(type,default)
S3method(type,stringr_boundary)
S3method(type,stringr_coll)
S3method(type,stringr_fixed)
S3method(type,stringr_regex)
export("%>%")
export("str_sub<-")
export(boundary)
export(coll)
export(fixed)
export(invert_match)
export(regex)
export(str_c)
export(str_conv)
export(str_count)
export(str_detect)
export(str_dup)
export(str_ends)
export(str_equal)
export(str_escape)
export(str_extract)
export(str_extract_all)
export(str_flatten)
export(str_flatten_comma)
export(str_glue)
export(str_glue_data)
export(str_ilike)
export(str_interp)
export(str_length)
export(str_like)
export(str_locate)
export(str_locate_all)
export(str_match)
export(str_match_all)
export(str_order)
export(str_pad)
export(str_rank)
export(str_remove)
export(str_remove_all)
export(str_replace)
export(str_replace_all)
export(str_replace_na)
export(str_sort)
export(str_split)
export(str_split_1)
export(str_split_fixed)
export(str_split_i)
export(str_squish)
export(str_starts)
export(str_sub)
export(str_sub_all)
export(str_subset)
export(str_to_camel)
export(str_to_kebab)
export(str_to_lower)
export(str_to_sentence)
export(str_to_snake)
export(str_to_title)
export(str_to_upper)
export(str_trim)
export(str_trunc)
export(str_unique)
export(str_view)
export(str_view_all)
export(str_which)
export(str_width)
export(str_wrap)
export(word)
import(rlang)
import(stringi)
importFrom(glue,glue)
importFrom(lifecycle,deprecated)
importFrom(magrittr,"%>%")


================================================
FILE: NEWS.md
================================================
# stringr (development version)

# stringr 1.6.0

## Breaking changes

* All relevant stringr functions now preserve names (@jonovik, #575).
* `str_like(ignore_case)` is deprecated, with `str_like()` now always case sensitive to better follow the conventions of the SQL LIKE operator (@edward-burn, #543).
* In `str_replace_all()`, a `replacement` function now receives all values in a single vector. This radically improves performance at the cost of breaking some existing uses (#462).

## New features

* New `vignette("locale-sensitive")` about locale sensitive functions (@kylieainslie, #404)
* New `str_ilike()` that follows the conventions of the SQL ILIKE operator (@edward-burn, #543).
* New `str_to_camel()`, `str_to_snake()`, and `str_to_kebab()` for changing "programming" case (@librill, #573 + @arnaudgallou, #593).

## Minor bug fies and improvements

* `str_*` now errors if `pattern` includes any `NA`s (@nash-delcamp-slp, #546).
* `str_dup()` gains a `sep` argument so you can add a separator between every repeated value (@edward-burn, #564).
* `str_sub<-` now gives a more informative error if `value` is not the correct length.
* `str_view()` displays a message when called with a zero-length character vector (@LouisMPenrod, #497).
* New `[[.stringr_pattern` method to match existing `[.stringr_pattern` (@edward-burn, #569).

# stringr 1.5.2

* `R CMD check` fixes

# stringr 1.5.1

* Some minor documentation improvements.

* `str_trunc()` now correctly truncates strings when `side` is `"left"` or
  `"center"` (@UchidaMizuki, #512).

# stringr 1.5.0

## Breaking changes

* stringr functions now consistently implement the tidyverse recycling rules
  (#372). There are two main changes:

    *  Only vectors of length 1 are recycled. Previously, (e.g.)
       `str_detect(letters, c("x", "y"))` worked, but it now errors.

    *  `str_c()` ignores `NULLs`, rather than treating them as length 0
        vectors.

    Additionally, many more arguments now throw errors, rather than warnings,
    if supplied the wrong type of input.

* `regex()` and friends now generate class names with `stringr_` prefix (#384).

* `str_detect()`, `str_starts()`, `str_ends()` and `str_subset()` now error
  when used with either an empty string (`""`) or a `boundary()`. These
  operations didn't really make sense (`str_detect(x, "")` returned `TRUE`
  for all non-empty strings) and made it easy to make mistakes when programming.

## New features

* Many tweaks to the documentation to make it more useful and consistent.

* New `vignette("from-base")` by @sastoudt provides a comprehensive comparison
  between base R functions and their stringr equivalents. It's designed to
  help you move to stringr if you're already familiar with base R string
  functions (#266).

* New `str_escape()` escapes regular expression metacharacters, providing
  an alternative to `fixed()` if you want to compose a pattern from user
  supplied strings (#408).

* New `str_equal()` compares two character vectors using unicode rules,
  optionally ignoring case (#381).

* `str_extract()` can now optionally extract a capturing group instead of
  the complete match (#420).

* New `str_flatten_comma()` is a special case of `str_flatten()` designed for
  comma separated flattening and can correctly apply the Oxford commas
  when there are only two elements (#444).

* New `str_split_1()` is tailored for the special case of splitting up a single
  string (#409).

* New `str_split_i()` extract a single piece from a string (#278, @bfgray3).

* New `str_like()` allows the use of SQL wildcards (#280, @rjpat).

* New `str_rank()` to complete the set of order/rank/sort functions (#353).

* New `str_sub_all()` to extract multiple substrings from each string.

* New `str_unique()` is a wrapper around `stri_unique()` and returns unique
  string values in a character vector (#249, @seasmith).

* `str_view()` uses ANSI colouring rather than an HTML widget (#370). This
  works in more places and requires fewer dependencies. It includes a number
  of other small improvements:

    * It no longer requires a pattern so you can use it to display strings with
      special characters.
    * It highlights unusual whitespace characters.
    * It's vectorised over both string` and `pattern` (#407).
    * It defaults to displaying all matches, making `str_view_all()` redundant
      (and hence deprecated) (#455).

* New `str_width()` returns the display width of a string (#380).

* stringr is now licensed as MIT (#351).

## Minor improvements and bug fixes

* Better error message if you supply a non-string pattern (#378).

* A new data source for `sentences` has fixed many small errors.

* `str_extract()` and `str_exctract_all()` now work correctly when `pattern`
  is a `boundary()`.

* `str_flatten()` gains a `last` argument that optionally override the
  final separator (#377). It gains a `na.rm` argument to remove missing
  values (since it's a summary function) (#439).

* `str_pad()` gains `use_width` argument to control whether to use the total
  code point width or the number of code points as "width" of a string (#190).

* `str_replace()` and `str_replace_all()` can use standard tidyverse formula
  shorthand for `replacement` function (#331).

* `str_starts()` and `str_ends()` now correctly respect regex operator
  precedence (@carlganz).

* `str_wrap()` breaks only at whitespace by default; set
  `whitespace_only = FALSE` to return to the previous behaviour (#335, @rjpat).

* `word()` now returns all the sentence when using a negative `start` parameter
  that is greater or equal than the number of words. (@pdelboca, #245)

# stringr 1.4.1

Hot patch release to resolve R CMD check failures.

# stringr 1.4.0

* `str_interp()` now renders lists consistently independent on the presence of
  additional placeholders (@amhrasmussen).

* New `str_starts()` and `str_ends()` functions to detect patterns at the
  beginning or end of strings (@jonthegeek, #258).

* `str_subset()`, `str_detect()`, and `str_which()` get `negate` argument,
  which is useful when you want the elements that do NOT match (#259,
  @yutannihilation).

* New `str_to_sentence()` function to capitalize with sentence case
  (@jonthegeek, #202).

# stringr 1.3.1

* `str_replace_all()` with a named vector now respects modifier functions (#207)

* `str_trunc()` is once again vectorised correctly (#203, @austin3dickey).

* `str_view()` handles `NA` values more gracefully (#217). I've also
  tweaked the sizing policy so hopefully it should work better in notebooks,
  while preserving the existing behaviour in knit documents (#232).

# stringr 1.3.0

## API changes

* During package build, you may see
  `Error : object ‘ignore.case’ is not exported by 'namespace:stringr'`.
  This is because the long deprecated `str_join()`, `ignore.case()` and
  `perl()` have now been removed.

## New features

* `str_glue()` and `str_glue_data()` provide convenient wrappers around
  `glue` and `glue_data()` from the [glue](https://glue.tidyverse.org/) package
  (#157).

* `str_flatten()` is a wrapper around `stri_flatten()` and clearly
  conveys flattening a character vector into a single string (#186).

* `str_remove()` and `str_remove_all()` functions. These wrap
  `str_replace()` and `str_replace_all()` to remove patterns from strings.
  (@Shians, #178)

* `str_squish()` removes spaces from both the left and right side of strings,
  and also converts multiple space (or space-like characters) to a single
  space within strings (@stephlocke, #197).

* `str_sub()` gains `omit_na` argument for ignoring `NA`. Accordingly,
  `str_replace()` now ignores `NA`s and keeps the original strings.
  (@yutannihilation, #164)

## Bug fixes and minor improvements

* `str_trunc()` now preserves NAs (@ClaytonJY, #162)

* `str_trunc()` now throws an error when `width` is shorter than `ellipsis`
  (@ClaytonJY, #163).

* Long deprecated `str_join()`, `ignore.case()` and `perl()` have now been
  removed.

# stringr 1.2.0

## API changes

* `str_match_all()` now returns NA if an optional group doesn't match
  (previously it returned ""). This is more consistent with `str_match()`
  and other match failures (#134).

## New features

* In `str_replace()`, `replacement` can now be a function that is called once
  for each match and whose return value is used to replace the match.

* New `str_which()` mimics `grep()` (#129).

* A new vignette (`vignette("regular-expressions")`) describes the
  details of the regular expressions supported by stringr.
  The main vignette (`vignette("stringr")`) has been updated to
  give a high-level overview of the package.

## Minor improvements and bug fixes

* `str_order()` and `str_sort()` gain explicit `numeric` argument for sorting
  mixed numbers and strings.

* `str_replace_all()` now throws an error if `replacement` is not a character
  vector. If `replacement` is `NA_character_` it replaces the complete string
  with replaces with `NA` (#124).

* All functions that take a locale (e.g. `str_to_lower()` and `str_sort()`)
  default to "en" (English) to ensure that the default is consistent across
  platforms.

# stringr 1.1.0

* Add sample datasets: `fruit`, `words` and `sentences`.

* `fixed()`, `regex()`, and `coll()` now throw an error if you use them with
  anything other than a plain string (#60). I've clarified that the replacement
  for `perl()` is `regex()` not `regexp()` (#61). `boundary()` has improved
  defaults when splitting on non-word boundaries (#58, @lmullen).

* `str_detect()` now can detect boundaries (by checking for a `str_count()` > 0)
  (#120). `str_subset()` works similarly.

* `str_extract()` and `str_extract_all()` now work with `boundary()`. This is
  particularly useful if you want to extract logical constructs like words
  or sentences. `str_extract_all()` respects the `simplify` argument
  when used with `fixed()` matches.

* `str_subset()` now respects custom options for `fixed()` patterns
  (#79, @gagolews).

* `str_replace()` and `str_replace_all()` now behave correctly when a
  replacement string contains `$`s, `\\\\1`, etc. (#83, #99).

* `str_split()` gains a `simplify` argument to match `str_extract_all()`
  etc.

* `str_view()` and `str_view_all()` create HTML widgets that display regular
  expression matches (#96).

* `word()` returns `NA` for indexes greater than number of words (#112).

# stringr 1.0.0

* stringr is now powered by [stringi](https://github.com/gagolews/stringi)
  instead of base R regular expressions. This improves unicode and support, and
  makes most operations considerably faster.  If you find stringr inadequate for
  your string processing needs, I highly recommend looking at stringi in more
  detail.

* stringr gains a vignette, currently a straight forward update of the article
  that appeared in the R Journal.

* `str_c()` now returns a zero length vector if any of its inputs are
  zero length vectors. This is consistent with all other functions, and
  standard R recycling rules. Similarly, using `str_c("x", NA)` now
  yields `NA`. If you want `"xNA"`, use `str_replace_na()` on the inputs.

* `str_replace_all()` gains a convenient syntax for applying multiple pairs of
  pattern and replacement to the same vector:

    ```R
    input <- c("abc", "def")
    str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
    ```

* `str_match()` now returns NA if an optional group doesn't match
  (previously it returned ""). This is more consistent with `str_extract()`
  and other match failures.

* New `str_subset()` keeps values that match a pattern. It's a convenient
  wrapper for `x[str_detect(x)]` (#21, @jiho).

* New `str_order()` and `str_sort()` allow you to sort and order strings
  in a specified locale.

* New `str_conv()` to convert strings from specified encoding to UTF-8.

* New modifier `boundary()` allows you to count, locate and split by
  character, word, line and sentence boundaries.

* The documentation got a lot of love, and very similar functions (e.g.
  first and all variants) are now documented together. This should hopefully
  make it easier to locate the function you need.

* `ignore.case(x)` has been deprecated in favour of
  `fixed|regex|coll(x, ignore.case = TRUE)`, `perl(x)` has been deprecated in
  favour of `regex(x)`.

* `str_join()` is deprecated, please use `str_c()` instead.

# stringr 0.6.2

* fixed path in `str_wrap` example so works for more R installations.

* remove dependency on plyr

# stringr 0.6.1

* Zero input to `str_split_fixed` returns 0 row matrix with `n` columns

* Export `str_join`

# stringr 0.6

* new modifier `perl` that switches to Perl regular expressions

* `str_match` now uses new base function `regmatches` to extract matches -
  this should hopefully be faster than my previous pure R algorithm

# stringr 0.5

* new `str_wrap` function which gives `strwrap` output in a more convenient
  format

* new `word` function extract words from a string given user defined
  separator (thanks to suggestion by David Cooper)

* `str_locate` now returns consistent type when matching empty string (thanks
  to Stavros Macrakis)

* new `str_count` counts number of matches in a string.

* `str_pad` and `str_trim` receive performance tweaks - for large vectors this
  should give at least a two order of magnitude speed up

* str_length returns NA for invalid multibyte strings

* fix small bug in internal `recyclable` function

# stringr 0.4

 * all functions now vectorised with respect to string, pattern (and
   where appropriate) replacement parameters
 * fixed() function now tells stringr functions to use fixed matching, rather
   than escaping the regular expression.  Should improve performance for
   large vectors.
 * new ignore.case() modifier tells stringr functions to ignore case of
   pattern.
 * str_replace renamed to str_replace_all and new str_replace function added.
   This makes str_replace consistent with all functions.
 * new str_sub<- function (analogous to substring<-) for substring replacement
 * str_sub now understands negative positions as a position from the end of
   the string. -1 replaces Inf as indicator for string end.
 * str_pad side argument can be left, right, or both (instead of center)
 * str_trim gains side argument to better match str_pad
 * stringr now has a namespace and imports plyr (rather than requiring it)

# stringr 0.3

 * fixed() now also escapes |
 * str_join() renamed to str_c()
 * all functions more carefully check input and return informative error
   messages if not as expected.
 * add invert_match() function to convert a matrix of location of matches to
   locations of non-matches
 * add fixed() function to allow matching of fixed strings.

# stringr 0.2

 * str_length now returns correct results when used with factors
 * str_sub now correctly replaces Inf in end argument with length of string
 * new function str_split_fixed returns fixed number of splits in a character
   matrix
 * str_split no longer uses strsplit to preserve trailing breaks


================================================
FILE: R/c.R
================================================
#' Join multiple strings into one string
#'
#' @description
#' `str_c()` combines multiple character vectors into a single character
#' vector. It's very similar to [paste0()] but uses tidyverse recycling and
#' `NA` rules.
#'
#' One way to understand how `str_c()` works is picture a 2d matrix of strings,
#' where each argument forms a column. `sep` is inserted between each column,
#' and then each row is combined together into a single string. If `collapse`
#' is set, it's inserted between each row, and then the result is again
#' combined, this time into a single string.
#'
#' @param ... One or more character vectors.
#'
#'   `NULL`s are removed; scalar inputs (vectors of length 1) are recycled to
#'   the common length of vector inputs.
#'
#'   Like most other R functions, missing values are "infectious": whenever
#'   a missing value is combined with another string the result will always
#'   be missing. Use [dplyr::coalesce()] or [str_replace_na()] to convert to
#'   the desired value.
#' @param sep String to insert between input vectors.
#' @param collapse Optional string used to combine output into single
#'   string. Generally better to use [str_flatten()] if you needed this
#'   behaviour.
#' @return If `collapse = NULL` (the default) a character vector with
#'   length equal to the longest input. If `collapse` is a string, a character
#'   vector of length 1.
#' @export
#' @examples
#' str_c("Letter: ", letters)
#' str_c("Letter", letters, sep = ": ")
#' str_c(letters, " is for", "...")
#' str_c(letters[-26], " comes before ", letters[-1])
#'
#' str_c(letters, collapse = "")
#' str_c(letters, collapse = ", ")
#'
#' # Differences from paste() ----------------------
#' # Missing inputs give missing outputs
#' str_c(c("a", NA, "b"), "-d")
#' paste0(c("a", NA, "b"), "-d")
#' # Use str_replace_NA to display literal NAs:
#' str_c(str_replace_na(c("a", NA, "b")), "-d")
#'
#' # Uses tidyverse recycling rules
#' \dontrun{str_c(1:2, 1:3)} # errors
#' paste0(1:2, 1:3)
#'
#' str_c("x", character())
#' paste0("x", character())
str_c <- function(..., sep = "", collapse = NULL) {
  check_string(sep)
  check_string(collapse, allow_null = TRUE)

  dots <- list(...)
  dots <- dots[!map_lgl(dots, is.null)]
  vctrs::vec_size_common(!!!dots)

  inject(stri_c(!!!dots, sep = sep, collapse = collapse))
}


================================================
FILE: R/case.R
================================================
#' Convert string to upper case, lower case, title case, or sentence case
#'
#' * `str_to_upper()` converts to upper case.
#' * `str_to_lower()` converts to lower case.
#' * `str_to_title()` converts to title case, where only the first letter of
#'   each word is capitalized.
#' * `str_to_sentence()` convert to sentence case, where only the first letter
#'   of sentence is capitalized.
#'
#' @inheritParams str_detect
#' @inheritParams coll
#' @return A character vector the same length as `string`.
#' @examples
#' dog <- "The quick brown dog"
#' str_to_upper(dog)
#' str_to_lower(dog)
#' str_to_title(dog)
#' str_to_sentence("the quick brown dog")
#'
#' # Locale matters!
#' str_to_upper("i") # English
#' str_to_upper("i", "tr") # Turkish
#' @name case
NULL

#' @export
#' @rdname case
str_to_upper <- function(string, locale = "en") {
  check_string(locale)
  copy_names(string, stri_trans_toupper(string, locale = locale))
}
#' @export
#' @rdname case
str_to_lower <- function(string, locale = "en") {
  check_string(locale)
  copy_names(string, stri_trans_tolower(string, locale = locale))
}
#' @export
#' @rdname case
str_to_title <- function(string, locale = "en") {
  check_string(locale)
  out <- stri_trans_totitle(
    string,
    opts_brkiter = stri_opts_brkiter(locale = locale)
  )
  copy_names(string, out)
}
#' @export
#' @rdname case
str_to_sentence <- function(string, locale = "en") {
  check_string(locale)
  out <- stri_trans_totitle(
    string,
    opts_brkiter = stri_opts_brkiter(type = "sentence", locale = locale)
  )
  copy_names(string, out)
}


#' Convert between different types of programming case
#'
#' @description
#' * `str_to_camel()` converts to camel case, where the first letter of
#'   each word is capitalized, with no separation between words. By default
#'   the first letter of the first word is not capitalized.
#'
#' * `str_to_kebab()` converts to kebab case, where words are converted to
#'   lower case and separated by dashes (`-`).
#'
#' * `str_to_snake()` converts to snake case, where words are converted to
#'   lower case and separated by underscores (`_`).
#' @inheritParams str_to_lower
#' @export
#' @param first_upper Logical. Should the first letter be capitalized?
#' @examples
#' str_to_camel("my-variable")
#' str_to_camel("my-variable", first_upper = TRUE)
#'
#' str_to_snake("MyVariable")
#' str_to_kebab("MyVariable")
str_to_camel <- function(string, first_upper = FALSE) {
  check_character(string)
  check_bool(first_upper)

  string <- string |>
    to_words() |>
    str_to_title() |>
    str_remove_all(pattern = fixed(" "))

  if (!first_upper) {
    str_sub(string, 1, 1) <- str_to_lower(str_sub(string, 1, 1))
  }

  string
}
#' @export
#' @rdname str_to_camel
str_to_snake <- function(string) {
  check_character(string)
  to_separated_case(string, sep = "_")
}
#' @export
#' @rdname str_to_camel
str_to_kebab <- function(string) {
  check_character(string)
  to_separated_case(string, sep = "-")
}

to_separated_case <- function(string, sep) {
  out <- to_words(string)
  str_replace_all(out, fixed(" "), sep)
}

to_words <- function(string) {
  breakpoints <- paste(
    # non-word characters
    "[^\\p{L}\\p{N}]+",
    # lowercase followed by uppercase
    "(?<=\\p{Ll})(?=\\p{Lu})",
    # letter followed by number
    "(?<=\\p{L})(?=\\p{N})",
    # number followed by letter
    "(?<=\\p{N})(?=\\p{L})",
    # uppercase followed uppercase then lowercase (i.e. end of acronym)
    "(?<=\\p{Lu})(?=\\p{Lu}\\p{Ll})",
    sep = "|"
  )
  out <- str_replace_all(string, breakpoints, " ")
  out <- str_to_lower(out)
  str_trim(out)
}


================================================
FILE: R/compat-obj-type.R
================================================
# nocov start --- r-lib/rlang compat-obj-type
#
# Changelog
# =========
#
# 2022-10-04:
# - `obj_type_friendly(value = TRUE)` now shows numeric scalars
#   literally.
# - `stop_friendly_type()` now takes `show_value`, passed to
#   `obj_type_friendly()` as the `value` argument.
#
# 2022-10-03:
# - Added `allow_na` and `allow_null` arguments.
# - `NULL` is now backticked.
# - Better friendly type for infinities and `NaN`.
#
# 2022-09-16:
# - Unprefixed usage of rlang functions with `rlang::` to
#   avoid onLoad issues when called from rlang (#1482).
#
# 2022-08-11:
# - Prefixed usage of rlang functions with `rlang::`.
#
# 2022-06-22:
# - `friendly_type_of()` is now `obj_type_friendly()`.
# - Added `obj_type_oo()`.
#
# 2021-12-20:
# - Added support for scalar values and empty vectors.
# - Added `stop_input_type()`
#
# 2021-06-30:
# - Added support for missing arguments.
#
# 2021-04-19:
# - Added support for matrices and arrays (#141).
# - Added documentation.
# - Added changelog.

#' Return English-friendly type
#' @param x Any R object.
#' @param value Whether to describe the value of `x`. Special values
#'   like `NA` or `""` are always described.
#' @param length Whether to mention the length of vectors and lists.
#' @return A string describing the type. Starts with an indefinite
#'   article, e.g. "an integer vector".
#' @noRd
obj_type_friendly <- function(x, value = TRUE) {
  if (is_missing(x)) {
    return("absent")
  }

  if (is.object(x)) {
    if (inherits(x, "quosure")) {
      type <- "quosure"
    } else {
      type <- paste(class(x), collapse = "/")
    }
    return(sprintf("a <%s> object", type))
  }

  if (!is_vector(x)) {
    return(.rlang_as_friendly_type(typeof(x)))
  }

  n_dim <- length(dim(x))

  if (!n_dim) {
    if (!is_list(x) && length(x) == 1) {
      if (is_na(x)) {
        return(switch(
          typeof(x),
          logical = "`NA`",
          integer = "an integer `NA`",
          double = if (is.nan(x)) {
            "`NaN`"
          } else {
            "a numeric `NA`"
          },
          complex = "a complex `NA`",
          character = "a character `NA`",
          .rlang_stop_unexpected_typeof(x)
        ))
      }

      show_infinites <- function(x) {
        if (x > 0) {
          "`Inf`"
        } else {
          "`-Inf`"
        }
      }
      str_encode <- function(x, width = 30, ...) {
        if (nchar(x) > width) {
          x <- substr(x, 1, width - 3)
          x <- paste0(x, "...")
        }
        encodeString(x, ...)
      }

      if (value) {
        if (is.numeric(x) && is.infinite(x)) {
          return(show_infinites(x))
        }

        if (is.numeric(x) || is.complex(x)) {
          number <- as.character(round(x, 2))
          what <- if (is.complex(x)) "the complex number" else "the number"
          return(paste(what, number))
        }

        return(switch(
          typeof(x),
          logical = if (x) "`TRUE`" else "`FALSE`",
          character = {
            what <- if (nzchar(x)) "the string" else "the empty string"
            paste(what, str_encode(x, quote = "\""))
          },
          raw = paste("the raw value", as.character(x)),
          .rlang_stop_unexpected_typeof(x)
        ))
      }

      return(switch(
        typeof(x),
        logical = "a logical value",
        integer = "an integer",
        double = if (is.infinite(x)) show_infinites(x) else "a number",
        complex = "a complex number",
        character = if (nzchar(x)) "a string" else "\"\"",
        raw = "a raw value",
        .rlang_stop_unexpected_typeof(x)
      ))
    }

    if (length(x) == 0) {
      return(switch(
        typeof(x),
        logical = "an empty logical vector",
        integer = "an empty integer vector",
        double = "an empty numeric vector",
        complex = "an empty complex vector",
        character = "an empty character vector",
        raw = "an empty raw vector",
        list = "an empty list",
        .rlang_stop_unexpected_typeof(x)
      ))
    }
  }

  vec_type_friendly(x)
}

vec_type_friendly <- function(x, length = FALSE) {
  if (!is_vector(x)) {
    abort("`x` must be a vector.")
  }
  type <- typeof(x)
  n_dim <- length(dim(x))

  add_length <- function(type) {
    if (length && !n_dim) {
      paste0(type, sprintf(" of length %s", length(x)))
    } else {
      type
    }
  }

  if (type == "list") {
    if (n_dim < 2) {
      return(add_length("a list"))
    } else if (is.data.frame(x)) {
      return("a data frame")
    } else if (n_dim == 2) {
      return("a list matrix")
    } else {
      return("a list array")
    }
  }

  type <- switch(
    type,
    logical = "a logical %s",
    integer = "an integer %s",
    numeric = ,
    double = "a double %s",
    complex = "a complex %s",
    character = "a character %s",
    raw = "a raw %s",
    type = paste0("a ", type, " %s")
  )

  if (n_dim < 2) {
    kind <- "vector"
  } else if (n_dim == 2) {
    kind <- "matrix"
  } else {
    kind <- "array"
  }
  out <- sprintf(type, kind)

  if (n_dim >= 2) {
    out
  } else {
    add_length(out)
  }
}

.rlang_as_friendly_type <- function(type) {
  switch(
    type,

    list = "a list",

    NULL = "`NULL`",
    environment = "an environment",
    externalptr = "a pointer",
    weakref = "a weak reference",
    S4 = "an S4 object",

    name = ,
    symbol = "a symbol",
    language = "a call",
    pairlist = "a pairlist node",
    expression = "an expression vector",

    char = "an internal string",
    promise = "an internal promise",
    ... = "an internal dots object",
    any = "an internal `any` object",
    bytecode = "an internal bytecode object",

    primitive = ,
    builtin = ,
    special = "a primitive function",
    closure = "a function",

    type
  )
}

.rlang_stop_unexpected_typeof <- function(x, call = caller_env()) {
  abort(
    sprintf("Unexpected type <%s>.", typeof(x)),
    call = call
  )
}

#' Return OO type
#' @param x Any R object.
#' @return One of `"bare"` (for non-OO objects), `"S3"`, `"S4"`,
#'   `"R6"`, or `"R7"`.
#' @noRd
obj_type_oo <- function(x) {
  if (!is.object(x)) {
    return("bare")
  }

  class <- inherits(x, c("R6", "R7_object"), which = TRUE)

  if (class[[1]]) {
    "R6"
  } else if (class[[2]]) {
    "R7"
  } else if (isS4(x)) {
    "S4"
  } else {
    "S3"
  }
}

#' @param x The object type which does not conform to `what`. Its
#'   `obj_type_friendly()` is taken and mentioned in the error message.
#' @param what The friendly expected type as a string. Can be a
#'   character vector of expected types, in which case the error
#'   message mentions all of them in an "or" enumeration.
#' @param show_value Passed to `value` argument of `obj_type_friendly()`.
#' @param ... Arguments passed to [abort()].
#' @inheritParams args_error_context
#' @noRd
stop_input_type <- function(
  x,
  what,
  ...,
  allow_na = FALSE,
  allow_null = FALSE,
  show_value = TRUE,
  arg = caller_arg(x),
  call = caller_env()
) {
  # From compat-cli.R
  cli <- env_get_list(
    nms = c("format_arg", "format_code"),
    last = topenv(),
    default = function(x) sprintf("`%s`", x),
    inherit = TRUE
  )

  if (allow_na) {
    what <- c(what, cli$format_code("NA"))
  }
  if (allow_null) {
    what <- c(what, cli$format_code("NULL"))
  }
  if (length(what)) {
    what <- oxford_comma(what)
  }

  message <- sprintf(
    "%s must be %s, not %s.",
    cli$format_arg(arg),
    what,
    obj_type_friendly(x, value = show_value)
  )

  abort(message, ..., call = call, arg = arg)
}

oxford_comma <- function(chr, sep = ", ", final = "or") {
  n <- length(chr)

  if (n < 2) {
    return(chr)
  }

  head <- chr[seq_len(n - 1)]
  last <- chr[n]

  head <- paste(head, collapse = sep)

  # Write a or b. But a, b, or c.
  if (n > 2) {
    paste0(head, sep, final, " ", last)
  } else {
    paste0(head, " ", final, " ", last)
  }
}

# nocov end


================================================
FILE: R/compat-purrr.R
================================================
# nocov start - compat-purrr (last updated: rlang 0.3.2.9000)

# This file serves as a reference for compatibility functions for
# purrr. They are not drop-in replacements but allow a similar style
# of programming. This is useful in cases where purrr is too heavy a
# package to depend on. Please find the most recent version in rlang's
# repository.

map <- function(.x, .f, ...) {
  lapply(.x, .f, ...)
}
map_mold <- function(.x, .f, .mold, ...) {
  out <- vapply(.x, .f, .mold, ..., USE.NAMES = FALSE)
  names(out) <- names(.x)
  out
}
map_lgl <- function(.x, .f, ...) {
  map_mold(.x, .f, logical(1), ...)
}
map_int <- function(.x, .f, ...) {
  map_mold(.x, .f, integer(1), ...)
}
map_dbl <- function(.x, .f, ...) {
  map_mold(.x, .f, double(1), ...)
}
map_chr <- function(.x, .f, ...) {
  map_mold(.x, .f, character(1), ...)
}
map_cpl <- function(.x, .f, ...) {
  map_mold(.x, .f, complex(1), ...)
}

walk <- function(.x, .f, ...) {
  map(.x, .f, ...)
  invisible(.x)
}

pluck <- function(.x, .f) {
  map(.x, `[[`, .f)
}
pluck_lgl <- function(.x, .f) {
  map_lgl(.x, `[[`, .f)
}
pluck_int <- function(.x, .f) {
  map_int(.x, `[[`, .f)
}
pluck_dbl <- function(.x, .f) {
  map_dbl(.x, `[[`, .f)
}
pluck_chr <- function(.x, .f) {
  map_chr(.x, `[[`, .f)
}
pluck_cpl <- function(.x, .f) {
  map_cpl(.x, `[[`, .f)
}

map2 <- function(.x, .y, .f, ...) {
  out <- mapply(.f, .x, .y, MoreArgs = list(...), SIMPLIFY = FALSE)
  if (length(out) == length(.x)) {
    set_names(out, names(.x))
  } else {
    set_names(out, NULL)
  }
}
map2_lgl <- function(.x, .y, .f, ...) {
  as.vector(map2(.x, .y, .f, ...), "logical")
}
map2_int <- function(.x, .y, .f, ...) {
  as.vector(map2(.x, .y, .f, ...), "integer")
}
map2_dbl <- function(.x, .y, .f, ...) {
  as.vector(map2(.x, .y, .f, ...), "double")
}
map2_chr <- function(.x, .y, .f, ...) {
  as.vector(map2(.x, .y, .f, ...), "character")
}
map2_cpl <- function(.x, .y, .f, ...) {
  as.vector(map2(.x, .y, .f, ...), "complex")
}

args_recycle <- function(args) {
  lengths <- map_int(args, length)
  n <- max(lengths)

  stopifnot(all(lengths == 1L | lengths == n))
  to_recycle <- lengths == 1L
  args[to_recycle] <- map(args[to_recycle], function(x) rep.int(x, n))

  args
}
pmap <- function(.l, .f, ...) {
  args <- args_recycle(.l)
  do.call(
    "mapply",
    c(
      FUN = list(quote(.f)),
      args,
      MoreArgs = quote(list(...)),
      SIMPLIFY = FALSE,
      USE.NAMES = FALSE
    )
  )
}

probe <- function(.x, .p, ...) {
  if (is_logical(.p)) {
    stopifnot(length(.p) == length(.x))
    .p
  } else {
    map_lgl(.x, .p, ...)
  }
}

keep <- function(.x, .f, ...) {
  .x[probe(.x, .f, ...)]
}
discard <- function(.x, .p, ...) {
  sel <- probe(.x, .p, ...)
  .x[is.na(sel) | !sel]
}
map_if <- function(.x, .p, .f, ...) {
  matches <- probe(.x, .p)
  .x[matches] <- map(.x[matches], .f, ...)
  .x
}

compact <- function(.x) {
  Filter(length, .x)
}

transpose <- function(.l) {
  inner_names <- names(.l[[1]])
  if (is.null(inner_names)) {
    fields <- seq_along(.l[[1]])
  } else {
    fields <- set_names(inner_names)
  }

  map(fields, function(i) {
    map(.l, .subset2, i)
  })
}

every <- function(.x, .p, ...) {
  for (i in seq_along(.x)) {
    if (!rlang::is_true(.p(.x[[i]], ...))) return(FALSE)
  }
  TRUE
}
some <- function(.x, .p, ...) {
  for (i in seq_along(.x)) {
    if (rlang::is_true(.p(.x[[i]], ...))) return(TRUE)
  }
  FALSE
}
negate <- function(.p) {
  function(...) !.p(...)
}

reduce <- function(.x, .f, ..., .init) {
  f <- function(x, y) .f(x, y, ...)
  Reduce(f, .x, init = .init)
}
reduce_right <- function(.x, .f, ..., .init) {
  f <- function(x, y) .f(y, x, ...)
  Reduce(f, .x, init = .init, right = TRUE)
}
accumulate <- function(.x, .f, ..., .init) {
  f <- function(x, y) .f(x, y, ...)
  Reduce(f, .x, init = .init, accumulate = TRUE)
}
accumulate_right <- function(.x, .f, ..., .init) {
  f <- function(x, y) .f(y, x, ...)
  Reduce(f, .x, init = .init, right = TRUE, accumulate = TRUE)
}

detect <- function(.x, .f, ..., .right = FALSE, .p = is_true) {
  for (i in index(.x, .right)) {
    if (.p(.f(.x[[i]], ...))) {
      return(.x[[i]])
    }
  }
  NULL
}
detect_index <- function(.x, .f, ..., .right = FALSE, .p = is_true) {
  for (i in index(.x, .right)) {
    if (.p(.f(.x[[i]], ...))) {
      return(i)
    }
  }
  0L
}
index <- function(x, right = FALSE) {
  idx <- seq_along(x)
  if (right) {
    idx <- rev(idx)
  }
  idx
}

imap <- function(.x, .f, ...) {
  map2(.x, vec_index(.x), .f, ...)
}
vec_index <- function(x) {
  names(x) %||% seq_along(x)
}

# nocov end


================================================
FILE: R/compat-types-check.R
================================================
# nocov start --- r-lib/rlang compat-types-check
#
# Dependencies
# ============
#
# - compat-obj-type.R
#
# Changelog
# =========
#
# 2022-10-04:
# - Added `check_name()` that forbids the empty string.
#   `check_string()` allows the empty string by default.
#
# 2022-09-28:
# - Removed `what` arguments.
# - Added `allow_na` and `allow_null` arguments.
# - Added `allow_decimal` and `allow_infinite` arguments.
# - Improved errors with absent arguments.
#
#
# 2022-09-16:
# - Unprefixed usage of rlang functions with `rlang::` to
#   avoid onLoad issues when called from rlang (#1482).
#
# 2022-08-11:
# - Added changelog.

# Scalars -----------------------------------------------------------------

check_bool <- function(
  x,
  ...,
  allow_na = FALSE,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_bool(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
    if (allow_na && identical(x, NA)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    c("`TRUE`", "`FALSE`"),
    ...,
    allow_na = allow_na,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_string <- function(
  x,
  ...,
  allow_empty = TRUE,
  allow_na = FALSE,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    is_string <- .rlang_check_is_string(
      x,
      allow_empty = allow_empty,
      allow_na = allow_na,
      allow_null = allow_null
    )
    if (is_string) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a single string",
    ...,
    allow_na = allow_na,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

.rlang_check_is_string <- function(x, allow_empty, allow_na, allow_null) {
  if (is_string(x)) {
    if (allow_empty || !is_string(x, "")) {
      return(TRUE)
    }
  }

  if (allow_null && is_null(x)) {
    return(TRUE)
  }

  if (allow_na && (identical(x, NA) || identical(x, na_chr))) {
    return(TRUE)
  }

  FALSE
}

check_name <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    is_string <- .rlang_check_is_string(
      x,
      allow_empty = FALSE,
      allow_na = FALSE,
      allow_null = allow_null
    )
    if (is_string) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a valid name",
    ...,
    allow_na = FALSE,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_number_decimal <- function(
  x,
  ...,
  min = -Inf,
  max = Inf,
  allow_infinite = TRUE,
  allow_na = FALSE,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  .rlang_types_check_number(
    x,
    ...,
    min = min,
    max = max,
    allow_decimal = TRUE,
    allow_infinite = allow_infinite,
    allow_na = allow_na,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_number_whole <- function(
  x,
  ...,
  min = -Inf,
  max = Inf,
  allow_na = FALSE,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  .rlang_types_check_number(
    x,
    ...,
    min = min,
    max = max,
    allow_decimal = FALSE,
    allow_infinite = FALSE,
    allow_na = allow_na,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

.rlang_types_check_number <- function(
  x,
  ...,
  min = -Inf,
  max = Inf,
  allow_decimal = FALSE,
  allow_infinite = FALSE,
  allow_na = FALSE,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (allow_decimal) {
    what <- "a number"
  } else {
    what <- "a whole number"
  }

  .stop <- function(x, what, ...) {
    stop_input_type(
      x,
      what,
      ...,
      allow_na = allow_na,
      allow_null = allow_null,
      arg = arg,
      call = call
    )
  }

  if (!missing(x)) {
    is_number <- is_number(
      x,
      allow_decimal = allow_decimal,
      allow_infinite = allow_infinite
    )

    if (is_number) {
      if (min > -Inf && max < Inf) {
        what <- sprintf("a number between %s and %s", min, max)
      } else {
        what <- NULL
      }
      if (x < min) {
        what <- what %||% sprintf("a number larger than %s", min)
        .stop(x, what, ...)
      }
      if (x > max) {
        what <- what %||% sprintf("a number smaller than %s", max)
        .stop(x, what, ...)
      }
      return(invisible(NULL))
    }

    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
    if (
      allow_na &&
        (identical(x, NA) ||
          identical(x, na_dbl) ||
          identical(x, na_int))
    ) {
      return(invisible(NULL))
    }
  }

  .stop(x, what, ...)
}

is_number <- function(x, allow_decimal = FALSE, allow_infinite = FALSE) {
  if (!typeof(x) %in% c("integer", "double")) {
    return(FALSE)
  }
  if (length(x) != 1) {
    return(FALSE)
  }
  if (is.na(x)) {
    return(FALSE)
  }
  if (!allow_decimal && !is_integerish(x)) {
    return(FALSE)
  }
  if (!allow_infinite && is.infinite(x)) {
    return(FALSE)
  }
  TRUE
}

check_symbol <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_symbol(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a symbol",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_arg <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_symbol(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "an argument name",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_call <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_call(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a defused call",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_environment <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_environment(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "an environment",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_function <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_function(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a function",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_closure <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_closure(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "an R function",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

check_formula <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_formula(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a formula",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}


# Vectors -----------------------------------------------------------------

check_character <- function(
  x,
  ...,
  allow_null = FALSE,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!missing(x)) {
    if (is_character(x)) {
      return(invisible(NULL))
    }
    if (allow_null && is_null(x)) {
      return(invisible(NULL))
    }
  }

  stop_input_type(
    x,
    "a character vector",
    ...,
    allow_null = allow_null,
    arg = arg,
    call = call
  )
}

# nocov end


================================================
FILE: R/conv.R
================================================
#' Specify the encoding of a string
#'
#' This is a convenient way to override the current encoding of a string.
#'
#' @inheritParams str_detect
#' @param encoding Name of encoding. See [stringi::stri_enc_list()]
#'   for a complete list.
#' @export
#' @examples
#' # Example from encoding?stringi::stringi
#' x <- rawToChar(as.raw(177))
#' x
#' str_conv(x, "ISO-8859-2") # Polish "a with ogonek"
#' str_conv(x, "ISO-8859-1") # Plus-minus
str_conv <- function(string, encoding) {
  check_string(encoding)

  copy_names(string, stri_conv(string, encoding, "UTF-8"))
}


================================================
FILE: R/count.R
================================================
#' Count number of matches
#'
#' Counts the number of times `pattern` is found within each element
#' of `string.`
#'
#' @inheritParams str_detect
#' @param pattern Pattern to look for.
#'
#'   The default interpretation is a regular expression, as described in
#'   `vignette("regular-expressions")`. Use [regex()] for finer control of the
#'   matching behaviour.
#'
#'   Match a fixed string (i.e. by comparing only bytes), using
#'   [fixed()]. This is fast, but approximate. Generally,
#'   for matching human text, you'll want [coll()] which
#'   respects character matching rules for the specified locale.
#'
#'   Match character, word, line and sentence boundaries with
#'   [boundary()]. The empty string, `""``, is equivalent to
#'   `boundary("character")`.
#' @return An integer vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_count()] which this function wraps.
#'
#'  [str_locate()]/[str_locate_all()] to locate position
#'  of matches
#'
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_count(fruit, "a")
#' str_count(fruit, "p")
#' str_count(fruit, "e")
#' str_count(fruit, c("a", "b", "p", "p"))
#'
#' str_count(c("a.", "...", ".a.a"), ".")
#' str_count(c("a.", "...", ".a.a"), fixed("."))
str_count <- function(string, pattern = "") {
  check_lengths(string, pattern)

  out <- switch(
    type(pattern),
    empty = ,
    bound = stri_count_boundaries(string, opts_brkiter = opts(pattern)),
    fixed = stri_count_fixed(string, pattern, opts_fixed = opts(pattern)),
    coll = stri_count_coll(string, pattern, opts_collator = opts(pattern)),
    regex = stri_count_regex(string, pattern, opts_regex = opts(pattern))
  )
  preserve_names_if_possible(string, pattern, out)
}


================================================
FILE: R/data.R
================================================
#' Sample character vectors for practicing string manipulations
#'
#' `fruit` and `words` come from the `rcorpora` package
#' written by Gabor Csardi; the data was collected by Darius Kazemi
#' and made available at \url{https://github.com/dariusk/corpora}.
#' `sentences` is a collection of "Harvard sentences" used for
#' standardised testing of voice.
#'
#' @format Character vectors.
#' @name stringr-data
#' @examples
#' length(sentences)
#' sentences[1:5]
#'
#' length(fruit)
#' fruit[1:5]
#'
#' length(words)
#' words[1:5]
NULL

#' @rdname stringr-data
#' @format NULL
"sentences"

#' @rdname stringr-data
#' @format NULL
"fruit"

#' @rdname stringr-data
#' @format NULL
"words"


================================================
FILE: R/detect.R
================================================
#' Detect the presence/absence of a match
#'
#' `str_detect()` returns a logical vector with `TRUE` for each element of
#' `string` that matches `pattern` and `FALSE` otherwise. It's equivalent to
#' `grepl(pattern, string)`.
#'
#' @param string Input vector. Either a character vector, or something
#'  coercible to one.
#' @param pattern Pattern to look for.
#'
#'   The default interpretation is a regular expression, as described in
#'   `vignette("regular-expressions")`. Use [regex()] for finer control of the
#'   matching behaviour.
#'
#'   Match a fixed string (i.e. by comparing only bytes), using
#'   [fixed()]. This is fast, but approximate. Generally,
#'   for matching human text, you'll want [coll()] which
#'   respects character matching rules for the specified locale.
#'
#'   You can not match boundaries, including `""`, with this function.
#'
#' @param negate If `TRUE`, inverts the resulting boolean vector.
#' @return A logical vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_detect()] which this function wraps,
#'   [str_subset()] for a convenient wrapper around
#'   `x[str_detect(x, pattern)]`
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_detect(fruit, "a")
#' str_detect(fruit, "^a")
#' str_detect(fruit, "a$")
#' str_detect(fruit, "b")
#' str_detect(fruit, "[aeiou]")
#'
#' # Also vectorised over pattern
#' str_detect("aecfg", letters)
#'
#' # Returns TRUE if the pattern do NOT match
#' str_detect(fruit, "^p", negate = TRUE)
str_detect <- function(string, pattern, negate = FALSE) {
  check_lengths(string, pattern)
  check_bool(negate)

  out <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = stri_detect_fixed(
      string,
      pattern,
      negate = negate,
      opts_fixed = opts(pattern)
    ),
    coll = stri_detect_coll(
      string,
      pattern,
      negate = negate,
      opts_collator = opts(pattern)
    ),
    regex = stri_detect_regex(
      string,
      pattern,
      negate = negate,
      opts_regex = opts(pattern)
    )
  )

  preserve_names_if_possible(string, pattern, out)
}

#' Detect the presence/absence of a match at the start/end
#'
#' `str_starts()` and `str_ends()` are special cases of [str_detect()] that
#' only match at the beginning or end of a string, respectively.
#'
#' @inheritParams str_detect
#' @param pattern Pattern with which the string starts or ends.
#'
#'   The default interpretation is a regular expression, as described in
#'   [stringi::about_search_regex]. Control options with [regex()].
#'
#'   Match a fixed string (i.e. by comparing only bytes), using [fixed()]. This
#'   is fast, but approximate. Generally, for matching human text, you'll want
#'   [coll()] which respects character matching rules for the specified locale.
#'
#' @return A logical vector.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_starts(fruit, "p")
#' str_starts(fruit, "p", negate = TRUE)
#' str_ends(fruit, "e")
#' str_ends(fruit, "e", negate = TRUE)
str_starts <- function(string, pattern, negate = FALSE) {
  check_lengths(string, pattern)
  check_bool(negate)

  out <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = stri_startswith_fixed(
      string,
      pattern,
      negate = negate,
      opts_fixed = opts(pattern)
    ),
    coll = stri_startswith_coll(
      string,
      pattern,
      negate = negate,
      opts_collator = opts(pattern)
    ),
    regex = {
      pattern2 <- paste0("^(", pattern, ")")
      stri_detect_regex(
        string,
        pattern2,
        negate = negate,
        opts_regex = opts(pattern)
      )
    }
  )
  preserve_names_if_possible(string, pattern, out)
}

#' @rdname str_starts
#' @export
str_ends <- function(string, pattern, negate = FALSE) {
  check_lengths(string, pattern)
  check_bool(negate)

  out <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = stri_endswith_fixed(
      string,
      pattern,
      negate = negate,
      opts_fixed = opts(pattern)
    ),
    coll = stri_endswith_coll(
      string,
      pattern,
      negate = negate,
      opts_collator = opts(pattern)
    ),
    regex = {
      pattern2 <- paste0("(", pattern, ")$")
      stri_detect_regex(
        string,
        pattern2,
        negate = negate,
        opts_regex = opts(pattern)
      )
    }
  )
  preserve_names_if_possible(string, pattern, out)
}

#' Detect a pattern in the same way as `SQL`'s `LIKE` and `ILIKE` operators
#'
#' @description
#' `str_like()` and `str_like()` follow the conventions of the SQL `LIKE`
#' and `ILIKE` operators, namely:
#'
#' * Must match the entire string.
#' * `_` matches a single character (like `.`).
#' * `%` matches any number of characters (like `.*`).
#' * `\%` and `\_` match literal `%` and `_`.
#'
#' The difference between the two functions is their case-sensitivity:
#' `str_like()` is case sensitive and `str_ilike()` is not.
#'
#' @note
#' Prior to stringr 1.6.0, `str_like()` was incorrectly case-insensitive.
#'
#' @inheritParams str_detect
#' @param pattern A character vector containing a SQL "like" pattern.
#'   See above for details.
#' @param ignore_case `r lifecycle::badge("deprecated")`
#' @return A logical vector the same length as `string`.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_like(fruit, "app")
#' str_like(fruit, "app%")
#' str_like(fruit, "APP%")
#' str_like(fruit, "ba_ana")
#' str_like(fruit, "%apple")
#'
#' str_ilike(fruit, "app")
#' str_ilike(fruit, "app%")
#' str_ilike(fruit, "APP%")
#' str_ilike(fruit, "ba_ana")
#' str_ilike(fruit, "%apple")
str_like <- function(string, pattern, ignore_case = deprecated()) {
  check_lengths(string, pattern)
  check_character(pattern)
  if (inherits(pattern, "stringr_pattern")) {
    cli::cli_abort(
      "{.arg pattern} must be a plain string, not a stringr modifier."
    )
  }
  if (lifecycle::is_present(ignore_case)) {
    lifecycle::deprecate_warn(
      when = "1.6.0",
      what = "str_like(ignore_case)",
      details = c(
        "`str_like()` is always case sensitive.",
        "Use `str_ilike()` for case insensitive string matching."
      )
    )
    check_bool(ignore_case)
    if (ignore_case) {
      return(str_ilike(string, pattern))
    }
  }

  pattern <- regex(like_to_regex(pattern), ignore_case = FALSE)
  out <- stri_detect_regex(string, pattern, opts_regex = opts(pattern))
  preserve_names_if_possible(string, pattern, out)
}

#' @export
#' @rdname str_like
str_ilike <- function(string, pattern) {
  check_lengths(string, pattern)
  check_character(pattern)
  if (inherits(pattern, "stringr_pattern")) {
    cli::cli_abort(tr_(
      "{.arg pattern} must be a plain string, not a stringr modifier."
    ))
  }

  pattern <- regex(like_to_regex(pattern), ignore_case = TRUE)
  out <- stri_detect_regex(string, pattern, opts_regex = opts(pattern))
  preserve_names_if_possible(string, pattern, out)
}

like_to_regex <- function(pattern) {
  converted <- stri_replace_all_regex(
    pattern,
    "(?<!\\\\|\\[)%(?!\\])",
    "\\.\\*"
  )
  converted <- stri_replace_all_regex(converted, "(?<!\\\\|\\[)_(?!\\])", "\\.")
  paste0("^", converted, "$")
}


================================================
FILE: R/dup.R
================================================
#' Duplicate a string
#'
#' `str_dup()` duplicates the characters within a string, e.g.
#' `str_dup("xy", 3)` returns `"xyxyxy"`.
#'
#' @inheritParams str_detect
#' @param times Number of times to duplicate each string.
#' @param sep String to insert between each duplicate.
#' @return A character vector the same length as `string`/`times`.
#' @export
#' @examples
#' fruit <- c("apple", "pear", "banana")
#' str_dup(fruit, 2)
#' str_dup(fruit, 2, sep = " ")
#' str_dup(fruit, 1:3)
#' str_c("ba", str_dup("na", 0:5))
str_dup <- function(string, times, sep = NULL) {
  input <- vctrs::vec_recycle_common(string = string, times = times)
  check_string(sep, allow_null = TRUE)

  if (is.null(sep)) {
    out <- stri_dup(input$string, input$times)
  } else {
    out <- map_chr(seq_along(input$string), function(i) {
      paste(rep(string[[i]], input$times[[i]]), collapse = sep)
    })
  }
  names(out) <- names(input$string)
  out
}


================================================
FILE: R/equal.R
================================================
#' Determine if two strings are equivalent
#'
#' This uses Unicode canonicalisation rules, and optionally ignores case.
#'
#' @param x,y A pair of character vectors.
#' @inheritParams str_order
#' @param ignore_case Ignore case when comparing strings?
#' @return An logical vector the same length as `x`/`y`.
#' @seealso [stringi::stri_cmp_equiv()] for the underlying implementation.
#' @export
#' @examples
#' # These two strings encode "a" with an accent in two different ways
#' a1 <- "\u00e1"
#' a2 <- "a\u0301"
#' c(a1, a2)
#'
#' a1 == a2
#' str_equal(a1, a2)
#'
#' # ohm and omega use different code points but should always be treated
#' # as equal
#' ohm <- "\u2126"
#' omega <- "\u03A9"
#' c(ohm, omega)
#'
#' ohm == omega
#' str_equal(ohm, omega)
str_equal <- function(x, y, locale = "en", ignore_case = FALSE, ...) {
  vctrs::vec_size_common(x = x, y = y)
  check_string(locale)
  check_bool(ignore_case)

  opts <- str_opts_collator(
    locale = locale,
    ignore_case = ignore_case,
    ...
  )
  stri_cmp_equiv(x, y, opts_collator = opts)
}


================================================
FILE: R/escape.R
================================================
#' Escape regular expression metacharacters
#'
#' This function escapes metacharacter, the characters that have special
#' meaning to the regular expression engine. In most cases you are better
#' off using [fixed()] since it is faster, but `str_escape()` is useful
#' if you are composing user provided strings into a pattern.
#'
#' @inheritParams str_detect
#' @return A character vector the same length as `string`.
#' @export
#' @examples
#' str_detect(c("a", "."), ".")
#' str_detect(c("a", "."), str_escape("."))
str_escape <- function(string) {
  out <- str_replace_all(string, "([.^$\\\\|*+?{}\\[\\]()])", "\\\\\\1")
  copy_names(string, out)
}


================================================
FILE: R/extract.R
================================================
#' Extract the complete match
#'
#' `str_extract()` extracts the first complete match from each string,
#' `str_extract_all()`extracts all matches from each string.
#'
#' @inheritParams str_count
#' @param group If supplied, instead of returning the complete match, will
#'   return the matched text from the specified capturing group.
#' @seealso [str_match()] to extract matched groups;
#'   [stringi::stri_extract()] for the underlying implementation.
#' @param simplify A boolean.
#'   * `FALSE` (the default): returns a list of character vectors.
#'   * `TRUE`: returns a character matrix.
#' @return
#' * `str_extract()`: an character vector the same length as `string`/`pattern`.
#' * `str_extract_all()`: a list of character vectors the same length as
#'   `string`/`pattern`.
#' @export
#' @examples
#' shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
#' str_extract(shopping_list, "\\d")
#' str_extract(shopping_list, "[a-z]+")
#' str_extract(shopping_list, "[a-z]{1,4}")
#' str_extract(shopping_list, "\\b[a-z]{1,4}\\b")
#'
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)")
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 1)
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 2)
#'
#' # Extract all matches
#' str_extract_all(shopping_list, "[a-z]+")
#' str_extract_all(shopping_list, "\\b[a-z]+\\b")
#' str_extract_all(shopping_list, "\\d")
#'
#' # Simplify results into character matrix
#' str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE)
#' str_extract_all(shopping_list, "\\d", simplify = TRUE)
#'
#' # Extract all words
#' str_extract_all("This is, suprisingly, a sentence.", boundary("word"))
str_extract <- function(string, pattern, group = NULL) {
  if (!is.null(group)) {
    out <- str_match(string, pattern)[, group + 1]
    return(preserve_names_if_possible(string, pattern, out))
  }

  check_lengths(string, pattern)
  opt <- opts(pattern)
  out <- switch(
    type(pattern),
    empty = stri_extract_first_boundaries(string, opts_brkiter = opt),
    bound = stri_extract_first_boundaries(string, opts_brkiter = opt),
    fixed = stri_extract_first_fixed(string, pattern, opts_fixed = opt),
    coll = stri_extract_first_coll(string, pattern, opts_collator = opt),
    regex = stri_extract_first_regex(string, pattern, opts_regex = opt)
  )
  preserve_names_if_possible(string, pattern, out)
}

#' @rdname str_extract
#' @export
str_extract_all <- function(string, pattern, simplify = FALSE) {
  check_lengths(string, pattern)
  check_bool(simplify)

  opt <- opts(pattern)
  out <- switch(
    type(pattern),
    empty = stri_extract_all_boundaries(
      string,
      simplify = simplify,
      omit_no_match = TRUE,
      opts_brkiter = opt
    ),
    bound = stri_extract_all_boundaries(
      string,
      simplify = simplify,
      omit_no_match = TRUE,
      opts_brkiter = opt
    ),
    fixed = stri_extract_all_fixed(
      string,
      pattern,
      simplify = simplify,
      omit_no_match = TRUE,
      opts_fixed = opt
    ),
    coll = stri_extract_all_coll(
      string,
      pattern,
      simplify = simplify,
      omit_no_match = TRUE,
      opts_collator = opt
    ),
    regex = stri_extract_all_regex(
      string,
      pattern,
      simplify = simplify,
      omit_no_match = TRUE,
      opts_regex = opt
    )
  )
  preserve_names_if_possible(string, pattern, out)
}


================================================
FILE: R/flatten.R
================================================
#' Flatten a string
#
#' @description
#' `str_flatten()` reduces a character vector to a single string. This is a
#' summary function because regardless of the length of the input `x`, it
#' always returns a single string.
#'
#' `str_flatten_comma()` is a variation designed specifically for flattening
#' with commas. It automatically recognises if `last` uses the Oxford comma
#' and handles the special case of 2 elements.
#'
#' @inheritParams str_detect
#' @param collapse String to insert between each piece. Defaults to `""`.
#' @param last Optional string to use in place of the final separator.
#' @param na.rm Remove missing values? If `FALSE` (the default), the result
#'   will be `NA` if any element of `string` is `NA`.
#' @return A string, i.e. a character vector of length 1.
#' @export
#' @examples
#' str_flatten(letters)
#' str_flatten(letters, "-")
#'
#' str_flatten(letters[1:3], ", ")
#'
#' # Use last to customise the last component
#' str_flatten(letters[1:3], ", ", " and ")
#'
#' # this almost works if you want an Oxford (aka serial) comma
#' str_flatten(letters[1:3], ", ", ", and ")
#'
#' # but it will always add a comma, even when not necessary
#' str_flatten(letters[1:2], ", ", ", and ")
#'
#' # str_flatten_comma knows how to handle the Oxford comma
#' str_flatten_comma(letters[1:3], ", and ")
#' str_flatten_comma(letters[1:2], ", and ")
str_flatten <- function(string, collapse = "", last = NULL, na.rm = FALSE) {
  check_string(collapse)
  check_string(last, allow_null = TRUE)
  check_bool(na.rm)

  if (na.rm) {
    string <- string[!is.na(string)]
  }

  n <- length(string)
  if (!is.null(last) && n >= 2) {
    string <- c(
      string[seq2(1, n - 2)],
      stringi::stri_c(string[[n - 1]], last, string[[n]])
    )
  }

  stri_flatten(string, collapse = collapse)
}

#' @export
#' @rdname str_flatten
str_flatten_comma <- function(string, last = NULL, na.rm = FALSE) {
  check_string(last, allow_null = TRUE)
  check_bool(na.rm)

  # Remove comma if exactly two elements, and last uses Oxford comma
  if (length(string) == 2 && !is.null(last) && str_detect(last, "^,")) {
    last <- str_replace(last, "^,", "")
  }
  str_flatten(string, ", ", last = last, na.rm = na.rm)
}


================================================
FILE: R/glue.R
================================================
#' Interpolation with glue
#'
#' @description
#' These functions are wrappers around [glue::glue()] and [glue::glue_data()],
#' which provide a powerful and elegant syntax for interpolating strings
#' with `{}`.
#'
#' These wrappers provide a small set of the full options. Use `glue()` and
#' `glue_data()` directly from glue for more control.
#'
#' @inheritParams glue::glue
#' @return A character vector with same length as the longest input.
#' @export
#' @examples
#' name <- "Fred"
#' age <- 50
#' anniversary <- as.Date("1991-10-12")
#' str_glue(
#'   "My name is {name}, ",
#'   "my age next year is {age + 1}, ",
#'   "and my anniversary is {format(anniversary, '%A, %B %d, %Y')}."
#' )
#'
#' # single braces can be inserted by doubling them
#' str_glue("My name is {name}, not {{name}}.")
#'
#' # You can also used named arguments
#' str_glue(
#'   "My name is {name}, ",
#'   "and my age next year is {age + 1}.",
#'   name = "Joe",
#'   age = 40
#' )
#'
#' # `str_glue_data()` is useful in data pipelines
#' mtcars %>% str_glue_data("{rownames(.)} has {hp} hp")
str_glue <- function(..., .sep = "", .envir = parent.frame(), .trim = TRUE) {
  glue::glue(..., .sep = .sep, .envir = .envir, .trim = .trim)
}

#' @export
#' @rdname str_glue
str_glue_data <- function(
  .x,
  ...,
  .sep = "",
  .envir = parent.frame(),
  .na = "NA"
) {
  glue::glue_data(
    .x,
    ...,
    .sep = .sep,
    .envir = .envir,
    .na = .na
  )
}


================================================
FILE: R/interp.R
================================================
#' String interpolation
#'
#' @description
#' `r lifecycle::badge("superseded")`
#'
#' `str_interp()` is superseded in favour of [str_glue()].
#'
#' String interpolation is a useful way of specifying a character string which
#' depends on values in a certain environment. It allows for string creation
#' which is easier to read and write when compared to using e.g.
#' [paste()] or [sprintf()]. The (template) string can
#' include expression placeholders of the form `${expression}` or
#' `$[format]{expression}`, where expressions are valid R expressions that
#' can be evaluated in the given environment, and `format` is a format
#' specification valid for use with [sprintf()].
#'
#' @param string A template character string. This function is not vectorised:
#'   a character vector will be collapsed into a single string.
#' @param env The environment in which to evaluate the expressions.
#' @seealso [str_glue()] and [str_glue_data()] for alternative approaches to
#'   the same problem.
#' @keywords internal
#' @return An interpolated character string.
#' @author Stefan Milton Bache
#' @export
#' @examples
#'
#' # Using values from the environment, and some formats
#' user_name <- "smbache"
#' amount <- 6.656
#' account <- 1337
#' str_interp("User ${user_name} (account $[08d]{account}) has $$[.2f]{amount}.")
#'
#' # Nested brace pairs work inside expressions too, and any braces can be
#' # placed outside the expressions.
#' str_interp("Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}")
#'
#' # Values can also come from a list
#' str_interp(
#'   "One value, ${value1}, and then another, ${value2*2}.",
#'   list(value1 = 10, value2 = 20)
#' )
#'
#' # Or a data frame
#' str_interp(
#'   "Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.",
#'   iris
#' )
#'
#' # Use a vector when the string is long:
#' max_char <- 80
#' str_interp(c(
#'   "This particular line is so long that it is hard to write ",
#'   "without breaking the ${max_char}-char barrier!"
#' ))
str_interp <- function(string, env = parent.frame()) {
  check_character(string)
  string <- str_c(string, collapse = "")

  # Find expression placeholders
  matches <- interp_placeholders(string)

  # Determine if any placeholders were found.
  if (matches$indices[1] <= 0) {
    string
  } else {
    # Evaluate them to get the replacement strings.
    replacements <- eval_interp_matches(matches$matches, env)

    # Replace the expressions by their values and return.
    `regmatches<-`(string, list(matches$indices), FALSE, list(replacements))
  }
}

#' Match String Interpolation Placeholders
#'
#' Given a character string a set of expression placeholders are matched. They
#' are of the form \code{${...}} or optionally \code{$[f]{...}} where `f`
#' is a valid format for [sprintf()].
#'
#' @param string character: The string to be interpolated.
#'
#' @return list containing `indices` (regex match data) and `matches`,
#'   the string representations of matched expressions.
#'
#' @noRd
#' @author Stefan Milton Bache
interp_placeholders <- function(string, error_call = caller_env()) {
  # Find starting position of ${} or $[]{} placeholders.
  starts <- gregexpr("\\$(\\[.*?\\])?\\{", string)[[1]]

  # Return immediately if no matches are found.
  if (starts[1] <= 0) {
    return(list(indices = starts))
  }

  # Break up the string in parts
  parts <- substr(
    rep(string, length(starts)),
    start = starts,
    stop = c(starts[-1L] - 1L, nchar(string))
  )

  # If there are nested placeholders, each part will not contain a full
  # placeholder in which case we report invalid string interpolation template.
  if (any(!grepl("\\$(\\[.*?\\])?\\{.+\\}", parts))) {
    cli::cli_abort(
      tr_("Invalid template string for interpolation."),
      call = error_call
    )
  }

  # For each part, find the opening and closing braces.
  opens <- lapply(strsplit(parts, ""), function(v) which(v == "{"))
  closes <- lapply(strsplit(parts, ""), function(v) which(v == "}"))

  # Identify the positions within the parts of the matching closing braces.
  # These are the lengths of the placeholder matches.
  lengths <- mapply(match_brace, opens, closes)

  # Update the `starts` match data with the
  attr(starts, "match.length") <- lengths

  # Return both the indices (regex match data) and the actual placeholder
  # matches (as strings.)
  list(
    indices = starts,
    matches = mapply(substr, starts, starts + lengths - 1, x = string)
  )
}

#' Evaluate String Interpolation Matches
#'
#' The expression part of string interpolation matches are evaluated in a
#' specified environment and formatted for replacement in the original string.
#' Used internally by [str_interp()].
#'
#' @param matches Match data
#'
#' @param env The environment in which to evaluate the expressions.
#'
#' @return A character vector of replacement strings.
#'
#' @noRd
#' @author Stefan Milton Bache
eval_interp_matches <- function(matches, env, error_call = caller_env()) {
  # Extract expressions from the matches
  expressions <- extract_expressions(matches, error_call = error_call)

  # Evaluate them in the given environment
  values <- lapply(
    expressions,
    eval,
    envir = env,
    enclos = if (is.environment(env)) env else environment(env)
  )

  # Find the formats to be used
  formats <- extract_formats(matches)

  # Format the values and return.
  mapply(sprintf, formats, values, SIMPLIFY = FALSE)
}

#' Extract Expression Objects from String Interpolation Matches
#'
#' An interpolation match object will contain both its wrapping \code{${ }} part
#' and possibly a format. This extracts the expression parts and parses them to
#' prepare them for evaluation.
#'
#' @param matches Match data
#'
#' @return list of R expressions
#'
#' @noRd
#' @author Stefan Milton Bache
extract_expressions <- function(matches, error_call = caller_env()) {
  # Parse function for text argument as first argument.

  parse_text <- function(text) {
    withCallingHandlers(
      parse(text = text),
      error = function(e) {
        cli::cli_abort(
          tr_("Failed to parse input {.str {text}}"),
          parent = e,
          call = error_call
        )
      }
    )
  }

  # string representation of the expressions (without the possible formats).
  strings <- gsub("\\$(\\[.+?\\])?\\{", "", matches)

  # Remove the trailing closing brace and parse.
  lapply(substr(strings, 1L, nchar(strings) - 1), parse_text)
}


#' Extract String Interpolation Formats from Matched Placeholders
#'
#' An expression placeholder for string interpolation may optionally contain a
#' format valid for [sprintf()]. This function will extract such or
#' default to "s" the format for strings.
#'
#' @param matches Match data
#'
#' @return A character vector of format specifiers.
#'
#' @noRd
#' @author Stefan Milton Bache
extract_formats <- function(matches) {
  # Extract the optional format parts.
  formats <- gsub("\\$(\\[(.+?)\\])?.*", "\\2", matches)

  # Use string options "s" as default when not specified.
  paste0("%", ifelse(formats == "", "s", formats))
}

#' Utility Function for Matching a Closing Brace
#'
#' Given positions of opening and closing braces `match_brace` identifies
#' the closing brace matching the first opening brace.
#'
#' @param opening integer: Vector with positions of opening braces.
#'
#' @param closing integer: Vector with positions of closing braces.
#'
#' @return Integer with the posision of the matching brace.
#'
#' @noRd
#' @author Stefan Milton Bache
match_brace <- function(opening, closing) {
  # maximum index for the matching closing brace
  max_close <- max(closing)

  # "path" for mapping opening and closing breaces
  path <- numeric(max_close)

  # Set openings to 1, and closings to -1
  path[opening[opening < max_close]] <- 1
  path[closing] <- -1

  # Cumulate the path ...
  cumpath <- cumsum(path)

  # ... and the first 0 after the first opening identifies the match.
  min(which(1:max_close > min(which(cumpath == 1)) & cumpath == 0))
}


================================================
FILE: R/length.R
================================================
#' Compute the length/width
#'
#' @description
#' `str_length()` returns the number of codepoints in a string. These are
#' the individual elements (which are often, but not always letters) that
#' can be extracted with [str_sub()].
#'
#' `str_width()` returns how much space the string will occupy when printed
#' in a fixed width font (i.e. when printed in the console).
#'
#' @inheritParams str_detect
#' @return A numeric vector the same length as `string`.
#' @seealso [stringi::stri_length()] which this function wraps.
#' @export
#' @examples
#' str_length(letters)
#' str_length(NA)
#' str_length(factor("abc"))
#' str_length(c("i", "like", "programming", NA))
#'
#' # Some characters, like emoji and Chinese characters (hanzi), are square
#' # which means they take up the width of two Latin characters
#' x <- c("\u6c49\u5b57", "\U0001f60a")
#' str_view(x)
#' str_width(x)
#' str_length(x)
#'
#' # There are two ways of representing a u with an umlaut
#' u <- c("\u00fc", "u\u0308")
#' # They have the same width
#' str_width(u)
#' # But a different length
#' str_length(u)
#' # Because the second element is made up of a u + an accent
#' str_sub(u, 1, 1)
str_length <- function(string) {
  copy_names(string, stri_length(string))
}

#' @export
#' @rdname str_length
str_width <- function(string) {
  copy_names(string, stri_width(string))
}


================================================
FILE: R/locate.R
================================================
#' Find location of match
#'
#' @description
#' `str_locate()` returns the `start` and `end` position of the first match;
#' `str_locate_all()` returns the `start` and `end` position of each match.
#'
#' Because the `start` and `end` values are inclusive, zero-length matches
#' (e.g. `$`, `^`, `\\b`) will have an `end` that is smaller than `start`.
#'
#' @inheritParams str_count
#' @returns
#' * `str_locate()` returns an integer matrix with two columns and
#'   one row for each element of `string`. The first column, `start`,
#'   gives the position at the start of the match, and the second column, `end`,
#'   gives the position of the end.
#'
#'* `str_locate_all()` returns a list of integer matrices with the same
#'   length as `string`/`pattern`. The matrices have columns `start` and `end`
#'   as above, and one row for each match.
#' @seealso
#'   [str_extract()] for a convenient way of extracting matches,
#'   [stringi::stri_locate()] for the underlying implementation.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_locate(fruit, "$")
#' str_locate(fruit, "a")
#' str_locate(fruit, "e")
#' str_locate(fruit, c("a", "b", "p", "p"))
#'
#' str_locate_all(fruit, "a")
#' str_locate_all(fruit, "e")
#' str_locate_all(fruit, c("a", "b", "p", "p"))
#'
#' # Find location of every character
#' str_locate_all(fruit, "")
str_locate <- function(string, pattern) {
  check_lengths(string, pattern)

  out <- switch(
    type(pattern),
    empty = ,
    bound = stri_locate_first_boundaries(string, opts_brkiter = opts(pattern)),
    fixed = stri_locate_first_fixed(
      string,
      pattern,
      opts_fixed = opts(pattern)
    ),
    coll = stri_locate_first_coll(
      string,
      pattern,
      opts_collator = opts(pattern)
    ),
    regex = stri_locate_first_regex(string, pattern, opts_regex = opts(pattern))
  )
  preserve_names_if_possible(string, pattern, out)
}

#' @rdname str_locate
#' @export
str_locate_all <- function(string, pattern) {
  check_lengths(string, pattern)
  opts <- opts(pattern)

  out <- switch(
    type(pattern),
    empty = ,
    bound = stri_locate_all_boundaries(
      string,
      omit_no_match = TRUE,
      opts_brkiter = opts
    ),
    fixed = stri_locate_all_fixed(
      string,
      pattern,
      omit_no_match = TRUE,
      opts_fixed = opts
    ),
    regex = stri_locate_all_regex(
      string,
      pattern,
      omit_no_match = TRUE,
      opts_regex = opts
    ),
    coll = stri_locate_all_coll(
      string,
      pattern,
      omit_no_match = TRUE,
      opts_collator = opts
    )
  )
  preserve_names_if_possible(string, pattern, out)
}


#' Switch location of matches to location of non-matches
#'
#' Invert a matrix of match locations to match the opposite of what was
#' previously matched.
#'
#' @param loc matrix of match locations, as from [str_locate_all()]
#' @return numeric match giving locations of non-matches
#' @export
#' @examples
#' numbers <- "1 and 2 and 4 and 456"
#' num_loc <- str_locate_all(numbers, "[0-9]+")[[1]]
#' str_sub(numbers, num_loc[, "start"], num_loc[, "end"])
#'
#' text_loc <- invert_match(num_loc)
#' str_sub(numbers, text_loc[, "start"], text_loc[, "end"])
invert_match <- function(loc) {
  cbind(
    start = c(0L, loc[, "end"] + 1L),
    end = c(loc[, "start"] - 1L, -1L)
  )
}


================================================
FILE: R/match.R
================================================
#' Extract components (capturing groups) from a match
#'
#' @description
#' Extract any number of matches defined by unnamed, `(pattern)`, and
#' named, `(?<name>pattern)` capture groups.
#'
#' Use a non-capturing group, `(?:pattern)`, if you need to override default
#' operate precedence but don't want to capture the result.
#'
#' @inheritParams str_detect
#' @param pattern Unlike other stringr functions, `str_match()` only supports
#'   regular expressions, as described `vignette("regular-expressions")`.
#'   The pattern should contain at least one capturing group.
#' @return
#' * `str_match()`: a character matrix with the same number of rows as the
#'   length of `string`/`pattern`. The first column is the complete match,
#'   followed by one column for each capture group. The columns will be named
#'   if you used "named captured groups", i.e. `(?<name>pattern')`.
#'
#' * `str_match_all()`: a list of the same length as `string`/`pattern`
#'   containing character matrices. Each matrix has columns as described above
#'   and one row for each match.
#'
#' @seealso [str_extract()] to extract the complete match,
#'   [stringi::stri_match()] for the underlying implementation.
#' @export
#' @examples
#' strings <- c(" 219 733 8965", "329-293-8753 ", "banana", "595 794 7569",
#'   "387 287 6718", "apple", "233.398.9187  ", "482 952 3315",
#'   "239 923 8115 and 842 566 4692", "Work: 579-499-7527", "$1000",
#'   "Home: 543.355.3679")
#' phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"
#'
#' str_extract(strings, phone)
#' str_match(strings, phone)
#'
#' # Extract/match all
#' str_extract_all(strings, phone)
#' str_match_all(strings, phone)
#'
#' # You can also name the groups to make further manipulation easier
#' phone <- "(?<area>[2-9][0-9]{2})[- .](?<phone>[0-9]{3}[- .][0-9]{4})"
#' str_match(strings, phone)
#'
#' x <- c("<a> <b>", "<a> <>", "<a>", "", NA)
#' str_match(x, "<(.*?)> <(.*?)>")
#' str_match_all(x, "<(.*?)>")
#'
#' str_extract(x, "<.*?>")
#' str_extract_all(x, "<.*?>")
str_match <- function(string, pattern) {
  check_lengths(string, pattern)
  if (type(pattern) != "regex") {
    cli::cli_abort(tr_("{.arg pattern} must be a regular expression."))
  }

  out <- stri_match_first_regex(string, pattern, opts_regex = opts(pattern))
  preserve_names_if_possible(string, pattern, out)
}

#' @rdname str_match
#' @export
str_match_all <- function(string, pattern) {
  check_lengths(string, pattern)
  if (type(pattern) != "regex") {
    cli::cli_abort(tr_("{.arg pattern} must be a regular expression."))
  }

  out <- stri_match_all_regex(
    string,
    pattern,
    omit_no_match = TRUE,
    opts_regex = opts(pattern)
  )
  preserve_names_if_possible(string, pattern, out)
}


================================================
FILE: R/modifiers.R
================================================
#' Control matching behaviour with modifier functions
#'
#' @description
#' Modifier functions control the meaning of the `pattern` argument to
#' stringr functions:
#'
#' * `boundary()`: Match boundaries between things.
#' * `coll()`: Compare strings using standard Unicode collation rules.
#' * `fixed()`: Compare literal bytes.
#' * `regex()` (the default): Uses ICU regular expressions.
#'
#' @param pattern Pattern to modify behaviour.
#' @param ignore_case Should case differences be ignored in the match?
#'   For `fixed()`, this uses a simple algorithm which assumes a
#'   one-to-one mapping between upper and lower case letters.
#' @return A stringr modifier object, i.e. a character vector with
#'   parent S3 class `stringr_pattern`.
#' @name modifiers
#' @examples
#' pattern <- "a.b"
#' strings <- c("abb", "a.b")
#' str_detect(strings, pattern)
#' str_detect(strings, fixed(pattern))
#' str_detect(strings, coll(pattern))
#'
#' # coll() is useful for locale-aware case-insensitive matching
#' i <- c("I", "\u0130", "i")
#' i
#' str_detect(i, fixed("i", TRUE))
#' str_detect(i, coll("i", TRUE))
#' str_detect(i, coll("i", TRUE, locale = "tr"))
#'
#' # Word boundaries
#' words <- c("These are   some words.")
#' str_count(words, boundary("word"))
#' str_split(words, " ")[[1]]
#' str_split(words, boundary("word"))[[1]]
#'
#' # Regular expression variations
#' str_extract_all("The Cat in the Hat", "[a-z]+")
#' str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))
#'
#' str_extract_all("a\nb\nc", "^.")
#' str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))
#'
#' str_extract_all("a\nb\nc", "a.")
#' str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))
NULL

#' @export
#' @rdname modifiers
fixed <- function(pattern, ignore_case = FALSE) {
  pattern <- as_bare_character(pattern)
  check_bool(ignore_case)

  options <- stri_opts_fixed(case_insensitive = ignore_case)

  structure(
    pattern,
    options = options,
    class = c("stringr_fixed", "stringr_pattern", "character")
  )
}

#' @export
#' @rdname modifiers
#' @param locale Locale to use for comparisons. See
#'   [stringi::stri_locale_list()] for all possible options.
#'   Defaults to "en" (English) to ensure that default behaviour is
#'   consistent across platforms.
#' @param ... Other less frequently used arguments passed on to
#'   [stringi::stri_opts_collator()],
#'   [stringi::stri_opts_regex()], or
#'   [stringi::stri_opts_brkiter()]
coll <- function(pattern, ignore_case = FALSE, locale = "en", ...) {
  pattern <- as_bare_character(pattern)
  check_bool(ignore_case)
  check_string(locale)

  options <- str_opts_collator(
    ignore_case = ignore_case,
    locale = locale,
    ...
  )

  structure(
    pattern,
    options = options,
    class = c("stringr_coll", "stringr_pattern", "character")
  )
}


str_opts_collator <- function(
  locale = "en",
  ignore_case = FALSE,
  strength = NULL,
  ...
) {
  strength <- strength %||% if (ignore_case) 2L else 3L
  stri_opts_collator(
    strength = strength,
    locale = locale,
    ...
  )
}

# used for testing
turkish_I <- function() {
  coll("I", ignore_case = TRUE, locale = "tr")
}

#' @export
#' @rdname modifiers
#' @param multiline If `TRUE`, `$` and `^` match
#'   the beginning and end of each line. If `FALSE`, the
#'   default, only match the start and end of the input.
#' @param comments If `TRUE`, white space and comments beginning with
#'   `#` are ignored. Escape literal spaces with `\\ `.
#' @param dotall If `TRUE`, `.` will also match line terminators.
regex <- function(
  pattern,
  ignore_case = FALSE,
  multiline = FALSE,
  comments = FALSE,
  dotall = FALSE,
  ...
) {
  pattern <- as_bare_character(pattern)
  check_bool(ignore_case)
  check_bool(multiline)
  check_bool(comments)
  check_bool(dotall)

  options <- stri_opts_regex(
    case_insensitive = ignore_case,
    multiline = multiline,
    comments = comments,
    dotall = dotall,
    ...
  )

  structure(
    pattern,
    options = options,
    class = c("stringr_regex", "stringr_pattern", "character")
  )
}

#' @param type Boundary type to detect.
#' \describe{
#'  \item{`character`}{Every character is a boundary.}
#'  \item{`line_break`}{Boundaries are places where it is acceptable to have
#'    a line break in the current locale.}
#'  \item{`sentence`}{The beginnings and ends of sentences are boundaries,
#'    using intelligent rules to avoid counting abbreviations
#'    ([details](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)).}
#'  \item{`word`}{The beginnings and ends of words are boundaries.}
#' }
#' @param skip_word_none Ignore "words" that don't contain any characters
#'   or numbers - i.e. punctuation. Default `NA` will skip such "words"
#'   only when splitting on `word` boundaries.
#' @export
#' @rdname modifiers
boundary <- function(
  type = c("character", "line_break", "sentence", "word"),
  skip_word_none = NA,
  ...
) {
  type <- arg_match(type)
  check_bool(skip_word_none, allow_na = TRUE)

  if (identical(skip_word_none, NA)) {
    skip_word_none <- type == "word"
  }

  options <- stri_opts_brkiter(
    type = type,
    skip_word_none = skip_word_none,
    ...
  )

  structure(
    NA_character_,
    options = options,
    class = c("stringr_boundary", "stringr_pattern", "character")
  )
}

opts <- function(x) {
  if (identical(x, "")) {
    stri_opts_brkiter(type = "character")
  } else {
    attr(x, "options")
  }
}

type <- function(x, error_call = caller_env()) {
  UseMethod("type")
}
#' @export
type.stringr_boundary <- function(x, error_call = caller_env()) {
  "bound"
}
#' @export
type.stringr_regex <- function(x, error_call = caller_env()) {
  "regex"
}
#' @export
type.stringr_coll <- function(x, error_call = caller_env()) {
  "coll"
}
#' @export
type.stringr_fixed <- function(x, error_call = caller_env()) {
  "fixed"
}
#' @export
type.character <- function(x, error_call = caller_env()) {
  if (any(is.na(x))) {
    cli::cli_abort(
      tr_("{.arg pattern} can not contain NAs."),
      call = error_call
    )
  }

  if (identical(x, "")) "empty" else "regex"
}

#' @export
type.default <- function(x, error_call = caller_env()) {
  if (inherits(x, "regex")) {
    # Fallback for rex
    return("regex")
  }

  cli::cli_abort(
    tr_(
      "{.arg pattern} must be a character vector, not {.obj_type_friendly {x}}."
    ),
    call = error_call
  )
}

#' @export
`[.stringr_pattern` <- function(x, i) {
  structure(
    NextMethod(),
    options = attr(x, "options"),
    class = class(x)
  )
}

#' @export
`[[.stringr_pattern` <- function(x, i) {
  structure(
    NextMethod(),
    options = attr(x, "options"),
    class = class(x)
  )
}

as_bare_character <- function(x, call = caller_env()) {
  if (is.character(x) && !is.object(x)) {
    # All OK!
    return(x)
  }

  warn("Coercing `pattern` to a plain character vector.", call = call)
  as.character(x)
}


================================================
FILE: R/pad.R
================================================
#' Pad a string to minimum width
#'
#' Pad a string to a fixed width, so that
#' `str_length(str_pad(x, n))` is always greater than or equal to `n`.
#'
#' @inheritParams str_detect
#' @param width Minimum width of padded strings.
#' @param side Side on which padding character is added (left, right or both).
#' @param pad Single padding character (default is a space).
#' @param use_width If `FALSE`, use the length of the string instead of the
#'   width; see [str_width()]/[str_length()] for the difference.
#' @return A character vector the same length as `stringr`/`width`/`pad`.
#' @seealso [str_trim()] to remove whitespace;
#'   [str_trunc()] to decrease the maximum width of a string.
#' @export
#' @examples
#' rbind(
#'   str_pad("hadley", 30, "left"),
#'   str_pad("hadley", 30, "right"),
#'   str_pad("hadley", 30, "both")
#' )
#'
#' # All arguments are vectorised except side
#' str_pad(c("a", "abc", "abcdef"), 10)
#' str_pad("a", c(5, 10, 20))
#' str_pad("a", 10, pad = c("-", "_", " "))
#'
#' # Longer strings are returned unchanged
#' str_pad("hadley", 3)
str_pad <- function(
  string,
  width,
  side = c("left", "right", "both"),
  pad = " ",
  use_width = TRUE
) {
  vctrs::vec_size_common(string = string, width = width, pad = pad)
  side <- arg_match(side)
  check_bool(use_width)

  out <- switch(
    side,
    left = stri_pad_left(string, width, pad = pad, use_length = !use_width),
    right = stri_pad_right(string, width, pad = pad, use_length = !use_width),
    both = stri_pad_both(string, width, pad = pad, use_length = !use_width)
  )
  # Preserve names unless `string` is recycled
  if (length(out) == length(string)) copy_names(string, out) else out
}


================================================
FILE: R/remove.R
================================================
#' Remove matched patterns
#'
#' Remove matches, i.e. replace them with `""`.
#'
#' @inheritParams str_detect
#' @return A character vector the same length as `string`/`pattern`.
#' @seealso [str_replace()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c("one apple", "two pears", "three bananas")
#' str_remove(fruits, "[aeiou]")
#' str_remove_all(fruits, "[aeiou]")
str_remove <- function(string, pattern) {
  str_replace(string, pattern, "")
}

#' @export
#' @rdname str_remove
str_remove_all <- function(string, pattern) {
  str_replace_all(string, pattern, "")
}


================================================
FILE: R/replace.R
================================================
#' Replace matches with new text
#'
#' `str_replace()` replaces the first match; `str_replace_all()` replaces
#' all matches.
#'
#' @inheritParams str_detect
#' @param pattern Pattern to look for.
#'
#'   The default interpretation is a regular expression, as described
#'   in [stringi::about_search_regex]. Control options with
#'   [regex()].
#'
#'   For `str_replace_all()` this can also be a named vector
#'   (`c(pattern1 = replacement1)`), in order to perform multiple replacements
#'   in each element of `string`.
#'
#'   Match a fixed string (i.e. by comparing only bytes), using
#'   [fixed()]. This is fast, but approximate. Generally,
#'   for matching human text, you'll want [coll()] which
#'   respects character matching rules for the specified locale.
#'
#'   You can not match boundaries, including `""`, with this function.
#' @param replacement The replacement value, usually a single string,
#'   but it can be the a vector the same length as `string` or `pattern`.
#'   References of the form `\1`, `\2`, etc will be replaced with
#'   the contents of the respective matched group (created by `()`).
#'
#'   Alternatively, supply a function (or formula): it will be passed a single
#'   character vector and should return a character vector of the same length.
#'
#'   To replace the complete string with `NA`, use
#'   `replacement = NA_character_`.
#' @return A character vector the same length as
#'   `string`/`pattern`/`replacement`.
#' @seealso [str_replace_na()] to turn missing values into "NA";
#'   [stringi::stri_replace()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c("one apple", "two pears", "three bananas")
#' str_replace(fruits, "[aeiou]", "-")
#' str_replace_all(fruits, "[aeiou]", "-")
#' str_replace_all(fruits, "[aeiou]", toupper)
#' str_replace_all(fruits, "b", NA_character_)
#'
#' str_replace(fruits, "([aeiou])", "")
#' str_replace(fruits, "([aeiou])", "\\1\\1")
#'
#' # Note that str_replace() is vectorised along text, pattern, and replacement
#' str_replace(fruits, "[aeiou]", c("1", "2", "3"))
#' str_replace(fruits, c("a", "e", "i"), "-")
#'
#' # If you want to apply multiple patterns and replacements to the same
#' # string, pass a named vector to pattern.
#' fruits %>%
#'   str_c(collapse = "---") %>%
#'   str_replace_all(c("one" = "1", "two" = "2", "three" = "3"))
#'
#' # Use a function for more sophisticated replacement. This example
#' # replaces colour names with their hex values.
#' colours <- str_c("\\b", colors(), "\\b", collapse="|")
#' col2hex <- function(col) {
#'   rgb <- col2rgb(col)
#'   rgb(rgb["red", ], rgb["green", ], rgb["blue", ], maxColorValue = 255)
#' }
#'
#' x <- c(
#'   "Roses are red, violets are blue",
#'   "My favourite colour is green"
#' )
#' str_replace_all(x, colours, col2hex)
str_replace <- function(string, pattern, replacement) {
  if (!missing(replacement) && is_replacement_fun(replacement)) {
    replacement <- as_function(replacement)
    return(str_transform(string, pattern, replacement))
  }

  check_lengths(string, pattern, replacement)

  out <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = stri_replace_first_fixed(
      string,
      pattern,
      replacement,
      opts_fixed = opts(pattern)
    ),
    coll = stri_replace_first_coll(
      string,
      pattern,
      replacement,
      opts_collator = opts(pattern)
    ),
    regex = stri_replace_first_regex(
      string,
      pattern,
      fix_replacement(replacement),
      opts_regex = opts(pattern)
    )
  )
  preserve_names_if_possible(string, pattern, out)
}

#' @export
#' @rdname str_replace
str_replace_all <- function(string, pattern, replacement) {
  if (!missing(replacement) && is_replacement_fun(replacement)) {
    replacement <- as_function(replacement)
    return(str_transform_all(string, pattern, replacement))
  }

  if (!is.null(names(pattern))) {
    vec <- FALSE
    replacement <- unname(pattern)
    pattern[] <- names(pattern)
  } else {
    check_lengths(string, pattern, replacement)
    vec <- TRUE
  }

  out <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = stri_replace_all_fixed(
      string,
      pattern,
      replacement,
      vectorize_all = vec,
      opts_fixed = opts(pattern)
    ),
    coll = stri_replace_all_coll(
      string,
      pattern,
      replacement,
      vectorize_all = vec,
      opts_collator = opts(pattern)
    ),
    regex = stri_replace_all_regex(
      string,
      pattern,
      fix_replacement(replacement),
      vectorize_all = vec,
      opts_regex = opts(pattern)
    )
  )
  preserve_names_if_possible(string, pattern, out)
}

is_replacement_fun <- function(x) {
  is.function(x) || is_formula(x)
}

fix_replacement <- function(x, error_call = caller_env()) {
  check_character(x, arg = "replacement", call = error_call)
  vapply(x, fix_replacement_one, character(1), USE.NAMES = FALSE)
}

fix_replacement_one <- function(x) {
  if (is.na(x)) {
    return(x)
  }

  chars <- str_split(x, "")[[1]]
  out <- character(length(chars))
  escaped <- logical(length(chars))

  in_escape <- FALSE
  for (i in seq_along(chars)) {
    escaped[[i]] <- in_escape
    char <- chars[[i]]

    if (in_escape) {
      # Escape character not printed previously so must include here
      if (char == "$") {
        out[[i]] <- "\\\\$"
      } else if (char >= "0" && char <= "9") {
        out[[i]] <- paste0("$", char)
      } else {
        out[[i]] <- paste0("\\", char)
      }

      in_escape <- FALSE
    } else {
      if (char == "$") {
        out[[i]] <- "\\$"
      } else if (char == "\\") {
        in_escape <- TRUE
      } else {
        out[[i]] <- char
      }
    }
  }

  # tibble::tibble(chars, out, escaped)
  paste0(out, collapse = "")
}


#' Turn NA into "NA"
#'
#' @inheritParams str_replace
#' @param replacement A single string.
#' @export
#' @examples
#' str_replace_na(c(NA, "abc", "def"))
str_replace_na <- function(string, replacement = "NA") {
  check_string(replacement)
  copy_names(string, stri_replace_na(string, replacement))
}

str_transform <- function(string, pattern, replacement) {
  loc <- str_locate(string, pattern)
  new <- replacement(str_sub(string, loc))
  str_sub(string, loc, omit_na = TRUE) <- new
  string
}

str_transform_all <- function(
  string,
  pattern,
  replacement,
  error_call = caller_env()
) {
  locs <- str_locate_all(string, pattern)

  old <- str_sub_all(string, locs)

  # unchop list into a vector, apply replacement(), and then rechop back into
  # a list
  old_flat <- vctrs::list_unchop(old)
  if (length(old_flat) == 0) {
    # minor optimisation to avoid problems with the many replacement
    # functions that use paste
    new_flat <- character()
  } else {
    withCallingHandlers(
      new_flat <- replacement(old_flat),
      error = function(cnd) {
        cli::cli_abort(
          c(
            tr_("Failed to apply {.arg replacement} function."),
            i = tr_("It must accept a character vector of any length.")
          ),
          parent = cnd,
          call = error_call
        )
      }
    )
  }

  if (!is.character(new_flat)) {
    cli::cli_abort(
      tr_(
        "{.arg replacement} function must return a character vector, not {.obj_type_friendly {new_flat}}."
      ),
      call = error_call
    )
  }
  if (length(new_flat) != length(old_flat)) {
    cli::cli_abort(
      tr_(
        "{.arg replacement} function must return a vector the same length as the input ({length(old_flat)}), not length {length(new_flat)}."
      ),
      call = error_call
    )
  }

  idx <- chop_index(old)
  new <- vctrs::vec_chop(new_flat, idx)

  stringi::stri_sub_all(string, locs) <- new
  string
}

chop_index <- function(x) {
  ls <- lengths(x)
  start <- cumsum(c(1L, ls[-length(ls)]))
  end <- start + ls - 1L
  lapply(seq_along(ls), function(i) seq2(start[[i]], end[[i]]))
}


================================================
FILE: R/sort.R
================================================
#' Order, rank, or sort a character vector
#'
#' * `str_sort()` returns the sorted vector.
#' * `str_order()` returns an integer vector that returns the desired
#'   order when used for subsetting, i.e. `x[str_order(x)]` is the same
#'   as `str_sort()`
#' * `str_rank()` returns the ranks of the values, i.e.
#'   `arrange(df, str_rank(x))` is the same as `str_sort(df$x)`.
#'
#' @param x A character vector to sort.
#' @param decreasing A boolean. If `FALSE`, the default, sorts from
#'   lowest to highest; if `TRUE` sorts from highest to lowest.
#' @param na_last Where should `NA` go? `TRUE` at the end,
#'   `FALSE` at the beginning, `NA` dropped.
#' @param numeric If `TRUE`, will sort digits numerically, instead
#'    of as strings.
#' @param ... Other options used to control collation. Passed on to
#'   [stringi::stri_opts_collator()].
#' @inheritParams coll
#' @return A character vector the same length as `string`.
#' @seealso [stringi::stri_order()] for the underlying implementation.
#' @export
#' @examples
#' x <- c("apple", "car", "happy", "char")
#' str_sort(x)
#'
#' str_order(x)
#' x[str_order(x)]
#'
#' str_rank(x)
#'
#' # In Czech, ch is a digraph that sorts after h
#' str_sort(x, locale = "cs")
#'
#' # Use numeric = TRUE to sort numbers in strings
#' x <- c("100a10", "100a5", "2b", "2a")
#' str_sort(x)
#' str_sort(x, numeric = TRUE)
str_order <- function(
  x,
  decreasing = FALSE,
  na_last = TRUE,
  locale = "en",
  numeric = FALSE,
  ...
) {
  check_bool(decreasing)
  check_bool(na_last, allow_na = TRUE)
  check_string(locale)
  check_bool(numeric)

  opts <- stri_opts_collator(locale, numeric = numeric, ...)
  stri_order(
    x,
    decreasing = decreasing,
    na_last = na_last,
    opts_collator = opts
  )
}

#' @export
#' @rdname str_order
str_rank <- function(x, locale = "en", numeric = FALSE, ...) {
  check_string(locale)
  check_bool(numeric)

  opts <- stri_opts_collator(locale, numeric = numeric, ...)
  stri_rank(x, opts_collator = opts)
}

#' @export
#' @rdname str_order
str_sort <- function(
  x,
  decreasing = FALSE,
  na_last = TRUE,
  locale = "en",
  numeric = FALSE,
  ...
) {
  check_bool(decreasing)
  check_bool(na_last, allow_na = TRUE)
  check_string(locale)
  check_bool(numeric)

  opts <- stri_opts_collator(locale, numeric = numeric, ...)
  idx <- stri_order(
    x,
    decreasing = decreasing,
    na_last = na_last,
    opts_collator = opts
  )
  x[idx]
}


================================================
FILE: R/split.R
================================================
#' Split up a string into pieces
#'
#' @description
#' This family of functions provides various ways of splitting a string up
#' into pieces. These two functions return a character vector:
#'
#' * `str_split_1()` takes a single string and splits it into pieces,
#'    returning a single character vector.
#' * `str_split_i()` splits each string in a character vector into pieces and
#'    extracts the `i`th value, returning a character vector.
#'
#' These two functions return a more complex object:
#'
#' * `str_split()` splits each string in a character vector into a varying
#'    number of pieces, returning a list of character vectors.
#' * `str_split_fixed()` splits each string in a character vector into a
#'    fixed number of pieces, returning a character matrix.
#'
#' @inheritParams str_extract
#' @param n Maximum number of pieces to return. Default (Inf) uses all
#'   possible split positions.
#'
#'   For `str_split()`, this determines the maximum length of each element
#'   of the output. For `str_split_fixed()`, this determines the number of
#'   columns in the output; if an input is too short, the result will be padded
#'   with `""`.
#' @return
#' * `str_split_1()`: a character vector.
#' * `str_split()`: a list the same length as `string`/`pattern` containing
#'   character vectors.
#' * `str_split_fixed()`: a character matrix with `n` columns and the same
#'   number of rows as the length of `string`/`pattern`.
#' * `str_split_i()`: a character vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_split()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c(
#'   "apples and oranges and pears and bananas",
#'   "pineapples and mangos and guavas"
#' )
#'
#' str_split(fruits, " and ")
#' str_split(fruits, " and ", simplify = TRUE)
#'
#' # If you want to split a single string, use `str_split_1`
#' str_split_1(fruits[[1]], " and ")
#'
#' # Specify n to restrict the number of possible matches
#' str_split(fruits, " and ", n = 3)
#' str_split(fruits, " and ", n = 2)
#' # If n greater than number of pieces, no padding occurs
#' str_split(fruits, " and ", n = 5)
#'
#' # Use fixed to return a character matrix
#' str_split_fixed(fruits, " and ", 3)
#' str_split_fixed(fruits, " and ", 4)
#'
#' # str_split_i extracts only a single piece from a string
#' str_split_i(fruits, " and ", 1)
#' str_split_i(fruits, " and ", 4)
#' # use a negative number to select from the end
#' str_split_i(fruits, " and ", -1)
str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
  check_lengths(string, pattern)
  check_positive_integer(n)
  check_bool(simplify, allow_na = TRUE)

  if (identical(n, Inf)) {
    n <- -1L
  }

  out <- switch(
    type(pattern),
    empty = stri_split_boundaries(
      string,
      n = n,
      simplify = simplify,
      opts_brkiter = opts(pattern)
    ),
    bound = stri_split_boundaries(
      string,
      n = n,
      simplify = simplify,
      opts_brkiter = opts(pattern)
    ),
    fixed = stri_split_fixed(
      string,
      pattern,
      n = n,
      simplify = simplify,
      opts_fixed = opts(pattern)
    ),
    regex = stri_split_regex(
      string,
      pattern,
      n = n,
      simplify = simplify,
      opts_regex = opts(pattern)
    ),
    coll = stri_split_coll(
      string,
      pattern,
      n = n,
      simplify = simplify,
      opts_collator = opts(pattern)
    )
  )

  preserve_names_if_possible(string, pattern, out)
}

#' @export
#' @rdname str_split
str_split_1 <- function(string, pattern) {
  check_string(string)

  str_split(string, pattern)[[1]]
}

#' @export
#' @rdname str_split
str_split_fixed <- function(string, pattern, n) {
  check_lengths(string, pattern)
  check_positive_integer(n)

  str_split(string, pattern, n = n, simplify = TRUE)
}

#' @export
#' @rdname str_split
#' @param i Element to return. Use a negative value to count from the
#'   right hand side.
str_split_i <- function(string, pattern, i) {
  check_number_whole(i)

  if (i > 0) {
    out <- str_split(string, pattern, simplify = NA, n = i + 1)
    col <- out[, i]
    if (keep_names(string, pattern)) copy_names(string, col) else col
  } else if (i < 0) {
    i <- abs(i)
    pieces <- str_split(string, pattern)
    last <- function(x) {
      n <- length(x)
      if (i > n) {
        NA_character_
      } else {
        x[[n + 1 - i]]
      }
    }
    out <- map_chr(pieces, last)
    preserve_names_if_possible(string, pattern, out)
  } else {
    cli::cli_abort(tr_("{.arg i} must not be 0."))
  }
}

check_positive_integer <- function(
  x,
  arg = caller_arg(x),
  call = caller_env()
) {
  if (!identical(x, Inf)) {
    check_number_whole(x, min = 1, arg = arg, call = call)
  }
}


================================================
FILE: R/stringr-package.R
================================================
#' @keywords internal
"_PACKAGE"

## usethis namespace: start
#' @import stringi
#' @import rlang
#' @importFrom glue glue
#' @importFrom lifecycle deprecated
## usethis namespace: end
NULL


================================================
FILE: R/sub.R
================================================
#' Get and set substrings using their positions
#'
#' `str_sub()` extracts or replaces the elements at a single position in each
#' string. `str_sub_all()` allows you to extract strings at multiple elements
#' in every string.
#'
#' @inheritParams str_detect
#' @param start,end A pair of integer vectors defining the range of characters
#'   to extract (inclusive). Positive values count from the left of the string,
#'   and negative values count from the right. In other words, if `string` is
#'   `"abcdef"` then 1 refers to `"a"` and -1 refers to `"f"`.
#'
#'   Alternatively, instead of a pair of vectors, you can pass a matrix to
#'   `start`. The matrix should have two columns, either labelled `start`
#'   and `end`, or `start` and `length`. This makes `str_sub()` work directly
#'   with the output from [str_locate()] and friends.
#'
#' @param omit_na Single logical value. If `TRUE`, missing values in any of the
#'   arguments provided will result in an unchanged input.
#' @param value Replacement string.
#' @return
#' * `str_sub()`: A character vector the same length as `string`/`start`/`end`.
#' * `str_sub_all()`: A list the same length as `string`. Each element is
#'    a character vector the same length as `start`/`end`.
#'
#' If `end` comes before `start` or `start` is outside the range of `string`
#' then the corresponding output will be the empty string.
#' @seealso The underlying implementation in [stringi::stri_sub()]
#' @export
#' @examples
#' hw <- "Hadley Wickham"
#'
#' str_sub(hw, 1, 6)
#' str_sub(hw, end = 6)
#' str_sub(hw, 8, 14)
#' str_sub(hw, 8)
#'
#' # Negative values index from end of string
#' str_sub(hw, -1)
#' str_sub(hw, -7)
#' str_sub(hw, end = -7)
#'
#' # str_sub() is vectorised by both string and position
#' str_sub(hw, c(1, 8), c(6, 14))
#'
#' # if you want to extract multiple positions from multiple strings,
#' # use str_sub_all()
#' x <- c("abcde", "ghifgh")
#' str_sub(x, c(1, 2), c(2, 4))
#' str_sub_all(x, start = c(1, 2), end = c(2, 4))
#'
#' # Alternatively, you can pass in a two column matrix, as in the
#' # output from str_locate_all
#' pos <- str_locate_all(hw, "[aeio]")[[1]]
#' pos
#' str_sub(hw, pos)
#'
#' # You can also use `str_sub()` to modify strings:
#' x <- "BBCDEF"
#' str_sub(x, 1, 1) <- "A"; x
#' str_sub(x, -1, -1) <- "K"; x
#' str_sub(x, -2, -2) <- "GHIJ"; x
#' str_sub(x, 2, -2) <- ""; x
str_sub <- function(string, start = 1L, end = -1L) {
  vctrs::vec_size_common(string = string, start = start, end = end)

  out <- if (is.matrix(start)) {
    stri_sub(string, from = start)
  } else {
    stri_sub(string, from = start, to = end)
  }
  # Preserve names unless `string` is recycled
  if (length(out) == length(string)) copy_names(string, out) else out
}


#' @export
#' @rdname str_sub
"str_sub<-" <- function(string, start = 1L, end = -1L, omit_na = FALSE, value) {
  vctrs::vec_size_common(
    string = string,
    start = start,
    end = end,
    value = value
  )

  if (is.matrix(start)) {
    stri_sub(string, from = start, omit_na = omit_na) <- value
  } else {
    stri_sub(string, from = start, to = end, omit_na = omit_na) <- value
  }
  string
}

#' @export
#' @rdname str_sub
str_sub_all <- function(string, start = 1L, end = -1L) {
  out <- if (is.matrix(start)) {
    stri_sub_all(string, from = start)
  } else {
    stri_sub_all(string, from = start, to = end)
  }
  copy_names(string, out)
}


================================================
FILE: R/subset.R
================================================
#' Find matching elements
#'
#' @description
#' `str_subset()` returns all elements of `string` where there's at least
#' one match to `pattern`. It's a wrapper around `x[str_detect(x, pattern)]`,
#' and is equivalent to `grep(pattern, x, value = TRUE)`.
#'
#' Use [str_extract()] to find the location of the match _within_ each string.
#'
#' @inheritParams str_detect
#' @return A character vector, usually smaller than `string`.
#' @seealso [grep()] with argument `value = TRUE`,
#'    [stringi::stri_subset()] for the underlying implementation.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_subset(fruit, "a")
#'
#' str_subset(fruit, "^a")
#' str_subset(fruit, "a$")
#' str_subset(fruit, "b")
#' str_subset(fruit, "[aeiou]")
#'
#' # Elements that don't match
#' str_subset(fruit, "^p", negate = TRUE)
#'
#' # Missings never match
#' str_subset(c("a", NA, "b"), ".")
str_subset <- function(string, pattern, negate = FALSE) {
  check_lengths(string, pattern)
  check_bool(negate)

  idx <- switch(
    type(pattern),
    empty = no_empty(),
    bound = no_boundary(),
    fixed = str_detect(string, pattern, negate = negate),
    coll = str_detect(string, pattern, negate = negate),
    regex = str_detect(string, pattern, negate = negate)
  )

  idx[is.na(idx)] <- FALSE
  string[idx]
}

#' Find matching indices
#'
#' `str_which()` returns the indices of `string` where there's at least
#' one match to `pattern`. It's a wrapper around
#' `which(str_detect(x, pattern))`, and is equivalent to `grep(pattern, x)`.
#'
#' @inheritParams str_detect
#' @return An integer vector, usually smaller than `string`.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_which(fruit, "a")
#'
#' # Elements that don't match
#' str_which(fruit, "^p", negate = TRUE)
#'
#' # Missings never match
#' str_which(c("a", NA, "b"), ".")
str_which <- function(string, pattern, negate = FALSE) {
  which(str_detect(string, pattern, negate = negate))
}


================================================
FILE: R/trim.R
================================================
#' Remove whitespace
#'
#' `str_trim()` removes whitespace from start and end of string; `str_squish()`
#' removes whitespace at the start and end, and replaces all internal whitespace
#' with a single space.
#'
#' @inheritParams str_detect
#' @param side Side on which to remove whitespace: "left", "right", or
#'   "both", the default.
#' @return A character vector the same length as `string`.
#' @export
#' @seealso [str_pad()] to add whitespace
#' @examples
#' str_trim("  String with trailing and leading white space\t")
#' str_trim("\n\nString with trailing and leading white space\n\n")
#'
#' str_squish("  String with trailing,  middle, and leading white space\t")
#' str_squish("\n\nString with excess,  trailing and leading white   space\n\n")
str_trim <- function(string, side = c("both", "left", "right")) {
  side <- arg_match(side)

  out <- switch(
    side,
    left = stri_trim_left(string),
    right = stri_trim_right(string),
    both = stri_trim_both(string)
  )
  copy_names(string, out)
}

#' @export
#' @rdname str_trim
str_squish <- function(string) {
  copy_names(string, stri_trim_both(str_replace_all(string, "\\s+", " ")))
}


================================================
FILE: R/trunc.R
================================================
#' Truncate a string to maximum width
#'
#' Truncate a string to a fixed of characters, so that
#' `str_length(str_trunc(x, n))` is always less than or equal to `n`.
#'
#' @inheritParams str_detect
#' @param width Maximum width of string.
#' @param side,ellipsis Location and content of ellipsis that indicates
#'   content has been removed.
#' @return A character vector the same length as `string`.
#' @seealso [str_pad()] to increase the minimum width of a string.
#' @export
#' @examples
#' x <- "This string is moderately long"
#' rbind(
#'   str_trunc(x, 20, "right"),
#'   str_trunc(x, 20, "left"),
#'   str_trunc(x, 20, "center")
#' )
str_trunc <- function(
  string,
  width,
  side = c("right", "left", "center"),
  ellipsis = "..."
) {
  check_number_whole(width)
  side <- arg_match(side)
  check_string(ellipsis)

  len <- str_length(string)
  too_long <- !is.na(string) & len > width
  width... <- width - str_length(ellipsis)

  if (width... < 0) {
    cli::cli_abort(
      tr_(
        "`width` ({width}) is shorter than `ellipsis` ({str_length(ellipsis)})."
      )
    )
  }

  string[too_long] <- switch(
    side,
    right = str_c(str_sub(string[too_long], 1, width...), ellipsis),
    left = str_c(
      ellipsis,
      str_sub(string[too_long], len[too_long] - width... + 1, -1)
    ),
    center = str_c(
      str_sub(string[too_long], 1, ceiling(width... / 2)),
      ellipsis,
      str_sub(string[too_long], len[too_long] - floor(width... / 2) + 1, -1)
    )
  )
  string
}


================================================
FILE: R/unique.R
================================================
#' Remove duplicated strings
#'
#' `str_unique()` removes duplicated values, with optional control over
#' how duplication is measured.
#'
#' @inheritParams str_detect
#' @inheritParams str_equal
#' @return A character vector, usually shorter than `string`.
#' @seealso [unique()], [stringi::stri_unique()] which this function wraps.
#' @examples
#' str_unique(c("a", "b", "c", "b", "a"))
#'
#' str_unique(c("a", "b", "c", "B", "A"))
#' str_unique(c("a", "b", "c", "B", "A"), ignore_case = TRUE)
#'
#' # Use ... to pass additional arguments to stri_unique()
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"))
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"), strength = 1)
#' @export
str_unique <- function(string, locale = "en", ignore_case = FALSE, ...) {
  check_string(locale)
  check_bool(ignore_case)

  opts <- str_opts_collator(
    locale = locale,
    ignore_case = ignore_case,
    ...
  )

  keep <- !stringi::stri_duplicated(string, opts_collator = opts)
  string[keep]
}


================================================
FILE: R/utils.R
================================================
#' Pipe operator
#'
#' @name %>%
#' @rdname pipe
#' @keywords internal
#' @export
#' @importFrom magrittr %>%
#' @usage lhs \%>\% rhs
NULL

check_lengths <- function(
  string,
  pattern,
  replacement = NULL,
  error_call = caller_env()
) {
  # stringi already correctly recycles vectors of length 0 and 1
  # we just want more stringent vctrs checks for other lengths
  vctrs::vec_size_common(
    string = string,
    pattern = pattern,
    replacement = replacement,
    .call = error_call
  )
}

no_boundary <- function(call = caller_env()) {
  cli::cli_abort(tr_("{.arg pattern} can't be a boundary."), call = call)
}
no_empty <- function(call = caller_env()) {
  cli::cli_abort(
    tr_("{.arg pattern} can't be the empty string ({.code \"\"})."),
    call = call
  )
}

tr_ <- function(...) {
  enc2utf8(gettext(paste0(...), domain = "R-stringr"))
}

# copy names from `string` to output, regardless of output type
copy_names <- function(from, to) {
  nm <- names(from)
  if (is.null(nm)) {
    return(to)
  }

  if (is.matrix(to)) {
    rownames(to) <- nm
    to
  } else {
    set_names(to, nm)
  }
}

# keep names if pattern is scalar (i.e. vectorised) or same length as string.
keep_names <- function(string, pattern) {
  length(pattern) == 1L || length(pattern) == length(string)
}

preserve_names_if_possible <- function(string, pattern, out) {
  if (keep_names(string, pattern)) {
    copy_names(string, out)
  } else {
    out
  }
}


================================================
FILE: R/view.R
================================================
#' View strings and matches
#'
#' @description
#' `str_view()` is used to print the underlying representation of a string and
#' to see how a `pattern` matches.
#'
#' Matches are surrounded by `<>` and unusual whitespace (i.e. all whitespace
#' apart from `" "` and `"\n"`) are surrounded by `{}` and escaped. Where
#' possible, matches and unusual whitespace are coloured blue and `NA`s red.
#'
#' @inheritParams str_detect
#' @param match If `pattern` is supplied, which elements should be shown?
#'
#'   * `TRUE`, the default, shows only elements that match the pattern.
#'   * `NA` shows all elements.
#'   * `FALSE` shows only elements that don't match the pattern.
#'
#'   If `pattern` is not supplied, all elements are always shown.
#' @param html Use HTML output? If `TRUE` will create an HTML widget; if `FALSE`
#'   will style using ANSI escapes.
#' @param use_escapes If `TRUE`, all non-ASCII characters will be rendered
#'   with unicode escapes. This is useful to see exactly what underlying
#'   values are stored in the string.
#' @export
#' @examples
#' # Show special characters
#' str_view(c("\"\\", "\\\\\\", "fgh", NA, "NA"))
#'
#' # A non-breaking space looks like a regular space:
#' nbsp <- "Hi\u00A0you"
#' nbsp
#' # But it doesn't behave like one:
#' str_detect(nbsp, " ")
#' # So str_view() brings it to your attention with a blue background
#' str_view(nbsp)
#'
#' # You can also use escapes to see all non-ASCII characters
#' str_view(nbsp, use_escapes = TRUE)
#'
#' # Supply a pattern to see where it matches
#' str_view(c("abc", "def", "fghi"), "[aeiou]")
#' str_view(c("abc", "def", "fghi"), "^")
#' str_view(c("abc", "def", "fghi"), "..")
#'
#' # By default, only matching strings will be shown
#' str_view(c("abc", "def", "fghi"), "e")
#' # but you can show all:
#' str_view(c("abc", "def", "fghi"), "e", match = NA)
#' # or just those that don't match:
#' str_view(c("abc", "def", "fghi"), "e", match = FALSE)
str_view <- function(
  string,
  pattern = NULL,
  match = TRUE,
  html = FALSE,
  use_escapes = FALSE
) {
  rec <- vctrs::vec_recycle_common(string = string, pattern = pattern)
  string <- rec$string
  pattern <- rec$pattern

  check_bool(match, allow_na = TRUE)
  check_bool(html)
  check_bool(use_escapes)

  filter <- str_view_filter(string, pattern, match)
  out <- string[filter]
  pattern <- pattern[filter]

  if (!is.null(pattern)) {
    out <- str_replace_all(out, pattern, str_view_highlighter(html))
  }
  if (use_escapes) {
    out <- stri_escape_unicode(out)
    out <- str_replace_all(out, fixed("\\u001b"), "\u001b")
  } else {
    out <- str_view_special(out, html = html)
  }

  str_view_print(out, filter, html = html)
}

#' @rdname str_view
#' @usage NULL
#' @export
str_view_all <- function(
  string,
  pattern = NULL,
  match = NA,
  html = FALSE,
  use_escapes = FALSE
) {
  lifecycle::deprecate_warn("1.5.0", "str_view_all()", "str_view()")

  str_view(
    string = string,
    pattern = pattern,
    match = match,
    html = html,
    use_escapes = use_escapes
  )
}

str_view_filter <- function(x, pattern, match) {
  if (is.null(pattern) || inherits(pattern, "stringr_boundary")) {
    rep(TRUE, length(x))
  } else {
    if (identical(match, TRUE)) {
      str_detect(x, pattern) & !is.na(x)
    } else if (identical(match, FALSE)) {
      !str_detect(x, pattern) | is.na(x)
    } else {
      rep(TRUE, length(x))
    }
  }
}

# Helpers -----------------------------------------------------------------

str_view_highlighter <- function(html = TRUE) {
  if (html) {
    function(x) str_c("<span class='match'>", x, "</span>")
  } else {
    function(x) {
      out <- cli::col_cyan("<", x, ">")

      # Ensure styling is starts and ends within each line
      out <- cli::ansi_strsplit(out, "\n", fixed = TRUE)
      out <- map_chr(out, str_flatten, "\n")

      out
    }
  }
}

str_view_special <- function(x, html = TRUE) {
  if (html) {
    replace <- function(x) str_c("<span class='special'>", x, "</span>")
  } else {
    replace <- function(x) {
      if (length(x) == 0) {
        return(character())
      }

      cli::col_cyan("{", stri_escape_unicode(x), "}")
    }
  }

  # Highlight any non-standard whitespace characters
  str_replace_all(x, "[\\p{Whitespace}-- \n]+", replace)
}

str_view_print <- function(x, filter, html = TRUE) {
  if (html) {
    str_view_widget(x)
  } else {
    structure(x, id = which(filter), class = "stringr_view")
  }
}

str_view_widget <- function(lines) {
  check_installed(c("htmltools", "htmlwidgets"))

  lines <- str_replace_na(lines)
  bullets <- str_c(
    "<ul>\n",
    str_c("  <li><pre>", lines, "</pre></li>", collapse = "\n"),
    "\n</ul>"
  )

  html <- htmltools::HTML(bullets)
  size <- htmlwidgets::sizingPolicy(
    knitr.figure = FALSE,
    defaultHeight = pmin(10 * length(lines), 300),
    knitr.defaultHeight = "100%"
  )

  htmlwidgets::createWidget(
    "str_view",
    list(html = html),
    sizingPolicy = size,
    package = "stringr"
  )
}

#' @export
print.stringr_view <- function(x, ..., n = getOption("stringr.view_n", 20)) {
  n_extra <- length(x) - n
  if (n_extra > 0) {
    x <- x[seq_len(n)]
  }

  if (length(x) == 0) {
    cli::cli_inform(c(x = "Empty `string` provided.\n"))
    return(invisible(x))
  }

  bar <- if (cli::is_utf8_output()) "\u2502" else "|"

  id <- format(paste0("[", attr(x, "id"), "] "), justify = "right")
  indent <- paste0(cli::col_grey(id, bar), " ")
  exdent <- paste0(strrep(" ", nchar(id[[1]])), cli::col_grey(bar), " ")

  x[is.na(x)] <- cli::col_red("NA")
  x <- paste0(indent, x)
  x <- str_replace_all(x, "\n", paste0("\n", exdent))

  cat(x, sep = "\n")
  if (n_extra > 0) {
    cat("... and ", n_extra, " more\n", sep = "")
  }

  invisible(x)
}

#' @export
`[.stringr_view` <- function(x, i, ...) {
  structure(NextMethod(), id = attr(x, "id")[i], class = "stringr_view")
}


================================================
FILE: R/word.R
================================================
#' Extract words from a sentence
#'
#' @inheritParams str_detect
#' @param start,end Pair of integer vectors giving range of words (inclusive)
#'   to extract. If negative, counts backwards from the last word.
#'
#'   The default value select the first word.
#' @param sep Separator between words. Defaults to single space.
#' @return A character vector with the same length as `string`/`start`/`end`.
#' @export
#' @examples
#' sentences <- c("Jane saw a cat", "Jane sat down")
#' word(sentences, 1)
#' word(sentences, 2)
#' word(sentences, -1)
#' word(sentences, 2, -1)
#'
#' # Also vectorised over start and end
#' word(sentences[1], 1:3, -1)
#' word(sentences[1], 1, 1:4)
#'
#' # Can define words by other separators
#' str <- 'abc.def..123.4568.999'
#' word(str, 1, sep = fixed('..'))
#' word(str, 2, sep = fixed('..'))
word <- function(string, start = 1L, end = start, sep = fixed(" ")) {
  args <- vctrs::vec_recycle_common(string = string, start = start, end = end)
  string <- args$string
  start <- args$start
  end <- args$end

  breaks <- str_locate_all(string, sep)
  words <- lapply(breaks, invert_match)

  # Convert negative values into actual positions
  len <- vapply(words, nrow, integer(1))

  neg_start <- !is.na(start) & start < 0L
  start[neg_start] <- start[neg_start] + len[neg_start] + 1L

  neg_end <- !is.na(end) & end < 0L
  end[neg_end] <- end[neg_end] + len[neg_end] + 1L

  # Replace indexes past end with NA
  start[start > len] <- NA
  end[end > len] <- NA

  # To return all words when trying to extract more words than available
  start[start < 1L] <- 1

  # Extract locations
  starts <- mapply(function(word, loc) word[loc, "start"], words, start)
  ends <- mapply(function(word, loc) word[loc, "end"], words, end)

  copy_names(string, str_sub(string, starts, ends))
}


================================================
FILE: R/wrap.R
================================================
#' Wrap words into nicely formatted paragraphs
#'
#' Wrap words into paragraphs, minimizing the "raggedness" of the lines
#' (i.e. the variation in length line) using the Knuth-Plass algorithm.
#'
#' @inheritParams str_detect
#' @param width Positive integer giving target line width (in number of
#'   characters). A width less than or equal to 1 will put each word on its
#'   own line.
#' @param indent,exdent A non-negative integer giving the indent for the
#'   first line (`indent`) and all subsequent lines (`exdent`).
#' @param whitespace_only A boolean.
#'   * If `TRUE` (the default) wrapping will only occur at whitespace.
#'   * If `FALSE`, can break on any non-word character (e.g. `/`, `-`).
#' @return A character vector the same length as `string`.
#' @seealso [stringi::stri_wrap()] for the underlying implementation.
#' @export
#' @examples
#' thanks_path <- file.path(R.home("doc"), "THANKS")
#' thanks <- str_c(readLines(thanks_path), collapse = "\n")
#' thanks <- word(thanks, 1, 3, fixed("\n\n"))
#' cat(str_wrap(thanks), "\n")
#' cat(str_wrap(thanks, width = 40), "\n")
#' cat(str_wrap(thanks, width = 60, indent = 2), "\n")
#' cat(str_wrap(thanks, width = 60, exdent = 2), "\n")
#' cat(str_wrap(thanks, width = 0, exdent = 2), "\n")
str_wrap <- function(
  string,
  width = 80,
  indent = 0,
  exdent = 0,
  whitespace_only = TRUE
) {
  check_number_decimal(width)
  if (width <= 0) {
    width <- 1
  }
  check_number_whole(indent)
  check_number_whole(exdent)
  check_bool(whitespace_only)

  out <- stri_wrap(
    string,
    width = width,
    indent = indent,
    exdent = exdent,
    whitespace_only = whitespace_only,
    simplify = FALSE
  )
  out <- vapply(out, str_c, collapse = "\n", character(1))
  copy_names(string, out)
}


================================================
FILE: README.Rmd
================================================
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
library(stringr)
```

# stringr <a href='https://stringr.tidyverse.org'><img src='man/figures/logo.png' align="right" height="139" /></a>

<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/stringr)](https://cran.r-project.org/package=stringr)
[![R-CMD-check](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/tidyverse/stringr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/stringr?branch=main)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->

## Overview

Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparation tasks. The stringr package provides a cohesive set of functions designed to make working with strings as easy as possible. If you're not familiar with strings, the best place to start is the [chapter on strings](https://r4ds.hadley.nz/strings) in R for Data Science.

stringr is built on top of [stringi](https://github.com/gagolews/stringi), which uses the [ICU](https://icu.unicode.org) C library to provide fast, correct implementations of common string manipulations. stringr focusses on the most important and commonly used string manipulation functions whereas stringi provides a comprehensive set covering almost anything you can imagine. If you find that stringr is missing a function that you need, try looking in stringi. Both packages share similar conventions, so once you've mastered stringr, you should find stringi similarly easy to use.

## Installation

```r
# The easiest way to get stringr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just stringr:
install.packages("stringr")
```

## Cheatsheet

<a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/main/pngs/thumbnails/strings-cheatsheet-thumbs.png" width="630" height="242"/></a>  

## Usage

All functions in stringr start with `str_` and take a vector of strings as the first argument:

```{r}
x <- c("why", "video", "cross", "extra", "deal", "authority")
str_length(x) 
str_c(x, collapse = ", ")
str_sub(x, 1, 2)
```

Most string functions work with regular expressions, a concise language for describing patterns of text. For example, the regular expression `"[aeiou]"` matches any single character that is a vowel:

```{r}
str_subset(x, "[aeiou]")
str_count(x, "[aeiou]")
```

There are seven main verbs that work with patterns:

*   `str_detect(x, pattern)` tells you if there's any match to the pattern:
    ```{r}
    str_detect(x, "[aeiou]")
    ```
    
*   `str_count(x, pattern)` counts the number of patterns:
    ```{r}
    str_count(x, "[aeiou]")
    ```

*   `str_subset(x, pattern)` extracts the matching components:
    ```{r}
    str_subset(x, "[aeiou]")
    ```

*   `str_locate(x, pattern)` gives the position of the match:
    ```{r}
    str_locate(x, "[aeiou]")
    ```

*   `str_extract(x, pattern)` extracts the text of the match:
    ```{r}
    str_extract(x, "[aeiou]")
    ```

*   `str_match(x, pattern)` extracts parts of the match defined by parentheses:
    ```{r}
    # extract the characters on either side of the vowel
    str_match(x, "(.)[aeiou](.)")
    ```

*   `str_replace(x, pattern, replacement)` replaces the matches with new text:
    ```{r}
    str_replace(x, "[aeiou]", "?")
    ```

*   `str_split(x, pattern)` splits up a string into multiple pieces:
    ```{r}
    str_split(c("a,b", "c,d,e"), ",")
    ```

As well as regular expressions (the default), there are three other pattern matching engines:

* `fixed()`: match exact bytes
* `coll()`: match human letters
* `boundary()`: match boundaries

## RStudio Addin

The [RegExplain RStudio addin](https://www.garrickadenbuie.com/project/regexplain/) provides a friendly interface for working with regular expressions and functions from stringr. This addin allows you to interactively build your regexp, check the output of common string matching functions, consult the interactive help pages, or use the included resources to learn regular expressions.

This addin can easily be installed with devtools:

```r
# install.packages("devtools")
devtools::install_github("gadenbuie/regexplain")
```

## Compared to base R

R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they lag behind the string operations in other programming languages, so that some things that are easy to do in languages like Ruby or Python are rather hard to do in R. 

* Uses consistent function and argument names. The first argument is always
  the vector of strings to modify, which makes stringr work particularly well
  in conjunction with the pipe:
  
    ```{r}
    letters %>%
      .[1:10] %>% 
      str_pad(3, "right") %>%
      str_c(letters[2:11])
    ```

* Simplifies string operations by eliminating options that you don't need
  95% of the time.

* Produces outputs than can easily be used as inputs. This includes ensuring
  that missing inputs result in missing outputs, and zero length inputs
  result in zero length outputs.
  
Learn more in `vignette("from-base")`


================================================
FILE: README.md
================================================

<!-- README.md is generated from README.Rmd. Please edit that file -->

# stringr <a href='https://stringr.tidyverse.org'><img src='man/figures/logo.png' align="right" height="139" /></a>

<!-- badges: start -->

[![CRAN
status](https://www.r-pkg.org/badges/version/stringr)](https://cran.r-project.org/package=stringr)
[![R-CMD-check](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml)
[![Codecov test
coverage](https://codecov.io/gh/tidyverse/stringr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/stringr?branch=main)
[![Lifecycle:
stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->

## Overview

Strings are not glamorous, high-profile components of R, but they do
play a big role in many data cleaning and preparation tasks. The stringr
package provides a cohesive set of functions designed to make working
with strings as easy as possible. If you’re not familiar with strings,
the best place to start is the [chapter on
strings](https://r4ds.hadley.nz/strings) in R for Data Science.

stringr is built on top of
[stringi](https://github.com/gagolews/stringi), which uses the
[ICU](https://icu.unicode.org) C library to provide fast, correct
implementations of common string manipulations. stringr focusses on the
most important and commonly used string manipulation functions whereas
stringi provides a comprehensive set covering almost anything you can
imagine. If you find that stringr is missing a function that you need,
try looking in stringi. Both packages share similar conventions, so once
you’ve mastered stringr, you should find stringi similarly easy to use.

## Installation

``` r
# The easiest way to get stringr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just stringr:
install.packages("stringr")
```

## Cheatsheet

<a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/main/pngs/thumbnails/strings-cheatsheet-thumbs.png" width="630" height="242"/></a>

## Usage

All functions in stringr start with `str_` and take a vector of strings
as the first argument:

``` r
x <- c("why", "video", "cross", "extra", "deal", "authority")
str_length(x) 
#> [1] 3 5 5 5 4 9
str_c(x, collapse = ", ")
#> [1] "why, video, cross, extra, deal, authority"
str_sub(x, 1, 2)
#> [1] "wh" "vi" "cr" "ex" "de" "au"
```

Most string functions work with regular expressions, a concise language
for describing patterns of text. For example, the regular expression
`"[aeiou]"` matches any single character that is a vowel:

``` r
str_subset(x, "[aeiou]")
#> [1] "video"     "cross"     "extra"     "deal"      "authority"
str_count(x, "[aeiou]")
#> [1] 0 3 1 2 2 4
```

There are seven main verbs that work with patterns:

- `str_detect(x, pattern)` tells you if there’s any match to the
  pattern:

  ``` r
  str_detect(x, "[aeiou]")
  #> [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
  ```

- `str_count(x, pattern)` counts the number of patterns:

  ``` r
  str_count(x, "[aeiou]")
  #> [1] 0 3 1 2 2 4
  ```

- `str_subset(x, pattern)` extracts the matching components:

  ``` r
  str_subset(x, "[aeiou]")
  #> [1] "video"     "cross"     "extra"     "deal"      "authority"
  ```

- `str_locate(x, pattern)` gives the position of the match:

  ``` r
  str_locate(x, "[aeiou]")
  #>      start end
  #> [1,]    NA  NA
  #> [2,]     2   2
  #> [3,]     3   3
  #> [4,]     1   1
  #> [5,]     2   2
  #> [6,]     1   1
  ```

- `str_extract(x, pattern)` extracts the text of the match:

  ``` r
  str_extract(x, "[aeiou]")
  #> [1] NA  "i" "o" "e" "e" "a"
  ```

- `str_match(x, pattern)` extracts parts of the match defined by
  parentheses:

  ``` r
  # extract the characters on either side of the vowel
  str_match(x, "(.)[aeiou](.)")
  #>      [,1]  [,2] [,3]
  #> [1,] NA    NA   NA  
  #> [2,] "vid" "v"  "d" 
  #> [3,] "ros" "r"  "s" 
  #> [4,] NA    NA   NA  
  #> [5,] "dea" "d"  "a" 
  #> [6,] "aut" "a"  "t"
  ```

- `str_replace(x, pattern, replacement)` replaces the matches with new
  text:

  ``` r
  str_replace(x, "[aeiou]", "?")
  #> [1] "why"       "v?deo"     "cr?ss"     "?xtra"     "d?al"      "?uthority"
  ```

- `str_split(x, pattern)` splits up a string into multiple pieces:

  ``` r
  str_split(c("a,b", "c,d,e"), ",")
  #> [[1]]
  #> [1] "a" "b"
  #> 
  #> [[2]]
  #> [1] "c" "d" "e"
  ```

As well as regular expressions (the default), there are three other
pattern matching engines:

- `fixed()`: match exact bytes
- `coll()`: match human letters
- `boundary()`: match boundaries

## RStudio Addin

The [RegExplain RStudio
addin](https://www.garrickadenbuie.com/project/regexplain/) provides a
friendly interface for working with regular expressions and functions
from stringr. This addin allows you to interactively build your regexp,
check the output of common string matching functions, consult the
interactive help pages, or use the included resources to learn regular
expressions.

This addin can easily be installed with devtools:

``` r
# install.packages("devtools")
devtools::install_github("gadenbuie/regexplain")
```

## Compared to base R

R provides a solid set of string operations, but because they have grown
organically over time, they can be inconsistent and a little hard to
learn. Additionally, they lag behind the string operations in other
programming languages, so that some things that are easy to do in
languages like Ruby or Python are rather hard to do in R.

- Uses consistent function and argument names. The first argument is
  always the vector of strings to modify, which makes stringr work
  particularly well in conjunction with the pipe:

  ``` r
  letters %>%
    .[1:10] %>% 
    str_pad(3, "right") %>%
    str_c(letters[2:11])
  #>  [1] "a  b" "b  c" "c  d" "d  e" "e  f" "f  g" "g  h" "h  i" "i  j" "j  k"
  ```

- Simplifies string operations by eliminating options that you don’t
  need 95% of the time.

- Produces outputs than can easily be used as inputs. This includes
  ensuring that missing inputs result in missing outputs, and zero
  length inputs result in zero length outputs.

Learn more in `vignette("from-base")`


================================================
FILE: _pkgdown.yml
================================================
url: https://stringr.tidyverse.org

development:
  mode: auto

template:
  package: tidytemplate
  bootstrap: 5
  includes:
    in_header: |
      <script src="https://cdn.jsdelivr.net/gh/posit-dev/supported-by-posit/js/badge.min.js" data-max-height="43" data-light-bg="#666f76" data-light-fg="#f9f9f9"></script>
      <script defer data-domain="stringr.tidyverse.org,all.tidyverse.org" src="https://plausible.io/js/plausible.js"></script>

home:
  links:
  - text: Learn more at R4DS
    href: http://r4ds.hadley.nz/strings.html

reference:
- title: Pattern matching

- subtitle: String
  contents:
  - str_count
  - str_detect
  - str_escape
  - str_extract
  - str_locate
  - str_match
  - str_replace
  - str_remove
  - str_split
  - str_starts
  - modifiers

- subtitle: Vector
  desc: >
    Unlike other pattern matching functions, these functions operate on the
    original character vector, not the individual matches.
  contents:
  - str_subset
  - str_which

- title: Combining strings
  contents:
  - str_c
  - str_flatten
  - str_glue

- title: Character based
  contents:
  - str_dup
  - str_length
  - str_pad
  - str_sub
  - str_trim
  - str_trunc
  - str_wrap

- title: Locale aware
  contents:
  - str_order
  - str_equal
  - case
  - str_unique

- title: Other helpers
  contents:
  - invert_match
  - str_conv
  - str_like
  - str_replace_na
  - str_to_camel
  - str_view
  - word

- title: Bundled data
  contents:
  - "`stringr-data`"

news:
  releases:
  - text: "Version 1.6.0"
    href: https://tidyverse.org/blog/2025/11/stringr-1-6-0/
  - text: "Version 1.5.0"
    href: https://www.tidyverse.org/blog/2022/12/stringr-1-5-0/
  - text: "Version 1.4.0"
    href: https://www.tidyverse.org/articles/2019/02/stringr-1-4-0/
  - text: "Version 1.3.0"
    href: https://www.tidyverse.org/articles/2018/02/stringr-1-3-0/
  - text: "Version 1.2.0"
    href: https://blog.rstudio.com/2017/04/12/tidyverse-updates/
  - text: "Version 1.1.0"
    href: https://blog.rstudio.com/2016/08/24/stringr-1-1-0/
  - text: "Version 1.0.0"
    href: https://blog.rstudio.com/2015/05/05/stringr-1-0-0/


================================================
FILE: air.toml
================================================


================================================
FILE: codecov.yml
================================================
comment: false

coverage:
  status:
    project:
      default:
        target: auto
        threshold: 1%
        informational: true
    patch:
      default:
        target: auto
        threshold: 1%
        informational: true


================================================
FILE: cran-comments.md
================================================
## R CMD check results

0 errors | 0 warnings | 0 note

## revdepcheck results

We checked 2390 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package.

 * We saw 9 new problems
 * We failed to check 2 packages

We've been working with maintainers for over a month to get fixes to CRAN in a timely manner. You can track our efforts at <https://github.com/tidyverse/stringr/issues/590>.


================================================
FILE: data-raw/harvard-sentences.txt
================================================
The birch canoe slid on the smooth planks.
Glue the sheet to the dark blue background.
It's easy to tell the depth of a well.
These days a chicken leg is a rare dish.
Rice is often served in round bowls.
The juice of lemons makes fine punch.
The box was thrown beside the parked truck.
The hogs were fed chopped corn and garbage.
Four hours of steady work faced us.
A large size in stockings is hard to sell.
The boy was there when the sun rose.
A rod is used to catch pink salmon.
The source of the huge river is the clear spring.
Kick the ball straight and follow through.
Help the woman get back to her feet.
A pot of tea helps to pass the evening.
Smoky fires lack flame and heat.
The soft cushion broke the man's fall.
The salt breeze came across from the sea.
The girl at the booth sold fifty bonds.
The small pup gnawed a hole in the sock.
The fish twisted and turned on the bent hook.
Press the pants and sew a button on the vest.
The swan dive was far short of perfect.
The beauty of the view stunned the young boy.
Two blue fish swam in the tank.
Her purse was full of useless trash.
The colt reared and threw the tall rider.
It snowed, rained, and hailed the same morning.
Read verse out loud for pleasure.
Hoist the load to your left shoulder.
Take the winding path to reach the lake.
Note closely the size of the gas tank.
Wipe the grease off his dirty face.
Mend the coat before you go out.
The wrist was badly strained and hung limp.
The stray cat gave birth to kittens.
The young girl gave no clear response.
The meal was cooked before the bell rang.
What joy there is in living.
A king ruled the state in the early days.
The ship was torn apart on the sharp reef.
Sickness kept him home the third week.
The wide road shimmered in the hot sun.
The lazy cow lay in the cool grass.
Lift the square stone over the fence.
The rope will bind the seven books at once.
Hop over the fence and plunge in.
The friendly gang left the drug store.
Mesh wire keeps chicks inside.
The frosty air passed through the coat.
The crooked maze failed to fool the mouse.
Adding fast leads to wrong sums.
The show was a flop from the very start.
A saw is a tool used for making boards.
The wagon moved on well oiled wheels.
March the soldiers past the next hill.
A cup of sugar makes sweet fudge.
Place a rosebush near the porch steps.
Both lost their lives in the raging storm.
We talked of the side show in the circus.
Use a pencil to write the first draft.
He ran half way to the hardware store.
The clock struck to mark the third period.
A small creek cut across the field.
Cars and busses stalled in snow drifts.
The set of china hit the floor with a crash.
This is a grand season for hikes on the road.
The dune rose from the edge of the water.
Those words were the cue for the actor to leave.
A yacht slid around the point into the bay.
The two met while playing on the sand.
The ink stain dried on the finished page.
The walled town was seized without a fight.
The lease ran out in sixteen weeks.
A tame squirrel makes a nice pet.
The horn of the car woke the sleeping cop.
The heart beat strongly and with firm strokes.
The pearl was worn in a thin silver ring.
The fruit peel was cut in thick slices.
The Navy attacked the big task force.
See the cat glaring at the scared mouse.
There are more than two factors here.
The hat brim was wide and too droopy.
The lawyer tried to lose his case.
The grass curled around the fence post.
Cut the pie into large parts.
Men strive but seldom get rich.
Always close the barn door tight.
He lay prone and hardly moved a limb.
The slush lay deep along the street.
A wisp of cloud hung in the blue air.
A pound of sugar costs more than eggs.
The fin was sharp and cut the clear water.
The play seems dull and quite stupid.
Bail the boat to stop it from sinking.
The term ended in late june that year.
A Tusk is used to make costly gifts.
Ten pins were set in order.
The bill was paid every third week.
Oak is strong and also gives shade.
Cats and Dogs each hate the other.
The pipe began to rust while new.
Open the crate but don't break the glass.
Add the sum to the product of these three.
Thieves who rob friends deserve jail.
The ripe taste of cheese improves with age.
Act on these orders with great speed.
The hog crawled under the high fence.
Move the vat over the hot fire.
The bark of the pine tree was shiny and dark.
Leaves turn brown and yellow in the fall.
The pennant waved when the wind blew.
Split the log with a quick, sharp blow.
Burn peat after the logs give out.
He ordered peach pie with ice cream.
Weave the carpet on the right hand side.
Hemp is a weed found in parts of the tropics.
A lame back kept his score low.
We find joy in the simplest things.
Type out three lists of orders.
The harder he tried the less he got done.
The boss ran the show with a watchful eye.
The cup cracked and spilled its contents.
Paste can cleanse the most dirty brass.
The slang word for raw whiskey is booze.
It caught its hind paw in a rusty trap.
The wharf could be seen at the farther shore.
Feel the heat of the weak dying flame.
The tiny girl took off her hat.
A cramp is no small danger on a swim.
He said the same phrase thirty times.
Pluck the bright rose without leaves.
Two plus seven is less than ten.
The glow deepened in the eyes of the sweet girl.
Bring your problems to the wise chief.
Write a fond note to the friend you cherish.
Clothes and lodging are free to new men.
We frown when events take a bad turn.
Port is a strong wine with a smoky taste.
The young kid jumped the rusty gate.
Guess the result from the first scores.
A salt pickle tastes fine with ham.
The just claim got the right verdict.
Those thistles bend in a high wind.
Pure bred poodles have curls.
The tree top waved in a graceful way.
The spot on the blotter was made by green ink.
Mud was spattered on the front of his white shirt.
The cigar burned a hole in the desk top.
The empty flask stood on the tin tray.
A speedy man can beat this track mark.
He broke a new shoelace that day.
The coffee stand is too high for the couch.
The urge to write short stories is rare.
The pencils have all been used.
The pirates seized the crew of the lost ship.
We tried to replace the coin but failed.
She sewed the torn coat quite neatly.
The sofa cushion is red and of light weight.
The jacket hung on the back of the wide chair.
At that high level the air is pure.
Drop the two when you add the figures.
A filing case is now hard to buy.
An abrupt start does not win the prize.
Wood is best for making toys and blocks.
The office paint was a dull, sad tan.
He knew the skill of the great young actress.
A rag will soak up spilled water.
A shower of dirt fell from the hot pipes.
Steam hissed from the broken valve.
The child almost hurt the small dog.
There was a sound of dry leaves outside.
The sky that morning was clear and bright blue.
Torn scraps littered the stone floor.
Sunday is the best part of the week.
The doctor cured him with these pills.
The new girl was fired today at noon.
They felt gay when the ship arrived in port.
Add the store's account to the last cent.
Acid burns holes in wool cloth.
Fairy tales should be fun to write.
Eight miles of woodland burned to waste.
The third act was dull and tired the players.
A young child should not suffer fright.
Add the column and put the sum here.
We admire and love a good cook.
There the flood mark is ten inches.
He carved a head from the round block of marble.
She has a smart way of wearing clothes.
The fruit of a fig tree is apple shaped.
Corn cobs can be used to kindle a fire.
Where were they when the noise started.
The paper box is full of thumb tacks.
Sell your gift to a buyer at a good gain.
The tongs lay beside the ice pail.
The petals fall with the next puff of wind.
Bring your best compass to the third class.
They could laugh although they were sad.
Farmers came in to thresh the oat crop.
The brown house was on fire to the attic.
The lure is used to catch trout and flounder.
Float the soap on top of the bath water.
A blue crane is a tall wading bird.
A fresh start will work such wonders.
The club rented the rink for the fifth night.
After the dance, they went straight home.
The hostess taught the new maid to serve.
He wrote his last novel there at the inn.
Even the worst will beat his low score.
The cement had dried when he moved it.
The loss of the second ship was hard to take.
The fly made its way along the wall.
Do that with a wooden stick.
Live wires should be kept covered.
The large house had hot water taps.
It is hard to erase blue or red ink.
Write at once or you may forget it.
The doorknob was made of bright clean brass.
The wreck occurred by the bank on Main Street.
A pencil with black lead writes best.
Coax a young calf to drink from a bucket.
Schools for ladies teach charm and grace.
The lamp shone with a steady green flame.
They took the axe and the saw to the forest.
The ancient coin was quite dull and worn.
The shaky barn fell with a loud crash.
Jazz and swing fans like fast music.
Rake the rubbish up and then burn it.
Slash the gold cloth into fine ribbons.
Try to have the court decide the case.
They are pushed back each time they attack.
He broke his ties with groups of former friends.
They floated on the raft to sun their white backs.
The map had an X that meant nothing.
Whitings are small fish caught in nets.
Some ads serve to cheat buyers.
Jerk the rope and the bell rings weakly.
A waxed floor makes us lose balance.
Madam, this is the best brand of corn.
On the islands the sea breeze is soft and mild.
The play began as soon as we sat down.
This will lead the world to more sound and fury.
Add salt before you fry the egg.
The rush for funds reached its peak Tuesday.
The birch looked stark white and lonesome.
The box is held by a bright red snapper.
To make pure ice, you freeze water.
The first worm gets snapped early.
Jump the fence and hurry up the bank.
Yell and clap as the curtain slides back.
They are men who walk the middle of the road.
Both brothers wear the same size.
In some form or other we need fun.
The prince ordered his head chopped off.
The houses are built of red clay bricks.
Ducks fly north but lack a compass.
Fruit flavors are used in fizz drinks.
These pills do less good than others.
Canned pears lack full flavor.
The dark pot hung in the front closet.
Carry the pail to the wall and spill it there.
The train brought our hero to the big town.
We are sure that one war is enough.
Gray paint stretched for miles around.
The rude laugh filled the empty room.
High seats are best for football fans.
Tea served from the brown jug is tasty.
A dash of pepper spoils beef stew.
A zestful food is the hot-cross bun.
The horse trotted around the field at a brisk pace.
Find the twin who stole the pearl necklace.
Cut the cord that binds the box tightly.
The red tape bound the smuggled food.
Look in the corner to find the tan shirt.
The cold drizzle will halt the bond drive.
Nine men were hired to dig the ruins.
The junk yard had a mouldy smell.
The flint sputtered and lit a pine torch.
Soak the cloth and drown the sharp odor.
The shelves were bare of both jam or crackers.
A joy to every child is the swan boat.
All sat frozen and watched the screen.
A cloud of dust stung his tender eyes.
To reach the end he needs much courage.
Shape the clay gently into block form.
A ridge on a smooth surface is a bump or flaw.
Hedge apples may stain your hands green.
Quench your thirst, then eat the crackers.
Tight curls get limp on rainy days.
The mute muffled the high tones of the horn.
The gold ring fits only a pierced ear.
The old pan was covered with hard fudge.
Watch the log float in the wide river.
The node on the stalk of wheat grew daily.
The heap of fallen leaves was set on fire.
Write fast if you want to finish early.
His shirt was clean but one button was gone.
The barrel of beer was a brew of malt and hops.
Tin cans are absent from store shelves.
Slide the box into that empty space.
The plant grew large and green in the window.
The beam dropped down on the workman's head.
Pink clouds floated with the breeze.
She danced like a swan, tall and graceful.
The tube was blown and the tire flat and useless.
It is late morning on the old wall clock.
Let's all join as we sing the last chorus.
The last switch cannot be turned off.
The fight will end in just six minutes.
The store walls were lined with colored frocks.
The peace league met to discuss their plans.
The rise to fame of a person takes luck.
Paper is scarce, so write with much care.
The quick fox jumped on the sleeping cat.
The nozzle of the fire hose was bright brass.
Screw the round cap on as tight as needed.
Time brings us many changes.
The purple tie was ten years old.
Men think and plan and sometimes act.
Fill the ink jar with sticky glue.
He smoke a big pipe with strong contents.
We need grain to keep our mules healthy.
Pack the records in a neat thin case.
The crunch of feet in the snow was the only sound.
The copper bowl shone in the sun's rays.
Boards will warp unless kept dry.
The plush chair leaned against the wall.
Glass will clink when struck by metal.
Bathe and relax in the cool green grass.
Nine rows of soldiers stood in a line.
The beach is dry and shallow at low tide.
The idea is to sew both edges straight.
The kitten chased the dog down the street.
Pages bound in cloth make a book.
Try to trace the fine lines of the painting.
Women form less than half of the group.
The zones merge in the central part of town.
A gem in the rough needs work to polish.
Code is used when secrets are sent.
Most of the news is easy for us to hear.
He used the lathe to make brass objects.
The vane on top of the pole revolved in the wind.
Mince pie is a dish served to children.
The clan gathered on each dull night.
Let it burn, it gives us warmth and comfort.
A castle built from sand fails to endure.
A child's wit saved the day for us.
Tack the strip of carpet to the worn floor.
Next Tuesday we must vote.
Pour the stew from the pot into the plate.
Each penny shone like new.
The man went to the woods to gather sticks.
The dirt piles were lines along the road.
The logs fell and tumbled into the clear stream.
Just hoist it up and take it away.
A ripe plum is fit for a king's palate.
Our plans right now are hazy.
Brass rings are sold by these natives.
It takes a good trap to capture a bear.
Feed the white mouse some flower seeds.
The thaw came early and freed the stream.
He took the lead and kept it the whole distance.
The key you designed will fit the lock.
Plead to the council to free the poor thief.
Better hash is made of rare beef.
This plank was made for walking on .
The lake sparkled in the red hot sun.
He crawled with care along the ledge.
Tend the sheep while the dog wanders.
It takes a lot of help to finish these.
Mark the spot with a sign painted red.
Take two shares as a fair profit.
The fur of cats goes by many names.
North winds bring colds and fevers.
He asks no person to vouch for him.
Go now and come here later.
A sash of gold silk will trim her dress.
Soap can wash most dirt away.
That move means the game is over.
He wrote down a long list of items.
A siege will crack the strong defense.
Grape juice and water mix well.
Roads are paved with sticky tar.
Fake stones shine but cost little.
The drip of the rain made a pleasant sound.
Smoke poured out of every crack.
Serve the hot rum to the tired heroes.
Much of the story makes good sense.
The sun came up to light the eastern sky.
Heave the line over the port side.
A lathe cuts and trims any wood.
It's a dense crowd in two distinct ways.
His hip struck the knee of the next player.
The stale smell of old beer lingers.
The desk was firm on the shaky floor.
It takes heat to bring out the odor.
Beef is scarcer than some lamb.
Raise the sail and steer the ship northward.
A cone costs five cents on Mondays.
A pod is what peas always grow in.
Jerk that dart from the cork target.
No cement will hold hard wood.
We now have a new base for shipping.
A list of names is carved around the base.
The sheep were led home by a dog.
Three for a dime, the young peddler cried.
The sense of smell is better than that of touch.
No hardship seemed to make him sad.
Grace makes up for lack of beauty.
Nudge gently but wake her now.
The news struck doubt into restless minds.
Once we stood beside the shore.
A chink in the wall allowed a draft to blow.
Fasten two pins on each side.
A cold dip restores health and zest.
He takes the oath of office each March.
The sand drifts over the sills of the old house.
The point of the steel pen was bent and twisted.
There is a lag between thought and act.
Seed is needed to plant the spring corn.
Draw the chart with heavy black lines.
The boy owed his pal thirty cents.
The chap slipped into the crowd and was lost.
Hats are worn to tea and not to dinner.
The ramp led up to the wide highway.
Beat the dust from the rug onto the lawn.
Say it slowly but make it ring clear.
The straw nest housed five robins.
Screen the porch with woven straw mats.
This horse will nose his way to the finish.
The dry wax protects the deep scratch.
He picked up the dice for a second roll.
These coins will be needed to pay his debt.
The nag pulled the frail cart along.
Twist the valve and release hot steam.
The vamp of the shoe had a gold buckle.
The smell of burned rags itches my nose.
New pants lack cuffs and pockets.
The marsh will freeze when cold enough.
They slice the sausage thin with a knife.
The bloom of the rose lasts a few days.
A gray mare walked before the colt.
Breakfast buns are fine with a hot drink.
Bottles hold four kinds of rum.
The man wore a feather in his felt hat.
He wheeled the bike past the winding road.
Drop the ashes on the worn old rug.
The desk and both chairs were painted tan.
Throw out the used paper cup and plate.
A clean neck means a neat collar.
The couch cover and hall drapes were blue.
The stems of the tall glasses cracked and broke.
The wall phone rang loud and often.
The clothes dried on a thin wooden rack.
Turn out the lantern which gives us light.
The cleat sank deeply into the soft turf.
The bills were mailed promptly on the tenth of the month.
To have is better than to wait and hope.
The price is fair for a good antique clock.
The music played on while they talked.
Dispense with a vest on a day like this.
The bunch of grapes was pressed into wine.
He sent the figs, but kept the ripe cherries.
The hinge on the door creaked with old age.
The screen before the fire kept in the sparks.
Fly by night and you waste little time.
Thick glasses helped him read the print.
Birth and death marks the limits of life.
The chair looked strong but had no bottom.
The kite flew wildly in the high wind.
A fur muff is stylish once more.
The tin box held priceless stones.
We need an end of all such matter.
The case was puzzling to the old and wise.
The bright lanterns were gay on the dark lawn.
We don't get much money but we have fun.
The youth drove with zest, but little skill.
Five years he lived with a shaggy dog.
A fence cuts through the corner lot.
The way to save money is not to spend much.
Shut the hatch before the waves push it in.
The odor of spring makes young hearts jump.
Crack the walnut with your sharp side teeth.
He offered proof in the form of a large chart.
Send the stuff in a thick paper bag.
A quart of milk is water for the most part.
They told wild tales to frighten him.
The three story house was built of stone.
In the rear of the ground floor was a large passage.
A man in a blue sweater sat at the desk.
Oats are a food eaten by horse and man.
Their eyelids droop for want of sleep.
A sip of tea revives his tired friend.
There are many ways to do these things.
Tuck the sheet under the edge of the mat.
A force equal to that would move the earth.
We like to see clear weather.
The work of the tailor is seen on each side.
Take a chance and win a china doll.
Shake the dust from your shoes, stranger.
She was kind to sick old people.
The square wooden crate was packed to be shipped.
The dusty bench stood by the stone wall.
We dress to suit the weather of most days.
Smile when you say nasty words.
A bowl of rice is free with chicken stew.
The water in this well is a source of good health.
Take shelter in this tent, but keep still.
That guy is the writer of a few banned books.
The little tales they tell are false.
The door was barred, locked, and bolted as well.
Ripe pears are fit for a queen's table.
A big wet stain was on the round carpet.
The kite dipped and swayed, but stayed aloft.
The pleasant hours fly by much too soon.
The room was crowded with a wild mob.
This strong arm shall shield your honor.
She blushed when he gave her a white orchid.
The beetle droned in the hot June sun.
Press the pedal with your left foot.
Neat plans fail without luck.
The black trunk fell from the landing.
The bank pressed for payment of the debt.
The theft of the pearl pin was kept secret.
Shake hands with this friendly child.
The vast space stretched into the far distance.
A rich farm is rare in this sandy waste.
His wide grin earned many friends.
Flax makes a fine brand of paper.
Hurdle the pit with the aid of a long pole.
A strong bid may scare your partner stiff.
Even a just cause needs power to win.
Peep under the tent and see the clowns.
The leaf drifts along with a slow spin.
Cheap clothes are flashy but don't last.
A thing of small note can cause despair.
Flood the mails with requests for this book.
A thick coat of black paint covered all.
The pencil was cut to be sharp at both ends.
Those last words were a strong statement.
He wrote his name boldly at the top of the sheet.
Dill pickles are sour but taste fine.
Down that road is the way to the grain farmer.
Either mud or dust are found at all times.
The best method is to fix it in place with clips.
If you mumble your speech will be lost.
At night the alarm roused him from a deep sleep.
Read just what the meter says.
Fill your pack with bright trinkets for the poor.
The small red neon lamp went out.
Clams are small, round, soft, and tasty.
The fan whirled its round blades softly.
The line where the edges join was clean.
Breathe deep and smell the piny air.
It matters not if he reads these words or those.
A brown leather bag hung from its strap.
A toad and a frog are hard to tell apart.
A white silk jacket goes with any shoes.
A break in the dam almost caused a flood.
Paint the sockets in the wall dull green.
The child crawled into the dense grass.
Bribes fail where honest men work.
Trample the spark, else the flames will spread.
The hilt of the sword was carved with fine designs.
A round hole was drilled through the thin board.
Footprints showed the path he took up the beach.
She was waiting at my front lawn.
A vent near the edge brought in fresh air.
Prod the old mule with a crooked stick.
It is a band of steel three inches wide.
The pipe ran almost the length of the ditch.
It was hidden from sight by a mass of leaves and shrubs.
The weight of the package was seen on the high scale.
Wake and rise, and step into the green outdoors.
The green light in the brown box flickered.
The brass tube circled the high wall.
The lobes of her ears were pierced to hold rings.
Hold the hammer near the end to drive the nail.
Next Sunday is the twelfth of the month.
Every word and phrase he speaks is true.
He put his last cartridge into the gun and fired.
They took their kids from the public school.
Drive the screw straight into the wood.
Keep the hatch tight and the watch constant.
Sever the twine with a quick snip of the knife.
Paper will dry out when wet.
Slide the catch back and open the desk.
Help the weak to preserve their strength.
A sullen smile gets few friends.
Stop whistling and watch the boys march.
Jerk the cord, and out tumbles the gold.
Slide the tray across the glass top.
The cloud moved in a stately way and was gone.
Light maple makes for a swell room.
Set the piece here and say nothing.
Dull stories make her laugh.
A stiff cord will do to fasten your shoe.
Get the trust fund to the bank early.
Choose between the high road and the low.
A plea for funds seems to come again.
He lent his coat to the tall gaunt stranger.
There is a strong chance it will happen once more.
The duke left the park in a silver coach.
Greet the new guests and leave quickly.
When the frost has come it is time for turkey.
Sweet words work better than fierce.
A thin stripe runs down the middle.
A six comes up more often than a ten.
Lush ferns grow on the lofty rocks.
The ram scared the school children off.
The team with the best timing looks good.
The farmer swapped his horse for a brown ox.
Sit on the perch and tell the others what to do.
A steep trail is painful for our feet.
The early phase of life moves fast.
Green moss grows on the northern side.
Tea in thin china has a sweet taste.
Pitch the straw through the door of the stable.
The latch on the back gate needed a nail.
The goose was brought straight from the old market.
The sink is the thing in which we pile dishes.
A whiff of it will cure the most stubborn cold.
The facts don't always show who is right.
She flaps her cape as she parades the street.
The loss of the cruiser was a blow to the fleet.
Loop the braid to the left and then over.
Plead with the lawyer to drop the lost cause.
Calves thrive on tender spring grass.
Post no bills on this office wall.
Tear a thin sheet from the yellow pad.
A cruise in warm waters in a sleek yacht is fun.
A streak of color ran down the left edge.
It was done before the boy could see it.
Crouch before you jump or miss the mark.
Pack the kits and don't forget the salt.
The square peg will settle in the round hole.
Fine soap saves tender skin.
Poached eggs and tea must suffice.
Bad nerves are jangled by a door slam.
Ship maps are different from those for planes.
Dimes showered down from all sides.
They sang the same tunes at each party.
The sky in the west is tinged with orange red.
The pods of peas ferment in bare fields.
The horse balked and threw the tall rider.
The hitch between the horse and cart broke.
Pile the coal high in the shed corner.
A gold vase is both rare and costly.
The knife was hung inside its bright sheath.
The rarest spice comes from the far East.
The roof should be tilted at a sharp slant.
A smatter of French is worse than none.
The mule trod the treadmill day and night.
The aim of the contest is to raise a great fund.
To send it now in large amounts is bad.
There is a fine hard tang in salty air.
Cod is the main business of the north shore.
The slab was hewn from heavy blocks of slate.
Dunk the stale biscuits into strong drink.
Hang tinsel from both branches.
Cap the jar with a tight brass cover.
The poor boy missed the boat again.
Be sure to set that lamp firmly in the hole.
Pick a card and slip it under the pack.
A round mat will cover the dull spot.
The first part of the plan needs changing.
A good book informs of what we ought to know.
The mail comes in three batches per day.
You cannot brew tea in a cold pot.
Dots of light betrayed the black cat.
Put the chart on the mantel and tack it down.
The night shift men rate extra pay.
The red paper brightened the dim stage.
See the player scoot to third base.
Slide the bill between the two leaves.
Many hands help get the job done.
We don't like to admit our small faults.
No doubt about the way the wind blows.
Dig deep in the earth for pirate's gold.
The steady drip is worse than a drenching rain.
A flat pack takes less luggage space.
Green ice frosted the punch bowl.
A stuffed chair slipped from the moving van.
The stitch will serve but needs to be shortened.
A thin book fits in the side pocket.
The gloss on top made it unfit to read.
The hail pattered on the burnt brown grass.
Seven seals were stamped on great sheets.
Our troops are set to strike heavy blows.
The store was jammed before the sale could start.
It was a bad error on the part of the new judge.
One step more and the board will collapse.
Take the match and strike it against your shoe.
The pot boiled but the contents failed to jell.
The baby puts his right foot in his mouth.
The bombs left most of the town in ruins.
Stop and stare at the hard working man.
The streets are narrow and full of sharp turns.
The pup jerked the leash as he saw a feline shape.
Open your book to the first page.
Fish evade the net and swim off.
Dip the pail once and let it settle.
Will you please answer that phone.
The big red apple fell to the ground.
The curtain rose and the show was on.
The young prince became heir to the throne.
He sent the boy on a short errand.
Leave now and you will arrive on time.
The corner store was robbed last night.
A gold ring will please most any girl.
The long journey home took a year.
She saw a cat in the neighbor's house.
A pink shell was found on the sandy beach.
Small children came to see him.
The grass and bushes were wet with dew.
The blind man counted his old coins.
A severe storm tore down the barn.
She called his name many times.
When you hear the bell, come quickly.


================================================
FILE: data-raw/samples.R
================================================
words <- rcorpora::corpora("words/common")$commonWords
fruit <- rcorpora::corpora("foods/fruits")$fruits

html <- read_html("https://harvardsentences.com")
html %>%
  html_elements("li") %>%
  html_text() %>%
  iconv(to = "ASCII//translit") %>%
  writeLines("data-raw/harvard-sentences.txt")
sentences <- readr::read_lines("data-raw/harvard-sentences.txt")

usethis::use_data(words, overwrite = TRUE)
usethis::use_data(fruit, overwrite = TRUE)
usethis::use_data(sentences, overwrite = TRUE)


================================================
FILE: inst/htmlwidgets/lib/str_view.css
================================================
.str_view ul {
  font-size: 16px;
}

.str_view ul, .str_view li {
  list-style: none;
  padding: 0;
  margin: 0.5em 0;
}

.str_view .match {
  border: 1px solid #ccc;
  background-color: #eee;
  border-color: #ccc;
  border-radius: 3px;
}

.str_view .special {
  background-color: red;
}


================================================
FILE: inst/htmlwidgets/str_view.js
================================================
HTMLWidgets.widget({

  name: 'str_view',

  type: 'output',

  initialize: function(el, width, height) {
  },

  renderValue: function(el, x, instance) {
    el.innerHTML = x.html;
  },

  resize: function(el, width, height, instance) {
  }

});


================================================
FILE: inst/htmlwidgets/str_view.yaml
================================================
dependencies:
 - name: str_view
   version: 0.1.0
   src: htmlwidgets/lib/
   stylesheet: str_view.css


================================================
FILE: man/case.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/case.R
\name{case}
\alias{case}
\alias{str_to_upper}
\alias{str_to_lower}
\alias{str_to_title}
\alias{str_to_sentence}
\title{Convert string to upper case, lower case, title case, or sentence case}
\usage{
str_to_upper(string, locale = "en")

str_to_lower(string, locale = "en")

str_to_title(string, locale = "en")

str_to_sentence(string, locale = "en")
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
\itemize{
\item \code{str_to_upper()} converts to upper case.
\item \code{str_to_lower()} converts to lower case.
\item \code{str_to_title()} converts to title case, where only the first letter of
each word is capitalized.
\item \code{str_to_sentence()} convert to sentence case, where only the first letter
of sentence is capitalized.
}
}
\examples{
dog <- "The quick brown dog"
str_to_upper(dog)
str_to_lower(dog)
str_to_title(dog)
str_to_sentence("the quick brown dog")

# Locale matters!
str_to_upper("i") # English
str_to_upper("i", "tr") # Turkish
}


================================================
FILE: man/invert_match.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/locate.R
\name{invert_match}
\alias{invert_match}
\title{Switch location of matches to location of non-matches}
\usage{
invert_match(loc)
}
\arguments{
\item{loc}{matrix of match locations, as from \code{\link[=str_locate_all]{str_locate_all()}}}
}
\value{
numeric match giving locations of non-matches
}
\description{
Invert a matrix of match locations to match the opposite of what was
previously matched.
}
\examples{
numbers <- "1 and 2 and 4 and 456"
num_loc <- str_locate_all(numbers, "[0-9]+")[[1]]
str_sub(numbers, num_loc[, "start"], num_loc[, "end"])

text_loc <- invert_match(num_loc)
str_sub(numbers, text_loc[, "start"], text_loc[, "end"])
}


================================================
FILE: man/modifiers.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/modifiers.R
\name{modifiers}
\alias{modifiers}
\alias{fixed}
\alias{coll}
\alias{regex}
\alias{boundary}
\title{Control matching behaviour with modifier functions}
\usage{
fixed(pattern, ignore_case = FALSE)

coll(pattern, ignore_case = FALSE, locale = "en", ...)

regex(
  pattern,
  ignore_case = FALSE,
  multiline = FALSE,
  comments = FALSE,
  dotall = FALSE,
  ...
)

boundary(
  type = c("character", "line_break", "sentence", "word"),
  skip_word_none = NA,
  ...
)
}
\arguments{
\item{pattern}{Pattern to modify behaviour.}

\item{ignore_case}{Should case differences be ignored in the match?
For \code{fixed()}, this uses a simple algorithm which assumes a
one-to-one mapping between upper and lower case letters.}

\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}

\item{...}{Other less frequently used arguments passed on to
\code{\link[stringi:stri_opts_collator]{stringi::stri_opts_collator()}},
\code{\link[stringi:stri_opts_regex]{stringi::stri_opts_regex()}}, or
\code{\link[stringi:stri_opts_brkiter]{stringi::stri_opts_brkiter()}}}

\item{multiline}{If \code{TRUE}, \code{$} and \code{^} match
the beginning and end of each line. If \code{FALSE}, the
default, only match the start and end of the input.}

\item{comments}{If \code{TRUE}, white space and comments beginning with
\verb{#} are ignored. Escape literal spaces with \verb{\\\\ }.}

\item{dotall}{If \code{TRUE}, \code{.} will also match line terminators.}

\item{type}{Boundary type to detect.
\describe{
\item{\code{character}}{Every character is a boundary.}
\item{\code{line_break}}{Boundaries are places where it is acceptable to have
a line break in the current locale.}
\item{\code{sentence}}{The beginnings and ends of sentences are boundaries,
using intelligent rules to avoid counting abbreviations
(\href{https://www.unicode.org/reports/tr29/#Sentence_Boundaries}{details}).}
\item{\code{word}}{The beginnings and ends of words are boundaries.}
}}

\item{skip_word_none}{Ignore "words" that don't contain any characters
or numbers - i.e. punctuation. Default \code{NA} will skip such "words"
only when splitting on \code{word} boundaries.}
}
\value{
A stringr modifier object, i.e. a character vector with
parent S3 class \code{stringr_pattern}.
}
\description{
Modifier functions control the meaning of the \code{pattern} argument to
stringr functions:
\itemize{
\item \code{boundary()}: Match boundaries between things.
\item \code{coll()}: Compare strings using standard Unicode collation rules.
\item \code{fixed()}: Compare literal bytes.
\item \code{regex()} (the default): Uses ICU regular expressions.
}
}
\examples{
pattern <- "a.b"
strings <- c("abb", "a.b")
str_detect(strings, pattern)
str_detect(strings, fixed(pattern))
str_detect(strings, coll(pattern))

# coll() is useful for locale-aware case-insensitive matching
i <- c("I", "\u0130", "i")
i
str_detect(i, fixed("i", TRUE))
str_detect(i, coll("i", TRUE))
str_detect(i, coll("i", TRUE, locale = "tr"))

# Word boundaries
words <- c("These are   some words.")
str_count(words, boundary("word"))
str_split(words, " ")[[1]]
str_split(words, boundary("word"))[[1]]

# Regular expression variations
str_extract_all("The Cat in the Hat", "[a-z]+")
str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))

str_extract_all("a\nb\nc", "^.")
str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))

str_extract_all("a\nb\nc", "a.")
str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))
}


================================================
FILE: man/pipe.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils.R
\name{\%>\%}
\alias{\%>\%}
\title{Pipe operator}
\usage{
lhs \%>\% rhs
}
\description{
Pipe operator
}
\keyword{internal}


================================================
FILE: man/str_c.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/c.R
\name{str_c}
\alias{str_c}
\title{Join multiple strings into one string}
\usage{
str_c(..., sep = "", collapse = NULL)
}
\arguments{
\item{...}{One or more character vectors.

\code{NULL}s are removed; scalar inputs (vectors of length 1) are recycled to
the common length of vector inputs.

Like most other R functions, missing values are "infectious": whenever
a missing value is combined with another string the result will always
be missing. Use \code{\link[dplyr:coalesce]{dplyr::coalesce()}} or \code{\link[=str_replace_na]{str_replace_na()}} to convert to
the desired value.}

\item{sep}{String to insert between input vectors.}

\item{collapse}{Optional string used to combine output into single
string. Generally better to use \code{\link[=str_flatten]{str_flatten()}} if you needed this
behaviour.}
}
\value{
If \code{collapse = NULL} (the default) a character vector with
length equal to the longest input. If \code{collapse} is a string, a character
vector of length 1.
}
\description{
\code{str_c()} combines multiple character vectors into a single character
vector. It's very similar to \code{\link[=paste0]{paste0()}} but uses tidyverse recycling and
\code{NA} rules.

One way to understand how \code{str_c()} works is picture a 2d matrix of strings,
where each argument forms a column. \code{sep} is inserted between each column,
and then each row is combined together into a single string. If \code{collapse}
is set, it's inserted between each row, and then the result is again
combined, this time into a single string.
}
\examples{
str_c("Letter: ", letters)
str_c("Letter", letters, sep = ": ")
str_c(letters, " is for", "...")
str_c(letters[-26], " comes before ", letters[-1])

str_c(letters, collapse = "")
str_c(letters, collapse = ", ")

# Differences from paste() ----------------------
# Missing inputs give missing outputs
str_c(c("a", NA, "b"), "-d")
paste0(c("a", NA, "b"), "-d")
# Use str_replace_NA to display literal NAs:
str_c(str_replace_na(c("a", NA, "b")), "-d")

# Uses tidyverse recycling rules
\dontrun{str_c(1:2, 1:3)} # errors
paste0(1:2, 1:3)

str_c("x", character())
paste0("x", character())
}


================================================
FILE: man/str_conv.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/conv.R
\name{str_conv}
\alias{str_conv}
\title{Specify the encoding of a string}
\usage{
str_conv(string, encoding)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{encoding}{Name of encoding. See \code{\link[stringi:stri_enc_list]{stringi::stri_enc_list()}}
for a complete list.}
}
\description{
This is a convenient way to override the current encoding of a string.
}
\examples{
# Example from encoding?stringi::stringi
x <- rawToChar(as.raw(177))
x
str_conv(x, "ISO-8859-2") # Polish "a with ogonek"
str_conv(x, "ISO-8859-1") # Plus-minus
}


================================================
FILE: man/str_count.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/count.R
\name{str_count}
\alias{str_count}
\title{Count number of matches}
\usage{
str_count(string, pattern = "")
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with
\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.}
}
\value{
An integer vector the same length as \code{string}/\code{pattern}.
}
\description{
Counts the number of times \code{pattern} is found within each element
of \code{string.}
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_count(fruit, "a")
str_count(fruit, "p")
str_count(fruit, "e")
str_count(fruit, c("a", "b", "p", "p"))

str_count(c("a.", "...", ".a.a"), ".")
str_count(c("a.", "...", ".a.a"), fixed("."))
}
\seealso{
\code{\link[stringi:stri_count]{stringi::stri_count()}} which this function wraps.

\code{\link[=str_locate]{str_locate()}}/\code{\link[=str_locate_all]{str_locate_all()}} to locate position
of matches
}


================================================
FILE: man/str_detect.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/detect.R
\name{str_detect}
\alias{str_detect}
\title{Detect the presence/absence of a match}
\usage{
str_detect(string, pattern, negate = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}

\item{negate}{If \code{TRUE}, inverts the resulting boolean vector.}
}
\value{
A logical vector the same length as \code{string}/\code{pattern}.
}
\description{
\code{str_detect()} returns a logical vector with \code{TRUE} for each element of
\code{string} that matches \code{pattern} and \code{FALSE} otherwise. It's equivalent to
\code{grepl(pattern, string)}.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_detect(fruit, "a")
str_detect(fruit, "^a")
str_detect(fruit, "a$")
str_detect(fruit, "b")
str_detect(fruit, "[aeiou]")

# Also vectorised over pattern
str_detect("aecfg", letters)

# Returns TRUE if the pattern do NOT match
str_detect(fruit, "^p", negate = TRUE)
}
\seealso{
\code{\link[stringi:stri_detect]{stringi::stri_detect()}} which this function wraps,
\code{\link[=str_subset]{str_subset()}} for a convenient wrapper around
\code{x[str_detect(x, pattern)]}
}


================================================
FILE: man/str_dup.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dup.R
\name{str_dup}
\alias{str_dup}
\title{Duplicate a string}
\usage{
str_dup(string, times, sep = NULL)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{times}{Number of times to duplicate each string.}

\item{sep}{String to insert between each duplicate.}
}
\value{
A character vector the same length as \code{string}/\code{times}.
}
\description{
\code{str_dup()} duplicates the characters within a string, e.g.
\code{str_dup("xy", 3)} returns \code{"xyxyxy"}.
}
\examples{
fruit <- c("apple", "pear", "banana")
str_dup(fruit, 2)
str_dup(fruit, 2, sep = " ")
str_dup(fruit, 1:3)
str_c("ba", str_dup("na", 0:5))
}


================================================
FILE: man/str_equal.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/equal.R
\name{str_equal}
\alias{str_equal}
\title{Determine if two strings are equivalent}
\usage{
str_equal(x, y, locale = "en", ignore_case = FALSE, ...)
}
\arguments{
\item{x, y}{A pair of character vectors.}

\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}

\item{ignore_case}{Ignore case when comparing strings?}

\item{...}{Other options used to control collation. Passed on to
\code{\link[stringi:stri_opts_collator]{stringi::stri_opts_collator()}}.}
}
\value{
An logical vector the same length as \code{x}/\code{y}.
}
\description{
This uses Unicode canonicalisation rules, and optionally ignores case.
}
\examples{
# These two strings encode "a" with an accent in two different ways
a1 <- "\u00e1"
a2 <- "a\u0301"
c(a1, a2)

a1 == a2
str_equal(a1, a2)

# ohm and omega use different code points but should always be treated
# as equal
ohm <- "\u2126"
omega <- "\u03A9"
c(ohm, omega)

ohm == omega
str_equal(ohm, omega)
}
\seealso{
\code{\link[stringi:stri_compare]{stringi::stri_cmp_equiv()}} for the underlying implementation.
}


================================================
FILE: man/str_escape.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/escape.R
\name{str_escape}
\alias{str_escape}
\title{Escape regular expression metacharacters}
\usage{
str_escape(string)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
This function escapes metacharacter, the characters that have special
meaning to the regular expression engine. In most cases you are better
off using \code{\link[=fixed]{fixed()}} since it is faster, but \code{str_escape()} is useful
if you are composing user provided strings into a pattern.
}
\examples{
str_detect(c("a", "."), ".")
str_detect(c("a", "."), str_escape("."))
}


================================================
FILE: man/str_extract.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/extract.R
\name{str_extract}
\alias{str_extract}
\alias{str_extract_all}
\title{Extract the complete match}
\usage{
str_extract(string, pattern, group = NULL)

str_extract_all(string, pattern, simplify = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with
\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.}

\item{group}{If supplied, instead of returning the complete match, will
return the matched text from the specified capturing group.}

\item{simplify}{A boolean.
\itemize{
\item \code{FALSE} (the default): returns a list of character vectors.
\item \code{TRUE}: returns a character matrix.
}}
}
\value{
\itemize{
\item \code{str_extract()}: an character vector the same length as \code{string}/\code{pattern}.
\item \code{str_extract_all()}: a list of character vectors the same length as
\code{string}/\code{pattern}.
}
}
\description{
\code{str_extract()} extracts the first complete match from each string,
\code{str_extract_all()}extracts all matches from each string.
}
\examples{
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
str_extract(shopping_list, "\\\\d")
str_extract(shopping_list, "[a-z]+")
str_extract(shopping_list, "[a-z]{1,4}")
str_extract(shopping_list, "\\\\b[a-z]{1,4}\\\\b")

str_extract(shopping_list, "([a-z]+) of ([a-z]+)")
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 1)
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 2)

# Extract all matches
str_extract_all(shopping_list, "[a-z]+")
str_extract_all(shopping_list, "\\\\b[a-z]+\\\\b")
str_extract_all(shopping_list, "\\\\d")

# Simplify results into character matrix
str_extract_all(shopping_list, "\\\\b[a-z]+\\\\b", simplify = TRUE)
str_extract_all(shopping_list, "\\\\d", simplify = TRUE)

# Extract all words
str_extract_all("This is, suprisingly, a sentence.", boundary("word"))
}
\seealso{
\code{\link[=str_match]{str_match()}} to extract matched groups;
\code{\link[stringi:stri_extract]{stringi::stri_extract()}} for the underlying implementation.
}


================================================
FILE: man/str_flatten.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/flatten.R
\name{str_flatten}
\alias{str_flatten}
\alias{str_flatten_comma}
\title{Flatten a string}
\usage{
str_flatten(string, collapse = "", last = NULL, na.rm = FALSE)

str_flatten_comma(string, last = NULL, na.rm = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{collapse}{String to insert between each piece. Defaults to \code{""}.}

\item{last}{Optional string to use in place of the final separator.}

\item{na.rm}{Remove missing values? If \code{FALSE} (the default), the result
will be \code{NA} if any element of \code{string} is \code{NA}.}
}
\value{
A string, i.e. a character vector of length 1.
}
\description{
\code{str_flatten()} reduces a character vector to a single string. This is a
summary function because regardless of the length of the input \code{x}, it
always returns a single string.

\code{str_flatten_comma()} is a variation designed specifically for flattening
with commas. It automatically recognises if \code{last} uses the Oxford comma
and handles the special case of 2 elements.
}
\examples{
str_flatten(letters)
str_flatten(letters, "-")

str_flatten(letters[1:3], ", ")

# Use last to customise the last component
str_flatten(letters[1:3], ", ", " and ")

# this almost works if you want an Oxford (aka serial) comma
str_flatten(letters[1:3], ", ", ", and ")

# but it will always add a comma, even when not necessary
str_flatten(letters[1:2], ", ", ", and ")

# str_flatten_comma knows how to handle the Oxford comma
str_flatten_comma(letters[1:3], ", and ")
str_flatten_comma(letters[1:2], ", and ")
}


================================================
FILE: man/str_glue.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/glue.R
\name{str_glue}
\alias{str_glue}
\alias{str_glue_data}
\title{Interpolation with glue}
\usage{
str_glue(..., .sep = "", .envir = parent.frame(), .trim = TRUE)

str_glue_data(.x, ..., .sep = "", .envir = parent.frame(), .na = "NA")
}
\arguments{
\item{...}{[\code{expressions}]\cr Unnamed arguments are taken to be expression
string(s) to format. Multiple inputs are concatenated together before formatting.
Named arguments are taken to be temporary variables available for substitution.

For \code{glue_data()}, elements in \code{...} override the values in \code{.x}.}

\item{.sep}{[\code{character(1)}: \sQuote{""}]\cr Separator used to separate elements.}

\item{.envir}{[\code{environment}: \code{parent.frame()}]\cr Environment to evaluate each expression in. Expressions are
evaluated from left to right. If \code{.x} is an environment, the expressions are
evaluated in that environment and \code{.envir} is ignored. If \code{NULL} is passed, it is equivalent to \code{\link[=emptyenv]{emptyenv()}}.}

\item{.trim}{[\code{logical(1)}: \sQuote{TRUE}]\cr Whether to trim the input
template with \code{\link[glue:trim]{trim()}} or not.}

\item{.x}{[\code{listish}]\cr An environment, list, or data frame used to lookup values.}

\item{.na}{[\code{character(1)}: \sQuote{NA}]\cr Value to replace \code{NA} values
with. If \code{NULL} missing values are propagated, that is an \code{NA} result will
cause \code{NA} output. Otherwise the value is replaced by the value of \code{.na}.}
}
\value{
A character vector with same length as the longest input.
}
\description{
These functions are wrappers around \code{\link[glue:glue]{glue::glue()}} and \code{\link[glue:glue]{glue::glue_data()}},
which provide a powerful and elegant syntax for interpolating strings
with \code{{}}.

These wrappers provide a small set of the full options. Use \code{glue()} and
\code{glue_data()} directly from glue for more control.
}
\examples{
name <- "Fred"
age <- 50
anniversary <- as.Date("1991-10-12")
str_glue(
  "My name is {name}, ",
  "my age next year is {age + 1}, ",
  "and my anniversary is {format(anniversary, '\%A, \%B \%d, \%Y')}."
)

# single braces can be inserted by doubling them
str_glue("My name is {name}, not {{name}}.")

# You can also used named arguments
str_glue(
  "My name is {name}, ",
  "and my age next year is {age + 1}.",
  name = "Joe",
  age = 40
)

# `str_glue_data()` is useful in data pipelines
mtcars \%>\% str_glue_data("{rownames(.)} has {hp} hp")
}


================================================
FILE: man/str_interp.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/interp.R
\name{str_interp}
\alias{str_interp}
\title{String interpolation}
\usage{
str_interp(string, env = parent.frame())
}
\arguments{
\item{string}{A template character string. This function is not vectorised:
a character vector will be collapsed into a single string.}

\item{env}{The environment in which to evaluate the expressions.}
}
\value{
An interpolated character string.
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#superseded}{\figure{lifecycle-superseded.svg}{options: alt='[Superseded]'}}}{\strong{[Superseded]}}

\code{str_interp()} is superseded in favour of \code{\link[=str_glue]{str_glue()}}.

String interpolation is a useful way of specifying a character string which
depends on values in a certain environment. It allows for string creation
which is easier to read and write when compared to using e.g.
\code{\link[=paste]{paste()}} or \code{\link[=sprintf]{sprintf()}}. The (template) string can
include expression placeholders of the form \verb{$\{expression\}} or
\verb{$[format]\{expression\}}, where expressions are valid R expressions that
can be evaluated in the given environment, and \code{format} is a format
specification valid for use with \code{\link[=sprintf]{sprintf()}}.
}
\examples{

# Using values from the environment, and some formats
user_name <- "smbache"
amount <- 6.656
account <- 1337
str_interp("User ${user_name} (account $[08d]{account}) has $$[.2f]{amount}.")

# Nested brace pairs work inside expressions too, and any braces can be
# placed outside the expressions.
str_interp("Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}")

# Values can also come from a list
str_interp(
  "One value, ${value1}, and then another, ${value2*2}.",
  list(value1 = 10, value2 = 20)
)

# Or a data frame
str_interp(
  "Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.",
  iris
)

# Use a vector when the string is long:
max_char <- 80
str_interp(c(
  "This particular line is so long that it is hard to write ",
  "without breaking the ${max_char}-char barrier!"
))
}
\seealso{
\code{\link[=str_glue]{str_glue()}} and \code{\link[=str_glue_data]{str_glue_data()}} for alternative approaches to
the same problem.
}
\author{
Stefan Milton Bache
}
\keyword{internal}


================================================
FILE: man/str_length.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/length.R
\name{str_length}
\alias{str_length}
\alias{str_width}
\title{Compute the length/width}
\usage{
str_length(string)

str_width(string)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}
}
\value{
A numeric vector the same length as \code{string}.
}
\description{
\code{str_length()} returns the number of codepoints in a string. These are
the individual elements (which are often, but not always letters) that
can be extracted with \code{\link[=str_sub]{str_sub()}}.

\code{str_width()} returns how much space the string will occupy when printed
in a fixed width font (i.e. when printed in the console).
}
\examples{
str_length(letters)
str_length(NA)
str_length(factor("abc"))
str_length(c("i", "like", "programming", NA))

# Some characters, like emoji and Chinese characters (hanzi), are square
# which means they take up the width of two Latin characters
x <- c("\u6c49\u5b57", "\U0001f60a")
str_view(x)
str_width(x)
str_length(x)

# There are two ways of representing a u with an umlaut
u <- c("\u00fc", "u\u0308")
# They have the same width
str_width(u)
# But a different length
str_length(u)
# Because the second element is made up of a u + an accent
str_sub(u, 1, 1)
}
\seealso{
\code{\link[stringi:stri_length]{stringi::stri_length()}} which this function wraps.
}


================================================
FILE: man/str_like.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/detect.R
\name{str_like}
\alias{str_like}
\alias{str_ilike}
\title{Detect a pattern in the same way as \code{SQL}'s \code{LIKE} and \code{ILIKE} operators}
\usage{
str_like(string, pattern, ignore_case = deprecated())

str_ilike(string, pattern)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{A character vector containing a SQL "like" pattern.
See above for details.}

\item{ignore_case}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
}
\value{
A logical vector the same length as \code{string}.
}
\description{
\code{str_like()} and \code{str_like()} follow the conventions of the SQL \code{LIKE}
and \code{ILIKE} operators, namely:
\itemize{
\item Must match the entire string.
\item \verb{_} matches a single character (like \code{.}).
\item \verb{\%} matches any number of characters (like \verb{.*}).
\item \verb{\\\%} and \verb{\\_} match literal \verb{\%} and \verb{_}.
}

The difference between the two functions is their case-sensitivity:
\code{str_like()} is case sensitive and \code{str_ilike()} is not.
}
\note{
Prior to stringr 1.6.0, \code{str_like()} was incorrectly case-insensitive.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_like(fruit, "app")
str_like(fruit, "app\%")
str_like(fruit, "APP\%")
str_like(fruit, "ba_ana")
str_like(fruit, "\%apple")

str_ilike(fruit, "app")
str_ilike(fruit, "app\%")
str_ilike(fruit, "APP\%")
str_ilike(fruit, "ba_ana")
str_ilike(fruit, "\%apple")
}


================================================
FILE: man/str_locate.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/locate.R
\name{str_locate}
\alias{str_locate}
\alias{str_locate_all}
\title{Find location of match}
\usage{
str_locate(string, pattern)

str_locate_all(string, pattern)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with
\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.}
}
\value{
\itemize{
\item \code{str_locate()} returns an integer matrix with two columns and
one row for each element of \code{string}. The first column, \code{start},
gives the position at the start of the match, and the second column, \code{end},
gives the position of the end.
\item \code{str_locate_all()} returns a list of integer matrices with the same
length as \code{string}/\code{pattern}. The matrices have columns \code{start} and \code{end}
as above, and one row for each match.
}
}
\description{
\code{str_locate()} returns the \code{start} and \code{end} position of the first match;
\code{str_locate_all()} returns the \code{start} and \code{end} position of each match.

Because the \code{start} and \code{end} values are inclusive, zero-length matches
(e.g. \code{$}, \code{^}, \verb{\\\\b}) will have an \code{end} that is smaller than \code{start}.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_locate(fruit, "$")
str_locate(fruit, "a")
str_locate(fruit, "e")
str_locate(fruit, c("a", "b", "p", "p"))

str_locate_all(fruit, "a")
str_locate_all(fruit, "e")
str_locate_all(fruit, c("a", "b", "p", "p"))

# Find location of every character
str_locate_all(fruit, "")
}
\seealso{
\code{\link[=str_extract]{str_extract()}} for a convenient way of extracting matches,
\code{\link[stringi:stri_locate]{stringi::stri_locate()}} for the underlying implementation.
}


================================================
FILE: man/str_match.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/match.R
\name{str_match}
\alias{str_match}
\alias{str_match_all}
\title{Extract components (capturing groups) from a match}
\usage{
str_match(string, pattern)

str_match_all(string, pattern)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Unlike other stringr functions, \code{str_match()} only supports
regular expressions, as described \code{vignette("regular-expressions")}.
The pattern should contain at least one capturing group.}
}
\value{
\itemize{
\item \code{str_match()}: a character matrix with the same number of rows as the
length of \code{string}/\code{pattern}. The first column is the complete match,
followed by one column for each capture group. The columns will be named
if you used "named captured groups", i.e. \verb{(?<name>pattern')}.
\item \code{str_match_all()}: a list of the same length as \code{string}/\code{pattern}
containing character matrices. Each matrix has columns as described above
and one row for each match.
}
}
\description{
Extract any number of matches defined by unnamed, \code{(pattern)}, and
named, \verb{(?<name>pattern)} capture groups.

Use a non-capturing group, \verb{(?:pattern)}, if you need to override default
operate precedence but don't want to capture the result.
}
\examples{
strings <- c(" 219 733 8965", "329-293-8753 ", "banana", "595 794 7569",
  "387 287 6718", "apple", "233.398.9187  ", "482 952 3315",
  "239 923 8115 and 842 566 4692", "Work: 579-499-7527", "$1000",
  "Home: 543.355.3679")
phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"

str_extract(strings, phone)
str_match(strings, phone)

# Extract/match all
str_extract_all(strings, phone)
str_match_all(strings, phone)

# You can also name the groups to make further manipulation easier
phone <- "(?<area>[2-9][0-9]{2})[- .](?<phone>[0-9]{3}[- .][0-9]{4})"
str_match(strings, phone)

x <- c("<a> <b>", "<a> <>", "<a>", "", NA)
str_match(x, "<(.*?)> <(.*?)>")
str_match_all(x, "<(.*?)>")

str_extract(x, "<.*?>")
str_extract_all(x, "<.*?>")
}
\seealso{
\code{\link[=str_extract]{str_extract()}} to extract the complete match,
\code{\link[stringi:stri_match]{stringi::stri_match()}} for the underlying implementation.
}


================================================
FILE: man/str_order.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sort.R
\name{str_order}
\alias{str_order}
\alias{str_rank}
\alias{str_sort}
\title{Order, rank, or sort a character vector}
\usage{
str_order(
  x,
  decreasing = FALSE,
  na_last = TRUE,
  locale = "en",
  numeric = FALSE,
  ...
)

str_rank(x, locale = "en", numeric = FALSE, ...)

str_sort(
  x,
  decreasing = FALSE,
  na_last = TRUE,
  locale = "en",
  numeric = FALSE,
  ...
)
}
\arguments{
\item{x}{A character vector to sort.}

\item{decreasing}{A boolean. If \code{FALSE}, the default, sorts from
lowest to highest; if \code{TRUE} sorts from highest to lowest.}

\item{na_last}{Where should \code{NA} go? \code{TRUE} at the end,
\code{FALSE} at the beginning, \code{NA} dropped.}

\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}

\item{numeric}{If \code{TRUE}, will sort digits numerically, instead
of as strings.}

\item{...}{Other options used to control collation. Passed on to
\code{\link[stringi:stri_opts_collator]{stringi::stri_opts_collator()}}.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
\itemize{
\item \code{str_sort()} returns the sorted vector.
\item \code{str_order()} returns an integer vector that returns the desired
order when used for subsetting, i.e. \code{x[str_order(x)]} is the same
as \code{str_sort()}
\item \code{str_rank()} returns the ranks of the values, i.e.
\code{arrange(df, str_rank(x))} is the same as \code{str_sort(df$x)}.
}
}
\examples{
x <- c("apple", "car", "happy", "char")
str_sort(x)

str_order(x)
x[str_order(x)]

str_rank(x)

# In Czech, ch is a digraph that sorts after h
str_sort(x, locale = "cs")

# Use numeric = TRUE to sort numbers in strings
x <- c("100a10", "100a5", "2b", "2a")
str_sort(x)
str_sort(x, numeric = TRUE)
}
\seealso{
\code{\link[stringi:stri_order]{stringi::stri_order()}} for the underlying implementation.
}


================================================
FILE: man/str_pad.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pad.R
\name{str_pad}
\alias{str_pad}
\title{Pad a string to minimum width}
\usage{
str_pad(
  string,
  width,
  side = c("left", "right", "both"),
  pad = " ",
  use_width = TRUE
)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{width}{Minimum width of padded strings.}

\item{side}{Side on which padding character is added (left, right or both).}

\item{pad}{Single padding character (default is a space).}

\item{use_width}{If \code{FALSE}, use the length of the string instead of the
width; see \code{\link[=str_width]{str_width()}}/\code{\link[=str_length]{str_length()}} for the difference.}
}
\value{
A character vector the same length as \code{stringr}/\code{width}/\code{pad}.
}
\description{
Pad a string to a fixed width, so that
\code{str_length(str_pad(x, n))} is always greater than or equal to \code{n}.
}
\examples{
rbind(
  str_pad("hadley", 30, "left"),
  str_pad("hadley", 30, "right"),
  str_pad("hadley", 30, "both")
)

# All arguments are vectorised except side
str_pad(c("a", "abc", "abcdef"), 10)
str_pad("a", c(5, 10, 20))
str_pad("a", 10, pad = c("-", "_", " "))

# Longer strings are returned unchanged
str_pad("hadley", 3)
}
\seealso{
\code{\link[=str_trim]{str_trim()}} to remove whitespace;
\code{\link[=str_trunc]{str_trunc()}} to decrease the maximum width of a string.
}


================================================
FILE: man/str_remove.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/remove.R
\name{str_remove}
\alias{str_remove}
\alias{str_remove_all}
\title{Remove matched patterns}
\usage{
str_remove(string, pattern)

str_remove_all(string, pattern)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}
}
\value{
A character vector the same length as \code{string}/\code{pattern}.
}
\description{
Remove matches, i.e. replace them with \code{""}.
}
\examples{
fruits <- c("one apple", "two pears", "three bananas")
str_remove(fruits, "[aeiou]")
str_remove_all(fruits, "[aeiou]")
}
\seealso{
\code{\link[=str_replace]{str_replace()}} for the underlying implementation.
}


================================================
FILE: man/str_replace.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/replace.R
\name{str_replace}
\alias{str_replace}
\alias{str_replace_all}
\title{Replace matches with new text}
\usage{
str_replace(string, pattern, replacement)

str_replace_all(string, pattern, replacement)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described
in \link[stringi:about_search_regex]{stringi::about_search_regex}. Control options with
\code{\link[=regex]{regex()}}.

For \code{str_replace_all()} this can also be a named vector
(\code{c(pattern1 = replacement1)}), in order to perform multiple replacements
in each element of \code{string}.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}

\item{replacement}{The replacement value, usually a single string,
but it can be the a vector the same length as \code{string} or \code{pattern}.
References of the form \verb{\\1}, \verb{\\2}, etc will be replaced with
the contents of the respective matched group (created by \verb{()}).

Alternatively, supply a function (or formula): it will be passed a single
character vector and should return a character vector of the same length.

To replace the complete string with \code{NA}, use
\code{replacement = NA_character_}.}
}
\value{
A character vector the same length as
\code{string}/\code{pattern}/\code{replacement}.
}
\description{
\code{str_replace()} replaces the first match; \code{str_replace_all()} replaces
all matches.
}
\examples{
fruits <- c("one apple", "two pears", "three bananas")
str_replace(fruits, "[aeiou]", "-")
str_replace_all(fruits, "[aeiou]", "-")
str_replace_all(fruits, "[aeiou]", toupper)
str_replace_all(fruits, "b", NA_character_)

str_replace(fruits, "([aeiou])", "")
str_replace(fruits, "([aeiou])", "\\\\1\\\\1")

# Note that str_replace() is vectorised along text, pattern, and replacement
str_replace(fruits, "[aeiou]", c("1", "2", "3"))
str_replace(fruits, c("a", "e", "i"), "-")

# If you want to apply multiple patterns and replacements to the same
# string, pass a named vector to pattern.
fruits \%>\%
  str_c(collapse = "---") \%>\%
  str_replace_all(c("one" = "1", "two" = "2", "three" = "3"))

# Use a function for more sophisticated replacement. This example
# replaces colour names with their hex values.
colours <- str_c("\\\\b", colors(), "\\\\b", collapse="|")
col2hex <- function(col) {
  rgb <- col2rgb(col)
  rgb(rgb["red", ], rgb["green", ], rgb["blue", ], maxColorValue = 255)
}

x <- c(
  "Roses are red, violets are blue",
  "My favourite colour is green"
)
str_replace_all(x, colours, col2hex)
}
\seealso{
\code{\link[=str_replace_na]{str_replace_na()}} to turn missing values into "NA";
\code{\link[stringi:stri_replace]{stringi::stri_replace()}} for the underlying implementation.
}


================================================
FILE: man/str_replace_na.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/replace.R
\name{str_replace_na}
\alias{str_replace_na}
\title{Turn NA into "NA"}
\usage{
str_replace_na(string, replacement = "NA")
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{replacement}{A single string.}
}
\description{
Turn NA into "NA"
}
\examples{
str_replace_na(c(NA, "abc", "def"))
}


================================================
FILE: man/str_split.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/split.R
\name{str_split}
\alias{str_split}
\alias{str_split_1}
\alias{str_split_fixed}
\alias{str_split_i}
\title{Split up a string into pieces}
\usage{
str_split(string, pattern, n = Inf, simplify = FALSE)

str_split_1(string, pattern)

str_split_fixed(string, pattern, n)

str_split_i(string, pattern, i)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with
\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.}

\item{n}{Maximum number of pieces to return. Default (Inf) uses all
possible split positions.

For \code{str_split()}, this determines the maximum length of each element
of the output. For \code{str_split_fixed()}, this determines the number of
columns in the output; if an input is too short, the result will be padded
with \code{""}.}

\item{simplify}{A boolean.
\itemize{
\item \code{FALSE} (the default): returns a list of character vectors.
\item \code{TRUE}: returns a character matrix.
}}

\item{i}{Element to return. Use a negative value to count from the
right hand side.}
}
\value{
\itemize{
\item \code{str_split_1()}: a character vector.
\item \code{str_split()}: a list the same length as \code{string}/\code{pattern} containing
character vectors.
\item \code{str_split_fixed()}: a character matrix with \code{n} columns and the same
number of rows as the length of \code{string}/\code{pattern}.
\item \code{str_split_i()}: a character vector the same length as \code{string}/\code{pattern}.
}
}
\description{
This family of functions provides various ways of splitting a string up
into pieces. These two functions return a character vector:
\itemize{
\item \code{str_split_1()} takes a single string and splits it into pieces,
returning a single character vector.
\item \code{str_split_i()} splits each string in a character vector into pieces and
extracts the \code{i}th value, returning a character vector.
}

These two functions return a more complex object:
\itemize{
\item \code{str_split()} splits each string in a character vector into a varying
number of pieces, returning a list of character vectors.
\item \code{str_split_fixed()} splits each string in a character vector into a
fixed number of pieces, returning a character matrix.
}
}
\examples{
fruits <- c(
  "apples and oranges and pears and bananas",
  "pineapples and mangos and guavas"
)

str_split(fruits, " and ")
str_split(fruits, " and ", simplify = TRUE)

# If you want to split a single string, use `str_split_1`
str_split_1(fruits[[1]], " and ")

# Specify n to restrict the number of possible matches
str_split(fruits, " and ", n = 3)
str_split(fruits, " and ", n = 2)
# If n greater than number of pieces, no padding occurs
str_split(fruits, " and ", n = 5)

# Use fixed to return a character matrix
str_split_fixed(fruits, " and ", 3)
str_split_fixed(fruits, " and ", 4)

# str_split_i extracts only a single piece from a string
str_split_i(fruits, " and ", 1)
str_split_i(fruits, " and ", 4)
# use a negative number to select from the end
str_split_i(fruits, " and ", -1)
}
\seealso{
\code{\link[stringi:stri_split]{stringi::stri_split()}} for the underlying implementation.
}


================================================
FILE: man/str_starts.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/detect.R
\name{str_starts}
\alias{str_starts}
\alias{str_ends}
\title{Detect the presence/absence of a match at the start/end}
\usage{
str_starts(string, pattern, negate = FALSE)

str_ends(string, pattern, negate = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern with which the string starts or ends.

The default interpretation is a regular expression, as described in
\link[stringi:about_search_regex]{stringi::about_search_regex}. Control options with \code{\link[=regex]{regex()}}.

Match a fixed string (i.e. by comparing only bytes), using \code{\link[=fixed]{fixed()}}. This
is fast, but approximate. Generally, for matching human text, you'll want
\code{\link[=coll]{coll()}} which respects character matching rules for the specified locale.}

\item{negate}{If \code{TRUE}, inverts the resulting boolean vector.}
}
\value{
A logical vector.
}
\description{
\code{str_starts()} and \code{str_ends()} are special cases of \code{\link[=str_detect]{str_detect()}} that
only match at the beginning or end of a string, respectively.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_starts(fruit, "p")
str_starts(fruit, "p", negate = TRUE)
str_ends(fruit, "e")
str_ends(fruit, "e", negate = TRUE)
}


================================================
FILE: man/str_sub.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sub.R
\name{str_sub}
\alias{str_sub}
\alias{str_sub<-}
\alias{str_sub_all}
\title{Get and set substrings using their positions}
\usage{
str_sub(string, start = 1L, end = -1L)

str_sub(string, start = 1L, end = -1L, omit_na = FALSE) <- value

str_sub_all(string, start = 1L, end = -1L)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{start, end}{A pair of integer vectors defining the range of characters
to extract (inclusive). Positive values count from the left of the string,
and negative values count from the right. In other words, if \code{string} is
\code{"abcdef"} then 1 refers to \code{"a"} and -1 refers to \code{"f"}.

Alternatively, instead of a pair of vectors, you can pass a matrix to
\code{start}. The matrix should have two columns, either labelled \code{start}
and \code{end}, or \code{start} and \code{length}. This makes \code{str_sub()} work directly
with the output from \code{\link[=str_locate]{str_locate()}} and friends.}

\item{omit_na}{Single logical value. If \code{TRUE}, missing values in any of the
arguments provided will result in an unchanged input.}

\item{value}{Replacement string.}
}
\value{
\itemize{
\item \code{str_sub()}: A character vector the same length as \code{string}/\code{start}/\code{end}.
\item \code{str_sub_all()}: A list the same length as \code{string}. Each element is
a character vector the same length as \code{start}/\code{end}.
}

If \code{end} comes before \code{start} or \code{start} is outside the range of \code{string}
then the corresponding output will be the empty string.
}
\description{
\code{str_sub()} extracts or replaces the elements at a single position in each
string. \code{str_sub_all()} allows you to extract strings at multiple elements
in every string.
}
\examples{
hw <- "Hadley Wickham"

str_sub(hw, 1, 6)
str_sub(hw, end = 6)
str_sub(hw, 8, 14)
str_sub(hw, 8)

# Negative values index from end of string
str_sub(hw, -1)
str_sub(hw, -7)
str_sub(hw, end = -7)

# str_sub() is vectorised by both string and position
str_sub(hw, c(1, 8), c(6, 14))

# if you want to extract multiple positions from multiple strings,
# use str_sub_all()
x <- c("abcde", "ghifgh")
str_sub(x, c(1, 2), c(2, 4))
str_sub_all(x, start = c(1, 2), end = c(2, 4))

# Alternatively, you can pass in a two column matrix, as in the
# output from str_locate_all
pos <- str_locate_all(hw, "[aeio]")[[1]]
pos
str_sub(hw, pos)

# You can also use `str_sub()` to modify strings:
x <- "BBCDEF"
str_sub(x, 1, 1) <- "A"; x
str_sub(x, -1, -1) <- "K"; x
str_sub(x, -2, -2) <- "GHIJ"; x
str_sub(x, 2, -2) <- ""; x
}
\seealso{
The underlying implementation in \code{\link[stringi:stri_sub]{stringi::stri_sub()}}
}


================================================
FILE: man/str_subset.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/subset.R
\name{str_subset}
\alias{str_subset}
\title{Find matching elements}
\usage{
str_subset(string, pattern, negate = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}

\item{negate}{If \code{TRUE}, inverts the resulting boolean vector.}
}
\value{
A character vector, usually smaller than \code{string}.
}
\description{
\code{str_subset()} returns all elements of \code{string} where there's at least
one match to \code{pattern}. It's a wrapper around \code{x[str_detect(x, pattern)]},
and is equivalent to \code{grep(pattern, x, value = TRUE)}.

Use \code{\link[=str_extract]{str_extract()}} to find the location of the match \emph{within} each string.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_subset(fruit, "a")

str_subset(fruit, "^a")
str_subset(fruit, "a$")
str_subset(fruit, "b")
str_subset(fruit, "[aeiou]")

# Elements that don't match
str_subset(fruit, "^p", negate = TRUE)

# Missings never match
str_subset(c("a", NA, "b"), ".")
}
\seealso{
\code{\link[=grep]{grep()}} with argument \code{value = TRUE},
\code{\link[stringi:stri_subset]{stringi::stri_subset()}} for the underlying implementation.
}


================================================
FILE: man/str_to_camel.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/case.R
\name{str_to_camel}
\alias{str_to_camel}
\alias{str_to_snake}
\alias{str_to_kebab}
\title{Convert between different types of programming case}
\usage{
str_to_camel(string, first_upper = FALSE)

str_to_snake(string)

str_to_kebab(string)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{first_upper}{Logical. Should the first letter be capitalized?}
}
\description{
\itemize{
\item \code{str_to_camel()} converts to camel case, where the first letter of
each word is capitalized, with no separation between words. By default
the first letter of the first word is not capitalized.
\item \code{str_to_kebab()} converts to kebab case, where words are converted to
lower case and separated by dashes (\code{-}).
\item \code{str_to_snake()} converts to snake case, where words are converted to
lower case and separated by underscores (\verb{_}).
}
}
\examples{
str_to_camel("my-variable")
str_to_camel("my-variable", first_upper = TRUE)

str_to_snake("MyVariable")
str_to_kebab("MyVariable")
}


================================================
FILE: man/str_trim.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/trim.R
\name{str_trim}
\alias{str_trim}
\alias{str_squish}
\title{Remove whitespace}
\usage{
str_trim(string, side = c("both", "left", "right"))

str_squish(string)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{side}{Side on which to remove whitespace: "left", "right", or
"both", the default.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
\code{str_trim()} removes whitespace from start and end of string; \code{str_squish()}
removes whitespace at the start and end, and replaces all internal whitespace
with a single space.
}
\examples{
str_trim("  String with trailing and leading white space\t")
str_trim("\n\nString with trailing and leading white space\n\n")

str_squish("  String with trailing,  middle, and leading white space\t")
str_squish("\n\nString with excess,  trailing and leading white   space\n\n")
}
\seealso{
\code{\link[=str_pad]{str_pad()}} to add whitespace
}


================================================
FILE: man/str_trunc.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/trunc.R
\name{str_trunc}
\alias{str_trunc}
\title{Truncate a string to maximum width}
\usage{
str_trunc(string, width, side = c("right", "left", "center"), ellipsis = "...")
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{width}{Maximum width of string.}

\item{side, ellipsis}{Location and content of ellipsis that indicates
content has been removed.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
Truncate a string to a fixed of characters, so that
\code{str_length(str_trunc(x, n))} is always less than or equal to \code{n}.
}
\examples{
x <- "This string is moderately long"
rbind(
  str_trunc(x, 20, "right"),
  str_trunc(x, 20, "left"),
  str_trunc(x, 20, "center")
)
}
\seealso{
\code{\link[=str_pad]{str_pad()}} to increase the minimum width of a string.
}


================================================
FILE: man/str_unique.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/unique.R
\name{str_unique}
\alias{str_unique}
\title{Remove duplicated strings}
\usage{
str_unique(string, locale = "en", ignore_case = FALSE, ...)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}

\item{ignore_case}{Ignore case when comparing strings?}

\item{...}{Other options used to control collation. Passed on to
\code{\link[stringi:stri_opts_collator]{stringi::stri_opts_collator()}}.}
}
\value{
A character vector, usually shorter than \code{string}.
}
\description{
\code{str_unique()} removes duplicated values, with optional control over
how duplication is measured.
}
\examples{
str_unique(c("a", "b", "c", "b", "a"))

str_unique(c("a", "b", "c", "B", "A"))
str_unique(c("a", "b", "c", "B", "A"), ignore_case = TRUE)

# Use ... to pass additional arguments to stri_unique()
str_unique(c("motley", "mötley", "pinguino", "pingüino"))
str_unique(c("motley", "mötley", "pinguino", "pingüino"), strength = 1)
}
\seealso{
\code{\link[=unique]{unique()}}, \code{\link[stringi:stri_unique]{stringi::stri_unique()}} which this function wraps.
}


================================================
FILE: man/str_view.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/view.R
\name{str_view}
\alias{str_view}
\alias{str_view_all}
\title{View strings and matches}
\usage{
str_view(
  string,
  pattern = NULL,
  match = TRUE,
  html = FALSE,
  use_escapes = FALSE
)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}

\item{match}{If \code{pattern} is supplied, which elements should be shown?
\itemize{
\item \code{TRUE}, the default, shows only elements that match the pattern.
\item \code{NA} shows all elements.
\item \code{FALSE} shows only elements that don't match the pattern.
}

If \code{pattern} is not supplied, all elements are always shown.}

\item{html}{Use HTML output? If \code{TRUE} will create an HTML widget; if \code{FALSE}
will style using ANSI escapes.}

\item{use_escapes}{If \code{TRUE}, all non-ASCII characters will be rendered
with unicode escapes. This is useful to see exactly what underlying
values are stored in the string.}
}
\description{
\code{str_view()} is used to print the underlying representation of a string and
to see how a \code{pattern} matches.

Matches are surrounded by \verb{<>} and unusual whitespace (i.e. all whitespace
apart from \code{" "} and \code{"\\n"}) are surrounded by \code{{}} and escaped. Where
possible, matches and unusual whitespace are coloured blue and \code{NA}s red.
}
\examples{
# Show special characters
str_view(c("\"\\\\", "\\\\\\\\\\\\", "fgh", NA, "NA"))

# A non-breaking space looks like a regular space:
nbsp <- "Hi\u00A0you"
nbsp
# But it doesn't behave like one:
str_detect(nbsp, " ")
# So str_view() brings it to your attention with a blue background
str_view(nbsp)

# You can also use escapes to see all non-ASCII characters
str_view(nbsp, use_escapes = TRUE)

# Supply a pattern to see where it matches
str_view(c("abc", "def", "fghi"), "[aeiou]")
str_view(c("abc", "def", "fghi"), "^")
str_view(c("abc", "def", "fghi"), "..")

# By default, only matching strings will be shown
str_view(c("abc", "def", "fghi"), "e")
# but you can show all:
str_view(c("abc", "def", "fghi"), "e", match = NA)
# or just those that don't match:
str_view(c("abc", "def", "fghi"), "e", match = FALSE)
}


================================================
FILE: man/str_which.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/subset.R
\name{str_which}
\alias{str_which}
\title{Find matching indices}
\usage{
str_which(string, pattern, negate = FALSE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{pattern}{Pattern to look for.

The default interpretation is a regular expression, as described in
\code{vignette("regular-expressions")}. Use \code{\link[=regex]{regex()}} for finer control of the
matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using
\code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally,
for matching human text, you'll want \code{\link[=coll]{coll()}} which
respects character matching rules for the specified locale.

You can not match boundaries, including \code{""}, with this function.}

\item{negate}{If \code{TRUE}, inverts the resulting boolean vector.}
}
\value{
An integer vector, usually smaller than \code{string}.
}
\description{
\code{str_which()} returns the indices of \code{string} where there's at least
one match to \code{pattern}. It's a wrapper around
\code{which(str_detect(x, pattern))}, and is equivalent to \code{grep(pattern, x)}.
}
\examples{
fruit <- c("apple", "banana", "pear", "pineapple")
str_which(fruit, "a")

# Elements that don't match
str_which(fruit, "^p", negate = TRUE)

# Missings never match
str_which(c("a", NA, "b"), ".")
}


================================================
FILE: man/str_wrap.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/wrap.R
\name{str_wrap}
\alias{str_wrap}
\title{Wrap words into nicely formatted paragraphs}
\usage{
str_wrap(string, width = 80, indent = 0, exdent = 0, whitespace_only = TRUE)
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{width}{Positive integer giving target line width (in number of
characters). A width less than or equal to 1 will put each word on its
own line.}

\item{indent, exdent}{A non-negative integer giving the indent for the
first line (\code{indent}) and all subsequent lines (\code{exdent}).}

\item{whitespace_only}{A boolean.
\itemize{
\item If \code{TRUE} (the default) wrapping will only occur at whitespace.
\item If \code{FALSE}, can break on any non-word character (e.g. \code{/}, \code{-}).
}}
}
\value{
A character vector the same length as \code{string}.
}
\description{
Wrap words into paragraphs, minimizing the "raggedness" of the lines
(i.e. the variation in length line) using the Knuth-Plass algorithm.
}
\examples{
thanks_path <- file.path(R.home("doc"), "THANKS")
thanks <- str_c(readLines(thanks_path), collapse = "\n")
thanks <- word(thanks, 1, 3, fixed("\n\n"))
cat(str_wrap(thanks), "\n")
cat(str_wrap(thanks, width = 40), "\n")
cat(str_wrap(thanks, width = 60, indent = 2), "\n")
cat(str_wrap(thanks, width = 60, exdent = 2), "\n")
cat(str_wrap(thanks, width = 0, exdent = 2), "\n")
}
\seealso{
\code{\link[stringi:stri_wrap]{stringi::stri_wrap()}} for the underlying implementation.
}


================================================
FILE: man/stringr-data.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/data.R
\docType{data}
\name{stringr-data}
\alias{stringr-data}
\alias{sentences}
\alias{fruit}
\alias{words}
\title{Sample character vectors for practicing string manipulations}
\format{
Character vectors.
}
\usage{
sentences

fruit

words
}
\description{
\code{fruit} and \code{words} come from the \code{rcorpora} package
written by Gabor Csardi; the data was collected by Darius Kazemi
and made available at \url{https://github.com/dariusk/corpora}.
\code{sentences} is a collection of "Harvard sentences" used for
standardised testing of voice.
}
\examples{
length(sentences)
sentences[1:5]

length(fruit)
fruit[1:5]

length(words)
words[1:5]
}
\keyword{datasets}


================================================
FILE: man/stringr-package.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/stringr-package.R
\docType{package}
\name{stringr-package}
\alias{stringr}
\alias{stringr-package}
\title{stringr: Simple, Consistent Wrappers for Common String Operations}
\description{
\if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}}

A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.
}
\seealso{
Useful links:
\itemize{
  \item \url{https://stringr.tidyverse.org}
  \item \url{https://github.com/tidyverse/stringr}
  \item Report bugs at \url{https://github.com/tidyverse/stringr/issues}
}

}
\author{
\strong{Maintainer}: Hadley Wickham \email{hadley@posit.co} [copyright holder]

Other contributors:
\itemize{
  \item Posit Software, PBC [copyright holder, funder]
}

}
\keyword{internal}


================================================
FILE: man/word.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/word.R
\name{word}
\alias{word}
\title{Extract words from a sentence}
\usage{
word(string, start = 1L, end = start, sep = fixed(" "))
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}

\item{start, end}{Pair of integer vectors giving range of words (inclusive)
to extract. If negative, counts backwards from the last word.

The default value select the first word.}

\item{sep}{Separator between words. Defaults to single space.}
}
\value{
A character vector with the same length as \code{string}/\code{start}/\code{end}.
}
\description{
Extract words from a sentence
}
\examples{
sentences <- c("Jane saw a cat", "Jane sat down")
word(sentences, 1)
word(sentences, 2)
word(sentences, -1)
word(sentences, 2, -1)

# Also vectorised over start and end
word(sentences[1], 1:3, -1)
word(sentences[1], 1, 1:4)

# Can define words by other separators
str <- 'abc.def..123.4568.999'
word(str, 1, sep = fixed('..'))
word(str, 2, sep = fixed('..'))
}


================================================
FILE: po/R-es.po
================================================
msgid ""
msgstr ""
"Project-Id-Version: stringr 1.5.1.9000\n"
"POT-Creation-Date: 2024-07-17 11:07-0500\n"
"PO-Revision-Date: 2024-07-17 11:07-0500\n"
"Last-Translator: Automatically generated\n"
"Language-Team: none\n"
"Language: es\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

#: detect.R:141
msgid "{.arg pattern} must be a plain string, not a stringr modifier."
msgstr "{.arg pattern} debe ser una cadena de caracteres, no un modificador de stringr"

#: interp.R:105
msgid "Invalid template string for interpolation."
msgstr "Plantilla de cadenas invalida para interpolación."

#: interp.R:176
msgid "Failed to parse input {.str {text}}"
msgstr "Fallo en segmentar el input {.str {text}}"

#: match.R:54 match.R:68
msgid "{.arg pattern} must be a regular expression."
msgstr "{.arg pattern} debe ser una expresión regular."

#: modifiers.R:216
msgid "{.arg pattern} must be a string, not {.obj_type_friendly {x}}."
msgstr "{.arg pattern} debe ser una cadena de caracteres, no {.obj_type_friendly {x}}."

#: replace.R:208
msgid "Failed to apply {.arg replacement} function."
msgstr "Fallo en aplicar la función {.arg replacement}."

#: replace.R:209
msgid "It must accept a character vector of any length."
msgstr "Debe aceptar un vector de caracteres de cualquier longitud."

#: replace.R:220
msgid ""
"{.arg replacement} function must return a character vector, not {."
"obj_type_friendly {new_flat}}."
msgstr ""
"La función {.arg replacement} debe devolver un vector de caracteres, no {."
"obj_type_friendly {new_flat}}."

#: replace.R:226
msgid ""
"{.arg replacement} function must return a vector the same length as the input "
"({length(old_flat)}), not length {length(new_flat)}."
msgstr ""
"La función {.arg replacement} debe devolver un vector del mismo largo que el input "
"({length(old_flat)}), no de {length(new_flat)} de largo."

#: split.R:122
msgid "{.arg i} must not be 0."
msgstr "{.arg i} no debe ser igual a 0."

#: trunc.R:32
msgid "`width` ({width}) is shorter than `ellipsis` ({str_length(ellipsis)})."
msgstr "`width` ({width}) es más corto que `ellipsis` ({str_length(ellipsis)})."

#: utils.R:23
msgid "{.arg pattern} can't be a boundary."
msgstr "{.arg pattern} no puede ser un límite."

#: utils.R:26
msgid "{.arg pattern} can't be the empty string ({.code \"\"})."
msgstr "{.arg pattern} no puede ser una cadena de caracteres vacia ({.code \"\"})."


================================================
FILE: po/R-stringr.pot
================================================
msgid ""
msgstr ""
"Project-Id-Version: stringr 1.5.1.9000\n"
"POT-Creation-Date: 2024-08-15 10:19-0700\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: detect.R:141
msgid "{.arg pattern} must be a plain string, not a stringr modifier."
msgstr ""

#: interp.R:105
msgid "Invalid template string for interpolation."
msgstr ""

#: interp.R:176
msgid "Failed to parse input {.str {text}}"
msgstr ""

#: match.R:54 match.R:68
msgid "{.arg pattern} must be a regular expression."
msgstr ""

#: modifiers.R:216
msgid "{.arg pattern} must be a string, not {.obj_type_friendly {x}}."
msgstr ""

#: replace.R:208
msgid "Failed to apply {.arg replacement} function."
msgstr ""

#: replace.R:209
msgid "It must accept a character vector of any length."
msgstr ""

#: replace.R:220
msgid ""
"{.arg replacement} function must return a character vector, not {."
"obj_type_friendly {new_flat}}."
msgstr ""

#: replace.R:226
msgid ""
"{.arg replacement} function must return a vector the same length as the "
"input ({length(old_flat)}), not length {length(new_flat)}."
msgstr ""

#: split.R:122
msgid "{.arg i} must not be 0."
msgstr ""

#: trunc.R:32
msgid "`width` ({width}) is shorter than `ellipsis` ({str_length(ellipsis)})."
msgstr ""

#: utils.R:23
msgid "{.arg pattern} can't be a boundary."
msgstr ""

#: utils.R:26
msgid "{.arg pattern} can't be the empty string ({.code \"\"})."
msgstr ""


================================================
FILE: revdep/.gitignore
================================================
checks
library
checks.noindex
library.noindex
data.sqlite
*.html
cloud.noindex


================================================
FILE: revdep/README.md
================================================
# Revdeps

## Failed to check (2)

|package             |version |error |warning |note |
|:-------------------|:-------|:-----|:-------|:----|
|DSMolgenisArmadillo |?       |      |        |     |
|multinma            |0.8.1   |1     |        |     |

## New problems (9)

|package   |version |error  |warning |note |
|:---------|:-------|:------|:-------|:----|
|[huxtable](problems.md#huxtable)|5.7.0   |__+2__ |        |1    |
|[latex2exp](problems.md#latex2exp)|0.9.6   |__+2__ |        |     |
|[NMsim](problems.md#nmsim)|0.2.5   |__+1__ |        |     |
|[nrlR](problems.md#nrlr)|0.1.1   |__+1__ |        |     |
|[phenofit](problems.md#phenofit)|0.3.10  |__+2__ |        |     |
|[psycModel](problems.md#psycmodel)|0.5.0   |__+1__ |        |1    |
|[salty](problems.md#salty)|0.1.1   |__+2__ |        |     |
|[sdbuildR](problems.md#sdbuildr)|1.0.7   |__+1__ |        |     |
|[zipangu](problems.md#zipangu)|0.3.3   |__+1__ |        |1    |


================================================
FILE: revdep/cran.md
================================================
## revdepcheck results

We checked 2390 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package.

 * We saw 9 new problems
 * We failed to check 2 packages

Issues with CRAN packages are summarised below.

### New problems
(This reports the first line of each new failure)

* huxtable
  checking examples ... ERROR
  checking tests ... ERROR

* latex2exp
  checking tests ... ERROR
  checking re-building of vignette outputs ... ERROR

* NMsim
  checking tests ... ERROR

* nrlR
  checking examples ... ERROR

* phenofit
  checking examples ... ERROR
  checking tests ... ERROR

* psycModel
  checking tests ... ERROR

* salty
  checking examples ... ERROR
  checking tests ... ERROR

* sdbuildR
  checking tests ... ERROR

* zipangu
  checking tests ... ERROR

### Failed to check

* DSMolgenisArmadillo (NA)
* multinma            (NA)


================================================
FILE: revdep/email.yml
================================================
release_date: ???
rel_release_date: ???
my_news_url: ???
release_version: ???
release_details: ???


================================================
FILE: revdep/failures.md
================================================
# DSMolgenisArmadillo (3.0.1)

* GitHub: <https://github.com/molgenis/molgenis-r-datashield>
* Email: <mailto:m.k.slofstra@umcg.nl>
* GitHub mirror: <https://github.com/cran/DSMolgenisArmadillo>

Run `revdepcheck::cloud_details(, "DSMolgenisArmadillo")` for more info

## Error before installation

### Devel

```
* using log directory ‘/tmp/workdir/DSMolgenisArmadillo/new/DSMolgenisArmadillo.Rcheck’
* using R version 4.5.1 (2025-06-13)
* using platform: x86_64-pc-linux-gnu
* R was compiled by
    gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
    GNU Fortran (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
* running under: Ubuntu 24.04.3 LTS
* using session charset: UTF-8
* using option ‘--no-manual’
* checking for file ‘DSMolgenisArmadillo/DESCRIPTION’ ... OK
...
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ... OK
  Running ‘testthat.R’
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... OK
* checking re-building of vignette outputs ... OK
* DONE
Status: OK


```
### CRAN

```
* using log directory ‘/tmp/workdir/DSMolgenisArmadillo/old/DSMolgenisArmadillo.Rcheck’
* using R version 4.5.1 (2025-06-13)
* using platform: x86_64-pc-linux-gnu
* R was compiled by
    gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
    GNU Fortran (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
* running under: Ubuntu 24.04.3 LTS
* using session charset: UTF-8
* using option ‘--no-manual’
* checking for file ‘DSMolgenisArmadillo/DESCRIPTION’ ... OK
...
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ... OK
  Running ‘testthat.R’
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... OK
* checking re-building of vignette outputs ... OK
* DONE
Status: OK


```
# multinma (0.8.1)

* GitHub: <https://github.com/dmphillippo/multinma>
* Email: <mailto:david.phillippo@bristol.ac.uk>
* GitHub mirror: <https://github.com/cran/multinma>

Run `revdepcheck::cloud_details(, "multinma")` for more info

## In both

*   checking whether package ‘multinma’ can be installed ... ERROR
     ```
     Installation failed.
     See ‘/tmp/workdir/multinma/new/multinma.Rcheck/00install.out’ for details.
     ```

## Installation

### Devel

```
* installing *source* package ‘multinma’ ...
** this is package ‘multinma’ version ‘0.8.1’
** package ‘multinma’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
using C++ compiler: ‘g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
using C++17


g++ -std=gnu++17 -I"/opt/R/4.5.1/lib/R/include" -DNDEBUG -I"../inst/include" -I"/usr/local/lib/R/site-library/StanHeaders/include/src" -DBOOST_DISABLE_ASSERTS -DEIGEN_NO_DEBUG -DBOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error -DUSE_STANC3 -D_HAS_AUTO_PTR_ETC=0 -I'/usr/local/lib/R/site-library/BH/include' -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppEigen/include' -I'/usr/local/lib/R/site-library/RcppParallel/include' -I'/usr/local/lib/R/site-library/rstan/include' -I'/usr/local/lib/R/site-library/StanHeaders/include' -I/usr/local/include    -I'/usr/local/lib/R/site-library/RcppParallel/include' -D_REENTRANT -DSTAN_THREADS   -fpic  -g -O2   -c RcppExports.cpp -o RcppExports.o
...
/usr/local/lib/R/site-library/StanHeaders/include/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp:22:0:   required from ‘double stan::mcmc::dense_e_metric<Model, BaseRNG>::T(stan::mcmc::dense_e_point&) [with Model = model_survival_param_namespace::model_survival_param; BaseRNG = boost::random::additive_combine_engine<boost::random::linear_congruential_engine<unsigned int, 40014, 0, 2147483563>, boost::random::linear_congruential_engine<unsigned int, 40692, 0, 2147483399> >]’
/usr/local/lib/R/site-library/StanHeaders/include/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp:21:0:   required from here
/usr/local/lib/R/site-library/RcppEigen/include/Eigen/src/Core/DenseCoeffsBase.h:654:74: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits<double>::type’ {aka ‘__m128d’} [-Wignored-attributes]
  654 |   return internal::first_aligned<int(unpacket_traits<DefaultPacketType>::alignment),Derived>(m);
      |                                                                          ^~~~~~~~~
g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make: *** [/opt/R/4.5.1/lib/R/etc/Makeconf:209: stanExports_survival_param.o] Error 1
ERROR: compilation failed for package ‘multinma’
* removing ‘/tmp/workdir/multinma/new/multinma.Rcheck/multinma’


```
### CRAN

```
* installing *source* package ‘multinma’ ...
** this is package ‘multinma’ version ‘0.8.1’
** package ‘multinma’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
using C++ compiler: ‘g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
using C++17


g++ -std=gnu++17 -I"/opt/R/4.5.1/lib/R/include" -DNDEBUG -I"../inst/include" -I"/usr/local/lib/R/site-library/StanHeaders/include/src" -DBOOST_DISABLE_ASSERTS -DEIGEN_NO_DEBUG -DBOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error -DUSE_STANC3 -D_HAS_AUTO_PTR_ETC=0 -I'/usr/local/lib/R/site-library/BH/include' -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppEigen/include' -I'/usr/local/lib/R/site-library/RcppParallel/include' -I'/usr/local/lib/R/site-library/rstan/include' -I'/usr/local/lib/R/site-library/StanHeaders/include' -I/usr/local/include    -I'/usr/local/lib/R/site-library/RcppParallel/include' -D_REENTRANT -DSTAN_THREADS   -fpic  -g -O2   -c RcppExports.cpp -o RcppExports.o
...
/usr/local/lib/R/site-library/StanHeaders/include/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp:22:0:   required from ‘double stan::mcmc::dense_e_metric<Model, BaseRNG>::T(stan::mcmc::dense_e_point&) [with Model = model_survival_param_namespace::model_survival_param; BaseRNG = boost::random::additive_combine_engine<boost::random::linear_congruential_engine<unsigned int, 40014, 0, 2147483563>, boost::random::linear_congruential_engine<unsigned int, 40692, 0, 2147483399> >]’
/usr/local/lib/R/site-library/StanHeaders/include/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp:21:0:   required from here
/usr/local/lib/R/site-library/RcppEigen/include/Eigen/src/Core/DenseCoeffsBase.h:654:74: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits<double>::type’ {aka ‘__m128d’} [-Wignored-attributes]
  654 |   return internal::first_aligned<int(unpacket_traits<DefaultPacketType>::alignment),Derived>(m);
      |                                                                          ^~~~~~~~~
g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make: *** [/opt/R/4.5.1/lib/R/etc/Makeconf:209: stanExports_survival_param.o] Error 1
ERROR: compilation failed for package ‘multinma’
* removing ‘/tmp/workdir/multinma/old/multinma.Rcheck/multinma’


```


================================================
FILE: revdep/problems.md
================================================
# huxtable (5.7.0)

* GitHub: <https://github.com/hughjonesd/huxtable>
* Email: <mailto:davidhughjones@gmail.com>
* GitHub mirror: <https://github.com/cran/huxtable>

Run `revdepcheck::cloud_details(, "huxtable")` for more info

## Newly broken

*   checking examples ... ERROR
     ```
     ...
       2. ├─huxtable:::print.huxtable(x)
       3. │ └─huxtable (local) meth(x, ...)
       4. │   ├─base::cat(to_screen(ht, ...))
       5. │   └─huxtable::to_screen(ht, ...)
       6. │     └─huxtable:::generate_table_display(...)
       7. │       └─huxtable:::create_character_matrix(...)
       8. │         └─huxtable:::character_matrix(...)
       9. │           └─huxtable:::prepare_cell_display_data(ht, markdown)
      10. │             └─huxtable:::clean_contents(ht, output_type = if (markdown) "markdown" else "screen")
      11. │               └─huxtable:::format_numbers_matrix(contents, ht)
      12. │                 └─base::vapply(...)
      13. │                   └─huxtable (local) FUN(X[[i]], ...)
      14. │                     └─base::vapply(...)
      15. │                       └─huxtable (local) FUN(X[[i]], ...)
      16. │                         └─huxtable:::format_numbers(cell, nf[[row, col]])
      17. │                           └─stringr::str_replace_all(string, number_regex(), format_numeral)
      18. │                             └─stringr:::str_transform_all(string, pattern, replacement)
      19. │                               ├─base::withCallingHandlers(...)
      20. │                               └─huxtable (local) replacement(old_flat)
      21. │                                 └─numeral_formatter(num_fmt)(num)
      22. └─base::.handleSimpleError(...)
      23.   └─stringr (local) h(simpleError(msg, call))
      24.     └─cli::cli_abort(...)
      25.       └─rlang::abort(...)
     Execution halted
     ```

*   checking tests ... ERROR
     ```
     ...
       • x86_64-w64-mingw32/x64/validate-outputs/dimensions.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/dimensions.tex
       • x86_64-w64-mingw32/x64/validate-outputs/dimensions.txt
       • x86_64-w64-mingw32/x64/validate-outputs/table_caption_tests.html
       • x86_64-w64-mingw32/x64/validate-outputs/table_caption_tests.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/table_caption_tests.tex
       • x86_64-w64-mingw32/x64/validate-outputs/table_caption_tests.txt
       • x86_64-w64-mingw32/x64/validate-outputs/table_width_tests.html
       • x86_64-w64-mingw32/x64/validate-outputs/table_width_tests.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/table_width_tests.tex
       • x86_64-w64-mingw32/x64/validate-outputs/table_width_tests.txt
       • x86_64-w64-mingw32/x64/validate-outputs/text_alignment.html
       • x86_64-w64-mingw32/x64/validate-outputs/text_alignment.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/text_alignment.tex
       • x86_64-w64-mingw32/x64/validate-outputs/text_alignment.txt
       • x86_64-w64-mingw32/x64/validate-outputs/text_effects.html
       • x86_64-w64-mingw32/x64/validate-outputs/text_effects.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/text_effects.tex
       • x86_64-w64-mingw32/x64/validate-outputs/text_effects.txt
       • x86_64-w64-mingw32/x64/validate-outputs/text_properties.html
       • x86_64-w64-mingw32/x64/validate-outputs/text_properties.rtf
       • x86_64-w64-mingw32/x64/validate-outputs/text_properties.tex
       • x86_64-w64-mingw32/x64/validate-outputs/text_properties.txt
       Error: Test failures
       Execution halted
     ```

## In both

*   checking dependencies in R code ... NOTE
     ```
     Namespaces in Imports field not imported from:
       ‘R6’ ‘xml2’
       All declared Imports should be used.
     ```

# latex2exp (0.9.6)

* GitHub: <https://github.com/stefano-meschiari/latex2exp>
* Email: <mailto:stefano.meschiari@gmail.com>
* GitHub mirror: <https://github.com/cran/latex2exp>

Run `revdepcheck::cloud_details(, "latex2exp")` for more info

## Newly broken

*   checking tests ... ERROR
     ```
     ...
       (2), not length 1.
       Backtrace:
            ▆
         1. ├─latex2exp:::expect_renders_same(...) at test_simple.R:166:3
         2. │ └─latex2exp:::.expect_renders(object, expected_expression, negate = FALSE) at tests/testthat/setup.R:30:3
         3. │   └─latex2exp::TeX(act$val) at tests/testthat/setup.R:65:5
         4. │     └─latex2exp:::parse_latex(input)
         5. │       └─... %>% ...
         6. ├─stringr::str_replace_all(., "([^\\\\]?)\\\\\\s", "\\1\\\\@SPACE2{}")
         7. │ └─stringr:::check_lengths(string, pattern, replacement)
         8. │   └─vctrs::vec_size_common(...)
         9. ├─stringr::str_replace_all(., "([^\\\\]?)\\\\;", "\\1\\\\@SPACE2{}")
        10. │ └─stringr:::check_lengths(string, pattern, replacement)
        11. │   └─vctrs::vec_size_common(...)
        12. ├─stringr::str_replace_all(., "([^\\\\]?)\\\\,", "\\1\\\\@SPACE1{}")
        13. │ └─stringr:::check_lengths(string, pattern, replacement)
        14. │   └─vctrs::vec_size_common(...)
        15. └─stringr::str_replace_all(...)
        16.   └─stringr:::str_transform_all(string, pattern, replacement)
        17.     └─cli::cli_abort(...)
        18.       └─rlang::abort(...)
       
       [ FAIL 1 | WARN 1 | SKIP 0 | PASS 100 ]
       Error: Test failures
       Execution halted
     ```

*   checking re-building of vignette outputs ... ERROR
     ```
     Error(s) in re-building vignettes:
     --- re-building ‘supported-commands.Rmd’ using rmarkdown
     --- finished re-building ‘supported-commands.Rmd’
     
     --- re-building ‘using-latex2exp.Rmd’ using rmarkdown
     ```

# NMsim (0.2.5)

* GitHub: <https://github.com/nmautoverse/NMsim>
* Email: <mailto:philip@delff.dk>
* GitHub mirror: <https://github.com/cran/NMsim>

Run `revdepcheck::cloud_details(, "NMsim")` for more info

## Newly broken

*   checking tests ... ERROR
     ```
     ...
     Running the tests in ‘tests/testthat.R’ failed.
     Complete output:
       > library(testthat)
       > library(NMsim)
       NMsim 0.2.5. Browse NMsim documentation at
       https://NMautoverse.github.io/NMsim/
       > 
       > test_check("NMsim")
       [ FAIL 1 | WARN 0 | SKIP 0 | PASS 168 ]
       
       ══ Failed tests ════════════════════════════════════════════════════════════════
       ── Error ('test_NMsim_VarCov.R:62:5'): Basic ───────────────────────────────────
       Error in `stringr::str_replace_all(mod$THETA, "\\d+\\.\\d+", function(x) round(as.numeric(x), 
           digits = 3))`: `replacement` function must return a character vector, not a double
       vector.
       Backtrace:
           ▆
        1. └─stringr::str_replace_all(...) at test_NMsim_VarCov.R:62:5
        2.   └─stringr:::str_transform_all(string, pattern, replacement)
        3.     └─cli::cli_abort(...)
        4.       └─rlang::abort(...)
       
       [ FAIL 1 | WARN 0 | SKIP 0 | PASS 168 ]
       Error: Test failures
       Execution halted
     ```

# nrlR (0.1.1)

* Email: <mailto:danieltomaro@icloud.com>
* GitHub mirror: <https://github.com/cran/nrlR>

Run `revdepcheck::cloud_details(, "nrlR")` for more info

## Newly broken

*   checking examples ... ERROR
     ```
     ...
     > ### Name: fetch_lineups
     > ### Title: Fetch NRL Team Lineups
     > ### Aliases: fetch_lineups
     > 
     > ### ** Examples
     > 
     > fetch_lineups(url = "https://www.nrl.com/news/2024/05/07/nrl-team-lists-round-10/")
     Fetching team lineups from
     https://www.nrl.com/news/2024/05/07/nrl-team-lists-round-10/
     Error in `stringr::str_replace()`:
     ! `pattern` can not contain NAs.
     Backtrace:
          ▆
       1. └─nrlR::fetch_lineups(url = "https://www.nrl.com/news/2024/05/07/nrl-team-lists-round-10/")
       2.   ├─stringr::str_squish(...)
       3.   │ └─stringr:::copy_names(...)
       4.   ├─stringr::str_replace(...)
       5.   │ └─stringr:::check_lengths(string, pattern, replacement)
       6.   │   └─vctrs::vec_size_common(...)
       7.   └─stringr::str_replace(rvest::html_text2(home_node), home_role_full, "")
       8.     ├─stringr:::type(pattern)
       9.     └─stringr:::type.character(pattern)
      10.       └─cli::cli_abort(tr_("{.arg pattern} can not contain NAs."), call = error_call)
      11.         └─rlang::abort(...)
     Execution halted
     ```

# phenofit (0.3.10)

* GitHub: <https://github.com/eco-hydro/phenofit>
* Email: <mailto:kongdd.sysu@gmail.com>
* GitHub mirror: <https://github.com/cran/phenofit>

Run `revdepcheck::cloud_details(, "phenofit")` for more info

## Newly broken

*   checking examples ... ERROR
     ```
     ...
       3. │ └─... %>% set_names(dt$flag)
       4. ├─dplyr::group_map(...)
       5. ├─dplyr:::group_map.data.frame(...)
       6. │ └─dplyr:::map2(chunks, group_keys, .f, ...)
       7. │   └─base::mapply(.f, .x, .y, MoreArgs = list(...), SIMPLIFY = FALSE)
       8. │     └─phenofit (local) `<fn>`(dots[[1L]][[1L]], dots[[2L]][[1L]])
       9. │       └─phenofit:::PhenoDeriv.default(values, t, der1, IsPlot = FALSE)
      10. │         └─phenofit::findpeaks(...)
      11. │           └─xc %<>% str_replace_midzero()
      12. ├─phenofit:::str_replace_midzero(.)
      13. │ └─str_replace_all(x, "\\++0\\++", . %>% replace("+")) %>% ...
      14. ├─stringr::str_replace_all(., "-+0-+", . %>% replace("-"))
      15. │ └─stringr:::str_transform_all(string, pattern, replacement)
      16. │   ├─base::withCallingHandlers(...)
      17. │   └─magrittr (local) replacement(old_flat)
      18. │     └─magrittr::freduce(value, `_function_list`)
      19. │       ├─base::withVisible(function_list[[k]](value))
      20. │       └─function_list[[k]](value)
      21. │         └─phenofit (local) replace(., "-")
      22. │           └─base::paste(rep(replacement, nchar(x)), collapse = "")
      23. └─base::.handleSimpleError(...)
      24.   └─stringr (local) h(simpleError(msg, call))
      25.     └─cli::cli_abort(...)
      26.       └─rlang::abort(...)
     Execution halted
     ```

*   checking tests ... ERROR
     ```
     ...
         8. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
         9. ├─base::do.call(season, param)
        10. ├─phenofit (local) `<fn>`(...)
        11. │ └─phenofit:::findpeaks_season(...)
        12. │   └─phenofit::findpeaks(...)
        13. │     └─xc %<>% str_replace_midzero()
        14. ├─phenofit:::str_replace_midzero(.)
        15. │ └─str_replace_all(x, "\\++0\\++", . %>% replace("+")) %>% ...
        16. ├─stringr::str_replace_all(., "-+0-+", . %>% replace("-"))
        17. │ └─stringr:::str_transform_all(string, pattern, replacement)
        18. │   ├─base::withCallingHandlers(...)
        19. │   └─magrittr (local) replacement(old_flat)
        20. │     └─magrittr::freduce(value, `_function_list`)
        21. │       ├─base::withVisible(function_list[[k]](value))
        22. │       └─function_list[[k]](value)
        23. │         └─phenofit (local) replace(., "-")
        24. │           └─base::paste(rep(replacement, nchar(x)), collapse = "")
        25. └─base::.handleSimpleError(...)
        26.   └─stringr (local) h(simpleError(msg, call))
        27.     └─cli::cli_abort(...)
        28.       └─rlang::abort(...)
       
       [ FAIL 2 | WARN 2 | SKIP 0 | PASS 66 ]
       Error: Test failures
       Execution halted
     ```

# psycModel (0.5.0)

* GitHub: <https://github.com/jasonmoy28/psycModel>
* Email: <mailto:jasonmoy28@gmail.com>
* GitHub mirror: <https://github.com/cran/psycModel>

Run `revdepcheck::cloud_details(, "psycModel")` for more info

## Newly broken

*   checking tests ... ERROR
     ```
     ...
       
       
       [ FAIL 2 | WARN 0 | SKIP 0 | PASS 68 ]
       
       ══ Failed tests ════════════════════════════════════════════════════════════════
       ── Failure ('test-model-table.R:15:3'): model_table: linear regression ─────────
       `lm_1_check` (`actual`) not equal to model_summary[[2]] (`expected`).
       
       `names(actual)`:   "(Intercept)" "Sepal.Length"
       `names(expected)`: ""            ""            
       ── Failure ('test-model-table.R:16:3'): model_table: linear regression ─────────
       `lm_2_check` (`actual`) not equal to model_summary[[3]] (`expected`).
       
       `names(actual)`:   "(Intercept)" "Petal.Length"
       `names(expected)`: ""            ""            
       
       [ FAIL 2 | WARN 0 | SKIP 0 | PASS 68 ]
       Error: Test failures
       Execution halted
     ```

## In both

*   checking dependencies in R code ... NOTE
     ```
     Namespaces in Imports field not imported from:
       ‘lifecycle’ ‘patchwork’
       All declared Imports should be used.
     ```

# salty (0.1.1)

* GitHub: <https://github.com/mdlincoln/salty>
* Email: <mailto:matthew.d.lincoln@gmail.com>
* GitHub mirror: <https://github.com/cran/salty>

Run `revdepcheck::cloud_details(, "salty")` for more info

## Newly broken

*   checking examples ... ERROR
     ```
     ...
     > x <- c("Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
     +        "Nunc finibus tortor a elit eleifend interdum.",
     +        "Maecenas aliquam augue sit amet ultricies placerat.")
     > 
     > salt_replace(x, replacement_shaker$capitalization, p = 0.5, rep_p = 0.2)
     Error in `purrr::map2_chr()`:
     ℹ In index: 1.
     Caused by error in `stringr::str_replace_all()`:
     ! `replacement` function must return a vector the same length as the
       input (47), not length 1.
     Backtrace:
          ▆
       1. └─salty::salt_replace(...)
       2.   └─purrr::map2_chr(...)
       3.     └─purrr:::map2_("character", .x, .y, .f, ..., .progress = .progress)
       4.       ├─purrr:::with_indexed_errors(...)
       5.       │ └─base::withCallingHandlers(...)
       6.       ├─purrr:::call_with_cleanup(...)
       7.       └─salty (local) .f(.x[[i]], .y[[i]], ...)
       8.         └─salty:::selective_replacement(xc, replacements(i = si), rep_p)
       9.           └─stringr::str_replace_all(x, pattern = patterns, replacement = repfun)
      10.             └─stringr:::str_transform_all(string, pattern, replacement)
      11.               └─cli::cli_abort(...)
      12.                 └─rlang::abort(...)
     Execution halted
     ```

*   checking tests ... ERROR
     ```
     ...
         9. │         └─purrr::map2_chr(...)
        10. │           └─purrr:::map2_("character", .x, .y, .f, ..., .progress = .progress)
        11. │             ├─purrr:::with_indexed_errors(...)
        12. │             │ └─base::withCallingHandlers(...)
        13. │             ├─purrr:::call_with_cleanup(...)
        14. │             └─salty (local) .f(.x[[i]], .y[[i]], ...)
        15. │               └─salty:::selective_replacement(xc, replacements(i = si), rep_p)
        16. │                 └─stringr::str_replace_all(x, pattern = patterns, replacement = repfun)
        17. │                   └─stringr:::str_transform_all(string, pattern, replacement)
        18. │                     └─cli::cli_abort(...)
        19. │                       └─rlang::abort(...)
        20. │                         └─rlang:::signal_abort(cnd, .file)
        21. │                           └─base::signalCondition(cnd)
        22. ├─purrr (local) `<fn>`(`<rlng_rrr>`)
        23. │ └─cli::cli_abort(...)
        24. │   └─rlang::abort(...)
        25. │     └─rlang:::signal_abort(cnd, .file)
        26. │       └─base::signalCondition(cnd)
        27. └─purrr (local) `<fn>`(`<prrr_rr_>`)
        28.   └─cli::cli_abort(...)
        29.     └─rlang::abort(...)
       
       [ FAIL 5 | WARN 0 | SKIP 0 | PASS 755 ]
       Error: Test failures
       Execution halted
     ```

# sdbuildR (1.0.7)

* GitHub: <https://github.com/KCEvers/sdbuildR>
* Email: <mailto:kyra.c.evers@gmail.com>
* GitHub mirror: <https://github.com/cran/sdbuildR>

Run `revdepcheck::cloud_details(, "sdbuildR")` for more info

## Newly broken

*   checking tests ... ERROR
     ```
     ...
        12.     └─cli::cli_abort(...)
        13.       └─rlang::abort(...)
       ── Error ('test-conv_julia.R:723:3'): adding scientific notation ───────────────
       Error in `stringr::str_replace_all(eqn, pattern = pattern, replacement = reformat_scientific)`: Failed to apply `replacement` function.
       i It must accept a character vector of any length.
       Caused by error in `if (nchar(format(num, scientific = FALSE)) > digits_max) ...`:
       ! the condition has length > 1
       Backtrace:
            ▆
         1. ├─testthat::expect_equal(...) at test-conv_julia.R:723:3
         2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
         3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
         4. ├─sdbuildR:::scientific_notation("hiding 1e+23", task = "add")
         5. │ └─stringr::str_replace_all(eqn, pattern = pattern, replacement = reformat_scientific)
         6. │   └─stringr:::str_transform_all(string, pattern, replacement)
         7. │     ├─base::withCallingHandlers(...)
         8. │     └─sdbuildR (local) replacement(old_flat)
         9. └─base::.handleSimpleError(...)
        10.   └─stringr (local) h(simpleError(msg, call))
        11.     └─cli::cli_abort(...)
        12.       └─rlang::abort(...)
       
       [ FAIL 4 | WARN 0 | SKIP 30 | PASS 915 ]
       Error: Test failures
       Execution halted
     ```

# zipangu (0.3.3)

* GitHub: <https://github.com/uribo/zipangu>
* Email: <mailto:suika1127@gmail.com>
* GitHub mirror: <https://github.com/cran/zipangu>

Run `revdepcheck::cloud_details(, "zipangu")` for more info

## Newly broken

*   checking tests ... ERROR
     ```
     ...
               res <- res %>% purrr::list_merge(city = split_pref[2] %>% 
                   dplyr::if_else(is_address_block(.), stringr::str_remove(., 
                       "((土地区画|街区).+)") %>% stringr::str_remove("土地区画|街区"), 
                       .) %>% stringr::str_replace("(.市)(.+町.+)", 
                   "\\1") %>% stringr::str_replace(city_name_regex, 
                   replacement = "\\1"))
           }
           else {
               res <- res %>% purrr::list_merge(city = split_pref[2] %>% 
                   dplyr::if_else(is_address_block(.), stringr::str_remove(., 
                       "((土地区画|街区).+)") %>% stringr::str_remove("土地区画|街区"), 
                       .) %>% stringr::str_replace(paste0(city_name_regex, 
                   "(.+)"), replacement = "\\1"))
           }
           res <- res %>% purrr::list_merge(street = split_pref[2] %>% 
               stringr::str_remove(res %>% purrr::pluck("city")))
           res %>% purrr::map(~dplyr::if_else(.x == "", NA_character_, 
               .x))
       })`: ℹ In index: 1.
       Caused by error in `str_replace()`:
       ! `pattern` can not contain NAs.
       
       [ FAIL 1 | WARN 0 | SKIP 2 | PASS 143 ]
       Error: Test failures
       Execution halted
     ```

## In both

*   checking DESCRIPTION meta-information ... NOTE
     ```
       Missing dependency on R >= 4.1.0 because package code uses the pipe
       |> or function shorthand \(...) syntax added in R 4.1.0.
       File(s) using such syntax:
         ‘convert-jyear-legacy.R’
     ```


================================================
FILE: stringr.Rproj
================================================
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes

BuildType: Package
PackageUseDevtools: Yes
PackageInstallArgs: --no-multiarch --with-keep.source
PackageRoxygenize: rd,collate,namespace


================================================
FILE: tests/testthat/_snaps/c.md
================================================
# obeys tidyverse recycling rules

    Code
      str_c(c("x", "y"), character())
    Condition
      Error in `str_c()`:
      ! Can't recycle `..1` (size 2) to match `..2` (size 0).

# vectorised arguments error

    Code
      str_c(letters, sep = c("a", "b"))
    Condition
      Error in `str_c()`:
      ! `sep` must be a single string, not a character vector.
    Code
      str_c(letters, collapse = c("a", "b"))
    Condition
      Error in `str_c()`:
      ! `collapse` must be a single string or `NULL`, not a character vector.


================================================
FILE: tests/testthat/_snaps/conv.md
================================================
# check encoding argument

    Code
      str_conv("A", c("ISO-8859-1", "ISO-8859-2"))
    Condition
      Error in `str_conv()`:
      ! `encoding` must be a single string, not a character vector.


================================================
FILE: tests/testthat/_snaps/detect.md
================================================
# can't empty/boundary

    Code
      str_detect("x", "")
    Condition
      Error in `str_detect()`:
      ! `pattern` can't be the empty string (`""`).
    Code
      str_starts("x", "")
    Condition
      Error in `str_starts()`:
      ! `pattern` can't be the empty string (`""`).
    Code
      str_ends("x", "")
    Condition
      Error in `str_ends()`:
      ! `pattern` can't be the empty string (`""`).

# functions use tidyverse recycling rules

    Code
      str_detect(1:2, 1:3)
    Condition
      Error in `str_detect()`:
      ! Can't recycle `string` (size 2) to match `pattern` (size 3).
    Code
      str_starts(1:2, 1:3)
    Condition
      Error in `str_starts()`:
      ! Can't recycle `string` (size 2) to match `pattern` (size 3).
    Code
      str_ends(1:2, 1:3)
    Condition
      Error in `str_ends()`:
      ! Can't recycle `string` (size 2) to match `pattern` (size 3).
    Code
      str_like(1:2, c("a", "b", "c"))
    Condition
      Error in `str_like()`:
      ! Can't recycle `string` (size 2) to match `pattern` (size 3).

# str_like is case sensitive

    Code
      str_like("abc", regex("x"))
    Condition
      Error in `str_like()`:
      ! `pattern` must be a plain string, not a stringr modifier.

# ignore_case is deprecated but still respected

    Code
      out <- str_like("abc", "AB%", ignore_case = TRUE)
    Condition
      Warning:
      The `ignore_case` argument of `str_like()` is deprecated as of stringr 1.6.0.
      i `str_like()` is always case sensitive.
      i Use `str_ilike()` for case insensitive string matching.

# str_ilike works

    Code
      str_ilike("abc", regex("x"))
    Condition
      Error in `str_ilike()`:
      ! `pattern` must be a plain string, not a stringr modifier.


================================================
FILE: tests/testthat/_snaps/dup.md
================================================
# separator must be a single string

    Code
      str_dup("a", 3, sep = 1)
    Condition
      Error in `str_dup()`:
      ! `sep` must be a single string or `NULL`, not the number 1.
    Code
      str_dup("a", 3, sep = c("-", ";"))
    Condition
      Error in `str_dup()`:
      ! `sep` must be a single string or `NULL`, not a character vector.


================================================
FILE: tests/testthat/_snaps/equal.md
================================================
# vectorised using TRR

    Code
      str_equal(letters[1:3], c("a", "b"))
    Condition
      Error in `str_equal()`:
      ! Can't recycle `x` (size 3) to match `y` (size 2).


================================================
FILE: tests/testthat/_snaps/flatten.md
================================================
# collapse must be single string

    Code
      str_flatten("A", c("a", "b"))
    Condition
      Error in `str_flatten()`:
      ! `collapse` must be a single string, not a character vector.


================================================
FILE: tests/testthat/_snaps/interp.md
================================================
# str_interp fails when encountering nested placeholders

    Code
      str_interp("${${msg}}")
    Condition
      Error in `str_interp()`:
      ! Invalid template string for interpolation.
    Code
      str_interp("$[.2f]{${msg}}")
    Condition
      Error in `str_interp()`:
      ! Invalid template string for interpolation.

# str_interp fails when input is not a character string

    Code
      str_interp(3L)
    Condition
      Error in `str_interp()`:
      ! `string` must be a character vector, not the number 3.

# str_interp wraps parsing errors

    Code
      str_interp("This is a ${1 +}")
    Condition
      Error in `str_interp()`:
      ! Failed to parse input "1 +"
      Caused by error in `parse()`:
      ! <text>:2:0: unexpected end of input
      1: 1 +
         ^


================================================
FILE: tests/testthat/_snaps/match.md
================================================
# match and match_all fail when pattern is not a regex

    Code
      str_match(phones, fixed("3"))
    Condition
      Error in `str_match()`:
      ! `pattern` must be a regular expression.
    Code
      str_match_all(phones, coll("9"))
    Condition
      Error in `str_match_all()`:
      ! `pattern` must be a regular expression.

# match can't use other modifiers

    Code
      str_match("x", coll("y"))
    Condition
      Error in `str_match()`:
      ! `pattern` must be a regular expression.
    Code
      str_match_all("x", coll("y"))
    Condition
      Error in `str_match_all()`:
      ! `pattern` must be a regular expression.


================================================
FILE: tests/testthat/_snaps/modifiers.md
================================================
# patterns coerced to character

    Code
      . <- regex(x)
    Condition
      Warning in `regex()`:
      Coercing `pattern` to a plain character vector.
    Code
      . <- coll(x)
    Condition
      Warning in `coll()`:
      Coercing `pattern` to a plain character vector.
    Code
      . <- fixed(x)
    Condition
      Warning in `fixed()`:
      Coercing `pattern` to a plain character vector.

# useful error message for bad type

    Code
      type(1:3)
    Condition
      Error:
      ! `pattern` must be a character vector, not an integer vector.

# useful errors for NAs

    Code
      type(NA)
    Condition
      Error:
      ! `pattern` must be a character vector, not `NA`.
    Code
      type(c("a", "b", NA_character_, "c"))
    Condition
      Error:
      ! `pattern` can not contain NAs.


================================================
FILE: tests/testthat/_snaps/replace.md
================================================
# replacement must be a string

    Code
      str_replace("x", "x", 1)
    Condition
      Error in `str_replace()`:
      ! `replacement` must be a character vector, not the number 1.

# can't replace empty/boundary

    Code
      str_replace("x", "", "")
    Condition
      Error in `str_replace()`:
      ! `pattern` can't be the empty string (`""`).
    Code
      str_replace("x", boundary("word"), "")
    Condition
      Error in `str_replace()`:
      ! `pattern` can't be a boundary.
    Code
      str_replace_all("x", "", "")
    Condition
      Error in `str_replace_all()`:
      ! `pattern` can't be the empty string (`""`).
    Code
      str_replace_all("x", boundary("word"), "")
    Condition
      Error in `str_replace_all()`:
      ! `pattern` can't be a boundary.

# useful error if not vectorised correctly

    Code
      str_replace_all(x, "a|c", ~ if (length(x) > 1) stop("Bad"))
    Condition
      Error in `str_replace_all()`:
      ! Failed to apply `replacement` function.
      i It must accept a character vector of any length.
      Caused by error in `replacement()`:
      ! Bad

# replacement function must return correct type/length

    Code
      str_replace_all("x", "x", ~1)
    Condition
      Error in `str_replace_all()`:
      ! `replacement` function must return a character vector, not a number.
    Code
      str_replace_all("x", "x", ~ c("a", "b"))
    Condition
      Error in `str_replace_all()`:
      ! `replacement` function must return a vector the same length as the input (1), not length 2.

# backrefs are correctly translated

    Code
      str_replace_all("abcde", "(b)(c)(d)", "\\4")
    Condition
      Error in `stri_replace_all_regex()`:
      ! Trying to access the index that is out of bounds. (U_INDEX_OUTOFBOUNDS_ERROR)


================================================
FILE: tests/testthat/_snaps/split.md
================================================
# str_split() checks its inputs

    Code
      str_split(letters[1:3], letters[1:2])
    Condition
      Error in `str_split()`:
      ! Can't recycle `string` (size 3) to match `pattern` (size 2).
    Code
      str_split("x", 1)
    Condition
      Error in `str_split()`:
      ! `pattern` must be a character vector, not a number.
    Code
      str_split("x", "x", n = 0)
    Condition
      Error in `str_split()`:
      ! `n` must be a number larger than 1, not the number 0.

# str_split_1 takes string and returns character vector

    `string` must be a single string, not a character vector.

# str_split_fixed check its inputs

    Code
      str_split_fixed("x", "x", 0)
    Condition
      Error in `str_split_fixed()`:
      ! `n` must be a number larger than 1, not the number 0.

# str_split_i check its inputs

    Code
      str_split_i("x", "x", 0)
    Condition
      Error in `str_split_i()`:
      ! `i` must not be 0.
    Code
      str_split_i("x", "x", 0.5)
    Condition
      Error in `str_split_i()`:
      ! `i` must be a whole number, not the number 0.5.


================================================
FILE: tests/testthat/_snaps/sub.md
================================================
# bad vectorisation gives informative error

    Code
      str_sub(x, 1:2, 1:3)
    Condition
      Error in `str_sub()`:
      ! Can't recycle `string` (size 2) to match `end` (size 3).
    Code
      str_sub(x, 1:2, 1:2) <- 1:3
    Condition
      Error in `str_sub<-`:
      ! Can't recycle `string` (size 2) to match `value` (size 3).


================================================
FILE: tests/testthat/_snaps/subset.md
================================================
# can't use boundaries

    Code
      str_subset(c("a", "b c"), "")
    Condition
      Error in `str_subset()`:
      ! `pattern` can't be the empty string (`""`).
    Code
      str_subset(c("a", "b c"), boundary())
    Condition
      Error in `str_subset()`:
      ! `pattern` can't be a boundary.


================================================
FILE: tests/testthat/_snaps/trunc.md
================================================
# does not truncate to a length shorter than elipsis

    Code
      str_trunc("foobar", 2)
    Condition
      Error in `str_trunc()`:
      ! `width` (2) is shorter than `ellipsis` (3).
    Code
      str_trunc("foobar", 3, ellipsis = "....")
    Condition
      Error in `str_trunc()`:
      ! `width` (3) is shorter than `ellipsis` (4).


================================================
FILE: tests/testthat/_snaps/view.md
================================================
# results are truncated

    Code
      str_view(words)
    Output
       [1] | a
       [2] | able
       [3] | about
       [4] | absolute
       [5] | accept
       [6] | account
       [7] | achieve
       [8] | across
       [9] | act
      [10] | active
      [11] | actual
      [12] | add
      [13] | address
      [14] | admit
      [15] | advertise
      [16] | affect
      [17] | afford
      [18] | after
      [19] | afternoon
      [20] | again
      ... and 960 more

---

    Code
      str_view(words)
    Output
      [1] | a
      [2] | able
      [3] | about
      [4] | absolute
      [5] | accept
      ... and 975 more

# indices come from original vector

    Code
      str_view(letters, "a|z", match = TRUE)
    Output
       [1] | <a>
      [26] | <z>

# view highlights all matches

    Code
      str_view(x, "[aeiou]")
    Output
      [1] | <a>bc
      [2] | d<e>f
    Code
      str_view(x, "d|e")
    Output
      [2] | <d><e>f

# view highlights whitespace (except a space/nl)

    Code
      str_view(x)
    Output
      [1] |  
      [2] | {\u00a0}
      [3] | 
          | 
      [4] | {\t}
    Code
      # or can instead use escapes
      str_view(x, use_escapes = TRUE)
    Output
      [1] |  
      [2] | \u00a0
      [3] | \n
      [4] | \t

# view displays message for empty vectors

    Code
      str_view(character())
    Message
      x Empty `string` provided.

# can match across lines

    Code
      str_view("a\nb\nbbb\nc", "(b|\n)+")
    Output
      [90m[1] |[39m a[36m<[39m
          [90m|[39m [36mb[39m
          [90m|[39m [36mbbb[39m
          [90m|[39m [36m>[39mc

# str_view_all() is deprecated

    Code
      str_view_all("abc", "a|b")
    Condition
      Warning:
      `str_view_all()` was deprecated in stringr 1.5.0.
      i Please use `str_view()` instead.
    Output
      [1] | <a><b>c

# html mode continues to work

    Code
      str_view(x, "[aeiou]", html = TRUE)$x$html
    Output
      <ul>
        <li><pre><span class='match'>a</span>bc</pre></li>
        <li><pre>d<span class='match'>e</span>f</pre></li>
      </ul>
    Code
      str_view(x, "d|e", html = TRUE)$x$html
    Output
      <ul>
        <li><pre><span class='match'>d</span><span class='match'>e</span>f</pre></li>
      </ul>

---

    Code
      str_view(x, html = TRUE, use_escapes = TRUE)$x$html
    Output
      <ul>
        <li><pre> </pre></li>
        <li><pre>\u00a0</pre></li>
        <li><pre>\n</pre></li>
      </ul>


================================================
FILE: tests/testthat/test-c.R
================================================
test_that("basic case works", {
  test <- c("a", "b", "c")

  expect_equal(str_c(test), test)
  expect_equal(str_c(test, sep = " "), test)
  expect_equal(str_c(test, collapse = ""), "abc")
})

test_that("obeys tidyverse recycling rules", {
  expect_equal(str_c(), character())

  expect_equal(str_c("x", character()), character())
  expect_equal(str_c("x", NULL), "x")

  expect_snapshot(str_c(c("x", "y"), character()), error = TRUE)
  expect_equal(str_c(c("x", "y"), NULL), c("x", "y"))
})

test_that("vectorised arguments error", {
  expect_snapshot(error = TRUE, {
    str_c(letters, sep = c("a", "b"))
    str_c(letters, collapse = c("a", "b"))
  })
})


================================================
FILE: tests/testthat/test-case.R
================================================
test_that("to_upper and to_lower have equivalent base versions", {
  x <- "This is a sentence."
  expect_identical(str_to_upper(x), toupper(x))
  expect_identical(str_to_lower(x), tolower(x))
})

test_that("to_title creates one capital letter per word", {
  x <- "This is a sentence."
  expect_equal(str_count(x, "\\W+"), str_count(str_to_title(x), "[[:upper:]]"))
})

test_that("to_sentence capitalizes just the first letter", {
  expect_identical(str_to_sentence("a Test"), "A test")
})

test_that("case conversions preserve names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_to_lower(x)), names(x))
  expect_equal(names(str_to_upper(x)), names(x))
  expect_equal(names(str_to_title(x)), names(x))
})

# programming cases -----------------------------------------------------------

test_that("to_camel can control case of first argument", {
  expect_equal(str_to_camel("my_variable"), "myVariable")
  expect_equal(str_to_camel("my$variable"), "myVariable")
  expect_equal(str_to_camel(" my    variable  "), "myVariable")
  expect_equal(str_to_camel("my_variable", first_upper = TRUE), "MyVariable")
})

test_that("to_kebab converts to kebab case", {
  expect_equal(str_to_kebab("myVariable"), "my-variable")
  expect_equal(str_to_kebab("MyVariable"), "my-variable")
  expect_equal(str_to_kebab("1MyVariable1"), "1-my-variable-1")
  expect_equal(str_to_kebab("My$Variable"), "my-variable")
  expect_equal(str_to_kebab(" My   Variable  "), "my-variable")
  expect_equal(str_to_kebab("testABCTest"), "test-abc-test")
  expect_equal(str_to_kebab("IlÉtaitUneFois"), "il-était-une-fois")
})

test_that("to_snake converts to snake case", {
  expect_equal(str_to_snake("myVariable"), "my_variable")
  expect_equal(str_to_snake("MyVariable"), "my_variable")
  expect_equal(str_to_snake("1MyVariable1"), "1_my_variable_1")
  expect_equal(str_to_snake("My$Variable"), "my_variable")
  expect_equal(str_to_snake(" My   Variable  "), "my_variable")
  expect_equal(str_to_snake("testABCTest"), "test_abc_test")
  expect_equal(str_to_snake("IlÉtaitUneFois"), "il_était_une_fois")
})

test_that("to_words handles common compound cases", {
  expect_equal(to_words("a_b"), "a b")
  expect_equal(to_words("a-b"), "a b")
  expect_equal(to_words("aB"), "a b")
  expect_equal(to_words("a123b"), "a 123 b")
  expect_equal(to_words("HTML"), "html")
})


================================================
FILE: tests/testthat/test-conv.R
================================================
test_that("encoding conversion works", {
  skip_on_os("windows")

  x <- rawToChar(as.raw(177))
  expect_equal(str_conv(x, "latin1"), "±")
})

test_that("check encoding argument", {
  expect_snapshot(str_conv("A", c("ISO-8859-1", "ISO-8859-2")), error = TRUE)
})

test_that("str_conv() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_conv(x, "UTF-8")), names(x))
})


================================================
FILE: tests/testthat/test-count.R
================================================
test_that("counts are as expected", {
  fruit <- c("apple", "banana", "pear", "pineapple")
  expect_equal(str_count(fruit, "a"), c(1, 3, 1, 1))
  expect_equal(str_count(fruit, "p"), c(2, 0, 1, 3))
  expect_equal(str_count(fruit, "e"), c(1, 0, 1, 2))
  expect_equal(str_count(fruit, c("a", "b", "p", "n")), c(1, 1, 1, 1))
})

test_that("uses tidyverse recycling rules", {
  expect_error(str_count(1:2, 1:3), class = "vctrs_error_incompatible_size")
})

test_that("can use fixed() and coll()", {
  expect_equal(str_count("x.", fixed(".")), 1)
  expect_equal(str_count("\u0131", turkish_I()), 1)
})

test_that("can count boundaries", {
  # str_count(x, boundary()) == lengths(str_split(x, boundary()))
  expect_equal(str_count("a b c", ""), 5)
  expect_equal(str_count("a b c", boundary("word")), 3)
})

test_that("str_count() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_count(x, ".")), names(x))
})

test_that("str_count() drops names when pattern is vector and string is scalar", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(names(str_count(x1, p2)))
})

test_that("str_count() preserves names when pattern and string have same length", {
  x2 <- c(A = "ab", B = "cd")
  p2 <- c("a", "c")
  expect_equal(names(str_count(x2, p2)), names(x2))
})


================================================
FILE: tests/testthat/test-detect.R
================================================
test_that("special cases are correct", {
  expect_equal(str_detect(NA, "x"), NA)
  expect_equal(str_detect(character(), "x"), logical())
})

test_that("vectorised patterns work", {
  expect_equal(str_detect("ab", c("a", "b", "c")), c(T, T, F))
  expect_equal(str_detect(c("ca", "ab"), c("a", "c")), c(T, F))

  # negation works
  expect_equal(str_detect("ab", c("a", "b", "c"), negate = TRUE), c(F, F, T))
})

test_that("str_starts() and str_ends() match expected strings", {
  expect_equal(str_starts(c("ab", "ba"), "a"), c(TRUE, FALSE))
  expect_equal(str_ends(c("ab", "ba"), "a"), c(FALSE, TRUE))

  # negation
  expect_equal(str_starts(c("ab", "ba"), "a", negate = TRUE), c(FALSE, TRUE))
  expect_equal(str_ends(c("ab", "ba"), "a", negate = TRUE), c(TRUE, FALSE))

  # correct precedence
  expect_equal(str_starts(c("ab", "ba", "cb"), "a|b"), c(TRUE, TRUE, FALSE))
  expect_equal(str_ends(c("ab", "ba", "bc"), "a|b"), c(TRUE, TRUE, FALSE))
})

test_that("can use fixed() and coll()", {
  expect_equal(str_detect("X", fixed(".")), FALSE)
  expect_equal(str_starts("X", fixed(".")), FALSE)
  expect_equal(str_ends("X", fixed(".")), FALSE)

  expect_equal(str_detect("\u0131", turkish_I()), TRUE)
  expect_equal(str_starts("\u0131", turkish_I()), TRUE)
  expect_equal(str_ends("\u0131", turkish_I()), TRUE)
})

test_that("can't empty/boundary", {
  expect_snapshot(error = TRUE, {
    str_detect("x", "")
    str_starts("x", "")
    str_ends("x", "")
  })
})

test_that("functions use tidyverse recycling rules", {
  expect_snapshot(error = TRUE, {
    str_detect(1:2, 1:3)
    str_starts(1:2, 1:3)
    str_ends(1:2, 1:3)
    str_like(1:2, c("a", "b", "c"))
  })
})

test_that("detection functions preserve names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_detect(x, "[123]")), names(x))
  expect_equal(names(str_starts(x, "1")), names(x))
  expect_equal(names(str_ends(x, "1")), names(x))
  expect_equal(names(str_like(x, "%")), names(x))
  expect_equal(names(str_ilike(x, "%")), names(x))
})

test_that("detection drops names when pattern is vector and string is scalar", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(names(str_detect(x1, p2)))
  expect_null(names(str_starts(x1, p2)))
  expect_null(names(str_ends(x1, p2)))
  expect_null(names(str_like(x1, p2)))
  expect_null(names(str_ilike(x1, p2)))
})

test_that("detection preserves names when pattern and string have same length", {
  x2 <- c(A = "ab", B = "cd")
  p2 <- c("a", "c")
  expect_equal(names(str_detect(x2, p2)), names(x2))
  expect_equal(names(str_starts(x2, p2)), names(x2))
  expect_equal(names(str_ends(x2, p2)), names(x2))
  expect_equal(names(str_like(x2, p2)), names(x2))
  expect_equal(names(str_ilike(x2, p2)), names(x2))
})

# str_like ----------------------------------------------------------------

test_that("str_like is case sensitive", {
  expect_true(str_like("abc", "ab%"))
  expect_false(str_like("abc", "AB%"))
  expect_snapshot(str_like("abc", regex("x")), error = TRUE)
})

test_that("ignore_case is deprecated but still respected", {
  expect_snapshot(out <- str_like("abc", "AB%", ignore_case = TRUE))
  expect_equal(out, TRUE)

  expect_warning(out <- str_like("abc", "AB%", ignore_case = FALSE))
  expect_equal(out, FALSE)
})

test_that("str_ilike works", {
  expect_true(str_ilike("abc", "ab%"))
  expect_true(str_ilike("abc", "AB%"))
  expect_snapshot(str_ilike("abc", regex("x")), error = TRUE)
})

test_that("like_to_regex generates expected regexps", {
  expect_equal(like_to_regex("ab%"), "^ab.*$")
  expect_equal(like_to_regex("ab_"), "^ab.$")

  # escaping
  expect_equal(like_to_regex("ab\\%"), "^ab\\%$")
  expect_equal(like_to_regex("ab[%]"), "^ab[%]$")
})


================================================
FILE: tests/testthat/test-dup.R
================================================
test_that("basic duplication works", {
  expect_equal(str_dup("a", 3), "aaa")
  expect_equal(str_dup("abc", 2), "abcabc")
  expect_equal(str_dup(c("a", "b"), 2), c("aa", "bb"))
  expect_equal(str_dup(c("a", "b"), c(2, 3)), c("aa", "bbb"))
})

test_that("0 duplicates equals empty string", {
  expect_equal(str_dup("a", 0), "")
  expect_equal(str_dup(c("a", "b"), 0), rep("", 2))
})

test_that("uses tidyverse recycling rules", {
  expect_error(str_dup(1:2, 1:3), class = "vctrs_error_incompatible_size")
})

test_that("uses sep argument", {
  expect_equal(str_dup("abc", 1, sep = "-"), "abc")
  expect_equal(str_dup("abc", 2, sep = "-"), "abc-abc")

  expect_equal(str_dup(c("a", "b"), 2, sep = "-"), c("a-a", "b-b"))
  expect_equal(str_dup(c("a", "b"), c(1, 2), sep = "-"), c("a", "b-b"))

  expect_equal(str_dup(character(), 1, sep = "-"), character())
  expect_equal(str_dup(character(), 2, sep = "-"), character())
})

test_that("separator must be a single string", {
  expect_snapshot(error = TRUE, {
    str_dup("a", 3, sep = 1)
    str_dup("a", 3, sep = c("-", ";"))
  })
})

test_that("str_dup() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_dup(x, 2)), names(x))
})


================================================
FILE: tests/testthat/test-equal.R
================================================
test_that("vectorised using TRR", {
  expect_equal(str_equal("a", character()), logical())
  expect_equal(str_equal("a", "b"), FALSE)
  expect_equal(str_equal("a", c("a", "b")), c(TRUE, FALSE))
  expect_snapshot(str_equal(letters[1:3], c("a", "b")), error = TRUE)
})

test_that("can ignore case", {
  expect_equal(str_equal("a", "A"), FALSE)
  expect_equal(str_equal("a", "A", ignore_case = TRUE), TRUE)
})


================================================
FILE: tests/testthat/test-escape.R
================================================
test_that("multiplication works", {
  expect_equal(
    str_escape(".^$|*+?{}[]()"),
    "\\.\\^\\$\\|\\*\\+\\?\\{\\}\\[\\]\\(\\)"
  )
  expect_equal(str_escape("\\"), "\\\\")
})

test_that("str_escape() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_escape(x)), names(x))
})


================================================
FILE: tests/testthat/test-extract.R
================================================
test_that("single pattern extracted correctly", {
  test <- c("one two three", "a b c")

  expect_equal(
    str_extract_all(test, "[a-z]+"),
    list(c("one", "two", "three"), c("a", "b", "c"))
  )

  expect_equal(
    str_extract_all(test, "[a-z]{3,}"),
    list(c("one", "two", "three"), character())
  )
})

test_that("uses tidyverse recycling rules", {
  expect_error(
    str_extract(c("a", "b"), c("a", "b", "c")),
    class = "vctrs_error_incompatible_size"
  )
  expect_error(
    str_extract_all(c("a", "b"), c("a", "b", "c")),
    class = "vctrs_error_incompatible_size"
  )
})


test_that("no match yields empty vector", {
  expect_equal(str_extract_all("a", "b")[[1]], character())
})

test_that("str_extract extracts first match if found, NA otherwise", {
  shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
  word_1_to_4 <- str_extract(shopping_list, "\\b[a-z]{1,4}\\b")

  expect_length(word_1_to_4, length(shopping_list))
  expect_equal(word_1_to_4[1], NA_character_)
})

test_that("can extract a group", {
  expect_equal(str_extract("abc", "(.).(.)", group = 1), "a")
  expect_equal(str_extract("abc", "(.).(.)", group = 2), "c")
})

test_that("can use fixed() and coll()", {
  expect_equal(str_extract("x.x", fixed(".")), ".")
  expect_equal(str_extract_all("x.x.", fixed(".")), list(c(".", ".")))

  expect_equal(str_extract("\u0131", turkish_I()), "\u0131")
  expect_equal(str_extract_all("\u0131I", turkish_I()), list(c("\u0131", "I")))
})

test_that("can extract boundaries", {
  expect_equal(str_extract("a b c", ""), "a")
  expect_equal(
    str_extract_all("a b c", ""),
    list(c("a", " ", "b", " ", "c"))
  )

  expect_equal(str_extract("a b c", boundary("word")), "a")
  expect_equal(
    str_extract_all("a b c", boundary("word")),
    list(c("a", "b", "c"))
  )
})

test_that("str_extract() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_extract(x, "[0-9]")), names(x))
})

test_that("str_extract_all() preserves names on outer structure", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_extract_all(x, "[0-9]")), names(x))
})

test_that("str_extract and extract_all handle vectorised patterns and names", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(names(str_extract(x1, p2)))
  expect_null(names(str_extract_all(x1, p2)))

  x2 <- c(A = "ab", B = "cd")
  expect_equal(names(str_extract(x2, p2)), names(x2))
  expect_equal(names(str_extract_all(x2, p2)), names(x2))
})


================================================
FILE: tests/testthat/test-flatten.R
================================================
test_that("equivalent to paste with collapse", {
  expect_equal(str_flatten(letters), paste0(letters, collapse = ""))
})

test_that("collapse must be single string", {
  expect_snapshot(str_flatten("A", c("a", "b")), error = TRUE)
})

test_that("last optionally used instead of final separator", {
  expect_equal(str_flatten(letters[1:3], ", ", ", and "), "a, b, and c")
  expect_equal(str_flatten(letters[1:2], ", ", ", and "), "a, and b")
  expect_equal(str_flatten(letters[1], ", ", ", and "), "a")
})

test_that("can remove missing values", {
  expect_equal(str_flatten(c("a", NA)), NA_character_)
  expect_equal(str_flatten(c("a", NA), na.rm = TRUE), "a")
})

test_that("str_flatten_oxford removes comma iif necessary", {
  expect_equal(str_flatten_comma(letters[1:2], ", or "), "a or b")

  expect_equal(str_flatten_comma(letters[1:3], ", or "), "a, b, or c")
  expect_equal(str_flatten_comma(letters[1:3], " or "), "a, b or c")
  expect_equal(str_flatten_comma(letters[1:3]), "a, b, c")
})


================================================
FILE: tests/testthat/test-glue.R
================================================
test_that("verify wrapper is functional", {
  expect_equal(as.character(str_glue("a {b}", b = "b")), "a b")

  df <- data.frame(b = "b")
  expect_equal(as.character(str_glue_data(df, "a {b}", b = "b")), "a b")
})

test_that("verify trim is functional", {
  expect_equal(as.character(str_glue("L1\t \n  \tL2")), "L1\t \nL2")

  expect_equal(
    as.character(str_glue("L1\t \n  \tL2", .trim = FALSE)),
    "L1\t \n  \tL2"
  )
})


================================================
FILE: tests/testthat/test-interp.R
================================================
test_that("str_interp works with default env", {
  subject <- "statistics"
  number <- 7
  floating <- 6.656

  expect_equal(
    str_interp("A ${subject}. B $[d]{number}. C $[.2f]{floating}."),
    "A statistics. B 7. C 6.66."
  )

  expect_equal(
    str_interp("Pi is approximately $[.5f]{pi}"),
    "Pi is approximately 3.14159"
  )
})

test_that("str_interp works with lists and data frames.", {
  expect_equal(
    str_interp(
      "One value, ${value1}, and then another, ${value2*2}.",
      list(value1 = 10, value2 = 20)
    ),
    "One value, 10, and then another, 40."
  )

  expect_equal(
    str_interp(
      "Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.",
      iris
    ),
    "Values are 4.40 and 2.00."
  )
})

test_that("str_interp works with nested expressions", {
  amount <- 1337

  expect_equal(
    str_interp("Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}"),
    "Works with } nested { braces too: 5348.00"
  )
})

test_that("str_interp works in the absense of placeholders", {
  expect_equal(
    str_interp("A quite static string here."),
    "A quite static string here."
  )
})

test_that("str_interp fails when encountering nested placeholders", {
  msg <- "This will never see the light of day"
  num <- 1.2345

  expect_snapshot(error = TRUE, {
    str_interp("${${msg}}")
    str_interp("$[.2f]{${msg}}")
  })
})

test_that("str_interp fails when input is not a character string", {
  expect_snapshot(str_interp(3L), error = TRUE)
})

test_that("str_interp wraps parsing errors", {
  expect_snapshot(str_interp("This is a ${1 +}"), error = TRUE)
})

test_that("str_interp formats list independetly of other placeholders", {
  a_list <- c("item1", "item2", "item3")
  other <- "1"
  extract <- function(text) regmatches(text, regexpr("xx[^x]+xx", text))
  from_list <- extract(str_interp("list: xx${a_list}xx"))
  from_both <- extract(str_interp("list: xx${a_list}xx, and another ${other}"))
  expect_equal(from_list, from_both)
})


================================================
FILE: tests/testthat/test-length.R
================================================
test_that("str_length is number of characters", {
  expect_equal(str_length("a"), 1)
  expect_equal(str_length("ab"), 2)
  expect_equal(str_length("abc"), 3)
})

test_that("str_length of missing string is missing", {
  expect_equal(str_length(NA), NA_integer_)
  expect_equal(str_length(c(NA, 1)), c(NA, 1))
  expect_equal(str_length("NA"), 2)
})

test_that("str_length of factor is length of level", {
  expect_equal(str_length(factor("a")), 1)
  expect_equal(str_length(factor("ab")), 2)
  expect_equal(str_length(factor("abc")), 3)
})

test_that("str_width returns display width", {
  x <- c("\u0308", "x", "\U0001f60a")
  expect_equal(str_width(x), c(0, 1, 2))
})

test_that("length/width preserve names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_length(x)), names(x))
  expect_equal(names(str_width(x)), names(x))
})


================================================
FILE: tests/testthat/test-locate.R
================================================
test_that("basic location matching works", {
  expect_equal(str_locate("abc", "a")[1, ], c(start = 1, end = 1))
  expect_equal(str_locate("abc", "b")[1, ], c(start = 2, end = 2))
  expect_equal(str_locate("abc", "c")[1, ], c(start = 3, end = 3))
  expect_equal(str_locate("abc", ".+")[1, ], c(start = 1, end = 3))
})

test_that("uses tidyverse recycling rules", {
  expect_error(str_locate(1:2, 1:3), class = "vctrs_error_incompatible_size")
  expect_error(
    str_locate_all(1:2, 1:3),
    class = "vctrs_error_incompatible_size"
  )
})

test_that("locations are integers", {
  strings <- c("a b c", "d e f")
  expect_true(is.integer(str_locate(strings, "[a-z]")))

  res <- str_locate_all(strings, "[a-z]")[[1]]
  expect_true(is.integer(res))
  expect_true(is.integer(invert_match(res)))
})

test_that("both string and patterns are vectorised", {
  strings <- c("abc", "def")

  locs <- str_locate(strings, "a")
  expect_equal(locs[, "start"], c(1, NA))

  locs <- str_locate(strings, c("a", "d"))
  expect_equal(locs[, "start"], c(1, 1))
  expect_equal(locs[, "end"], c(1, 1))

  locs <- str_locate_all(c("abab"), c("a", "b"))
  expect_equal(locs[[1]][, "start"], c(1, 3))
  expect_equal(locs[[2]][, "start"], c(2, 4))
})

test_that("can use fixed() and coll()", {
  expect_equal(str_locate("x.x", fixed(".")), cbind(start = 2, end = 2))
  expect_equal(
    str_locate_all("x.x.", fixed(".")),
    list(cbind(start = c(2, 4), end = c(2, 4)))
  )

  expect_equal(str_locate("\u0131", turkish_I()), cbind(start = 1, end = 1))
  expect_equal(
    str_locate_all("\u0131I", turkish_I()),
    list(cbind(start = 1:2, end = 1:2))
  )
})

test_that("can use boundaries", {
  expect_equal(
    str_locate(" x  y", ""),
    cbind(start = 1, end = 1)
  )
  expect_equal(
    str_locate_all("abc", ""),
    list(cbind(start = 1:3, end = 1:3))
  )

  expect_equal(
    str_locate(" xy", boundary("word")),
    cbind(start = 2, end = 3)
  )
  expect_equal(
    str_locate_all(" ab  cd", boundary("word")),
    list(cbind(start = c(2, 6), end = c(3, 7)))
  )
})

test_that("str_locate() preserves row names when 1:1 with input", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(rownames(str_locate(x, "[0-9]")), names(x))
})

test_that("str_locate_all() preserves names on outer structure", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_locate_all(x, "[0-9]")), names(x))
})

test_that("locate handles vectorised patterns and names", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(rownames(str_locate(x1, p2)))
  expect_null(names(str_locate_all(x1, p2)))

  x2 <- c(A = "ab", B = "cd")
  expect_equal(rownames(str_locate(x2, p2)), names(x2))
  expect_equal(names(str_locate_all(x2, p2)), names(x2))
})


================================================
FILE: tests/testthat/test-match.R
================================================
set.seed(1410)
num <- matrix(sample(9, 10 * 10, replace = T), ncol = 10)
num_flat <- apply(num, 1, str_c, collapse = "")

phones <- str_c(
  "(",
  num[, 1],
  num[, 2],
  num[, 3],
  ") ",
  num[, 4],
  num[, 5],
  num[, 6],
  " ",
  num[, 7],
  num[, 8],
  num[, 9],
  num[, 10]
)

test_that("empty strings return correct matrix of correct size", {
  skip_if_not_installed("stringi", "1.2.2")

  expect_equal(str_match(NA, "(a)"), matrix(NA_character_, 1, 2))
  expect_equal(str_match(character(), "(a)"), matrix(character(), 0, 2))
})

test_that("no matching cases returns 1 column matrix", {
  res <- str_match(c("a", "b"), ".")

  expect_equal(nrow(res), 2)
  expect_equal(ncol(res), 1)

  expect_equal(res[, 1], c("a", "b"))
})

test_that("single match works when all match", {
  matches <- str_match(phones, "\\(([0-9]{3})\\) ([0-9]{3}) ([0-9]{4})")

  expect_equal(nrow(matches), length(phones))
  expect_equal(ncol(matches), 4)

  expect_equal(matches[, 1], phones)

  matches_flat <- apply(matches[, -1], 1, str_c, collapse = "")
  expect_equal(matches_flat, num_flat)
})

test_that("match returns NA when some inputs don't match", {
  matches <- str_match(
    c(phones, "blah", NA),
    "\\(([0-9]{3})\\) ([0-9]{3}) ([0-9]{4})"
  )

  expect_equal(nrow(matches), length(phones) + 2)
  expect_equal(ncol(matches), 4)

  expect_equal(matches[11, ], rep(NA_character_, 4))
  expect_equal(matches[12, ], rep(NA_character_, 4))
})

test_that("match returns NA when optional group doesn't match", {
  expect_equal(str_match(c("ab", "a"), "(a)(b)?")[, 3], c("b", NA))
})

test_that("match_all returns NA when option group doesn't match", {
  expect_equal(str_match_all("a", "(a)(b)?")[[1]][1, ], c("a", "a", NA))
})

test_that("multiple match works", {
  phones_one <- str_c(phones, collapse = " ")
  multi_match <- str_match_all(
    phones_one,
    "\\(([0-9]{3})\\) ([0-9]{3}) ([0-9]{4})"
  )
  single_matches <- str_match(phones, "\\(([0-9]{3})\\) ([0-9]{3}) ([0-9]{4})")

  expect_equal(multi_match[[1]], single_matches)
})

test_that("match and match_all fail when pattern is not a regex", {
  expect_snapshot(error = TRUE, {
    str_match(phones, fixed("3"))
    str_match_all(phones, coll("9"))
  })
})

test_that("uses tidyverse recycling rules", {
  expect_error(
    str_match(c("a", "b"), c("a", "b", "c")),
    class = "vctrs_error_incompatible_size"
  )
  expect_error(
    str_match_all(c("a", "b"), c("a", "b", "c")),
    class = "vctrs_error_incompatible_size"
  )
})

test_that("match can't use other modifiers", {
  expect_snapshot(error = TRUE, {
    str_match("x", coll("y"))
    str_match_all("x", coll("y"))
  })
})

test_that("str_match() preserves row names when 1:1 with input", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(rownames(str_match(x, "([0-9])")), names(x))
})

test_that("str_match_all() preserves names on outer structure", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_match_all(x, "([0-9])")), names(x))
})

test_that("match handles vectorised patterns and names", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(rownames(str_match(x1, p2)))
  expect_null(names(str_match_all(x1, p2)))

  x2 <- c(A = "ab", B = "cd")
  expect_equal(rownames(str_match(x2, p2)), names(x2))
  expect_equal(names(str_match_all(x2, p2)), names(x2))
})


================================================
FILE: tests/testthat/test-modifiers.R
================================================
test_that("patterns coerced to character", {
  x <- factor("a")

  expect_snapshot({
    . <- regex(x)
    . <- coll(x)
    . <- fixed(x)
  })
})

test_that("useful error message for bad type", {
  expect_snapshot(error = TRUE, {
    type(1:3)
  })
})

test_that("fallback for regex (#433)", {
  expect_equal(type(structure("x", class = "regex")), "regex")
})

test_that("ignore_case sets strength, but can override manually", {
  x1 <- coll("x", strength = 1)
  x2 <- coll("x", ignore_case = TRUE)
  x3 <- coll("x")

  expect_equal(attr(x1, "options")$strength, 1)
  expect_equal(attr(x2, "options")$strength, 2)
  expect_equal(attr(x3, "options")$strength, 3)
})

test_that("boundary has length 1", {
  expect_length(boundary(), 1)
})

test_that("subsetting preserves class and options", {
  x <- regex("a", multiline = TRUE)
  expect_equal(x[], x)
})

test_that("useful errors for NAs", {
  expect_snapshot(error = TRUE, {
    type(NA)
    type(c("a", "b", NA_character_, "c"))
  })
})

test_that("stringr_pattern methods", {
  ex <- coll(c("foo", "bar"))
  expect_true(inherits(ex[1], "stringr_pattern"))
  expect_true(inherits(ex[[1]], "stringr_pattern"))
})


================================================
FILE: tests/testthat/test-pad.R
================================================
test_that("long strings are unchanged", {
  lengths <- sample(40:100, 10)
  strings <- vapply(
    lengths,
    function(x) {
      str_c(letters[sample(26, x, replace = T)], collapse = "")
    },
    character(1)
  )

  padded <- str_pad(strings, width = 30)
  expect_equal(str_length(padded), str_length(strings))
})

test_that("directions work for simple case", {
  pad <- function(direction) str_pad("had", direction, width = 10)

  expect_equal(pad("right"), "had       ")
  expect_equal(pad("left"), "       had")
  expect_equal(pad("both"), "   had    ")
})

test_that("padding based of length works", {
  # \u4e2d is a 2-characters-wide Chinese character
  pad <- function(...) str_pad("\u4e2d", ..., side = "both")

  expect_equal(pad(width = 6), "  \u4e2d  ")
  expect_equal(pad(width = 5, use_width = FALSE), "  \u4e2d  ")
})

test_that("uses tidyverse recycling rules", {
  expect_error(
    str_pad(c("a", "b"), 1:3),
    class = "vctrs_error_incompatible_size"
  )
  expect_error(
    str_pad(c("a", "b"), 10, pad = c("a", "b", "c")),
    class = "vctrs_error_incompatible_size"
  )
})

test_that("str_pad() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_pad(x, 2, side = "left")), names(x))
})


================================================
FILE: tests/testthat/test-remove.R
================================================
test_that("succesfully wraps str_replace_all", {
  expect_equal(str_remove_all("abababa", "ba"), "a")
  expect_equal(str_remove("abababa", "ba"), "ababa")
})

test_that("str_remove() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_remove(x, "[0-9]")), names(x))
})


================================================
FILE: tests/testthat/test-replace.R
================================================
test_that("basic replacement works", {
  expect_equal(str_replace_all("abababa", "ba", "BA"), "aBABABA")
  expect_equal(str_replace("abababa", "ba", "BA"), "aBAbaba")
})

test_that("can replace multiple matches", {
  x <- c("a1", "b2")
  y <- str_replace_all(x, c("a" = "1", "b" = "2"))
  expect_equal(y, c("11", "22"))
})

test_that("even when lengths differ", {
  x <- c("a1", "b2", "c3")
  y <- str_replace_all(x, c("a" = "1", "b" = "2"))
  expect_equal(y, c("11", "22", "c3"))
})

test_that("multiple matches respects class", {
  x <- c("x", "y")
  y <- str_replace_all(x, regex(c("X" = "a"), ignore_case = TRUE))
  expect_equal(y, c("a", "y"))
})

test_that("replacement must be a string", {
  expect_snapshot(str_replace("x", "x", 1), error = TRUE)
})

test_that("replacement must be a string", {
  expect_equal(str_replace("xyz", "x", NA_character_), NA_character_)
})

test_that("can replace all types of NA values", {
  expect_equal(str_replace_na(NA), "NA")
  expect_equal(str_replace_na(NA_character_), "NA")
  expect_equal(str_replace_na(NA_complex_), "NA")
  expect_equal(str_replace_na(NA_integer_), "NA")
  expect_equal(str_replace_na(NA_real_), "NA")
})

test_that("can use fixed() and coll()", {
  expect_equal(str_replace("x.x", fixed("."), "Y"), "xYx")
  expect_equal(str_replace_all("x.x.", fixed("."), "Y"), "xYxY")

  expect_equal(str_replace("\u0131", turkish_I(), "Y"), "Y")
  expect_equal(str_replace_all("\u0131I", turkish_I(), "Y"), "YY")
})

test_that("can't replace empty/boundary", {
  expect_snapshot(error = TRUE, {
    str_replace("x", "", "")
    str_replace("x", boundary("word"), "")
    str_replace_all("x", "", "")
    str_replace_all("x", boundary("word"), "")
  })
})

# functions ---------------------------------------------------------------

test_that("can replace multiple values", {
  expect_equal(str_replace("abc", "a|c", toupper), "Abc")
  expect_equal(str_replace_all("abc", "a|c", toupper), "AbC")
})

test_that("can use formula", {
  expect_equal(str_replace("abc", "b", ~"x"), "axc")
  expect_equal(str_replace_all("abc", "b", ~"x"), "axc")
})

test_that("replacement can be different length", {
  double <- function(x) str_dup(x, 2)
  expect_equal(str_replace_all("abc", "a|c", double), "aabcc")
})

test_that("replacement is vectorised", {
  x <- c("", "a", "b", "ab", "abc", "cba")
  expect_equal(
    str_replace_all(x, "a|c", ~ toupper(str_dup(.x, 2))),
    c("", "AA", "b", "AAb", "AAbCC", "CCbAA")
  )
})

test_that("is forgiving of 0 matches with paste", {
  x <- c("a", "b", "c")
  expect_equal(str_replace_all(x, "d", ~ paste("x", .x)), x)
})

test_that("useful error if not vectorised correctly", {
  x <- c("a", "b", "c")
  expect_snapshot(
    str_replace_all(x, "a|c", ~ if (length(x) > 1) stop("Bad")),
    error = TRUE
  )
})

test_that("works with no match", {
  expect_equal(str_replace("abc", "z", toupper), "abc")
})

test_that("works with zero length match", {
  expect_equal(str_replace("abc", "$", toupper), "abc")
  expect_equal(str_replace_all("abc", "$|^", ~ rep("X", length(.x))), "XabcX")
})

test_that("replacement function must return correct type/length", {
  expect_snapshot(error = TRUE, {
    str_replace_all("x", "x", ~1)
    str_replace_all("x", "x", ~ c("a", "b"))
  })
})

# fix_replacement ---------------------------------------------------------

test_that("backrefs are correctly translated", {
  expect_equal(str_replace_all("abcde", "(b)(c)(d)", "\\1"), "abe")
  expect_equal(str_replace_all("abcde", "(b)(c)(d)", "\\2"), "ace")
  expect_equal(str_replace_all("abcde", "(b)(c)(d)", "\\3"), "ade")

  # gsub("(b)(c)(d)", "\\0", "abcde", perl=TRUE) gives a0e,
  # in ICU regex $0 refers to the whole pattern match
  expect_equal(str_replace_all("abcde", "(b)(c)(d)", "\\0"), "abcde")

  # gsub("(b)(c)(d)", "\\4", "abcde", perl=TRUE) is legal,
  # in ICU regex this gives an U_INDEX_OUTOFBOUNDS_ERROR
  expect_snapshot(str_replace_all("abcde", "(b)(c)(d)", "\\4"), error = TRUE)

  expect_equal(str_replace_all("abcde", "bcd", "\\\\1"), "a\\1e")

  expect_equal(str_replace_all("a!1!2!b", "!", "$"), "a$1$2$b")
  expect_equal(str_replace("aba", "b", "$"), "a$a")
  expect_equal(str_replace("aba", "b", "$$$"), "a$$$a")
  expect_equal(str_replace("aba", "(b)", "\\1$\\1$\\1"), "ab$b$ba")
  expect_equal(str_replace("aba", "(b)", "\\1$\\\\1$\\1"), "ab$\\1$ba")
  expect_equal(str_replace("aba", "(b)", "\\\\1$\\1$\\\\1"), "a\\1$b$\\1a")
})

test_that("$ are escaped", {
  expect_equal(fix_replacement("$"), "\\$")
  expect_equal(fix_replacement("\\$"), "\\\\$")
})

test_that("\1 converted to $1 etc", {
  expect_equal(fix_replacement("\\1"), "$1")
  expect_equal(fix_replacement("\\9"), "$9")
})

test_that("\\1 left as is", {
  expect_equal(fix_replacement("\\\\1"), "\\\\1")
})

test_that("replace functions preserve names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_replace(x, "[0-9]", "x")), names(x))
  expect_equal(names(str_replace_all(x, "[0-9]", "x")), names(x))
})

test_that("replace functions handle vectorised patterns and names", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(names(str_replace(x1, p2, "x")))
  expect_null(names(str_replace_all(x1, p2, "x")))

  x2 <- c(A = "ab", B = "cd")
  expect_equal(names(str_replace(x2, p2, "x")), names(x2))
  expect_equal(names(str_replace_all(x2, p2, "x")), names(x2))
})

test_that("str_replace_na() preserves names", {
  y <- c(A = NA, B = "x")
  expect_equal(names(str_replace_na(y)), names(y))
})


================================================
FILE: tests/testthat/test-sort.R
================================================
test_that("digits can be sorted/ordered as strings or numbers", {
  x <- c("2", "1", "10")

  expect_equal(str_sort(x, numeric = FALSE), c("1", "10", "2"))
  expect_equal(str_sort(x, numeric = TRUE), c("1", "2", "10"))

  expect_equal(str_order(x, numeric = FALSE), c(2, 3, 1))
  expect_equal(str_order(x, numeric = TRUE), c(2, 1, 3))

  expect_equal(str_rank(x, numeric = FALSE), c(3, 1, 2))
  expect_equal(str_rank(x, numeric = TRUE), c(2, 1, 3))
})

test_that("NA can be at beginning or end", {
  x <- c("2", "1", NA, "10")

  na_end <- str_sort(x, numeric = TRUE, na_last = TRUE)
  expect_equal(tail(na_end, 1), NA_character_)

  na_start <- str_sort(x, numeric = TRUE, na_last = FALSE)
  expect_equal(head(na_start, 1), NA_character_)
})

test_that("str_sort() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  out <- str_sort(x)
  expect_equal(names(out), c("A", "B", "C"))
})


================================================
FILE: tests/testthat/test-split.R
================================================
test_that("special cases are correct", {
  expect_equal(str_split(NA, "")[[1]], NA_character_)
  expect_equal(str_split(character(), ""), list())
})

test_that("str_split functions as expected", {
  expect_equal(
    str_split(c("bab", "cac", "dadad"), "a"),
    list(c("b", "b"), c("c", "c"), c("d", "d", "d"))
  )
})

test_that("str_split() can split by special patterns", {
  expect_equal(str_split("ab", ""), list(c("a", "b")))
  expect_equal(
    str_split("this that.", boundary("word")),
    list(c("this", "that"))
  )
  expect_equal(str_split("a-b", fixed("-")), list(c("a", "b")))
  expect_equal(
    str_split("aXb", coll("X", ignore_case = TRUE)),
    list(c("a", "b"))
  )
})

test_that("boundary() can be recycled", {
  expect_equal(str_split(c("x", "y"), boundary()), list("x", "y"))
})

test_that("str_split() can control maximum number of splits", {
  expect_equal(
    str_split(c("a", "a-b"), n = 1, "-"),
    list("a", "a-b")
  )
  expect_equal(
    str_split(c("a", "a-b"), n = 3, "-"),
    list("a", c("a", "b"))
  )
})

test_that("str_split() checks its inputs", {
  expect_snapshot(error = TRUE, {
    str_split(letters[1:3], letters[1:2])
    str_split("x", 1)
    str_split("x", "x", n = 0)
  })
})

test_that("str_split_1 takes string and returns character vector", {
  expect_equal(str_split_1("abc", ""), c("a", "b", "c"))
  expect_snapshot_error(str_split_1(letters, ""))
})

test_that("str_split_fixed pads with empty string", {
  expect_equal(
    str_split_fixed(c("a", "a-b"), "-", 1),
    cbind(c("a", "a-b"))
  )
  expect_equal(
    str_split_fixed(c("a", "a-b"), "-", 2),
    cbind(c("a", "a"), c("", "b"))
  )
  expect_equal(
    str_split_fixed(c("a", "a-b"), "-", 3),
    cbind(c("a", "a"), c("", "b"), c("", ""))
  )
})

test_that("str_split_fixed check its inputs", {
  expect_snapshot(str_split_fixed("x", "x", 0), error = TRUE)
})

# str_split_i -------------------------------------------------------------

test_that("str_split_i can extract from LHS or RHS", {
  expect_equal(str_split_i(c("1-2-3", "4-5"), "-", 1), c("1", "4"))
  expect_equal(str_split_i(c("1-2-3", "4-5"), "-", -1), c("3", "5"))
})

test_that("str_split_i returns NA for absent components", {
  expect_equal(str_split_i(c("a", "b-c"), "-", 2), c(NA, "c"))
  expect_equal(str_split_i(c("a", "b-c"), "-", 3), c(NA_character_, NA))

  expect_equal(str_split_i(c("1-2-3", "4-5"), "-", -3), c("1", NA))
  expect_equal(str_split_i(c("1-2-3", "4-5"), "-", -4), c(NA_character_, NA))
})

test_that("str_split_i check its inputs", {
  expect_snapshot(error = TRUE, {
    str_split_i("x", "x", 0)
    str_split_i("x", "x", 0.5)
  })
})

test_that("split functions preserve names on outer structures", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_split(x, "")), names(x))
  expect_equal(rownames(str_split(x, "", simplify = TRUE)), names(x))
  expect_equal(rownames(str_split_fixed(x, "", 1)), names(x))
})

test_that("str_split_i() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_split_i(x, " ", 1)), names(x))
})

test_that("split handles vectorised patterns and names", {
  x1 <- c(A = "ab")
  p2 <- c("a", "b")
  expect_null(names(str_split(x1, p2)))
  expect_null(rownames(str_split(x1, p2, simplify = TRUE)))
  expect_null(rownames(str_split_fixed(x1, p2, 1)))

  x2 <- c(A = "ab", B = "cd")
  expect_equal(names(str_split(x2, p2)), names(x2))
  expect_equal(rownames(str_split(x2, p2, simplify = TRUE)), names(x2))
  expect_equal(rownames(str_split_fixed(x2, p2, 1)), names(x2))
})


================================================
FILE: tests/testthat/test-sub.R
================================================
test_that("correct substring extracted", {
  alphabet <- str_c(letters, collapse = "")
  expect_equal(str_sub(alphabet, 1, 3), "abc")
  expect_equal(str_sub(alphabet, 24, 26), "xyz")
})

test_that("can extract multiple substrings", {
  expect_equal(
    str_sub_all(c("abc", "def"), list(c(1, 2), 1), list(c(1, 2), 2)),
    list(c("a", "b"), "de")
  )
})

test_that("arguments expanded to longest", {
  alphabet <- str_c(letters, collapse = "")

  expect_equal(
    str_sub(alphabet, c(1, 24), c(3, 26)),
    c("abc", "xyz")
  )

  expect_equal(
    str_sub(c("abc", "xyz"), 2, 2),
    c("b", "y")
  )
})

test_that("can supply start and end/length as a matrix", {
  x <- c("abc", "def")
  expect_equal(str_sub(x, cbind(1, end = 1)), c("a", "d"))
  expect_equal(str_sub(x, cbind(1, length = 2)), c("ab", "de"))

  expect_equal(
    str_sub_all(x, cbind(c(1, 2), end = c(2, 3))),
    list(c("ab", "bc"), c("de", "ef"))
  )

  str_sub(x, cbind(1, end = 1)) <- c("A", "D")
  expect_equal(x, c("Abc", "Def"))
})

test_that("specifying only end subsets from start", {
  alphabet <- str_c(letters, collapse = "")
  expect_equal(str_sub(alphabet, end = 3), "abc")
})

test_that("specifying only start subsets to end", {
  alphabet <- str_c(letters, collapse = "")
  expect_equal(str_sub(alphabet, 24), "xyz")
})

test_that("specifying -1 as end selects entire string", {
  expect_equal(
    str_sub("ABCDEF", c(4, 5), c(5, -1)),
    c("DE", "EF")
  )

  expect_equal(
    str_sub("ABCDEF", c(4, 5), c(-1, -1)),
    c("DEF", "EF")
  )
})

test_that("negative values select from end", {
  expect_equal(str_sub("ABCDEF", 1, -4), "ABC")
  expect_equal(str_sub("ABCDEF", -3), "DEF")
})

test_that("missing arguments give missing results", {
  expect_equal(str_sub(NA), NA_character_)
  expect_equal(str_sub(NA, 1, 3), NA_character_)
  expect_equal(str_sub(c(NA, "NA"), 1, 3), c(NA, "NA"))

  expect_equal(str_sub("test", NA, NA), NA_character_)
  expect_equal(str_sub(c(NA, "test"), NA, NA), rep(NA_character_, 2))
})

test_that("negative length or out of range gives empty string", {
  expect_equal(str_sub("abc", 2, 1), "")
  expect_equal(str_sub("abc", 4, 5), "")
})

test_that("replacement works", {
  x <- "BBCDEF"
  str_sub(x, 1, 1) <- "A"
  expect_equal(x, "ABCDEF")

  str_sub(x, -1, -1) <- "K"
  expect_equal(x, "ABCDEK")

  str_sub(x, -2, -1) <- "EFGH"
  expect_equal(x, "ABCDEFGH")

  str_sub(x, 2, -2) <- ""
  expect_equal(x, "AH")
})

test_that("replacement with NA works", {
  x <- "BBCDEF"
  str_sub(x, NA) <- "A"
  expect_equal(x, NA_character_)

  x <- "BBCDEF"
  str_sub(x, NA, omit_na = TRUE) <- "A"
  str_sub(x, 1, 1, omit_na = TRUE) <- NA
  expect_equal(x, "BBCDEF")
})

test_that("bad vectorisation gives informative error", {
  x <- "a"
  expect_snapshot(error = TRUE, {
    str_sub(x, 1:2, 1:3)
    str_sub(x, 1:2, 1:2) <- 1:3
  })
})

test_that("str_sub() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_sub(x, 1, 1)), names(x))
})

test_that("str_sub_all() preserves names on outer structure", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_sub_all(x, 1, 1)), names(x))
})


================================================
FILE: tests/testthat/test-subset.R
================================================
test_that("can subset with regexps", {
  x <- c("a", "b", "c")
  expect_equal(str_subset(x, "a|c"), c("a", "c"))
  expect_equal(str_subset(x, "a|c", negate = TRUE), "b")
})

test_that("can subset with fixed patterns", {
  expect_equal(str_subset(c("i", "I"), fixed("i")), "i")
  expect_equal(
    str_subset(c("i", "I"), fixed("i", ignore_case = TRUE)),
    c("i", "I")
  )

  # negation works
  expect_equal(str_subset(c("i", "I"), fixed("i"), negate = TRUE), "I")
})

test_that("str_which is equivalent to grep", {
  expect_equal(
    str_which(head(letters), "[aeiou]"),
    grep("[aeiou]", head(letters))
  )

  # negation works
  expect_equal(
    str_which(head(letters), "[aeiou]", negate = TRUE),
    grep("[aeiou]", head(letters), invert = TRUE)
  )
})

test_that("can use fixed() and coll()", {
  expect_equal(str_subset(c("x", "."), fixed(".")), ".")
  expect_equal(str_subset(c("i", "\u0131"), turkish_I()), "\u0131")
})

test_that("can't use boundaries", {
  expect_snapshot(error = TRUE, {
    str_subset(c("a", "b c"), "")
    str_subset(c("a", "b c"), boundary())
  })
})

test_that("keep names", {
  fruit <- c(A = "apple", B = "banana", C = "pear", D = "pineapple")
  expect_identical(names(str_subset(fruit, "b")), "B")
  expect_identical(names(str_subset(fruit, "p")), c("A", "C", "D"))
  expect_identical(names(str_subset(fruit, "x")), as.character())
})

test_that("str_subset() preserves names of retained elements", {
  x <- c(C = "3", B = "2", A = "1")
  out <- str_subset(x, "[12]")
  expect_equal(names(out), c("B", "A"))
})

test_that("str_subset() never matches missing values", {
  expect_equal(str_subset(c("a", NA, "b"), "."), c("a", "b"))
  expect_identical(str_subset(NA_character_, "."), character(0))
})


================================================
FILE: tests/testthat/test-trim.R
================================================
test_that("trimming removes spaces", {
  expect_equal(str_trim("abc   "), "abc")
  expect_equal(str_trim("   abc"), "abc")
  expect_equal(str_trim("  abc   "), "abc")
})

test_that("trimming removes tabs", {
  expect_equal(str_trim("abc\t"), "abc")
  expect_equal(str_trim("\tabc"), "abc")
  expect_equal(str_trim("\tabc\t"), "abc")
})

test_that("side argument restricts trimming", {
  expect_equal(str_trim(" abc ", "left"), "abc ")
  expect_equal(str_trim(" abc ", "right"), " abc")
})

test_that("str_squish removes excess spaces from all parts of string", {
  expect_equal(str_squish("ab\t\tc\t"), "ab c")
  expect_equal(str_squish("\ta  bc"), "a bc")
  expect_equal(str_squish("\ta\t bc\t"), "a bc")
})

test_that("trimming functions preserve names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_trim(x)), names(x))
})


================================================
FILE: tests/testthat/test-trunc.R
================================================
test_that("NA values in input pass through unchanged", {
  expect_equal(
    str_trunc(NA_character_, width = 5),
    NA_character_
  )
  expect_equal(
    str_trunc(c("foobar", NA), 5),
    c("fo...", NA)
  )
})

test_that("truncations work for all elements of a vector", {
  expect_equal(
    str_trunc(c("abcd", "abcde", "abcdef"), width = 5),
    c("abcd", "abcde", "ab...")
  )
})

test_that("truncations work for all sides", {
  trunc <- function(direction, width) {
    str_trunc(
      "This string is moderately long",
      direction,
      width = width
    )
  }

  expect_equal(trunc("right", 20), "This string is mo...")
  expect_equal(trunc("left", 20), "...s moderately long")
  expect_equal(trunc("center", 20), "This stri...ely long")

  expect_equal(trunc("right", 3), "...")
  expect_equal(trunc("left", 3), "...")
  expect_equal(trunc("center", 3), "...")

  expect_equal(trunc("right", 4), "T...")
  expect_equal(trunc("left", 4), "...g")
  expect_equal(trunc("center", 4), "T...")

  expect_equal(trunc("right", 5), "Th...")
  expect_equal(trunc("left", 5), "...ng")
  expect_equal(trunc("center", 5), "T...g")
})

test_that("does not truncate to a length shorter than elipsis", {
  expect_snapshot(error = TRUE, {
    str_trunc("foobar", 2)
    str_trunc("foobar", 3, ellipsis = "....")
  })
})

test_that("str_trunc correctly snips rhs-of-ellipsis for truncated strings", {
  trunc <- function(width, side) {
    str_trunc(
      c("", "a", "aa", "aaa", "aaaa", "aaaaaaa"),
      width,
      side,
      ellipsis = ".."
    )
  }

  expect_equal(trunc(4, "right"), c("", "a", "aa", "aaa", "aaaa", "aa.."))
  expect_equal(trunc(4, "left"), c("", "a", "aa", "aaa", "aaaa", "..aa"))
  expect_equal(trunc(4, "center"), c("", "a", "aa", "aaa", "aaaa", "a..a"))

  expect_equal(trunc(3, "right"), c("", "a", "aa", "aaa", "a..", "a.."))
  expect_equal(trunc(3, "left"), c("", "a", "aa", "aaa", "..a", "..a"))
  expect_equal(trunc(3, "center"), c("", "a", "aa", "aaa", "a..", "a.."))

  expect_equal(trunc(2, "right"), c("", "a", "aa", "..", "..", ".."))
  expect_equal(trunc(2, "left"), c("", "a", "aa", "..", "..", ".."))
  expect_equal(trunc(2, "center"), c("", "a", "aa", "..", "..", ".."))
})

test_that("str_trunc() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_trunc(x, 3)), names(x))
})


================================================
FILE: tests/testthat/test-unique.R
================================================
test_that("unique values returned for strings with duplicate values", {
  expect_equal(str_unique(c("a", "a", "a")), "a")
  expect_equal(str_unique(c(NA_character_, NA_character_)), NA_character_)
})

test_that("can ignore case", {
  expect_equal(str_unique(c("a", "A"), ignore_case = TRUE), "a")
})

test_that("str_unique() preserves names of first occurrences", {
  y <- c(A = "a", A2 = "a", B = "b")
  out <- str_unique(y)
  expect_equal(names(out), c("A", "B"))
})


================================================
FILE: tests/testthat/test-utils.R
================================================
test_that("keep_names() returns logical flag based on inputs", {
  expect_true(keep_names("a", "x"))
  expect_false(keep_names("a", c("x", "y")))
  expect_true(keep_names(c("a", "b"), "x"))
  expect_true(keep_names(c("a", "b"), c("x", "y")))
})

test_that("copy_names() applies names to vectors if present", {
  expect_equal(
    copy_names(c(A = "a", B = "b"), c("x", "y")),
    c(A = "x", B = "y")
  )

  expect_equal(
    copy_names(c("a", "b"), c("x", "y")),
    c("x", "y")
  )
})

test_that("copy_names() applies rownames to matrices if present", {
  from <- c(A = "a", B = "b")
  to <- matrix(c("x", "y"), nrow = 2)

  expected <- to
  rownames(expected) <- names(from)

  expect_equal(copy_names(from, to), expected)
  expect_equal(copy_names(c("a", "b"), to), to)
})


================================================
FILE: tests/testthat/test-view.R
================================================
test_that("results are truncated", {
  expect_snapshot(str_view(words))

  # and can control with option
  local_options(stringr.view_n = 5)
  expect_snapshot(str_view(words))
})

test_that("indices come from original vector", {
  expect_snapshot(str_view(letters, "a|z", match = TRUE))
})

test_that("view highlights all matches", {
  x <- c("abc", "def", "fgh")

  expect_snapshot({
    str_view(x, "[aeiou]")
    str_view(x, "d|e")
  })
})

test_that("view highlights whitespace (except a space/nl)", {
  x <- c(" ", "\u00A0", "\n", "\t")
  expect_snapshot({
    str_view(x)

    "or can instead use escapes"
    str_view(x, use_escapes = TRUE)
  })
})

test_that("view displays message for empty vectors", {
  expect_snapshot(str_view(character()))
})

test_that("match argument controls what is shown", {
  x <- c("abc", "def", "fgh", NA)
  out <- str_view(x, "d|e", match = NA)
  expect_length(out, 4)

  out <- str_view(x, "d|e", match = TRUE)
  expect_length(out, 1)

  out <- str_view(x, "d|e", match = FALSE)
  expect_length(out, 3)
})

test_that("can match across lines", {
  local_reproducible_output(crayon = TRUE)
  expect_snapshot(str_view("a\nb\nbbb\nc", "(b|\n)+"))
})

test_that("vectorised over pattern", {
  x <- str_view("a", c("a", "b"), match = NA)
  expect_equal(length(x), 2)
})

test_that("[ preserves class", {
  x <- str_view(letters)
  expect_s3_class(x[], "stringr_view")
})

test_that("str_view_all() is deprecated", {
  expect_snapshot(str_view_all("abc", "a|b"))
})

test_that("html mode continues to work", {
  skip_if_not_installed("htmltools")
  skip_if_not_installed("htmlwidgets")

  x <- c("abc", "def", "fgh")
  expect_snapshot({
    str_view(x, "[aeiou]", html = TRUE)$x$html
    str_view(x, "d|e", html = TRUE)$x$html
  })

  # can use escapes
  x <- c(" ", "\u00A0", "\n")
  expect_snapshot({
    str_view(x, html = TRUE, use_escapes = TRUE)$x$html
  })
})


================================================
FILE: tests/testthat/test-word.R
================================================
test_that("word extraction", {
  expect_equal("walk", word("walk the moon"))
  expect_equal("walk", word("walk the moon", 1))
  expect_equal("moon", word("walk the moon", 3))
  expect_equal("the moon", word("walk the moon", 2, 3))
})

test_that("words past end return NA", {
  expect_equal(word("a b c", 4), NA_character_)
})

test_that("negative parameters", {
  expect_equal("moon", word("walk the moon", -1, -1))
  expect_equal("walk the moon", word("walk the moon", -3, -1))
  expect_equal("walk the moon", word("walk the moon", -5, -1))
})

test_that("word() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(word(x, 1)), names(x))
})


================================================
FILE: tests/testthat/test-wrap.R
================================================
test_that("wrapping removes spaces", {
  expect_equal(str_wrap(""), "")
  expect_equal(str_wrap(" "), "")
  expect_equal(str_wrap("  a  "), "a")
})

test_that("wrapping with width of 0 puts each word on own line", {
  n_returns <- letters %>%
    str_c(collapse = " ") %>%
    str_wrap(0) %>%
    str_count("\n")
  expect_equal(n_returns, length(letters) - 1)
})

test_that("wrapping at whitespace break works", {
  expect_equal(str_wrap("a/b", width = 0, whitespace_only = TRUE), "a/b")
  expect_equal(str_wrap("a/b", width = 0, whitespace_only = FALSE), "a/\nb")
})

test_that("str_wrap() preserves names", {
  x <- c(C = "3", B = "2", A = "1")
  expect_equal(names(str_wrap(x)), names(x))
})


================================================
FILE: tests/testthat.R
================================================
library(testthat)
library(stringr)

test_check("stringr")


================================================
FILE: vignettes/.gitignore
================================================
/.quarto/


================================================
FILE: vignettes/from-base.Rmd
================================================
---
title: "From base R"
author: "Sara Stoudt"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{From base R}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r}
#| label: setup
#| include: false

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(stringr)
library(magrittr)
```

This vignette compares stringr functions to their base R equivalents to help users transitioning from using base R to stringr. 

# Overall differences

We'll begin with a lookup table between the most important stringr functions and their base R equivalents.

```{r}
#| label: stringr-base-r-diff
#| echo: false

data_stringr_base_diff <- tibble::tribble(
  ~stringr,                                        ~base_r,
  "str_detect(string, pattern)",                   "grepl(pattern, x)",
  "str_dup(string, times)",                        "strrep(x, times)",
  "str_extract(string, pattern)",                  "regmatches(x, m = regexpr(pattern, text))",
  "str_extract_all(string, pattern)",              "regmatches(x, m = gregexpr(pattern, text))",
  "str_length(string)",                            "nchar(x)",
  "str_locate(string, pattern)",                   "regexpr(pattern, text)",
  "str_locate_all(string, pattern)",               "gregexpr(pattern, text)",
  "str_match(string, pattern)",                    "regmatches(x, m = regexec(pattern, text))",
  "str_order(string)",                             "order(...)",
  "str_replace(string, pattern, replacement)",     "sub(pattern, replacement, x)",
  "str_replace_all(string, pattern, replacement)", "gsub(pattern, replacement, x)",
  "str_sort(string)",                              "sort(x)",
  "str_split(string, pattern)",                    "strsplit(x, split)",
  "str_sub(string, start, end)",                   "substr(x, start, stop)",
  "str_subset(string, pattern)",                   "grep(pattern, x, value = TRUE)",
  "str_to_lower(string)",                          "tolower(x)",
  "str_to_title(string)",                          "tools::toTitleCase(text)",
  "str_to_upper(string)",                          "toupper(x)",
  "str_trim(string)",                              "trimws(x)",
  "str_which(string, pattern)",                    "grep(pattern, x)",
  "str_wrap(string)",                              "strwrap(x)"
)

# create MD table, arranged alphabetically by stringr fn name
data_stringr_base_diff %>%
  dplyr::mutate(dplyr::across(
      .cols = everything(),
      .fns = ~ paste0("`", .x, "`"))
  ) %>%
  dplyr::arrange(stringr) %>%
  dplyr::rename(`base R` = base_r) %>%
  gt::gt() %>%
  gt::fmt_markdown(columns = everything()) %>%
  gt::tab_options(column_labels.font.weight = "bold")
```

Overall the main differences between base R and stringr are:

1.  stringr functions start with `str_` prefix; base R string functions have no
    consistent naming scheme.
   
1.  The order of inputs is usually different between base R and stringr. 
    In base R, the `pattern` to match usually comes first; in stringr, the
    `string` to manupulate always comes first. This makes stringr easier 
    to use in pipes, and with `lapply()` or `purrr::map()`.
    
1.  Functions in stringr tend to do less, where many of the string processing 
    functions in base R have multiple purposes.
    
1.  The output and input of stringr functions has been carefully designed.
    For example, the output of `str_locate()` can be fed directly into 
    `str_sub()`; the same is not true of `regexpr()` and `substr()`.
    
1.  Base functions use arguments (like `perl`, `fixed`, and `ignore.case`)
    to control how the pattern is interpreted. To avoid dependence between
    arguments, stringr instead uses helper functions (like `fixed()`, 
    `regex()`, and `coll()`).

Next we'll walk through each of the functions, noting the similarities and important differences. These examples are adapted from the stringr documentation and here they are contrasted with the analogous base R operations.

# Detect matches

## `str_detect()`: Detect the presence or absence of a pattern in a string

Suppose you want to know whether each word in a vector of fruit names contains an "a". 

```{r}
fruit <- c("apple", "banana", "pear", "pineapple")

# base
grepl(pattern = "a", x = fruit)

# stringr
str_detect(fruit, pattern = "a")
```

In base you would use `grepl()` (see the "l" and think logical) while in stringr you use `str_detect()` (see the verb "detect" and think of a yes/no action). 

## `str_which()`: Find positions matching a pattern

Now you want to identify the positions of the words in a vector of fruit names that contain an "a". 

```{r}
# base
grep(pattern = "a", x = fruit)

# stringr
str_which(fruit, pattern = "a")
```

In base you would use `grep()` while in stringr you use `str_which()` (by analogy to `which()`). 

## `str_count()`: Count the number of matches in a string

How many "a"s are in each fruit?

```{r}
# base 
loc <- gregexpr(pattern = "a", text = fruit, fixed = TRUE)
sapply(loc, function(x) length(attr(x, "match.length")))

# stringr
str_count(fruit, pattern = "a")
```

This information can be gleaned from `gregexpr()` in base, but you need to look at the `match.length` attribute as the vector uses a length-1 integer vector (`-1`) to indicate no match.

## `str_locate()`: Locate the position of patterns in a string

Within each fruit, where does the first "p" occur? Where are all of the "p"s?

```{r}
fruit3 <- c("papaya", "lime", "apple")

# base
str(gregexpr(pattern = "p", text = fruit3))

# stringr
str_locate(fruit3, pattern = "p")
str_locate_all(fruit3, pattern = "p")
```

# Subset strings

## `str_sub()`: Extract and replace substrings from a character vector

What if we want to grab part of a string?

```{r}
hw <- "Hadley Wickham"

# base
substr(hw, start = 1, stop = 6)
substring(hw, first = 1) 

# stringr
str_sub(hw, start = 1, end = 6)
str_sub(hw, start = 1)
str_sub(hw, end = 6)
```

In base you could use `substr()` or `substring()`. The former requires both a start and stop of the substring while the latter assumes the stop will be the end of the string. The stringr version, `str_sub()` has the same functionality, but also gives a default start value (the beginning of the string). Both the base and stringr functions have the same order of expected inputs. 

In stringr you can use negative numbers to index from the right-hand side string: -1 is the last letter, -2 is the second to last, and so on.

```{r}
str_sub(hw, start = 1, end = -1)
str_sub(hw, start = -5, end = -2)
```

Both base R and stringr subset are vectorized over their parameters. This means you can either choose the same subset across multiple strings or specify different subsets for different strings.

```{r}
al <- "Ada Lovelace"

# base
substr(c(hw,al), start = 1, stop = 6)
substr(c(hw,al), start = c(1,1), stop = c(6,7))

# stringr
str_sub(c(hw,al), start = 1, end = -1)
str_sub(c(hw,al), start = c(1,1), end = c(-1,-2))
```

stringr will automatically recycle the first argument to the same length as `start` and `stop`:

```{r}
str_sub(hw, start = 1:5)
```

Whereas the base equivalent silently uses just the first value:

```{r}
substr(hw, start = 1:5, stop = 15)
```

## `str_sub() <- `: Subset assignment

`substr()` behaves in a surprising way when you replace a substring with a different number of characters:

```{r}
# base
x <- "ABCDEF"
substr(x, 1, 3) <- "x"
x
```

`str_sub()` does what you would expect:

```{r}
# stringr
x <- "ABCDEF"
str_sub(x, 1, 3) <- "x"
x
```

## `str_subset()`: Keep strings matching a pattern, or find positions

We may want to retrieve strings that contain a pattern of interest:

```{r}
# base
grep(pattern = "g", x = fruit, value = TRUE)

# stringr
str_subset(fruit, pattern = "g")
```

## `str_extract()`: Extract matching patterns from a string

We may want to pick out certain patterns from a string, for example, the digits in a shopping list:

```{r}
shopping_list <- c("apples x4", "bag of flour", "10", "milk x2")

# base
matches <- regexpr(pattern = "\\d+", text = shopping_list) # digits
regmatches(shopping_list, m = matches)

matches <- gregexpr(pattern = "[a-z]+", text = shopping_list) # words
regmatches(shopping_list, m = matches)

# stringr
str_extract(shopping_list, pattern = "\\d+") 
str_extract_all(shopping_list, "[a-z]+")
```

Base R requires the combination of `regexpr()` with `regmatches()`; but note that the strings without matches are dropped from the output. stringr provides `str_extract()` and `str_extract_all()`, and the output is always the same length as the input.

## `str_match()`: Extract matched groups from a string

We may also want to extract groups from a string. Here I'm going to use the scenario from Section 14.4.3 in [R for Data Science](https://r4ds.had.co.nz/strings.html).

```{r}
head(sentences)
noun <- "([A]a|[Tt]he) ([^ ]+)"

# base
matches <- regexec(pattern = noun, text = head(sentences))
do.call("rbind", regmatches(x = head(sentences), m = matches))

# stringr
str_match(head(sentences), pattern = noun)
```

As for extracting the full match base R requires the combination of two functions, and inputs with no matches are dropped from the output.

# Manage lengths

## `str_length()`: The length of a string

To determine the length of a string, base R uses `nchar()` (not to be confused with `length()` which gives the length of vectors, etc.) while stringr uses `str_length()`.

```{r}
# base
nchar(letters)

# stringr
str_length(letters)
```

There are some subtle differences between base and stringr here. `nchar()` requires a character vector, so it will return an error if used on a factor. `str_length()` can handle a factor input.

```{r}
#| error: true

# base
nchar(factor("abc")) 
```

```{r}
# stringr
str_length(factor("abc"))
```

Note that "characters" is a poorly defined concept, and technically both `nchar()` and `str_length()` returns the number of code points. This is usually the same as what you'd consider to be a charcter, but not always:

```{r}
x <- c("\u00fc", "u\u0308")
x

nchar(x)
str_length(x)
```

## `str_pad()`: Pad a string

To pad a string to a certain width, use stringr's `str_pad()`. In base R you could use `sprintf()`, but unlike `str_pad()`, `sprintf()` has many other functionalities. 

```{r}
# base
sprintf("%30s", "hadley")
sprintf("%-30s", "hadley")
# "both" is not as straightforward

# stringr
rbind(
  str_pad("hadley", 30, "left"),
  str_pad("hadley", 30, "right"),
  str_pad("hadley", 30, "both")
)
```

## `str_trunc()`: Truncate a character string

The stringr package provides an easy way to truncate a character string: `str_trunc()`. Base R has no function to do this directly.

```{r}
x <- "This string is moderately long"

# stringr
rbind(
  str_trunc(x, 20, "right"),
  str_trunc(x, 20, "left"),
  str_trunc(x, 20, "center")
)
```

## `str_trim()`: Trim whitespace from a string

Similarly, stringr provides `str_trim()` to trim whitespace from a string. This is analogous to base R's `trimws()` added in R 3.3.0.

```{r}
# base
trimws(" String with trailing and leading white space\t")
trimws("\n\nString with trailing and leading white space\n\n")

# stringr
str_trim(" String with trailing and leading white space\t")
str_trim("\n\nString with trailing and leading white space\n\n")
```

The stringr function `str_squish()` allows for extra whitespace within a string to be trimmed (in contrast to `str_trim()` which removes whitespace at the beginning and/or end of string). In base R, one might take advantage of `gsub()` to accomplish the same effect.

```{r}
# stringr
str_squish(" String with trailing, middle,   and leading white space\t")
str_squish("\n\nString with excess, trailing and leading white space\n\n")
```

## `str_wrap()`: Wrap strings into nicely formatted paragraphs

`strwrap()` and `str_wrap()` use different algorithms. `str_wrap()` uses the famous [Knuth-Plass algorithm](http://litherum.blogspot.com/2015/07/knuth-plass-line-breaking-algorithm.html). 

```{r}
gettysburg <- "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal."

# base
cat(strwrap(gettysburg, width = 60), sep = "\n")

# stringr
cat(str_wrap(gettysburg, width = 60), "\n")
```

Note that `strwrap()` returns a character vector with one element for each line; `str_wrap()` returns a single string containing line breaks.

# Mutate strings

## `str_replace()`: Replace matched patterns in a string

To replace certain patterns within a string, stringr provides the functions `str_replace()` and `str_replace_all()`. The base R equivalents are `sub()` and `gsub()`. Note the difference in default input order again. 

```{r}
fruits <- c("apple", "banana", "pear", "pineapple")

# base
sub("[aeiou]", "-", fruits)
gsub("[aeiou]", "-", fruits)

# stringr
str_replace(fruits, "[aeiou]", "-")
str_replace_all(fruits, "[aeiou]", "-")
```

## case: Convert case of a string

Both stringr and base R have functions to convert to upper and lower case.  Title case is also provided in stringr.

```{r}
dog <- "The quick brown dog"

# base
toupper(dog)
tolower(dog)
tools::toTitleCase(dog)

# stringr
str_to_upper(dog)
str_to_lower(dog)
str_to_title(dog)
```

In stringr we can control the locale, while in base R locale distinctions are controlled with global variables. Therefore, the output of your base R code may vary across different computers with different global settings.

```{r}
# stringr
str_to_upper("i") # English
str_to_upper("i", locale = "tr") # Turkish
```

# Join and split

## `str_flatten()`: Flatten a string

If we want to take elements of a string vector and collapse them to a single string we can use the `collapse` argument in `paste()` or use stringr's `str_flatten()`.

```{r}
# base
paste0(letters, collapse = "-")

# stringr
str_flatten(letters, collapse = "-")
```

The advantage of `str_flatten()` is that it always returns a vector the same length as its input; to predict the return length of `paste()` you must carefully read all arguments.

## `str_dup()`: duplicate strings within a character vector

To duplicate strings within a character vector use `strrep()` (in R 3.3.0 or greater) or `str_dup()`:

```{r}
#| eval: !expr getRversion() >= "3.3.0"

fruit <- c("apple", "pear", "banana")

# base
strrep(fruit, 2)
strrep(fruit, 1:3)

# stringr
str_dup(fruit, 2)
str_dup(fruit, 1:3)
```

## `str_split()`: Split up a string into pieces

To split a string into pieces with breaks based on a particular pattern match stringr uses `str_split()` and base R uses `strsplit()`. Unlike most other functions, `strsplit()` starts with the character vector to modify.

```{r}
fruits <- c(
  "apples and oranges and pears and bananas",
  "pineapples and mangos and guavas"
)
# base
strsplit(fruits, " and ")

# stringr
str_split(fruits, " and ")
```

The stringr package's `str_split()` allows for more control over the split, including restricting the number of possible matches.

```{r}
# stringr
str_split(fruits, " and ", n = 3)
str_split(fruits, " and ", n = 2)
```

## `str_glue()`: Interpolate strings

It's often useful to interpolate varying values into a fixed string. In base R, you can use `sprintf()` for this purpose; stringr provides a wrapper for the more general purpose [glue](https://glue.tidyverse.org) package.

```{r}
name <- "Fred"
age <- 50
anniversary <- as.Date("1991-10-12")

# base
sprintf(
  "My name is %s my age next year is %s and my anniversary is %s.", 
  name,
  age + 1,
  format(anniversary, "%A, %B %d, %Y")
)

# stringr
str_glue(
  "My name is {name}, ",
  "my age next year is {age + 1}, ",
  "and my anniversary is {format(anniversary, '%A, %B %d, %Y')}."
)
```

# Order strings

## `str_order()`: Order or sort a character vector

Both base R and stringr have separate functions to order and sort strings.

```{r}
# base
order(letters)
sort(letters)

# stringr
str_order(letters)
str_sort(letters)
```

Some options in `str_order()` and `str_sort()` don't have analogous base R options. For example, the stringr functions have a `locale` argument to control how to order or sort. In base R the locale is a global setting, so the outputs of `sort()` and `order()` may differ across different computers. For example, in the Norwegian alphabet, å comes after z:

```{r}
x <- c("å", "a", "z")
str_sort(x)
str_sort(x, locale = "no")
```

The stringr functions also have a `numeric` argument to sort digits numerically instead of treating them as strings.

```{r}
# stringr
x <- c("100a10", "100a5", "2b", "2a")
str_sort(x)
str_sort(x, numeric = TRUE)
```


================================================
FILE: vignettes/locale-sensitive.Rmd
================================================
---
title: "Locale sensitive functions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Locale sensitive functions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r}
#| include: FALSE
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(stringr)
```

A locale is a set of parameters that define a user's language, region, and cultural preferences. It determines language-specific rules for text processing, including how to:

- Convert between uppercase and lowercase letters
- Sort text alphabetically
- Format dates, numbers, and currency
- Handle character encoding and display

In stringr, you can control the locale using the `locale` argument, which takes language codes like "en" (English), "tr" (Turkish), or "es_MX" (Mexican Spanish). In general, a locale is a lower-case language abbreviation, optionally followed by an underscore (_) and an upper-case region identifier. You can see which locales are supported in stringr by running `stringi::stri_locale_list()`.

This vignette describes locale-sensitive stringr functions, i.e. functions with a `locale` argument. These functions fall into two broad categories:

1. Case conversion
2. Sorting and ordering

## Case conversion

`str_to_lower()`, `str_to_upper()`, `str_to_title()`, and `str_to_sentence()` all change the case of their inputs. But while most languages that use the Latin alphabet (like English) have upper and lower case, the rules for converting between the two aren't always the same. For example, Turkish has two forms of the letter "I": as well as "i" and "I", Turkish also has "ı", the dotless lowercase i, and "İ" is the dotted uppercase I. This means the rules for converting i to upper case and I to lower case are different from English:

```{r}
# English
str_to_upper("i")
str_to_lower("I")

# Turkish
str_to_upper("i", locale = "tr")
str_to_lower("I", locale = "tr")
```

Another example is Dutch, where "ij" is a digraph treated as a single letter. This means that `str_to_sentence()` will incorrectly capitalize "ij" at the start of a sentence unless you use a Dutch locale:

```{r}
#| warning: false
dutch_sentence <- "ijsland is een prachtig land in Noord-Europa."

# Incorrect
str_to_sentence(dutch_sentence)
# Correct
str_to_sentence(dutch_sentence, locale = "nl")
```

Case conversion also comes up in another situation: case-insensitive comparison. This is relevant in two contexts. First, `str_equal()` and `str_unique()` can optionally ignore case, so it's important to also supply locale when working with non-English text. For example, imagine we're searching for a Turkish name, ignoring case:

```{r}
turkish_names <- c("İpek", "Işık", "İbrahim")
search_name <- "ipek"

# incorrect
str_equal(turkish_names, search_name, ignore_case = TRUE)

# correct
str_equal(turkish_names, search_name, ignore_case = TRUE, locale = "tr")
```

Case conversion also comes up in pattern matching functions like `str_detect()`. You might be accustomed to use `ignore_case = TRUE` with `regex()` or `fixed()`, but if you want to use locale-sensitive comparison you instead need to use `coll()`:

```{r}
# incorrect
str_detect(turkish_names, fixed(search_name, ignore_case = TRUE))

# correct
str_detect(turkish_names, coll(search_name, ignore_case = TRUE, locale = "tr"))
```

## Sorting and ordering

`str_sort()`, `str_order()`, and `str_rank()` all rely on the alphabetical ordering of letters. But not every language uses the same ordering as English. For example, Lithuanian places 'y' between 'i' and 'k', and Czech treats "ch" as a single compound letter that sorts after all other words beginning with 'h'. This means that to correctly sort words in these languages, you must provide the appropriate locale:

```{r}
czech_words <- c("had", "chata", "hrad", "chůze")
lithuanian_words <- c("ąžuolas", "ėglė", "šuo", "yra", "žuvis")

# incorrect
str_sort(czech_words)
str_sort(lithuanian_words)

# correct
str_sort(czech_words, locale = "cs")
str_sort(lithuanian_words, locale = "lt")
```


================================================
FILE: vignettes/regular-expressions.Rmd
================================================
---
title: "Regular expressions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Regular expressions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r}
#| label = "setup",
#| include = FALSE
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(stringr)
```

Regular expressions are a concise and flexible tool for describing patterns in strings. This vignette describes the key features of stringr's regular expressions, as implemented by [stringi](https://github.com/gagolews/stringi). It is not a tutorial, so if you're unfamiliar regular expressions, I'd recommend starting at <https://r4ds.had.co.nz/strings.html>. If you want to master the details, I'd recommend reading the classic [_Mastering Regular Expressions_](https://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124) by Jeffrey E. F. Friedl. 

Regular expressions are the default pattern engine in stringr. That means when you use a pattern matching function with a bare string, it's equivalent to wrapping it in a call to `regex()`:

```{r}
#| eval = FALSE
# The regular call:
str_extract(fruit, "nana")
# Is shorthand for
str_extract(fruit, regex("nana"))
```

You will need to use `regex()` explicitly if you want to override the default options, as you'll see in examples below.

## Basic matches

The simplest patterns match exact strings:

```{r}
x <- c("apple", "banana", "pear")
str_extract(x, "an")
```

You can perform a case-insensitive match using `ignore_case = TRUE`:
    
```{r}
bananas <- c("banana", "Banana", "BANANA")
str_detect(bananas, "banana")
str_detect(bananas, regex("banana", ignore_case = TRUE))
```

The next step up in complexity is `.`, which matches any character except a newline:

```{r}
str_extract(x, ".a.")
```

You can allow `.` to match everything, including `\n`, by setting `dotall = TRUE`:

```{r}
str_detect("\nX\n", ".X.")
str_detect("\nX\n", regex(".X.", dotall = TRUE))
```

## Escaping

If "`.`" matches any character, how do you match a literal "`.`"? You need to use an "escape" to tell the regular expression you want to match it exactly, not use its special behaviour. Like strings, regexps use the backslash, `\`, to escape special behaviour. So to match an `.`, you need the regexp `\.`. Unfortunately this creates a problem. We use strings to represent regular expressions, and `\` is also used as an escape symbol in strings. So to create the regular expression `\.` we need the string `"\\."`. 

```{r}
# To create the regular expression, we need \\
dot <- "\\."

# But the expression itself only contains one:
writeLines(dot)

# And this tells R to look for an explicit .
str_extract(c("abc", "a.c", "bef"), "a\\.c")
```

If `\` is used as an escape character in regular expressions, how do you match a literal `\`? Well you need to escape it, creating the regular expression `\\`. To create that regular expression, you need to use a string, which also needs to escape `\`. That means to match a literal `\` you need to write `"\\\\"` --- you need four backslashes to match one!

```{r}
x <- "a\\b"
writeLines(x)

str_extract(x, "\\\\")
```

In this vignette, I use `\.` to denote the regular expression, and `"\\."` to denote the string that represents the regular expression.

An alternative quoting mechanism is `\Q...\E`: all the characters in `...` are treated as exact matches. This is useful if you want to exactly match user input as part of a regular expression.

```{r}
x <- c("a.b.c.d", "aeb")
starts_with <- "a.b"

str_detect(x, paste0("^", starts_with))
str_detect(x, paste0("^\\Q", starts_with, "\\E"))
```

## Special characters

Escapes also allow you to specify individual characters that are otherwise hard to type. You can specify individual unicode characters in five ways, either as a variable number of hex digits (four is most common), or by name:

* `\xhh`: 2 hex digits.

* `\x{hhhh}`: 1-6 hex digits.

* `\uhhhh`: 4 hex digits.

* `\Uhhhhhhhh`: 8 hex digits.

* `\N{name}`, e.g. `\N{grinning face}` matches the basic smiling emoji.

Similarly, you can specify many common control characters:

* `\a`: bell.

* `\cX`: match a control-X character.

* `\e`: escape  (`\u001B`).

* `\f`: form feed (`\u000C`).

* `\n`: line feed (`\u000A`).

* `\r`: carriage return (`\u000D`).

* `\t`: horizontal tabulation (`\u0009`).

* `\0ooo` match an octal character. 'ooo' is from one to three octal digits, 
  from 000 to 0377. The leading zero is required.

(Many of these are only of historical interest and are only included here for the sake of completeness.)

## Matching multiple characters

There are a number of patterns that match more than one character. You've already seen `.`, which matches any character (except a newline). A closely related operator is `\X`, which matches a __grapheme cluster__, a set of individual elements that form a single symbol. For example, one way of representing "á" is as the letter "a" plus an accent: `.` will match the component "a", while `\X` will match the complete symbol:
    
```{r}
x <- "a\u0301"
str_extract(x, ".")
str_extract(x, "\\X")
```

There are five other escaped pairs that match narrower classes of characters:
   
*   `\d`: matches any digit. The complement, `\D`, matches any character that 
    is not a decimal digit.

    ```{r}
    str_extract_all("1 + 2 = 3", "\\d+")[[1]]
    ```

    Technically, `\d` includes any character in the Unicode Category of Nd 
    ("Number, Decimal Digit"), which also includes numeric symbols from other 
    languages:
    
    ```{r}
    # Some Laotian numbers
    str_detect("១២៣", "\\d")
    ```
    
*   `\s`: matches any whitespace. This includes tabs, newlines, form feeds, 
    and any character in the Unicode Z Category (which includes a variety of 
    space characters and other separators.). The complement, `\S`, matches any
    non-whitespace character.
    
    ```{r}
    (text <- "Some  \t badly\n\t\tspaced \f text")
    str_replace_all(text, "\\s+", " ")
    ```

*   `\p{property name}` matches any character with specific unicode property,
    like `\p{Uppercase}` or `\p{Diacritic}`. The complement, 
    `\P{property name}`, matches all characters without the property.
    A complete list of unicode properties can be found at
    <http://www.unicode.org/reports/tr44/#Property_Index>.
    
    ```{r}
    (text <- c('"Double quotes"', "«Guillemet»", "“Fancy quotes”"))
    str_replace_all(text, "\\p{quotation mark}", "'")
    ```

*   `\w` matches any "word" character, which includes alphabetic characters, 
    marks and decimal numbers. The complement, `\W`, matches any non-word
    character.
    
    ```{r}
    str_extract_all("Don't eat that!", "\\w+")[[1]]
    str_split("Don't eat that!", "\\W")[[1]]
    ```
    
    Technically, `\w` also matches connector punctuation, `\u200c` (zero width
    connector), and `\u200d` (zero width joiner), but these are rarely seen in
    the wild.

*   `\b` matches word boundaries, the transition between word and non-word 
    characters. `\B` matches the opposite: boundaries that have either both
    word or non-word characters on either side.
    
    ```{r}
    str_replace_all("The quick brown fox", "\\b", "_")
    str_replace_all("The quick brown fox", "\\B", "_")
    ```

You can also create your own __character classes__ using `[]`:

* `[abc]`: matches a, b, or c.
* `[a-z]`: matches every character between a and z 
   (in Unicode code point order).
* `[^abc]`: matches anything except a, b, or c.
* `[\^\-]`: matches `^` or `-`.

There are a number of pre-built classes that you can use inside `[]`:

* `[:punct:]`: punctuation.
* `[:alpha:]`: letters.
* `[:lower:]`: lowercase letters.
* `[:upper:]`: upperclass letters.
* `[:digit:]`: digits.
* `[:xdigit:]`: hex digits.
* `[:alnum:]`: letters and numbers.
* `[:cntrl:]`: control characters.
* `[:graph:]`: letters, numbers, and punctuation.
* `[:print:]`: letters, numbers, punctuation, and whitespace.
* `[:space:]`: space characters (basically equivalent to `\s`).
* `[:blank:]`: space and tab.

These all go inside the `[]` for character classes, i.e. `[[:digit:]AX]` matches all digits, A, and X.

You can also using Unicode properties, like `[\p{Letter}]`, and various set operations, like `[\p{Letter}--\p{script=latin}]`. See `?"stringi-search-charclass"` for details.

## Alternation

`|` is the __alternation__ operator, which will pick between one or more possible matches. For example, `abc|def` will match `abc` or `def`:

```{r}
str_detect(c("abc", "def", "ghi"), "abc|def")
```

Note that the precedence for `|` is low: `abc|def` is equivalent to `(abc)|(def)` not `ab(c|d)ef`.

## Grouping

You can use parentheses to override the default precedence rules:

```{r}
str_extract(c("grey", "gray"), "gre|ay")
str_extract(c("grey", "gray"), "gr(e|a)y")
```

Parentheses also define "groups" that you can refer to with __backreferences__, like `\1`, `\2` etc, and can be extracted with `str_match()`. For example, the following regular expression finds all fruits that have a repeated pair of letters:

```{r}
pattern <- "(..)\\1"
fruit %>% 
  str_subset(pattern)

fruit %>% 
  str_subset(pattern) %>% 
  str_match(pattern)
```

You can use `(?:...)`, the non-grouping parentheses, to control precedence but not capture the match in a group. This is slightly more efficient than capturing parentheses.

```{r}
str_match(c("grey", "gray"), "gr(e|a)y")
str_match(c("grey", "gray"), "gr(?:e|a)y")
```

This is most useful for more complex cases where you need to capture matches and control precedence independently.

You can use `(?<name>...)`, the named capture group, to provide a reference to the matched text. This is more readable and maintainable, especially with complex regular expressions, because you can reference the matched text by name instead of a potentially confusing numerical index. 

*Note: `<name>` should not include an underscore because they are not supported.*

```{r}
date_string <- "Today's date is 2025-09-19."
pattern <- "(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})"
str_match(date_string, pattern)
```

You can then use `\k<name>` to backreference the previously captured named group. It is an alternative to the standard numbered backreferences like `\1` or `\2`. 

```{r}
text <- "This is is a test test with duplicates duplicates"
pattern <- "(?<word>\\b\\w+\\b)\\s+\\k<word>"
str_subset(text, pattern)
str_match_all(text, pattern)
```

## Anchors

By default, regular expressions will match any part of a string. It's often useful to __anchor__ the regular expression so that it matches from the start or end of the string:

* `^` matches the start of string. 
* `$` matches the end of the string.

```{r}
x <- c("apple", "banana", "pear")
str_extract(x, "^a")
str_extract(x, "a$")
```

To match a literal "$" or "^", you need to escape them, `\$`, and `\^`.

For multiline strings, you can use `regex(multiline = TRUE)`. This changes the behaviour of `^` and `$`, and introduces three new operators:

* `^` now matches the start of each line. 

* `$` now matches the end of each line.

* `\A` matches the start of the input.

* `\z` matches the end of the input.

* `\Z` matches the end of the input, but before the final line terminator, 
  if it exists.

```{r}
x <- "Line 1\nLine 2\nLine 3\n"
str_extract_all(x, "^Line..")[[1]]
str_extract_all(x, regex("^Line..", multiline = TRUE))[[1]]
str_extract_all(x, regex("\\ALine..", multiline = TRUE))[[1]]
```

## Repetition

You can control how many times a pattern matches with the repetition operators:

* `?`: 0 or 1.
* `+`: 1 or more.
* `*`: 0 or more.

```{r}
x <- "1888 is the longest year in Roman numerals: MDCCCLXXXVIII"
str_extract(x, "CC?")
str_extract(x, "CC+")
str_extract(x, 'C[LX]+')
```

Note that the precedence of these operators is high, so you can write: `colou?r` to match either American or British spellings. That means most uses will need parentheses, like `bana(na)+`.

You can also specify the number of matches precisely:

* `{n}`: exactly n
* `{n,}`: n or more
* `{n,m}`: between n and m

```{r}
str_extract(x, "C{2}")
str_extract(x, "C{2,}")
str_extract(x, "C{2,3}")
```

By default these matches are "greedy": they will match the longest string possible. You can make them "lazy", matching the shortest string possible by putting a `?` after them:

* `??`: 0 or 1, prefer 0.
* `+?`: 1 or more, match as few times as possible.
* `*?`: 0 or more, match as few times as possible.
* `{n,}?`: n or more, match as few times as possible.
* `{n,m}?`: between n and m, , match as few times as possible, but at least n.

```{r}
str_extract(x, c("C{2,3}", "C{2,3}?"))
str_extract(x, c("C[LX]+", "C[LX]+?"))
```

You can also make the matches possessive by putting a `+` after them, which means that if later parts of the match fail, the repetition will not be re-tried with a smaller number of characters. This is an advanced feature used to improve performance in worst-case scenarios (called "catastrophic backtracking").

* `?+`: 0 or 1, possessive.
* `++`: 1 or more, possessive.
* `*+`: 0 or more, possessive.
* `{n}+`: exactly n, possessive.
* `{n,}+`: n or more, possessive.
* `{n,m}+`: between n and m, possessive.

A related concept is the __atomic-match__ parenthesis, `(?>...)`. If a later match fails and the engine needs to back-track, an atomic match is kept as is: it succeeds or fails as a whole. Compare the following two regular expressions:

```{r}
str_detect("ABC", "(?>A|.B)C")
str_detect("ABC", "(?:A|.B)C")
```

The atomic match fails because it matches A, and then the next character is a C so it fails. The regular match succeeds because it matches A, but then C doesn't match, so it back-tracks and tries B instead.

## Look arounds

These assertions look ahead or behind the current match without "consuming" any characters (i.e. changing the input position).

* `(?=...)`: positive look-ahead assertion. Matches if `...` matches at the 
  current input.
  
* `(?!...)`: negative look-ahead assertion. Matches if `...` __does not__ 
  match at the current input.
  
* `(?<=...)`: positive look-behind assertion. Matches if `...` matches text 
  preceding the current position, with the last character of the match 
  being the character just before the current position. Length must be bounded  
  (i.e. no `*` or `+`).

* `(?<!...)`: negative look-behind assertion. Matches if `...` __does not__
  match text preceding the current position. Length must be bounded  
  (i.e. no `*` or `+`).

These are useful when you want to check that a pattern exists, but you don't want to include it in the result:

```{r}
x <- c("1 piece", "2 pieces", "3")
str_extract(x, "\\d+(?= pieces?)")

y <- c("100", "$400")
str_extract(y, "(?<=\\$)\\d+")
```

## Comments

There are two ways to include comments in a regular expression. The first is with `(?#...)`:

```{r}
str_detect("xyz", "x(?#this is a comment)")
```

The second is to use `regex(comments = TRUE)`. This form ignores spaces and newlines, and anything everything after `#`. To match a literal space, you'll need to escape it: `"\\ "`. This is a useful way of describing complex regular expressions:

```{r}
phone <- regex("
  \\(?       # optional opening parens
  (\\d{3})   # area code
  \\)?       # optional closing parens
  (?:-|\\ )? # optional dash or space
  (\\d{3})   # another three numbers
  (?:-|\\ )? # optional dash or space
  (\\d{3})   # three more numbers
  ", comments = TRUE)

str_match(c("514-791-8141", "(514) 791 8141"), phone)
```


================================================
FILE: vignettes/stringr.Rmd
================================================
---
title: "Introduction to stringr"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to stringr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r}
#| include = FALSE
library(stringr)
knitr::opts_chunk$set(
  comment = "#>", 
  collapse = TRUE
)
```

There are four main families of functions in stringr: 

1.  Character manipulation: these functions allow you to manipulate 
    individual characters within the strings in character vectors.
   
1.  Whitespace tools to add, remove, and manipulate whitespace.

1.  Locale sensitive operations whose operations will vary from locale
    to locale.
    
1.  Pattern matching functions. These recognise four engines of
    pattern description. The most common is regular expressions, but there
    are three other tools.

## Getting and setting individual characters

You can get the length of the string with `str_length()`:

```{r}
str_length("abc")
```

This is now equivalent to the base R function `nchar()`. Previously it was needed to work around issues with `nchar()` such as the fact that it returned 2 for `nchar(NA)`. This has been fixed as of R 3.3.0, so it is no longer so important.

You can access individual character using `str_sub()`. It takes three arguments: a character vector, a `start` position and an `end` position. Either position can either be a positive integer, which counts from the left, or a negative integer which counts from the right. The positions are inclusive, and if longer than the string, will be silently truncated.

```{r}
x <- c("abcdef", "ghifjk")

# The 3rd letter
str_sub(x, 3, 3)

# The 2nd to 2nd-to-last character
str_sub(x, 2, -2)

```

You can also use `str_sub()` to modify strings:

```{r}
str_sub(x, 3, 3) <- "X"
x
```

To duplicate individual strings, you can use `str_dup()`:

```{r}
str_dup(x, c(2, 3))
```

## Whitespace

Three functions add, remove, or modify whitespace:

1. `str_pad()` pads a string to a fixed length by adding extra whitespace on 
    the left, right, or both sides.
    
    ```{r}
    x <- c("abc", "defghi")
    str_pad(x, 10) # default pads on left
    str_pad(x, 10, "both")
    ```
    
    (You can pad with other characters by using the `pad` argument.)
    
    `str_pad()` will never make a string shorter:
    
    ```{r}
    str_pad(x, 4)
    ```
    
    So if you want to ensure that all strings are the same length (often useful
    for print methods), combine `str_pad()` and `str_trunc()`:
    
    ```{r}
    x <- c("Short", "This is a long string")
    
    x %>% 
      str_trunc(10) %>% 
      str_pad(10, "right")
    ```

1.  The opposite of `str_pad()` is `str_trim()`, which removes leading and 
    trailing whitespace:
    
    ```{r}
    x <- c("  a   ", "b   ",  "   c")
    str_trim(x)
    str_trim(x, "left")
    ```

1.  You can use `str_wrap()` to modify existing whitespace in order to wrap
    a paragraph of text, such that the length of each line is as similar as 
    possible. 
    
    ```{r}
    jabberwocky <- str_c(
      "`Twas brillig, and the slithy toves ",
      "did gyre and gimble in the wabe: ",
      "All mimsy were the borogoves, ",
      "and the mome raths outgrabe. "
    )
    cat(str_wrap(jabberwocky, width = 40))
    ```

## Locale sensitive

A handful of stringr functions are locale-sensitive: they will perform differently in different regions of the world. These functions are case transformation functions:

```{r}
x <- "I like horses."
str_to_upper(x)
str_to_title(x)

str_to_lower(x)
# Turkish has two sorts of i: with and without the dot
str_to_lower(x, "tr")
```

String ordering and sorting:

```{r}
x <- c("y", "i", "k")
str_order(x)

str_sort(x)
# In Lithuanian, y comes between i and k
str_sort(x, locale = "lt")
```

The locale always defaults to English to ensure that the default behaviour is identical across systems. Locales always include a two letter ISO-639-1 language code (like "en" for English or "zh" for Chinese), and optionally a ISO-3166 country code (like "en_UK" vs "en_US"). You can see a complete list of available locales by running `stringi::stri_locale_list()`.

## Pattern matching

The vast majority of stringr functions work with patterns. These are parameterised by the task they perform and the types of patterns they match.

### Tasks

Each pattern matching function has the same first two arguments, a character vector of `string`s to process and a single `pattern` to match. stringr provides pattern matching functions to **detect**, **locate**, **extract**, **match**, **replace**, and **split** strings. I'll illustrate how they work with some strings and a regular expression designed to match (US) phone numbers:

```{r}
strings <- c(
  "apple", 
  "219 733 8965", 
  "329-293-8753", 
  "Work: 579-499-7527; Home: 543.355.3679"
)
phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"
```

-   `str_detect()` detects the presence or absence of a pattern and returns a 
    logical vector (similar to `grepl()`). `str_subset()` returns the elements
    of a character vector that match a regular expression (similar to `grep()` 
    with `value = TRUE`)`.
    
    ```{r}
    # Which strings contain phone numbers?
    str_detect(strings, phone)
    str_subset(strings, phone)
    ```

-  `str_count()` counts the number of matches:

    ```{r}
    # How many phone numbers in each string?
    str_count(strings, phone)
    ```

-   `str_locate()` locates the **first** position of a pattern and returns a numeric 
    matrix with columns start and end. `str_locate_all()` locates all matches, 
    returning a list of numeric matrices. Similar to `regexpr()` and `gregexpr()`.

    ```{r}
    # Where in the string is the phone number located?
    (loc <- str_locate(strings, phone))
    str_locate_all(strings, phone)
    ```

-   `str_extract()` extracts text corresponding to the **first** match, returning a 
    character vector. `str_extract_all()` extracts all matches and returns a 
    list of character vectors.

    ```{r}
    # What are the phone numbers?
    str_extract(strings, phone)
    str_extract_all(strings, phone)
    str_extract_all(strings, phone, simplify = TRUE)
    ```

-   `str_match()` extracts capture groups formed by `()` from the **first** match. 
    It returns a character matrix with one column for the complete match and 
    one column for each group. `str_match_all()` extracts capture groups from 
    all matches and returns a list of character matrices. Similar to 
    `regmatches()`.

    ```{r}
    # Pull out the three components of the match
    str_match(strings, phone)
    str_match_all(strings, phone)
    ```

-   `str_replace()` replaces the **first** matched pattern and returns a character
    vector. `str_replace_all()` replaces all matches. Similar to `sub()` and 
    `gsub()`.

    ```{r}
    str_replace(strings, phone, "XXX-XXX-XXXX")
    str_replace_all(strings, phone, "XXX-XXX-XXXX")
    ```

-   `str_split_fixed()` splits a string into a **fixed** number of pieces based 
    on a pattern and returns a character matrix. `str_split()` splits a string 
    into a **variable** number of pieces and returns a list of character vectors.
    
    ```{r}
    str_split("a-b-c", "-")
    str_split_fixed("a-b-c", "-", n = 2)
    ```

### Engines

There are four main engines that stringr can use to describe patterns:

* Regular expressions, the default, as shown above, and described in
  `vignette("regular-expressions")`. 
  
* Fixed bytewise matching, with `fixed()`.

* Locale-sensitive character matching, with `coll()`

* Text boundary analysis with `boundary()`.

#### Fixed matches

`fixed(x)` only matches the exact sequence of bytes specified by `x`. This is a very limited "pattern", but the restriction can make matching much faster. Beware using `fixed()` with non-English data. It is problematic because there are often multiple ways of representing the same character. For  example, there are two ways to define "á": either as a single character or as an "a" plus an accent:

```{r}
a1 <- "\u00e1"
a2 <- "a\u0301"
c(a1, a2)
a1 == a2
```

They render identically, but because they're defined differently, 
`fixed()` doesn't find a match. Instead, you can use `coll()`, explained
below, to respect human character comparison rules:

```{r}
str_detect(a1, fixed(a2))
str_detect(a1, coll(a2))
```
    
#### Collation search
    
`coll(x)` looks for a match to `x` using human-language **coll**ation rules, and is particularly important if you want to do case insensitive matching. Collation rules differ around the world, so you'll also need to supply a `locale` parameter.

```{r}
i <- c("I", "İ", "i", "ı")
i

str_subset(i, coll("i", ignore_case = TRUE))
str_subset(i, coll("i", ignore_case = TRUE, locale = "tr"))
```

The downside of `coll()` is speed. Because the rules for recognising which characters are the same are complicated, `coll()` is relatively slow compared to `regex()` and `fixed()`. Note that when both `fixed()` and `regex()` have `ignore_case` arguments, they perform a much simpler comparison than `coll()`.

#### Boundary

`boundary()` matches boundaries between characters, lines, sentences or words. It's most useful with `str_split()`, but can be used with all pattern matching functions:

```{r}
x <- "This is a sentence."
str_split(x, boundary("word"))
str_count(x, boundary("word"))
str_extract_all(x, boundary("word"))
```

By convention, `""` is treated as `boundary("character")`:

```{r}
str_split(x, "")
str_count(x, "")
```