Showing preview only (414K chars total). Download the full file or copy to clipboard to get everything.
Repository: tidyverse/stringr
Branch: main
Commit: ae054b1d28f6
Files: 163
Total size: 377.5 KB
Directory structure:
gitextract_1kgwvzj7/
├── .Rbuildignore
├── .covrignore
├── .github/
│ ├── .gitignore
│ ├── CODE_OF_CONDUCT.md
│ └── workflows/
│ ├── R-CMD-check.yaml
│ ├── pkgdown.yaml
│ ├── pr-commands.yaml
│ └── test-coverage.yaml
├── .gitignore
├── .vscode/
│ ├── extensions.json
│ └── settings.json
├── DESCRIPTION
├── LICENSE
├── LICENSE.md
├── NAMESPACE
├── NEWS.md
├── R/
│ ├── c.R
│ ├── case.R
│ ├── compat-obj-type.R
│ ├── compat-purrr.R
│ ├── compat-types-check.R
│ ├── conv.R
│ ├── count.R
│ ├── data.R
│ ├── detect.R
│ ├── dup.R
│ ├── equal.R
│ ├── escape.R
│ ├── extract.R
│ ├── flatten.R
│ ├── glue.R
│ ├── interp.R
│ ├── length.R
│ ├── locate.R
│ ├── match.R
│ ├── modifiers.R
│ ├── pad.R
│ ├── remove.R
│ ├── replace.R
│ ├── sort.R
│ ├── split.R
│ ├── stringr-package.R
│ ├── sub.R
│ ├── subset.R
│ ├── trim.R
│ ├── trunc.R
│ ├── unique.R
│ ├── utils.R
│ ├── view.R
│ ├── word.R
│ └── wrap.R
├── README.Rmd
├── README.md
├── _pkgdown.yml
├── air.toml
├── codecov.yml
├── cran-comments.md
├── data/
│ ├── fruit.rda
│ ├── sentences.rda
│ └── words.rda
├── data-raw/
│ ├── harvard-sentences.txt
│ └── samples.R
├── inst/
│ └── htmlwidgets/
│ ├── lib/
│ │ └── str_view.css
│ ├── str_view.js
│ └── str_view.yaml
├── man/
│ ├── case.Rd
│ ├── invert_match.Rd
│ ├── modifiers.Rd
│ ├── pipe.Rd
│ ├── str_c.Rd
│ ├── str_conv.Rd
│ ├── str_count.Rd
│ ├── str_detect.Rd
│ ├── str_dup.Rd
│ ├── str_equal.Rd
│ ├── str_escape.Rd
│ ├── str_extract.Rd
│ ├── str_flatten.Rd
│ ├── str_glue.Rd
│ ├── str_interp.Rd
│ ├── str_length.Rd
│ ├── str_like.Rd
│ ├── str_locate.Rd
│ ├── str_match.Rd
│ ├── str_order.Rd
│ ├── str_pad.Rd
│ ├── str_remove.Rd
│ ├── str_replace.Rd
│ ├── str_replace_na.Rd
│ ├── str_split.Rd
│ ├── str_starts.Rd
│ ├── str_sub.Rd
│ ├── str_subset.Rd
│ ├── str_to_camel.Rd
│ ├── str_trim.Rd
│ ├── str_trunc.Rd
│ ├── str_unique.Rd
│ ├── str_view.Rd
│ ├── str_which.Rd
│ ├── str_wrap.Rd
│ ├── stringr-data.Rd
│ ├── stringr-package.Rd
│ └── word.Rd
├── po/
│ ├── R-es.po
│ └── R-stringr.pot
├── revdep/
│ ├── .gitignore
│ ├── README.md
│ ├── cran.md
│ ├── email.yml
│ ├── failures.md
│ └── problems.md
├── stringr.Rproj
├── tests/
│ ├── testthat/
│ │ ├── _snaps/
│ │ │ ├── c.md
│ │ │ ├── conv.md
│ │ │ ├── detect.md
│ │ │ ├── dup.md
│ │ │ ├── equal.md
│ │ │ ├── flatten.md
│ │ │ ├── interp.md
│ │ │ ├── match.md
│ │ │ ├── modifiers.md
│ │ │ ├── replace.md
│ │ │ ├── split.md
│ │ │ ├── sub.md
│ │ │ ├── subset.md
│ │ │ ├── trunc.md
│ │ │ └── view.md
│ │ ├── test-c.R
│ │ ├── test-case.R
│ │ ├── test-conv.R
│ │ ├── test-count.R
│ │ ├── test-detect.R
│ │ ├── test-dup.R
│ │ ├── test-equal.R
│ │ ├── test-escape.R
│ │ ├── test-extract.R
│ │ ├── test-flatten.R
│ │ ├── test-glue.R
│ │ ├── test-interp.R
│ │ ├── test-length.R
│ │ ├── test-locate.R
│ │ ├── test-match.R
│ │ ├── test-modifiers.R
│ │ ├── test-pad.R
│ │ ├── test-remove.R
│ │ ├── test-replace.R
│ │ ├── test-sort.R
│ │ ├── test-split.R
│ │ ├── test-sub.R
│ │ ├── test-subset.R
│ │ ├── test-trim.R
│ │ ├── test-trunc.R
│ │ ├── test-unique.R
│ │ ├── test-utils.R
│ │ ├── test-view.R
│ │ ├── test-word.R
│ │ └── test-wrap.R
│ └── testthat.R
└── vignettes/
├── .gitignore
├── from-base.Rmd
├── locale-sensitive.Rmd
├── regular-expressions.Rmd
└── stringr.Rmd
================================================
FILE CONTENTS
================================================
================================================
FILE: .Rbuildignore
================================================
^pkgdown$
^\.covrignore$
^.*\.Rproj$
^\.Rproj\.user$
^packrat/
^\.Rprofile$
^\.travis\.yml$
^revdep$
^cran-comments\.md$
^data-raw$
^codecov\.yml$
^\.httr-oauth$
^_pkgdown\.yml$
^doc$
^docs$
^Meta$
^README\.Rmd$
^README-.*\.png$
^appveyor\.yml$
^CRAN-RELEASE$
^LICENSE\.md$
^\.github$
^CRAN-SUBMISSION$
^[.]?air[.]toml$
^\.vscode$
================================================
FILE: .covrignore
================================================
R/deprec-*.R
R/compat-*.R
================================================
FILE: .github/.gitignore
================================================
*.html
================================================
FILE: .github/CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at codeofconduct@posit.co.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
<https://www.contributor-covenant.org/version/2/1/code_of_conduct.html>.
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][https://github.com/mozilla/inclusion].
For answers to common questions about this code of conduct, see the FAQ at
<https://www.contributor-covenant.org/faq>. Translations are available at <https://www.contributor-covenant.org/translations>.
[homepage]: https://www.contributor-covenant.org
================================================
FILE: .github/workflows/R-CMD-check.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
#
# NOTE: This workflow is overkill for most R packages and
# check-standard.yaml is likely a better choice.
# usethis::use_github_action("check-standard") will install it.
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
name: R-CMD-check.yaml
permissions: read-all
jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
strategy:
fail-fast: false
matrix:
config:
- {os: macos-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
# use 4.0 or 4.1 to check with rtools40's older compiler
- {os: windows-latest, r: 'oldrel-4'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}
- {os: ubuntu-latest, r: 'oldrel-2'}
- {os: ubuntu-latest, r: 'oldrel-3'}
- {os: ubuntu-latest, r: 'oldrel-4'}
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-pandoc@v2
- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check
- uses: r-lib/actions/check-r-package@v2
with:
upload-snapshots: true
build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")'
================================================
FILE: .github/workflows/pkgdown.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
release:
types: [published]
workflow_dispatch:
name: pkgdown.yaml
permissions: read-all
jobs:
pkgdown:
runs-on: ubuntu-latest
# Only restrict concurrency for non-PR jobs
concurrency:
group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
permissions:
contents: write
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-pandoc@v2
- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::pkgdown, local::.
needs: website
- name: Build site
run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)
shell: Rscript {0}
- name: Deploy to GitHub pages 🚀
if: github.event_name != 'pull_request'
uses: JamesIves/github-pages-deploy-action@v4.5.0
with:
clean: false
branch: gh-pages
folder: docs
================================================
FILE: .github/workflows/pr-commands.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
issue_comment:
types: [created]
name: pr-commands.yaml
permissions: read-all
jobs:
document:
if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/document') }}
name: document
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
permissions:
contents: write
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/pr-fetch@v2
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::roxygen2
needs: pr-document
- name: Document
run: roxygen2::roxygenise()
shell: Rscript {0}
- name: commit
run: |
git config --local user.name "$GITHUB_ACTOR"
git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
git add man/\* NAMESPACE
git commit -m 'Document'
- uses: r-lib/actions/pr-push@v2
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
style:
if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/style') }}
name: style
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
permissions:
contents: write
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/pr-fetch@v2
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: r-lib/actions/setup-r@v2
- name: Install dependencies
run: install.packages("styler")
shell: Rscript {0}
- name: Style
run: styler::style_pkg()
shell: Rscript {0}
- name: commit
run: |
git config --local user.name "$GITHUB_ACTOR"
git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
git add \*.R
git commit -m 'Style'
- uses: r-lib/actions/pr-push@v2
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
================================================
FILE: .github/workflows/test-coverage.yaml
================================================
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
name: test-coverage.yaml
permissions: read-all
jobs:
test-coverage:
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::covr, any::xml2
needs: coverage
- name: Test coverage
run: |
cov <- covr::package_coverage(
quiet = FALSE,
clean = FALSE,
install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package")
)
covr::to_cobertura(cov)
shell: Rscript {0}
- uses: codecov/codecov-action@v4
with:
fail_ci_if_error: ${{ github.event_name != 'pull_request' && true || false }}
file: ./cobertura.xml
plugin: noop
disable_search: true
token: ${{ secrets.CODECOV_TOKEN }}
- name: Show testthat output
if: always()
run: |
## --------------------------------------------------------------------
find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true
shell: bash
- name: Upload test results
if: failure()
uses: actions/upload-artifact@v4
with:
name: coverage-test-failures
path: ${{ runner.temp }}/package
================================================
FILE: .gitignore
================================================
docs
.Rproj.user
.Rhistory
.RData
packrat/lib*/
packrat/src
inst/doc
.httr-oauth
revdep/checks
revdep/library
revdep/checks.noindex
revdep/library.noindex
revdep/data.sqlite
/doc/
/Meta/
================================================
FILE: .vscode/extensions.json
================================================
{
"recommendations": [
"Posit.air-vscode"
]
}
================================================
FILE: .vscode/settings.json
================================================
{
"[r]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "Posit.air-vscode"
}
}
================================================
FILE: DESCRIPTION
================================================
Package: stringr
Title: Simple, Consistent Wrappers for Common String Operations
Version: 1.6.0.9000
Authors@R: c(
person("Hadley", "Wickham", , "hadley@posit.co", role = c("aut", "cre", "cph")),
person("Posit Software, PBC", role = c("cph", "fnd"))
)
Description: A consistent, simple and easy to use set of wrappers around
the fantastic 'stringi' package. All function and argument names (and
positions) are consistent, all functions deal with "NA"'s and zero
length vectors in the same way, and the output from one function is
easy to feed into the input of another.
License: MIT + file LICENSE
URL: https://stringr.tidyverse.org, https://github.com/tidyverse/stringr
BugReports: https://github.com/tidyverse/stringr/issues
Depends:
R (>= 3.6)
Imports:
cli,
glue (>= 1.6.1),
lifecycle (>= 1.0.3),
magrittr,
rlang (>= 1.0.0),
stringi (>= 1.5.3),
vctrs (>= 0.4.0)
Suggests:
covr,
dplyr,
gt,
htmltools,
htmlwidgets,
knitr,
rmarkdown,
testthat (>= 3.0.0),
tibble
VignetteBuilder:
knitr
Config/Needs/website: tidyverse/tidytemplate
Config/potools/style: explicit
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.3
================================================
FILE: LICENSE
================================================
YEAR: 2023
COPYRIGHT HOLDER: stringr authors
================================================
FILE: LICENSE.md
================================================
# MIT License
Copyright (c) 2023 stringr authors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: NAMESPACE
================================================
# Generated by roxygen2: do not edit by hand
S3method("[",stringr_pattern)
S3method("[",stringr_view)
S3method("[[",stringr_pattern)
S3method(print,stringr_view)
S3method(type,character)
S3method(type,default)
S3method(type,stringr_boundary)
S3method(type,stringr_coll)
S3method(type,stringr_fixed)
S3method(type,stringr_regex)
export("%>%")
export("str_sub<-")
export(boundary)
export(coll)
export(fixed)
export(invert_match)
export(regex)
export(str_c)
export(str_conv)
export(str_count)
export(str_detect)
export(str_dup)
export(str_ends)
export(str_equal)
export(str_escape)
export(str_extract)
export(str_extract_all)
export(str_flatten)
export(str_flatten_comma)
export(str_glue)
export(str_glue_data)
export(str_ilike)
export(str_interp)
export(str_length)
export(str_like)
export(str_locate)
export(str_locate_all)
export(str_match)
export(str_match_all)
export(str_order)
export(str_pad)
export(str_rank)
export(str_remove)
export(str_remove_all)
export(str_replace)
export(str_replace_all)
export(str_replace_na)
export(str_sort)
export(str_split)
export(str_split_1)
export(str_split_fixed)
export(str_split_i)
export(str_squish)
export(str_starts)
export(str_sub)
export(str_sub_all)
export(str_subset)
export(str_to_camel)
export(str_to_kebab)
export(str_to_lower)
export(str_to_sentence)
export(str_to_snake)
export(str_to_title)
export(str_to_upper)
export(str_trim)
export(str_trunc)
export(str_unique)
export(str_view)
export(str_view_all)
export(str_which)
export(str_width)
export(str_wrap)
export(word)
import(rlang)
import(stringi)
importFrom(glue,glue)
importFrom(lifecycle,deprecated)
importFrom(magrittr,"%>%")
================================================
FILE: NEWS.md
================================================
# stringr (development version)
# stringr 1.6.0
## Breaking changes
* All relevant stringr functions now preserve names (@jonovik, #575).
* `str_like(ignore_case)` is deprecated, with `str_like()` now always case sensitive to better follow the conventions of the SQL LIKE operator (@edward-burn, #543).
* In `str_replace_all()`, a `replacement` function now receives all values in a single vector. This radically improves performance at the cost of breaking some existing uses (#462).
## New features
* New `vignette("locale-sensitive")` about locale sensitive functions (@kylieainslie, #404)
* New `str_ilike()` that follows the conventions of the SQL ILIKE operator (@edward-burn, #543).
* New `str_to_camel()`, `str_to_snake()`, and `str_to_kebab()` for changing "programming" case (@librill, #573 + @arnaudgallou, #593).
## Minor bug fies and improvements
* `str_*` now errors if `pattern` includes any `NA`s (@nash-delcamp-slp, #546).
* `str_dup()` gains a `sep` argument so you can add a separator between every repeated value (@edward-burn, #564).
* `str_sub<-` now gives a more informative error if `value` is not the correct length.
* `str_view()` displays a message when called with a zero-length character vector (@LouisMPenrod, #497).
* New `[[.stringr_pattern` method to match existing `[.stringr_pattern` (@edward-burn, #569).
# stringr 1.5.2
* `R CMD check` fixes
# stringr 1.5.1
* Some minor documentation improvements.
* `str_trunc()` now correctly truncates strings when `side` is `"left"` or
`"center"` (@UchidaMizuki, #512).
# stringr 1.5.0
## Breaking changes
* stringr functions now consistently implement the tidyverse recycling rules
(#372). There are two main changes:
* Only vectors of length 1 are recycled. Previously, (e.g.)
`str_detect(letters, c("x", "y"))` worked, but it now errors.
* `str_c()` ignores `NULLs`, rather than treating them as length 0
vectors.
Additionally, many more arguments now throw errors, rather than warnings,
if supplied the wrong type of input.
* `regex()` and friends now generate class names with `stringr_` prefix (#384).
* `str_detect()`, `str_starts()`, `str_ends()` and `str_subset()` now error
when used with either an empty string (`""`) or a `boundary()`. These
operations didn't really make sense (`str_detect(x, "")` returned `TRUE`
for all non-empty strings) and made it easy to make mistakes when programming.
## New features
* Many tweaks to the documentation to make it more useful and consistent.
* New `vignette("from-base")` by @sastoudt provides a comprehensive comparison
between base R functions and their stringr equivalents. It's designed to
help you move to stringr if you're already familiar with base R string
functions (#266).
* New `str_escape()` escapes regular expression metacharacters, providing
an alternative to `fixed()` if you want to compose a pattern from user
supplied strings (#408).
* New `str_equal()` compares two character vectors using unicode rules,
optionally ignoring case (#381).
* `str_extract()` can now optionally extract a capturing group instead of
the complete match (#420).
* New `str_flatten_comma()` is a special case of `str_flatten()` designed for
comma separated flattening and can correctly apply the Oxford commas
when there are only two elements (#444).
* New `str_split_1()` is tailored for the special case of splitting up a single
string (#409).
* New `str_split_i()` extract a single piece from a string (#278, @bfgray3).
* New `str_like()` allows the use of SQL wildcards (#280, @rjpat).
* New `str_rank()` to complete the set of order/rank/sort functions (#353).
* New `str_sub_all()` to extract multiple substrings from each string.
* New `str_unique()` is a wrapper around `stri_unique()` and returns unique
string values in a character vector (#249, @seasmith).
* `str_view()` uses ANSI colouring rather than an HTML widget (#370). This
works in more places and requires fewer dependencies. It includes a number
of other small improvements:
* It no longer requires a pattern so you can use it to display strings with
special characters.
* It highlights unusual whitespace characters.
* It's vectorised over both string` and `pattern` (#407).
* It defaults to displaying all matches, making `str_view_all()` redundant
(and hence deprecated) (#455).
* New `str_width()` returns the display width of a string (#380).
* stringr is now licensed as MIT (#351).
## Minor improvements and bug fixes
* Better error message if you supply a non-string pattern (#378).
* A new data source for `sentences` has fixed many small errors.
* `str_extract()` and `str_exctract_all()` now work correctly when `pattern`
is a `boundary()`.
* `str_flatten()` gains a `last` argument that optionally override the
final separator (#377). It gains a `na.rm` argument to remove missing
values (since it's a summary function) (#439).
* `str_pad()` gains `use_width` argument to control whether to use the total
code point width or the number of code points as "width" of a string (#190).
* `str_replace()` and `str_replace_all()` can use standard tidyverse formula
shorthand for `replacement` function (#331).
* `str_starts()` and `str_ends()` now correctly respect regex operator
precedence (@carlganz).
* `str_wrap()` breaks only at whitespace by default; set
`whitespace_only = FALSE` to return to the previous behaviour (#335, @rjpat).
* `word()` now returns all the sentence when using a negative `start` parameter
that is greater or equal than the number of words. (@pdelboca, #245)
# stringr 1.4.1
Hot patch release to resolve R CMD check failures.
# stringr 1.4.0
* `str_interp()` now renders lists consistently independent on the presence of
additional placeholders (@amhrasmussen).
* New `str_starts()` and `str_ends()` functions to detect patterns at the
beginning or end of strings (@jonthegeek, #258).
* `str_subset()`, `str_detect()`, and `str_which()` get `negate` argument,
which is useful when you want the elements that do NOT match (#259,
@yutannihilation).
* New `str_to_sentence()` function to capitalize with sentence case
(@jonthegeek, #202).
# stringr 1.3.1
* `str_replace_all()` with a named vector now respects modifier functions (#207)
* `str_trunc()` is once again vectorised correctly (#203, @austin3dickey).
* `str_view()` handles `NA` values more gracefully (#217). I've also
tweaked the sizing policy so hopefully it should work better in notebooks,
while preserving the existing behaviour in knit documents (#232).
# stringr 1.3.0
## API changes
* During package build, you may see
`Error : object ‘ignore.case’ is not exported by 'namespace:stringr'`.
This is because the long deprecated `str_join()`, `ignore.case()` and
`perl()` have now been removed.
## New features
* `str_glue()` and `str_glue_data()` provide convenient wrappers around
`glue` and `glue_data()` from the [glue](https://glue.tidyverse.org/) package
(#157).
* `str_flatten()` is a wrapper around `stri_flatten()` and clearly
conveys flattening a character vector into a single string (#186).
* `str_remove()` and `str_remove_all()` functions. These wrap
`str_replace()` and `str_replace_all()` to remove patterns from strings.
(@Shians, #178)
* `str_squish()` removes spaces from both the left and right side of strings,
and also converts multiple space (or space-like characters) to a single
space within strings (@stephlocke, #197).
* `str_sub()` gains `omit_na` argument for ignoring `NA`. Accordingly,
`str_replace()` now ignores `NA`s and keeps the original strings.
(@yutannihilation, #164)
## Bug fixes and minor improvements
* `str_trunc()` now preserves NAs (@ClaytonJY, #162)
* `str_trunc()` now throws an error when `width` is shorter than `ellipsis`
(@ClaytonJY, #163).
* Long deprecated `str_join()`, `ignore.case()` and `perl()` have now been
removed.
# stringr 1.2.0
## API changes
* `str_match_all()` now returns NA if an optional group doesn't match
(previously it returned ""). This is more consistent with `str_match()`
and other match failures (#134).
## New features
* In `str_replace()`, `replacement` can now be a function that is called once
for each match and whose return value is used to replace the match.
* New `str_which()` mimics `grep()` (#129).
* A new vignette (`vignette("regular-expressions")`) describes the
details of the regular expressions supported by stringr.
The main vignette (`vignette("stringr")`) has been updated to
give a high-level overview of the package.
## Minor improvements and bug fixes
* `str_order()` and `str_sort()` gain explicit `numeric` argument for sorting
mixed numbers and strings.
* `str_replace_all()` now throws an error if `replacement` is not a character
vector. If `replacement` is `NA_character_` it replaces the complete string
with replaces with `NA` (#124).
* All functions that take a locale (e.g. `str_to_lower()` and `str_sort()`)
default to "en" (English) to ensure that the default is consistent across
platforms.
# stringr 1.1.0
* Add sample datasets: `fruit`, `words` and `sentences`.
* `fixed()`, `regex()`, and `coll()` now throw an error if you use them with
anything other than a plain string (#60). I've clarified that the replacement
for `perl()` is `regex()` not `regexp()` (#61). `boundary()` has improved
defaults when splitting on non-word boundaries (#58, @lmullen).
* `str_detect()` now can detect boundaries (by checking for a `str_count()` > 0)
(#120). `str_subset()` works similarly.
* `str_extract()` and `str_extract_all()` now work with `boundary()`. This is
particularly useful if you want to extract logical constructs like words
or sentences. `str_extract_all()` respects the `simplify` argument
when used with `fixed()` matches.
* `str_subset()` now respects custom options for `fixed()` patterns
(#79, @gagolews).
* `str_replace()` and `str_replace_all()` now behave correctly when a
replacement string contains `$`s, `\\\\1`, etc. (#83, #99).
* `str_split()` gains a `simplify` argument to match `str_extract_all()`
etc.
* `str_view()` and `str_view_all()` create HTML widgets that display regular
expression matches (#96).
* `word()` returns `NA` for indexes greater than number of words (#112).
# stringr 1.0.0
* stringr is now powered by [stringi](https://github.com/gagolews/stringi)
instead of base R regular expressions. This improves unicode and support, and
makes most operations considerably faster. If you find stringr inadequate for
your string processing needs, I highly recommend looking at stringi in more
detail.
* stringr gains a vignette, currently a straight forward update of the article
that appeared in the R Journal.
* `str_c()` now returns a zero length vector if any of its inputs are
zero length vectors. This is consistent with all other functions, and
standard R recycling rules. Similarly, using `str_c("x", NA)` now
yields `NA`. If you want `"xNA"`, use `str_replace_na()` on the inputs.
* `str_replace_all()` gains a convenient syntax for applying multiple pairs of
pattern and replacement to the same vector:
```R
input <- c("abc", "def")
str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
```
* `str_match()` now returns NA if an optional group doesn't match
(previously it returned ""). This is more consistent with `str_extract()`
and other match failures.
* New `str_subset()` keeps values that match a pattern. It's a convenient
wrapper for `x[str_detect(x)]` (#21, @jiho).
* New `str_order()` and `str_sort()` allow you to sort and order strings
in a specified locale.
* New `str_conv()` to convert strings from specified encoding to UTF-8.
* New modifier `boundary()` allows you to count, locate and split by
character, word, line and sentence boundaries.
* The documentation got a lot of love, and very similar functions (e.g.
first and all variants) are now documented together. This should hopefully
make it easier to locate the function you need.
* `ignore.case(x)` has been deprecated in favour of
`fixed|regex|coll(x, ignore.case = TRUE)`, `perl(x)` has been deprecated in
favour of `regex(x)`.
* `str_join()` is deprecated, please use `str_c()` instead.
# stringr 0.6.2
* fixed path in `str_wrap` example so works for more R installations.
* remove dependency on plyr
# stringr 0.6.1
* Zero input to `str_split_fixed` returns 0 row matrix with `n` columns
* Export `str_join`
# stringr 0.6
* new modifier `perl` that switches to Perl regular expressions
* `str_match` now uses new base function `regmatches` to extract matches -
this should hopefully be faster than my previous pure R algorithm
# stringr 0.5
* new `str_wrap` function which gives `strwrap` output in a more convenient
format
* new `word` function extract words from a string given user defined
separator (thanks to suggestion by David Cooper)
* `str_locate` now returns consistent type when matching empty string (thanks
to Stavros Macrakis)
* new `str_count` counts number of matches in a string.
* `str_pad` and `str_trim` receive performance tweaks - for large vectors this
should give at least a two order of magnitude speed up
* str_length returns NA for invalid multibyte strings
* fix small bug in internal `recyclable` function
# stringr 0.4
* all functions now vectorised with respect to string, pattern (and
where appropriate) replacement parameters
* fixed() function now tells stringr functions to use fixed matching, rather
than escaping the regular expression. Should improve performance for
large vectors.
* new ignore.case() modifier tells stringr functions to ignore case of
pattern.
* str_replace renamed to str_replace_all and new str_replace function added.
This makes str_replace consistent with all functions.
* new str_sub<- function (analogous to substring<-) for substring replacement
* str_sub now understands negative positions as a position from the end of
the string. -1 replaces Inf as indicator for string end.
* str_pad side argument can be left, right, or both (instead of center)
* str_trim gains side argument to better match str_pad
* stringr now has a namespace and imports plyr (rather than requiring it)
# stringr 0.3
* fixed() now also escapes |
* str_join() renamed to str_c()
* all functions more carefully check input and return informative error
messages if not as expected.
* add invert_match() function to convert a matrix of location of matches to
locations of non-matches
* add fixed() function to allow matching of fixed strings.
# stringr 0.2
* str_length now returns correct results when used with factors
* str_sub now correctly replaces Inf in end argument with length of string
* new function str_split_fixed returns fixed number of splits in a character
matrix
* str_split no longer uses strsplit to preserve trailing breaks
================================================
FILE: R/c.R
================================================
#' Join multiple strings into one string
#'
#' @description
#' `str_c()` combines multiple character vectors into a single character
#' vector. It's very similar to [paste0()] but uses tidyverse recycling and
#' `NA` rules.
#'
#' One way to understand how `str_c()` works is picture a 2d matrix of strings,
#' where each argument forms a column. `sep` is inserted between each column,
#' and then each row is combined together into a single string. If `collapse`
#' is set, it's inserted between each row, and then the result is again
#' combined, this time into a single string.
#'
#' @param ... One or more character vectors.
#'
#' `NULL`s are removed; scalar inputs (vectors of length 1) are recycled to
#' the common length of vector inputs.
#'
#' Like most other R functions, missing values are "infectious": whenever
#' a missing value is combined with another string the result will always
#' be missing. Use [dplyr::coalesce()] or [str_replace_na()] to convert to
#' the desired value.
#' @param sep String to insert between input vectors.
#' @param collapse Optional string used to combine output into single
#' string. Generally better to use [str_flatten()] if you needed this
#' behaviour.
#' @return If `collapse = NULL` (the default) a character vector with
#' length equal to the longest input. If `collapse` is a string, a character
#' vector of length 1.
#' @export
#' @examples
#' str_c("Letter: ", letters)
#' str_c("Letter", letters, sep = ": ")
#' str_c(letters, " is for", "...")
#' str_c(letters[-26], " comes before ", letters[-1])
#'
#' str_c(letters, collapse = "")
#' str_c(letters, collapse = ", ")
#'
#' # Differences from paste() ----------------------
#' # Missing inputs give missing outputs
#' str_c(c("a", NA, "b"), "-d")
#' paste0(c("a", NA, "b"), "-d")
#' # Use str_replace_NA to display literal NAs:
#' str_c(str_replace_na(c("a", NA, "b")), "-d")
#'
#' # Uses tidyverse recycling rules
#' \dontrun{str_c(1:2, 1:3)} # errors
#' paste0(1:2, 1:3)
#'
#' str_c("x", character())
#' paste0("x", character())
str_c <- function(..., sep = "", collapse = NULL) {
check_string(sep)
check_string(collapse, allow_null = TRUE)
dots <- list(...)
dots <- dots[!map_lgl(dots, is.null)]
vctrs::vec_size_common(!!!dots)
inject(stri_c(!!!dots, sep = sep, collapse = collapse))
}
================================================
FILE: R/case.R
================================================
#' Convert string to upper case, lower case, title case, or sentence case
#'
#' * `str_to_upper()` converts to upper case.
#' * `str_to_lower()` converts to lower case.
#' * `str_to_title()` converts to title case, where only the first letter of
#' each word is capitalized.
#' * `str_to_sentence()` convert to sentence case, where only the first letter
#' of sentence is capitalized.
#'
#' @inheritParams str_detect
#' @inheritParams coll
#' @return A character vector the same length as `string`.
#' @examples
#' dog <- "The quick brown dog"
#' str_to_upper(dog)
#' str_to_lower(dog)
#' str_to_title(dog)
#' str_to_sentence("the quick brown dog")
#'
#' # Locale matters!
#' str_to_upper("i") # English
#' str_to_upper("i", "tr") # Turkish
#' @name case
NULL
#' @export
#' @rdname case
str_to_upper <- function(string, locale = "en") {
check_string(locale)
copy_names(string, stri_trans_toupper(string, locale = locale))
}
#' @export
#' @rdname case
str_to_lower <- function(string, locale = "en") {
check_string(locale)
copy_names(string, stri_trans_tolower(string, locale = locale))
}
#' @export
#' @rdname case
str_to_title <- function(string, locale = "en") {
check_string(locale)
out <- stri_trans_totitle(
string,
opts_brkiter = stri_opts_brkiter(locale = locale)
)
copy_names(string, out)
}
#' @export
#' @rdname case
str_to_sentence <- function(string, locale = "en") {
check_string(locale)
out <- stri_trans_totitle(
string,
opts_brkiter = stri_opts_brkiter(type = "sentence", locale = locale)
)
copy_names(string, out)
}
#' Convert between different types of programming case
#'
#' @description
#' * `str_to_camel()` converts to camel case, where the first letter of
#' each word is capitalized, with no separation between words. By default
#' the first letter of the first word is not capitalized.
#'
#' * `str_to_kebab()` converts to kebab case, where words are converted to
#' lower case and separated by dashes (`-`).
#'
#' * `str_to_snake()` converts to snake case, where words are converted to
#' lower case and separated by underscores (`_`).
#' @inheritParams str_to_lower
#' @export
#' @param first_upper Logical. Should the first letter be capitalized?
#' @examples
#' str_to_camel("my-variable")
#' str_to_camel("my-variable", first_upper = TRUE)
#'
#' str_to_snake("MyVariable")
#' str_to_kebab("MyVariable")
str_to_camel <- function(string, first_upper = FALSE) {
check_character(string)
check_bool(first_upper)
string <- string |>
to_words() |>
str_to_title() |>
str_remove_all(pattern = fixed(" "))
if (!first_upper) {
str_sub(string, 1, 1) <- str_to_lower(str_sub(string, 1, 1))
}
string
}
#' @export
#' @rdname str_to_camel
str_to_snake <- function(string) {
check_character(string)
to_separated_case(string, sep = "_")
}
#' @export
#' @rdname str_to_camel
str_to_kebab <- function(string) {
check_character(string)
to_separated_case(string, sep = "-")
}
to_separated_case <- function(string, sep) {
out <- to_words(string)
str_replace_all(out, fixed(" "), sep)
}
to_words <- function(string) {
breakpoints <- paste(
# non-word characters
"[^\\p{L}\\p{N}]+",
# lowercase followed by uppercase
"(?<=\\p{Ll})(?=\\p{Lu})",
# letter followed by number
"(?<=\\p{L})(?=\\p{N})",
# number followed by letter
"(?<=\\p{N})(?=\\p{L})",
# uppercase followed uppercase then lowercase (i.e. end of acronym)
"(?<=\\p{Lu})(?=\\p{Lu}\\p{Ll})",
sep = "|"
)
out <- str_replace_all(string, breakpoints, " ")
out <- str_to_lower(out)
str_trim(out)
}
================================================
FILE: R/compat-obj-type.R
================================================
# nocov start --- r-lib/rlang compat-obj-type
#
# Changelog
# =========
#
# 2022-10-04:
# - `obj_type_friendly(value = TRUE)` now shows numeric scalars
# literally.
# - `stop_friendly_type()` now takes `show_value`, passed to
# `obj_type_friendly()` as the `value` argument.
#
# 2022-10-03:
# - Added `allow_na` and `allow_null` arguments.
# - `NULL` is now backticked.
# - Better friendly type for infinities and `NaN`.
#
# 2022-09-16:
# - Unprefixed usage of rlang functions with `rlang::` to
# avoid onLoad issues when called from rlang (#1482).
#
# 2022-08-11:
# - Prefixed usage of rlang functions with `rlang::`.
#
# 2022-06-22:
# - `friendly_type_of()` is now `obj_type_friendly()`.
# - Added `obj_type_oo()`.
#
# 2021-12-20:
# - Added support for scalar values and empty vectors.
# - Added `stop_input_type()`
#
# 2021-06-30:
# - Added support for missing arguments.
#
# 2021-04-19:
# - Added support for matrices and arrays (#141).
# - Added documentation.
# - Added changelog.
#' Return English-friendly type
#' @param x Any R object.
#' @param value Whether to describe the value of `x`. Special values
#' like `NA` or `""` are always described.
#' @param length Whether to mention the length of vectors and lists.
#' @return A string describing the type. Starts with an indefinite
#' article, e.g. "an integer vector".
#' @noRd
obj_type_friendly <- function(x, value = TRUE) {
if (is_missing(x)) {
return("absent")
}
if (is.object(x)) {
if (inherits(x, "quosure")) {
type <- "quosure"
} else {
type <- paste(class(x), collapse = "/")
}
return(sprintf("a <%s> object", type))
}
if (!is_vector(x)) {
return(.rlang_as_friendly_type(typeof(x)))
}
n_dim <- length(dim(x))
if (!n_dim) {
if (!is_list(x) && length(x) == 1) {
if (is_na(x)) {
return(switch(
typeof(x),
logical = "`NA`",
integer = "an integer `NA`",
double = if (is.nan(x)) {
"`NaN`"
} else {
"a numeric `NA`"
},
complex = "a complex `NA`",
character = "a character `NA`",
.rlang_stop_unexpected_typeof(x)
))
}
show_infinites <- function(x) {
if (x > 0) {
"`Inf`"
} else {
"`-Inf`"
}
}
str_encode <- function(x, width = 30, ...) {
if (nchar(x) > width) {
x <- substr(x, 1, width - 3)
x <- paste0(x, "...")
}
encodeString(x, ...)
}
if (value) {
if (is.numeric(x) && is.infinite(x)) {
return(show_infinites(x))
}
if (is.numeric(x) || is.complex(x)) {
number <- as.character(round(x, 2))
what <- if (is.complex(x)) "the complex number" else "the number"
return(paste(what, number))
}
return(switch(
typeof(x),
logical = if (x) "`TRUE`" else "`FALSE`",
character = {
what <- if (nzchar(x)) "the string" else "the empty string"
paste(what, str_encode(x, quote = "\""))
},
raw = paste("the raw value", as.character(x)),
.rlang_stop_unexpected_typeof(x)
))
}
return(switch(
typeof(x),
logical = "a logical value",
integer = "an integer",
double = if (is.infinite(x)) show_infinites(x) else "a number",
complex = "a complex number",
character = if (nzchar(x)) "a string" else "\"\"",
raw = "a raw value",
.rlang_stop_unexpected_typeof(x)
))
}
if (length(x) == 0) {
return(switch(
typeof(x),
logical = "an empty logical vector",
integer = "an empty integer vector",
double = "an empty numeric vector",
complex = "an empty complex vector",
character = "an empty character vector",
raw = "an empty raw vector",
list = "an empty list",
.rlang_stop_unexpected_typeof(x)
))
}
}
vec_type_friendly(x)
}
vec_type_friendly <- function(x, length = FALSE) {
if (!is_vector(x)) {
abort("`x` must be a vector.")
}
type <- typeof(x)
n_dim <- length(dim(x))
add_length <- function(type) {
if (length && !n_dim) {
paste0(type, sprintf(" of length %s", length(x)))
} else {
type
}
}
if (type == "list") {
if (n_dim < 2) {
return(add_length("a list"))
} else if (is.data.frame(x)) {
return("a data frame")
} else if (n_dim == 2) {
return("a list matrix")
} else {
return("a list array")
}
}
type <- switch(
type,
logical = "a logical %s",
integer = "an integer %s",
numeric = ,
double = "a double %s",
complex = "a complex %s",
character = "a character %s",
raw = "a raw %s",
type = paste0("a ", type, " %s")
)
if (n_dim < 2) {
kind <- "vector"
} else if (n_dim == 2) {
kind <- "matrix"
} else {
kind <- "array"
}
out <- sprintf(type, kind)
if (n_dim >= 2) {
out
} else {
add_length(out)
}
}
.rlang_as_friendly_type <- function(type) {
switch(
type,
list = "a list",
NULL = "`NULL`",
environment = "an environment",
externalptr = "a pointer",
weakref = "a weak reference",
S4 = "an S4 object",
name = ,
symbol = "a symbol",
language = "a call",
pairlist = "a pairlist node",
expression = "an expression vector",
char = "an internal string",
promise = "an internal promise",
... = "an internal dots object",
any = "an internal `any` object",
bytecode = "an internal bytecode object",
primitive = ,
builtin = ,
special = "a primitive function",
closure = "a function",
type
)
}
.rlang_stop_unexpected_typeof <- function(x, call = caller_env()) {
abort(
sprintf("Unexpected type <%s>.", typeof(x)),
call = call
)
}
#' Return OO type
#' @param x Any R object.
#' @return One of `"bare"` (for non-OO objects), `"S3"`, `"S4"`,
#' `"R6"`, or `"R7"`.
#' @noRd
obj_type_oo <- function(x) {
if (!is.object(x)) {
return("bare")
}
class <- inherits(x, c("R6", "R7_object"), which = TRUE)
if (class[[1]]) {
"R6"
} else if (class[[2]]) {
"R7"
} else if (isS4(x)) {
"S4"
} else {
"S3"
}
}
#' @param x The object type which does not conform to `what`. Its
#' `obj_type_friendly()` is taken and mentioned in the error message.
#' @param what The friendly expected type as a string. Can be a
#' character vector of expected types, in which case the error
#' message mentions all of them in an "or" enumeration.
#' @param show_value Passed to `value` argument of `obj_type_friendly()`.
#' @param ... Arguments passed to [abort()].
#' @inheritParams args_error_context
#' @noRd
stop_input_type <- function(
x,
what,
...,
allow_na = FALSE,
allow_null = FALSE,
show_value = TRUE,
arg = caller_arg(x),
call = caller_env()
) {
# From compat-cli.R
cli <- env_get_list(
nms = c("format_arg", "format_code"),
last = topenv(),
default = function(x) sprintf("`%s`", x),
inherit = TRUE
)
if (allow_na) {
what <- c(what, cli$format_code("NA"))
}
if (allow_null) {
what <- c(what, cli$format_code("NULL"))
}
if (length(what)) {
what <- oxford_comma(what)
}
message <- sprintf(
"%s must be %s, not %s.",
cli$format_arg(arg),
what,
obj_type_friendly(x, value = show_value)
)
abort(message, ..., call = call, arg = arg)
}
oxford_comma <- function(chr, sep = ", ", final = "or") {
n <- length(chr)
if (n < 2) {
return(chr)
}
head <- chr[seq_len(n - 1)]
last <- chr[n]
head <- paste(head, collapse = sep)
# Write a or b. But a, b, or c.
if (n > 2) {
paste0(head, sep, final, " ", last)
} else {
paste0(head, " ", final, " ", last)
}
}
# nocov end
================================================
FILE: R/compat-purrr.R
================================================
# nocov start - compat-purrr (last updated: rlang 0.3.2.9000)
# This file serves as a reference for compatibility functions for
# purrr. They are not drop-in replacements but allow a similar style
# of programming. This is useful in cases where purrr is too heavy a
# package to depend on. Please find the most recent version in rlang's
# repository.
map <- function(.x, .f, ...) {
lapply(.x, .f, ...)
}
map_mold <- function(.x, .f, .mold, ...) {
out <- vapply(.x, .f, .mold, ..., USE.NAMES = FALSE)
names(out) <- names(.x)
out
}
map_lgl <- function(.x, .f, ...) {
map_mold(.x, .f, logical(1), ...)
}
map_int <- function(.x, .f, ...) {
map_mold(.x, .f, integer(1), ...)
}
map_dbl <- function(.x, .f, ...) {
map_mold(.x, .f, double(1), ...)
}
map_chr <- function(.x, .f, ...) {
map_mold(.x, .f, character(1), ...)
}
map_cpl <- function(.x, .f, ...) {
map_mold(.x, .f, complex(1), ...)
}
walk <- function(.x, .f, ...) {
map(.x, .f, ...)
invisible(.x)
}
pluck <- function(.x, .f) {
map(.x, `[[`, .f)
}
pluck_lgl <- function(.x, .f) {
map_lgl(.x, `[[`, .f)
}
pluck_int <- function(.x, .f) {
map_int(.x, `[[`, .f)
}
pluck_dbl <- function(.x, .f) {
map_dbl(.x, `[[`, .f)
}
pluck_chr <- function(.x, .f) {
map_chr(.x, `[[`, .f)
}
pluck_cpl <- function(.x, .f) {
map_cpl(.x, `[[`, .f)
}
map2 <- function(.x, .y, .f, ...) {
out <- mapply(.f, .x, .y, MoreArgs = list(...), SIMPLIFY = FALSE)
if (length(out) == length(.x)) {
set_names(out, names(.x))
} else {
set_names(out, NULL)
}
}
map2_lgl <- function(.x, .y, .f, ...) {
as.vector(map2(.x, .y, .f, ...), "logical")
}
map2_int <- function(.x, .y, .f, ...) {
as.vector(map2(.x, .y, .f, ...), "integer")
}
map2_dbl <- function(.x, .y, .f, ...) {
as.vector(map2(.x, .y, .f, ...), "double")
}
map2_chr <- function(.x, .y, .f, ...) {
as.vector(map2(.x, .y, .f, ...), "character")
}
map2_cpl <- function(.x, .y, .f, ...) {
as.vector(map2(.x, .y, .f, ...), "complex")
}
args_recycle <- function(args) {
lengths <- map_int(args, length)
n <- max(lengths)
stopifnot(all(lengths == 1L | lengths == n))
to_recycle <- lengths == 1L
args[to_recycle] <- map(args[to_recycle], function(x) rep.int(x, n))
args
}
pmap <- function(.l, .f, ...) {
args <- args_recycle(.l)
do.call(
"mapply",
c(
FUN = list(quote(.f)),
args,
MoreArgs = quote(list(...)),
SIMPLIFY = FALSE,
USE.NAMES = FALSE
)
)
}
probe <- function(.x, .p, ...) {
if (is_logical(.p)) {
stopifnot(length(.p) == length(.x))
.p
} else {
map_lgl(.x, .p, ...)
}
}
keep <- function(.x, .f, ...) {
.x[probe(.x, .f, ...)]
}
discard <- function(.x, .p, ...) {
sel <- probe(.x, .p, ...)
.x[is.na(sel) | !sel]
}
map_if <- function(.x, .p, .f, ...) {
matches <- probe(.x, .p)
.x[matches] <- map(.x[matches], .f, ...)
.x
}
compact <- function(.x) {
Filter(length, .x)
}
transpose <- function(.l) {
inner_names <- names(.l[[1]])
if (is.null(inner_names)) {
fields <- seq_along(.l[[1]])
} else {
fields <- set_names(inner_names)
}
map(fields, function(i) {
map(.l, .subset2, i)
})
}
every <- function(.x, .p, ...) {
for (i in seq_along(.x)) {
if (!rlang::is_true(.p(.x[[i]], ...))) return(FALSE)
}
TRUE
}
some <- function(.x, .p, ...) {
for (i in seq_along(.x)) {
if (rlang::is_true(.p(.x[[i]], ...))) return(TRUE)
}
FALSE
}
negate <- function(.p) {
function(...) !.p(...)
}
reduce <- function(.x, .f, ..., .init) {
f <- function(x, y) .f(x, y, ...)
Reduce(f, .x, init = .init)
}
reduce_right <- function(.x, .f, ..., .init) {
f <- function(x, y) .f(y, x, ...)
Reduce(f, .x, init = .init, right = TRUE)
}
accumulate <- function(.x, .f, ..., .init) {
f <- function(x, y) .f(x, y, ...)
Reduce(f, .x, init = .init, accumulate = TRUE)
}
accumulate_right <- function(.x, .f, ..., .init) {
f <- function(x, y) .f(y, x, ...)
Reduce(f, .x, init = .init, right = TRUE, accumulate = TRUE)
}
detect <- function(.x, .f, ..., .right = FALSE, .p = is_true) {
for (i in index(.x, .right)) {
if (.p(.f(.x[[i]], ...))) {
return(.x[[i]])
}
}
NULL
}
detect_index <- function(.x, .f, ..., .right = FALSE, .p = is_true) {
for (i in index(.x, .right)) {
if (.p(.f(.x[[i]], ...))) {
return(i)
}
}
0L
}
index <- function(x, right = FALSE) {
idx <- seq_along(x)
if (right) {
idx <- rev(idx)
}
idx
}
imap <- function(.x, .f, ...) {
map2(.x, vec_index(.x), .f, ...)
}
vec_index <- function(x) {
names(x) %||% seq_along(x)
}
# nocov end
================================================
FILE: R/compat-types-check.R
================================================
# nocov start --- r-lib/rlang compat-types-check
#
# Dependencies
# ============
#
# - compat-obj-type.R
#
# Changelog
# =========
#
# 2022-10-04:
# - Added `check_name()` that forbids the empty string.
# `check_string()` allows the empty string by default.
#
# 2022-09-28:
# - Removed `what` arguments.
# - Added `allow_na` and `allow_null` arguments.
# - Added `allow_decimal` and `allow_infinite` arguments.
# - Improved errors with absent arguments.
#
#
# 2022-09-16:
# - Unprefixed usage of rlang functions with `rlang::` to
# avoid onLoad issues when called from rlang (#1482).
#
# 2022-08-11:
# - Added changelog.
# Scalars -----------------------------------------------------------------
check_bool <- function(
x,
...,
allow_na = FALSE,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_bool(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
if (allow_na && identical(x, NA)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
c("`TRUE`", "`FALSE`"),
...,
allow_na = allow_na,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_string <- function(
x,
...,
allow_empty = TRUE,
allow_na = FALSE,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
is_string <- .rlang_check_is_string(
x,
allow_empty = allow_empty,
allow_na = allow_na,
allow_null = allow_null
)
if (is_string) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a single string",
...,
allow_na = allow_na,
allow_null = allow_null,
arg = arg,
call = call
)
}
.rlang_check_is_string <- function(x, allow_empty, allow_na, allow_null) {
if (is_string(x)) {
if (allow_empty || !is_string(x, "")) {
return(TRUE)
}
}
if (allow_null && is_null(x)) {
return(TRUE)
}
if (allow_na && (identical(x, NA) || identical(x, na_chr))) {
return(TRUE)
}
FALSE
}
check_name <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
is_string <- .rlang_check_is_string(
x,
allow_empty = FALSE,
allow_na = FALSE,
allow_null = allow_null
)
if (is_string) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a valid name",
...,
allow_na = FALSE,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_number_decimal <- function(
x,
...,
min = -Inf,
max = Inf,
allow_infinite = TRUE,
allow_na = FALSE,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
.rlang_types_check_number(
x,
...,
min = min,
max = max,
allow_decimal = TRUE,
allow_infinite = allow_infinite,
allow_na = allow_na,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_number_whole <- function(
x,
...,
min = -Inf,
max = Inf,
allow_na = FALSE,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
.rlang_types_check_number(
x,
...,
min = min,
max = max,
allow_decimal = FALSE,
allow_infinite = FALSE,
allow_na = allow_na,
allow_null = allow_null,
arg = arg,
call = call
)
}
.rlang_types_check_number <- function(
x,
...,
min = -Inf,
max = Inf,
allow_decimal = FALSE,
allow_infinite = FALSE,
allow_na = FALSE,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (allow_decimal) {
what <- "a number"
} else {
what <- "a whole number"
}
.stop <- function(x, what, ...) {
stop_input_type(
x,
what,
...,
allow_na = allow_na,
allow_null = allow_null,
arg = arg,
call = call
)
}
if (!missing(x)) {
is_number <- is_number(
x,
allow_decimal = allow_decimal,
allow_infinite = allow_infinite
)
if (is_number) {
if (min > -Inf && max < Inf) {
what <- sprintf("a number between %s and %s", min, max)
} else {
what <- NULL
}
if (x < min) {
what <- what %||% sprintf("a number larger than %s", min)
.stop(x, what, ...)
}
if (x > max) {
what <- what %||% sprintf("a number smaller than %s", max)
.stop(x, what, ...)
}
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
if (
allow_na &&
(identical(x, NA) ||
identical(x, na_dbl) ||
identical(x, na_int))
) {
return(invisible(NULL))
}
}
.stop(x, what, ...)
}
is_number <- function(x, allow_decimal = FALSE, allow_infinite = FALSE) {
if (!typeof(x) %in% c("integer", "double")) {
return(FALSE)
}
if (length(x) != 1) {
return(FALSE)
}
if (is.na(x)) {
return(FALSE)
}
if (!allow_decimal && !is_integerish(x)) {
return(FALSE)
}
if (!allow_infinite && is.infinite(x)) {
return(FALSE)
}
TRUE
}
check_symbol <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_symbol(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a symbol",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_arg <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_symbol(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"an argument name",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_call <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_call(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a defused call",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_environment <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_environment(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"an environment",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_function <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_function(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a function",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_closure <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_closure(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"an R function",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
check_formula <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_formula(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a formula",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
# Vectors -----------------------------------------------------------------
check_character <- function(
x,
...,
allow_null = FALSE,
arg = caller_arg(x),
call = caller_env()
) {
if (!missing(x)) {
if (is_character(x)) {
return(invisible(NULL))
}
if (allow_null && is_null(x)) {
return(invisible(NULL))
}
}
stop_input_type(
x,
"a character vector",
...,
allow_null = allow_null,
arg = arg,
call = call
)
}
# nocov end
================================================
FILE: R/conv.R
================================================
#' Specify the encoding of a string
#'
#' This is a convenient way to override the current encoding of a string.
#'
#' @inheritParams str_detect
#' @param encoding Name of encoding. See [stringi::stri_enc_list()]
#' for a complete list.
#' @export
#' @examples
#' # Example from encoding?stringi::stringi
#' x <- rawToChar(as.raw(177))
#' x
#' str_conv(x, "ISO-8859-2") # Polish "a with ogonek"
#' str_conv(x, "ISO-8859-1") # Plus-minus
str_conv <- function(string, encoding) {
check_string(encoding)
copy_names(string, stri_conv(string, encoding, "UTF-8"))
}
================================================
FILE: R/count.R
================================================
#' Count number of matches
#'
#' Counts the number of times `pattern` is found within each element
#' of `string.`
#'
#' @inheritParams str_detect
#' @param pattern Pattern to look for.
#'
#' The default interpretation is a regular expression, as described in
#' `vignette("regular-expressions")`. Use [regex()] for finer control of the
#' matching behaviour.
#'
#' Match a fixed string (i.e. by comparing only bytes), using
#' [fixed()]. This is fast, but approximate. Generally,
#' for matching human text, you'll want [coll()] which
#' respects character matching rules for the specified locale.
#'
#' Match character, word, line and sentence boundaries with
#' [boundary()]. The empty string, `""``, is equivalent to
#' `boundary("character")`.
#' @return An integer vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_count()] which this function wraps.
#'
#' [str_locate()]/[str_locate_all()] to locate position
#' of matches
#'
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_count(fruit, "a")
#' str_count(fruit, "p")
#' str_count(fruit, "e")
#' str_count(fruit, c("a", "b", "p", "p"))
#'
#' str_count(c("a.", "...", ".a.a"), ".")
#' str_count(c("a.", "...", ".a.a"), fixed("."))
str_count <- function(string, pattern = "") {
check_lengths(string, pattern)
out <- switch(
type(pattern),
empty = ,
bound = stri_count_boundaries(string, opts_brkiter = opts(pattern)),
fixed = stri_count_fixed(string, pattern, opts_fixed = opts(pattern)),
coll = stri_count_coll(string, pattern, opts_collator = opts(pattern)),
regex = stri_count_regex(string, pattern, opts_regex = opts(pattern))
)
preserve_names_if_possible(string, pattern, out)
}
================================================
FILE: R/data.R
================================================
#' Sample character vectors for practicing string manipulations
#'
#' `fruit` and `words` come from the `rcorpora` package
#' written by Gabor Csardi; the data was collected by Darius Kazemi
#' and made available at \url{https://github.com/dariusk/corpora}.
#' `sentences` is a collection of "Harvard sentences" used for
#' standardised testing of voice.
#'
#' @format Character vectors.
#' @name stringr-data
#' @examples
#' length(sentences)
#' sentences[1:5]
#'
#' length(fruit)
#' fruit[1:5]
#'
#' length(words)
#' words[1:5]
NULL
#' @rdname stringr-data
#' @format NULL
"sentences"
#' @rdname stringr-data
#' @format NULL
"fruit"
#' @rdname stringr-data
#' @format NULL
"words"
================================================
FILE: R/detect.R
================================================
#' Detect the presence/absence of a match
#'
#' `str_detect()` returns a logical vector with `TRUE` for each element of
#' `string` that matches `pattern` and `FALSE` otherwise. It's equivalent to
#' `grepl(pattern, string)`.
#'
#' @param string Input vector. Either a character vector, or something
#' coercible to one.
#' @param pattern Pattern to look for.
#'
#' The default interpretation is a regular expression, as described in
#' `vignette("regular-expressions")`. Use [regex()] for finer control of the
#' matching behaviour.
#'
#' Match a fixed string (i.e. by comparing only bytes), using
#' [fixed()]. This is fast, but approximate. Generally,
#' for matching human text, you'll want [coll()] which
#' respects character matching rules for the specified locale.
#'
#' You can not match boundaries, including `""`, with this function.
#'
#' @param negate If `TRUE`, inverts the resulting boolean vector.
#' @return A logical vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_detect()] which this function wraps,
#' [str_subset()] for a convenient wrapper around
#' `x[str_detect(x, pattern)]`
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_detect(fruit, "a")
#' str_detect(fruit, "^a")
#' str_detect(fruit, "a$")
#' str_detect(fruit, "b")
#' str_detect(fruit, "[aeiou]")
#'
#' # Also vectorised over pattern
#' str_detect("aecfg", letters)
#'
#' # Returns TRUE if the pattern do NOT match
#' str_detect(fruit, "^p", negate = TRUE)
str_detect <- function(string, pattern, negate = FALSE) {
check_lengths(string, pattern)
check_bool(negate)
out <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = stri_detect_fixed(
string,
pattern,
negate = negate,
opts_fixed = opts(pattern)
),
coll = stri_detect_coll(
string,
pattern,
negate = negate,
opts_collator = opts(pattern)
),
regex = stri_detect_regex(
string,
pattern,
negate = negate,
opts_regex = opts(pattern)
)
)
preserve_names_if_possible(string, pattern, out)
}
#' Detect the presence/absence of a match at the start/end
#'
#' `str_starts()` and `str_ends()` are special cases of [str_detect()] that
#' only match at the beginning or end of a string, respectively.
#'
#' @inheritParams str_detect
#' @param pattern Pattern with which the string starts or ends.
#'
#' The default interpretation is a regular expression, as described in
#' [stringi::about_search_regex]. Control options with [regex()].
#'
#' Match a fixed string (i.e. by comparing only bytes), using [fixed()]. This
#' is fast, but approximate. Generally, for matching human text, you'll want
#' [coll()] which respects character matching rules for the specified locale.
#'
#' @return A logical vector.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_starts(fruit, "p")
#' str_starts(fruit, "p", negate = TRUE)
#' str_ends(fruit, "e")
#' str_ends(fruit, "e", negate = TRUE)
str_starts <- function(string, pattern, negate = FALSE) {
check_lengths(string, pattern)
check_bool(negate)
out <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = stri_startswith_fixed(
string,
pattern,
negate = negate,
opts_fixed = opts(pattern)
),
coll = stri_startswith_coll(
string,
pattern,
negate = negate,
opts_collator = opts(pattern)
),
regex = {
pattern2 <- paste0("^(", pattern, ")")
stri_detect_regex(
string,
pattern2,
negate = negate,
opts_regex = opts(pattern)
)
}
)
preserve_names_if_possible(string, pattern, out)
}
#' @rdname str_starts
#' @export
str_ends <- function(string, pattern, negate = FALSE) {
check_lengths(string, pattern)
check_bool(negate)
out <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = stri_endswith_fixed(
string,
pattern,
negate = negate,
opts_fixed = opts(pattern)
),
coll = stri_endswith_coll(
string,
pattern,
negate = negate,
opts_collator = opts(pattern)
),
regex = {
pattern2 <- paste0("(", pattern, ")$")
stri_detect_regex(
string,
pattern2,
negate = negate,
opts_regex = opts(pattern)
)
}
)
preserve_names_if_possible(string, pattern, out)
}
#' Detect a pattern in the same way as `SQL`'s `LIKE` and `ILIKE` operators
#'
#' @description
#' `str_like()` and `str_like()` follow the conventions of the SQL `LIKE`
#' and `ILIKE` operators, namely:
#'
#' * Must match the entire string.
#' * `_` matches a single character (like `.`).
#' * `%` matches any number of characters (like `.*`).
#' * `\%` and `\_` match literal `%` and `_`.
#'
#' The difference between the two functions is their case-sensitivity:
#' `str_like()` is case sensitive and `str_ilike()` is not.
#'
#' @note
#' Prior to stringr 1.6.0, `str_like()` was incorrectly case-insensitive.
#'
#' @inheritParams str_detect
#' @param pattern A character vector containing a SQL "like" pattern.
#' See above for details.
#' @param ignore_case `r lifecycle::badge("deprecated")`
#' @return A logical vector the same length as `string`.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_like(fruit, "app")
#' str_like(fruit, "app%")
#' str_like(fruit, "APP%")
#' str_like(fruit, "ba_ana")
#' str_like(fruit, "%apple")
#'
#' str_ilike(fruit, "app")
#' str_ilike(fruit, "app%")
#' str_ilike(fruit, "APP%")
#' str_ilike(fruit, "ba_ana")
#' str_ilike(fruit, "%apple")
str_like <- function(string, pattern, ignore_case = deprecated()) {
check_lengths(string, pattern)
check_character(pattern)
if (inherits(pattern, "stringr_pattern")) {
cli::cli_abort(
"{.arg pattern} must be a plain string, not a stringr modifier."
)
}
if (lifecycle::is_present(ignore_case)) {
lifecycle::deprecate_warn(
when = "1.6.0",
what = "str_like(ignore_case)",
details = c(
"`str_like()` is always case sensitive.",
"Use `str_ilike()` for case insensitive string matching."
)
)
check_bool(ignore_case)
if (ignore_case) {
return(str_ilike(string, pattern))
}
}
pattern <- regex(like_to_regex(pattern), ignore_case = FALSE)
out <- stri_detect_regex(string, pattern, opts_regex = opts(pattern))
preserve_names_if_possible(string, pattern, out)
}
#' @export
#' @rdname str_like
str_ilike <- function(string, pattern) {
check_lengths(string, pattern)
check_character(pattern)
if (inherits(pattern, "stringr_pattern")) {
cli::cli_abort(tr_(
"{.arg pattern} must be a plain string, not a stringr modifier."
))
}
pattern <- regex(like_to_regex(pattern), ignore_case = TRUE)
out <- stri_detect_regex(string, pattern, opts_regex = opts(pattern))
preserve_names_if_possible(string, pattern, out)
}
like_to_regex <- function(pattern) {
converted <- stri_replace_all_regex(
pattern,
"(?<!\\\\|\\[)%(?!\\])",
"\\.\\*"
)
converted <- stri_replace_all_regex(converted, "(?<!\\\\|\\[)_(?!\\])", "\\.")
paste0("^", converted, "$")
}
================================================
FILE: R/dup.R
================================================
#' Duplicate a string
#'
#' `str_dup()` duplicates the characters within a string, e.g.
#' `str_dup("xy", 3)` returns `"xyxyxy"`.
#'
#' @inheritParams str_detect
#' @param times Number of times to duplicate each string.
#' @param sep String to insert between each duplicate.
#' @return A character vector the same length as `string`/`times`.
#' @export
#' @examples
#' fruit <- c("apple", "pear", "banana")
#' str_dup(fruit, 2)
#' str_dup(fruit, 2, sep = " ")
#' str_dup(fruit, 1:3)
#' str_c("ba", str_dup("na", 0:5))
str_dup <- function(string, times, sep = NULL) {
input <- vctrs::vec_recycle_common(string = string, times = times)
check_string(sep, allow_null = TRUE)
if (is.null(sep)) {
out <- stri_dup(input$string, input$times)
} else {
out <- map_chr(seq_along(input$string), function(i) {
paste(rep(string[[i]], input$times[[i]]), collapse = sep)
})
}
names(out) <- names(input$string)
out
}
================================================
FILE: R/equal.R
================================================
#' Determine if two strings are equivalent
#'
#' This uses Unicode canonicalisation rules, and optionally ignores case.
#'
#' @param x,y A pair of character vectors.
#' @inheritParams str_order
#' @param ignore_case Ignore case when comparing strings?
#' @return An logical vector the same length as `x`/`y`.
#' @seealso [stringi::stri_cmp_equiv()] for the underlying implementation.
#' @export
#' @examples
#' # These two strings encode "a" with an accent in two different ways
#' a1 <- "\u00e1"
#' a2 <- "a\u0301"
#' c(a1, a2)
#'
#' a1 == a2
#' str_equal(a1, a2)
#'
#' # ohm and omega use different code points but should always be treated
#' # as equal
#' ohm <- "\u2126"
#' omega <- "\u03A9"
#' c(ohm, omega)
#'
#' ohm == omega
#' str_equal(ohm, omega)
str_equal <- function(x, y, locale = "en", ignore_case = FALSE, ...) {
vctrs::vec_size_common(x = x, y = y)
check_string(locale)
check_bool(ignore_case)
opts <- str_opts_collator(
locale = locale,
ignore_case = ignore_case,
...
)
stri_cmp_equiv(x, y, opts_collator = opts)
}
================================================
FILE: R/escape.R
================================================
#' Escape regular expression metacharacters
#'
#' This function escapes metacharacter, the characters that have special
#' meaning to the regular expression engine. In most cases you are better
#' off using [fixed()] since it is faster, but `str_escape()` is useful
#' if you are composing user provided strings into a pattern.
#'
#' @inheritParams str_detect
#' @return A character vector the same length as `string`.
#' @export
#' @examples
#' str_detect(c("a", "."), ".")
#' str_detect(c("a", "."), str_escape("."))
str_escape <- function(string) {
out <- str_replace_all(string, "([.^$\\\\|*+?{}\\[\\]()])", "\\\\\\1")
copy_names(string, out)
}
================================================
FILE: R/extract.R
================================================
#' Extract the complete match
#'
#' `str_extract()` extracts the first complete match from each string,
#' `str_extract_all()`extracts all matches from each string.
#'
#' @inheritParams str_count
#' @param group If supplied, instead of returning the complete match, will
#' return the matched text from the specified capturing group.
#' @seealso [str_match()] to extract matched groups;
#' [stringi::stri_extract()] for the underlying implementation.
#' @param simplify A boolean.
#' * `FALSE` (the default): returns a list of character vectors.
#' * `TRUE`: returns a character matrix.
#' @return
#' * `str_extract()`: an character vector the same length as `string`/`pattern`.
#' * `str_extract_all()`: a list of character vectors the same length as
#' `string`/`pattern`.
#' @export
#' @examples
#' shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
#' str_extract(shopping_list, "\\d")
#' str_extract(shopping_list, "[a-z]+")
#' str_extract(shopping_list, "[a-z]{1,4}")
#' str_extract(shopping_list, "\\b[a-z]{1,4}\\b")
#'
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)")
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 1)
#' str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 2)
#'
#' # Extract all matches
#' str_extract_all(shopping_list, "[a-z]+")
#' str_extract_all(shopping_list, "\\b[a-z]+\\b")
#' str_extract_all(shopping_list, "\\d")
#'
#' # Simplify results into character matrix
#' str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE)
#' str_extract_all(shopping_list, "\\d", simplify = TRUE)
#'
#' # Extract all words
#' str_extract_all("This is, suprisingly, a sentence.", boundary("word"))
str_extract <- function(string, pattern, group = NULL) {
if (!is.null(group)) {
out <- str_match(string, pattern)[, group + 1]
return(preserve_names_if_possible(string, pattern, out))
}
check_lengths(string, pattern)
opt <- opts(pattern)
out <- switch(
type(pattern),
empty = stri_extract_first_boundaries(string, opts_brkiter = opt),
bound = stri_extract_first_boundaries(string, opts_brkiter = opt),
fixed = stri_extract_first_fixed(string, pattern, opts_fixed = opt),
coll = stri_extract_first_coll(string, pattern, opts_collator = opt),
regex = stri_extract_first_regex(string, pattern, opts_regex = opt)
)
preserve_names_if_possible(string, pattern, out)
}
#' @rdname str_extract
#' @export
str_extract_all <- function(string, pattern, simplify = FALSE) {
check_lengths(string, pattern)
check_bool(simplify)
opt <- opts(pattern)
out <- switch(
type(pattern),
empty = stri_extract_all_boundaries(
string,
simplify = simplify,
omit_no_match = TRUE,
opts_brkiter = opt
),
bound = stri_extract_all_boundaries(
string,
simplify = simplify,
omit_no_match = TRUE,
opts_brkiter = opt
),
fixed = stri_extract_all_fixed(
string,
pattern,
simplify = simplify,
omit_no_match = TRUE,
opts_fixed = opt
),
coll = stri_extract_all_coll(
string,
pattern,
simplify = simplify,
omit_no_match = TRUE,
opts_collator = opt
),
regex = stri_extract_all_regex(
string,
pattern,
simplify = simplify,
omit_no_match = TRUE,
opts_regex = opt
)
)
preserve_names_if_possible(string, pattern, out)
}
================================================
FILE: R/flatten.R
================================================
#' Flatten a string
#
#' @description
#' `str_flatten()` reduces a character vector to a single string. This is a
#' summary function because regardless of the length of the input `x`, it
#' always returns a single string.
#'
#' `str_flatten_comma()` is a variation designed specifically for flattening
#' with commas. It automatically recognises if `last` uses the Oxford comma
#' and handles the special case of 2 elements.
#'
#' @inheritParams str_detect
#' @param collapse String to insert between each piece. Defaults to `""`.
#' @param last Optional string to use in place of the final separator.
#' @param na.rm Remove missing values? If `FALSE` (the default), the result
#' will be `NA` if any element of `string` is `NA`.
#' @return A string, i.e. a character vector of length 1.
#' @export
#' @examples
#' str_flatten(letters)
#' str_flatten(letters, "-")
#'
#' str_flatten(letters[1:3], ", ")
#'
#' # Use last to customise the last component
#' str_flatten(letters[1:3], ", ", " and ")
#'
#' # this almost works if you want an Oxford (aka serial) comma
#' str_flatten(letters[1:3], ", ", ", and ")
#'
#' # but it will always add a comma, even when not necessary
#' str_flatten(letters[1:2], ", ", ", and ")
#'
#' # str_flatten_comma knows how to handle the Oxford comma
#' str_flatten_comma(letters[1:3], ", and ")
#' str_flatten_comma(letters[1:2], ", and ")
str_flatten <- function(string, collapse = "", last = NULL, na.rm = FALSE) {
check_string(collapse)
check_string(last, allow_null = TRUE)
check_bool(na.rm)
if (na.rm) {
string <- string[!is.na(string)]
}
n <- length(string)
if (!is.null(last) && n >= 2) {
string <- c(
string[seq2(1, n - 2)],
stringi::stri_c(string[[n - 1]], last, string[[n]])
)
}
stri_flatten(string, collapse = collapse)
}
#' @export
#' @rdname str_flatten
str_flatten_comma <- function(string, last = NULL, na.rm = FALSE) {
check_string(last, allow_null = TRUE)
check_bool(na.rm)
# Remove comma if exactly two elements, and last uses Oxford comma
if (length(string) == 2 && !is.null(last) && str_detect(last, "^,")) {
last <- str_replace(last, "^,", "")
}
str_flatten(string, ", ", last = last, na.rm = na.rm)
}
================================================
FILE: R/glue.R
================================================
#' Interpolation with glue
#'
#' @description
#' These functions are wrappers around [glue::glue()] and [glue::glue_data()],
#' which provide a powerful and elegant syntax for interpolating strings
#' with `{}`.
#'
#' These wrappers provide a small set of the full options. Use `glue()` and
#' `glue_data()` directly from glue for more control.
#'
#' @inheritParams glue::glue
#' @return A character vector with same length as the longest input.
#' @export
#' @examples
#' name <- "Fred"
#' age <- 50
#' anniversary <- as.Date("1991-10-12")
#' str_glue(
#' "My name is {name}, ",
#' "my age next year is {age + 1}, ",
#' "and my anniversary is {format(anniversary, '%A, %B %d, %Y')}."
#' )
#'
#' # single braces can be inserted by doubling them
#' str_glue("My name is {name}, not {{name}}.")
#'
#' # You can also used named arguments
#' str_glue(
#' "My name is {name}, ",
#' "and my age next year is {age + 1}.",
#' name = "Joe",
#' age = 40
#' )
#'
#' # `str_glue_data()` is useful in data pipelines
#' mtcars %>% str_glue_data("{rownames(.)} has {hp} hp")
str_glue <- function(..., .sep = "", .envir = parent.frame(), .trim = TRUE) {
glue::glue(..., .sep = .sep, .envir = .envir, .trim = .trim)
}
#' @export
#' @rdname str_glue
str_glue_data <- function(
.x,
...,
.sep = "",
.envir = parent.frame(),
.na = "NA"
) {
glue::glue_data(
.x,
...,
.sep = .sep,
.envir = .envir,
.na = .na
)
}
================================================
FILE: R/interp.R
================================================
#' String interpolation
#'
#' @description
#' `r lifecycle::badge("superseded")`
#'
#' `str_interp()` is superseded in favour of [str_glue()].
#'
#' String interpolation is a useful way of specifying a character string which
#' depends on values in a certain environment. It allows for string creation
#' which is easier to read and write when compared to using e.g.
#' [paste()] or [sprintf()]. The (template) string can
#' include expression placeholders of the form `${expression}` or
#' `$[format]{expression}`, where expressions are valid R expressions that
#' can be evaluated in the given environment, and `format` is a format
#' specification valid for use with [sprintf()].
#'
#' @param string A template character string. This function is not vectorised:
#' a character vector will be collapsed into a single string.
#' @param env The environment in which to evaluate the expressions.
#' @seealso [str_glue()] and [str_glue_data()] for alternative approaches to
#' the same problem.
#' @keywords internal
#' @return An interpolated character string.
#' @author Stefan Milton Bache
#' @export
#' @examples
#'
#' # Using values from the environment, and some formats
#' user_name <- "smbache"
#' amount <- 6.656
#' account <- 1337
#' str_interp("User ${user_name} (account $[08d]{account}) has $$[.2f]{amount}.")
#'
#' # Nested brace pairs work inside expressions too, and any braces can be
#' # placed outside the expressions.
#' str_interp("Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}")
#'
#' # Values can also come from a list
#' str_interp(
#' "One value, ${value1}, and then another, ${value2*2}.",
#' list(value1 = 10, value2 = 20)
#' )
#'
#' # Or a data frame
#' str_interp(
#' "Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.",
#' iris
#' )
#'
#' # Use a vector when the string is long:
#' max_char <- 80
#' str_interp(c(
#' "This particular line is so long that it is hard to write ",
#' "without breaking the ${max_char}-char barrier!"
#' ))
str_interp <- function(string, env = parent.frame()) {
check_character(string)
string <- str_c(string, collapse = "")
# Find expression placeholders
matches <- interp_placeholders(string)
# Determine if any placeholders were found.
if (matches$indices[1] <= 0) {
string
} else {
# Evaluate them to get the replacement strings.
replacements <- eval_interp_matches(matches$matches, env)
# Replace the expressions by their values and return.
`regmatches<-`(string, list(matches$indices), FALSE, list(replacements))
}
}
#' Match String Interpolation Placeholders
#'
#' Given a character string a set of expression placeholders are matched. They
#' are of the form \code{${...}} or optionally \code{$[f]{...}} where `f`
#' is a valid format for [sprintf()].
#'
#' @param string character: The string to be interpolated.
#'
#' @return list containing `indices` (regex match data) and `matches`,
#' the string representations of matched expressions.
#'
#' @noRd
#' @author Stefan Milton Bache
interp_placeholders <- function(string, error_call = caller_env()) {
# Find starting position of ${} or $[]{} placeholders.
starts <- gregexpr("\\$(\\[.*?\\])?\\{", string)[[1]]
# Return immediately if no matches are found.
if (starts[1] <= 0) {
return(list(indices = starts))
}
# Break up the string in parts
parts <- substr(
rep(string, length(starts)),
start = starts,
stop = c(starts[-1L] - 1L, nchar(string))
)
# If there are nested placeholders, each part will not contain a full
# placeholder in which case we report invalid string interpolation template.
if (any(!grepl("\\$(\\[.*?\\])?\\{.+\\}", parts))) {
cli::cli_abort(
tr_("Invalid template string for interpolation."),
call = error_call
)
}
# For each part, find the opening and closing braces.
opens <- lapply(strsplit(parts, ""), function(v) which(v == "{"))
closes <- lapply(strsplit(parts, ""), function(v) which(v == "}"))
# Identify the positions within the parts of the matching closing braces.
# These are the lengths of the placeholder matches.
lengths <- mapply(match_brace, opens, closes)
# Update the `starts` match data with the
attr(starts, "match.length") <- lengths
# Return both the indices (regex match data) and the actual placeholder
# matches (as strings.)
list(
indices = starts,
matches = mapply(substr, starts, starts + lengths - 1, x = string)
)
}
#' Evaluate String Interpolation Matches
#'
#' The expression part of string interpolation matches are evaluated in a
#' specified environment and formatted for replacement in the original string.
#' Used internally by [str_interp()].
#'
#' @param matches Match data
#'
#' @param env The environment in which to evaluate the expressions.
#'
#' @return A character vector of replacement strings.
#'
#' @noRd
#' @author Stefan Milton Bache
eval_interp_matches <- function(matches, env, error_call = caller_env()) {
# Extract expressions from the matches
expressions <- extract_expressions(matches, error_call = error_call)
# Evaluate them in the given environment
values <- lapply(
expressions,
eval,
envir = env,
enclos = if (is.environment(env)) env else environment(env)
)
# Find the formats to be used
formats <- extract_formats(matches)
# Format the values and return.
mapply(sprintf, formats, values, SIMPLIFY = FALSE)
}
#' Extract Expression Objects from String Interpolation Matches
#'
#' An interpolation match object will contain both its wrapping \code{${ }} part
#' and possibly a format. This extracts the expression parts and parses them to
#' prepare them for evaluation.
#'
#' @param matches Match data
#'
#' @return list of R expressions
#'
#' @noRd
#' @author Stefan Milton Bache
extract_expressions <- function(matches, error_call = caller_env()) {
# Parse function for text argument as first argument.
parse_text <- function(text) {
withCallingHandlers(
parse(text = text),
error = function(e) {
cli::cli_abort(
tr_("Failed to parse input {.str {text}}"),
parent = e,
call = error_call
)
}
)
}
# string representation of the expressions (without the possible formats).
strings <- gsub("\\$(\\[.+?\\])?\\{", "", matches)
# Remove the trailing closing brace and parse.
lapply(substr(strings, 1L, nchar(strings) - 1), parse_text)
}
#' Extract String Interpolation Formats from Matched Placeholders
#'
#' An expression placeholder for string interpolation may optionally contain a
#' format valid for [sprintf()]. This function will extract such or
#' default to "s" the format for strings.
#'
#' @param matches Match data
#'
#' @return A character vector of format specifiers.
#'
#' @noRd
#' @author Stefan Milton Bache
extract_formats <- function(matches) {
# Extract the optional format parts.
formats <- gsub("\\$(\\[(.+?)\\])?.*", "\\2", matches)
# Use string options "s" as default when not specified.
paste0("%", ifelse(formats == "", "s", formats))
}
#' Utility Function for Matching a Closing Brace
#'
#' Given positions of opening and closing braces `match_brace` identifies
#' the closing brace matching the first opening brace.
#'
#' @param opening integer: Vector with positions of opening braces.
#'
#' @param closing integer: Vector with positions of closing braces.
#'
#' @return Integer with the posision of the matching brace.
#'
#' @noRd
#' @author Stefan Milton Bache
match_brace <- function(opening, closing) {
# maximum index for the matching closing brace
max_close <- max(closing)
# "path" for mapping opening and closing breaces
path <- numeric(max_close)
# Set openings to 1, and closings to -1
path[opening[opening < max_close]] <- 1
path[closing] <- -1
# Cumulate the path ...
cumpath <- cumsum(path)
# ... and the first 0 after the first opening identifies the match.
min(which(1:max_close > min(which(cumpath == 1)) & cumpath == 0))
}
================================================
FILE: R/length.R
================================================
#' Compute the length/width
#'
#' @description
#' `str_length()` returns the number of codepoints in a string. These are
#' the individual elements (which are often, but not always letters) that
#' can be extracted with [str_sub()].
#'
#' `str_width()` returns how much space the string will occupy when printed
#' in a fixed width font (i.e. when printed in the console).
#'
#' @inheritParams str_detect
#' @return A numeric vector the same length as `string`.
#' @seealso [stringi::stri_length()] which this function wraps.
#' @export
#' @examples
#' str_length(letters)
#' str_length(NA)
#' str_length(factor("abc"))
#' str_length(c("i", "like", "programming", NA))
#'
#' # Some characters, like emoji and Chinese characters (hanzi), are square
#' # which means they take up the width of two Latin characters
#' x <- c("\u6c49\u5b57", "\U0001f60a")
#' str_view(x)
#' str_width(x)
#' str_length(x)
#'
#' # There are two ways of representing a u with an umlaut
#' u <- c("\u00fc", "u\u0308")
#' # They have the same width
#' str_width(u)
#' # But a different length
#' str_length(u)
#' # Because the second element is made up of a u + an accent
#' str_sub(u, 1, 1)
str_length <- function(string) {
copy_names(string, stri_length(string))
}
#' @export
#' @rdname str_length
str_width <- function(string) {
copy_names(string, stri_width(string))
}
================================================
FILE: R/locate.R
================================================
#' Find location of match
#'
#' @description
#' `str_locate()` returns the `start` and `end` position of the first match;
#' `str_locate_all()` returns the `start` and `end` position of each match.
#'
#' Because the `start` and `end` values are inclusive, zero-length matches
#' (e.g. `$`, `^`, `\\b`) will have an `end` that is smaller than `start`.
#'
#' @inheritParams str_count
#' @returns
#' * `str_locate()` returns an integer matrix with two columns and
#' one row for each element of `string`. The first column, `start`,
#' gives the position at the start of the match, and the second column, `end`,
#' gives the position of the end.
#'
#'* `str_locate_all()` returns a list of integer matrices with the same
#' length as `string`/`pattern`. The matrices have columns `start` and `end`
#' as above, and one row for each match.
#' @seealso
#' [str_extract()] for a convenient way of extracting matches,
#' [stringi::stri_locate()] for the underlying implementation.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_locate(fruit, "$")
#' str_locate(fruit, "a")
#' str_locate(fruit, "e")
#' str_locate(fruit, c("a", "b", "p", "p"))
#'
#' str_locate_all(fruit, "a")
#' str_locate_all(fruit, "e")
#' str_locate_all(fruit, c("a", "b", "p", "p"))
#'
#' # Find location of every character
#' str_locate_all(fruit, "")
str_locate <- function(string, pattern) {
check_lengths(string, pattern)
out <- switch(
type(pattern),
empty = ,
bound = stri_locate_first_boundaries(string, opts_brkiter = opts(pattern)),
fixed = stri_locate_first_fixed(
string,
pattern,
opts_fixed = opts(pattern)
),
coll = stri_locate_first_coll(
string,
pattern,
opts_collator = opts(pattern)
),
regex = stri_locate_first_regex(string, pattern, opts_regex = opts(pattern))
)
preserve_names_if_possible(string, pattern, out)
}
#' @rdname str_locate
#' @export
str_locate_all <- function(string, pattern) {
check_lengths(string, pattern)
opts <- opts(pattern)
out <- switch(
type(pattern),
empty = ,
bound = stri_locate_all_boundaries(
string,
omit_no_match = TRUE,
opts_brkiter = opts
),
fixed = stri_locate_all_fixed(
string,
pattern,
omit_no_match = TRUE,
opts_fixed = opts
),
regex = stri_locate_all_regex(
string,
pattern,
omit_no_match = TRUE,
opts_regex = opts
),
coll = stri_locate_all_coll(
string,
pattern,
omit_no_match = TRUE,
opts_collator = opts
)
)
preserve_names_if_possible(string, pattern, out)
}
#' Switch location of matches to location of non-matches
#'
#' Invert a matrix of match locations to match the opposite of what was
#' previously matched.
#'
#' @param loc matrix of match locations, as from [str_locate_all()]
#' @return numeric match giving locations of non-matches
#' @export
#' @examples
#' numbers <- "1 and 2 and 4 and 456"
#' num_loc <- str_locate_all(numbers, "[0-9]+")[[1]]
#' str_sub(numbers, num_loc[, "start"], num_loc[, "end"])
#'
#' text_loc <- invert_match(num_loc)
#' str_sub(numbers, text_loc[, "start"], text_loc[, "end"])
invert_match <- function(loc) {
cbind(
start = c(0L, loc[, "end"] + 1L),
end = c(loc[, "start"] - 1L, -1L)
)
}
================================================
FILE: R/match.R
================================================
#' Extract components (capturing groups) from a match
#'
#' @description
#' Extract any number of matches defined by unnamed, `(pattern)`, and
#' named, `(?<name>pattern)` capture groups.
#'
#' Use a non-capturing group, `(?:pattern)`, if you need to override default
#' operate precedence but don't want to capture the result.
#'
#' @inheritParams str_detect
#' @param pattern Unlike other stringr functions, `str_match()` only supports
#' regular expressions, as described `vignette("regular-expressions")`.
#' The pattern should contain at least one capturing group.
#' @return
#' * `str_match()`: a character matrix with the same number of rows as the
#' length of `string`/`pattern`. The first column is the complete match,
#' followed by one column for each capture group. The columns will be named
#' if you used "named captured groups", i.e. `(?<name>pattern')`.
#'
#' * `str_match_all()`: a list of the same length as `string`/`pattern`
#' containing character matrices. Each matrix has columns as described above
#' and one row for each match.
#'
#' @seealso [str_extract()] to extract the complete match,
#' [stringi::stri_match()] for the underlying implementation.
#' @export
#' @examples
#' strings <- c(" 219 733 8965", "329-293-8753 ", "banana", "595 794 7569",
#' "387 287 6718", "apple", "233.398.9187 ", "482 952 3315",
#' "239 923 8115 and 842 566 4692", "Work: 579-499-7527", "$1000",
#' "Home: 543.355.3679")
#' phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"
#'
#' str_extract(strings, phone)
#' str_match(strings, phone)
#'
#' # Extract/match all
#' str_extract_all(strings, phone)
#' str_match_all(strings, phone)
#'
#' # You can also name the groups to make further manipulation easier
#' phone <- "(?<area>[2-9][0-9]{2})[- .](?<phone>[0-9]{3}[- .][0-9]{4})"
#' str_match(strings, phone)
#'
#' x <- c("<a> <b>", "<a> <>", "<a>", "", NA)
#' str_match(x, "<(.*?)> <(.*?)>")
#' str_match_all(x, "<(.*?)>")
#'
#' str_extract(x, "<.*?>")
#' str_extract_all(x, "<.*?>")
str_match <- function(string, pattern) {
check_lengths(string, pattern)
if (type(pattern) != "regex") {
cli::cli_abort(tr_("{.arg pattern} must be a regular expression."))
}
out <- stri_match_first_regex(string, pattern, opts_regex = opts(pattern))
preserve_names_if_possible(string, pattern, out)
}
#' @rdname str_match
#' @export
str_match_all <- function(string, pattern) {
check_lengths(string, pattern)
if (type(pattern) != "regex") {
cli::cli_abort(tr_("{.arg pattern} must be a regular expression."))
}
out <- stri_match_all_regex(
string,
pattern,
omit_no_match = TRUE,
opts_regex = opts(pattern)
)
preserve_names_if_possible(string, pattern, out)
}
================================================
FILE: R/modifiers.R
================================================
#' Control matching behaviour with modifier functions
#'
#' @description
#' Modifier functions control the meaning of the `pattern` argument to
#' stringr functions:
#'
#' * `boundary()`: Match boundaries between things.
#' * `coll()`: Compare strings using standard Unicode collation rules.
#' * `fixed()`: Compare literal bytes.
#' * `regex()` (the default): Uses ICU regular expressions.
#'
#' @param pattern Pattern to modify behaviour.
#' @param ignore_case Should case differences be ignored in the match?
#' For `fixed()`, this uses a simple algorithm which assumes a
#' one-to-one mapping between upper and lower case letters.
#' @return A stringr modifier object, i.e. a character vector with
#' parent S3 class `stringr_pattern`.
#' @name modifiers
#' @examples
#' pattern <- "a.b"
#' strings <- c("abb", "a.b")
#' str_detect(strings, pattern)
#' str_detect(strings, fixed(pattern))
#' str_detect(strings, coll(pattern))
#'
#' # coll() is useful for locale-aware case-insensitive matching
#' i <- c("I", "\u0130", "i")
#' i
#' str_detect(i, fixed("i", TRUE))
#' str_detect(i, coll("i", TRUE))
#' str_detect(i, coll("i", TRUE, locale = "tr"))
#'
#' # Word boundaries
#' words <- c("These are some words.")
#' str_count(words, boundary("word"))
#' str_split(words, " ")[[1]]
#' str_split(words, boundary("word"))[[1]]
#'
#' # Regular expression variations
#' str_extract_all("The Cat in the Hat", "[a-z]+")
#' str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))
#'
#' str_extract_all("a\nb\nc", "^.")
#' str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))
#'
#' str_extract_all("a\nb\nc", "a.")
#' str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))
NULL
#' @export
#' @rdname modifiers
fixed <- function(pattern, ignore_case = FALSE) {
pattern <- as_bare_character(pattern)
check_bool(ignore_case)
options <- stri_opts_fixed(case_insensitive = ignore_case)
structure(
pattern,
options = options,
class = c("stringr_fixed", "stringr_pattern", "character")
)
}
#' @export
#' @rdname modifiers
#' @param locale Locale to use for comparisons. See
#' [stringi::stri_locale_list()] for all possible options.
#' Defaults to "en" (English) to ensure that default behaviour is
#' consistent across platforms.
#' @param ... Other less frequently used arguments passed on to
#' [stringi::stri_opts_collator()],
#' [stringi::stri_opts_regex()], or
#' [stringi::stri_opts_brkiter()]
coll <- function(pattern, ignore_case = FALSE, locale = "en", ...) {
pattern <- as_bare_character(pattern)
check_bool(ignore_case)
check_string(locale)
options <- str_opts_collator(
ignore_case = ignore_case,
locale = locale,
...
)
structure(
pattern,
options = options,
class = c("stringr_coll", "stringr_pattern", "character")
)
}
str_opts_collator <- function(
locale = "en",
ignore_case = FALSE,
strength = NULL,
...
) {
strength <- strength %||% if (ignore_case) 2L else 3L
stri_opts_collator(
strength = strength,
locale = locale,
...
)
}
# used for testing
turkish_I <- function() {
coll("I", ignore_case = TRUE, locale = "tr")
}
#' @export
#' @rdname modifiers
#' @param multiline If `TRUE`, `$` and `^` match
#' the beginning and end of each line. If `FALSE`, the
#' default, only match the start and end of the input.
#' @param comments If `TRUE`, white space and comments beginning with
#' `#` are ignored. Escape literal spaces with `\\ `.
#' @param dotall If `TRUE`, `.` will also match line terminators.
regex <- function(
pattern,
ignore_case = FALSE,
multiline = FALSE,
comments = FALSE,
dotall = FALSE,
...
) {
pattern <- as_bare_character(pattern)
check_bool(ignore_case)
check_bool(multiline)
check_bool(comments)
check_bool(dotall)
options <- stri_opts_regex(
case_insensitive = ignore_case,
multiline = multiline,
comments = comments,
dotall = dotall,
...
)
structure(
pattern,
options = options,
class = c("stringr_regex", "stringr_pattern", "character")
)
}
#' @param type Boundary type to detect.
#' \describe{
#' \item{`character`}{Every character is a boundary.}
#' \item{`line_break`}{Boundaries are places where it is acceptable to have
#' a line break in the current locale.}
#' \item{`sentence`}{The beginnings and ends of sentences are boundaries,
#' using intelligent rules to avoid counting abbreviations
#' ([details](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)).}
#' \item{`word`}{The beginnings and ends of words are boundaries.}
#' }
#' @param skip_word_none Ignore "words" that don't contain any characters
#' or numbers - i.e. punctuation. Default `NA` will skip such "words"
#' only when splitting on `word` boundaries.
#' @export
#' @rdname modifiers
boundary <- function(
type = c("character", "line_break", "sentence", "word"),
skip_word_none = NA,
...
) {
type <- arg_match(type)
check_bool(skip_word_none, allow_na = TRUE)
if (identical(skip_word_none, NA)) {
skip_word_none <- type == "word"
}
options <- stri_opts_brkiter(
type = type,
skip_word_none = skip_word_none,
...
)
structure(
NA_character_,
options = options,
class = c("stringr_boundary", "stringr_pattern", "character")
)
}
opts <- function(x) {
if (identical(x, "")) {
stri_opts_brkiter(type = "character")
} else {
attr(x, "options")
}
}
type <- function(x, error_call = caller_env()) {
UseMethod("type")
}
#' @export
type.stringr_boundary <- function(x, error_call = caller_env()) {
"bound"
}
#' @export
type.stringr_regex <- function(x, error_call = caller_env()) {
"regex"
}
#' @export
type.stringr_coll <- function(x, error_call = caller_env()) {
"coll"
}
#' @export
type.stringr_fixed <- function(x, error_call = caller_env()) {
"fixed"
}
#' @export
type.character <- function(x, error_call = caller_env()) {
if (any(is.na(x))) {
cli::cli_abort(
tr_("{.arg pattern} can not contain NAs."),
call = error_call
)
}
if (identical(x, "")) "empty" else "regex"
}
#' @export
type.default <- function(x, error_call = caller_env()) {
if (inherits(x, "regex")) {
# Fallback for rex
return("regex")
}
cli::cli_abort(
tr_(
"{.arg pattern} must be a character vector, not {.obj_type_friendly {x}}."
),
call = error_call
)
}
#' @export
`[.stringr_pattern` <- function(x, i) {
structure(
NextMethod(),
options = attr(x, "options"),
class = class(x)
)
}
#' @export
`[[.stringr_pattern` <- function(x, i) {
structure(
NextMethod(),
options = attr(x, "options"),
class = class(x)
)
}
as_bare_character <- function(x, call = caller_env()) {
if (is.character(x) && !is.object(x)) {
# All OK!
return(x)
}
warn("Coercing `pattern` to a plain character vector.", call = call)
as.character(x)
}
================================================
FILE: R/pad.R
================================================
#' Pad a string to minimum width
#'
#' Pad a string to a fixed width, so that
#' `str_length(str_pad(x, n))` is always greater than or equal to `n`.
#'
#' @inheritParams str_detect
#' @param width Minimum width of padded strings.
#' @param side Side on which padding character is added (left, right or both).
#' @param pad Single padding character (default is a space).
#' @param use_width If `FALSE`, use the length of the string instead of the
#' width; see [str_width()]/[str_length()] for the difference.
#' @return A character vector the same length as `stringr`/`width`/`pad`.
#' @seealso [str_trim()] to remove whitespace;
#' [str_trunc()] to decrease the maximum width of a string.
#' @export
#' @examples
#' rbind(
#' str_pad("hadley", 30, "left"),
#' str_pad("hadley", 30, "right"),
#' str_pad("hadley", 30, "both")
#' )
#'
#' # All arguments are vectorised except side
#' str_pad(c("a", "abc", "abcdef"), 10)
#' str_pad("a", c(5, 10, 20))
#' str_pad("a", 10, pad = c("-", "_", " "))
#'
#' # Longer strings are returned unchanged
#' str_pad("hadley", 3)
str_pad <- function(
string,
width,
side = c("left", "right", "both"),
pad = " ",
use_width = TRUE
) {
vctrs::vec_size_common(string = string, width = width, pad = pad)
side <- arg_match(side)
check_bool(use_width)
out <- switch(
side,
left = stri_pad_left(string, width, pad = pad, use_length = !use_width),
right = stri_pad_right(string, width, pad = pad, use_length = !use_width),
both = stri_pad_both(string, width, pad = pad, use_length = !use_width)
)
# Preserve names unless `string` is recycled
if (length(out) == length(string)) copy_names(string, out) else out
}
================================================
FILE: R/remove.R
================================================
#' Remove matched patterns
#'
#' Remove matches, i.e. replace them with `""`.
#'
#' @inheritParams str_detect
#' @return A character vector the same length as `string`/`pattern`.
#' @seealso [str_replace()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c("one apple", "two pears", "three bananas")
#' str_remove(fruits, "[aeiou]")
#' str_remove_all(fruits, "[aeiou]")
str_remove <- function(string, pattern) {
str_replace(string, pattern, "")
}
#' @export
#' @rdname str_remove
str_remove_all <- function(string, pattern) {
str_replace_all(string, pattern, "")
}
================================================
FILE: R/replace.R
================================================
#' Replace matches with new text
#'
#' `str_replace()` replaces the first match; `str_replace_all()` replaces
#' all matches.
#'
#' @inheritParams str_detect
#' @param pattern Pattern to look for.
#'
#' The default interpretation is a regular expression, as described
#' in [stringi::about_search_regex]. Control options with
#' [regex()].
#'
#' For `str_replace_all()` this can also be a named vector
#' (`c(pattern1 = replacement1)`), in order to perform multiple replacements
#' in each element of `string`.
#'
#' Match a fixed string (i.e. by comparing only bytes), using
#' [fixed()]. This is fast, but approximate. Generally,
#' for matching human text, you'll want [coll()] which
#' respects character matching rules for the specified locale.
#'
#' You can not match boundaries, including `""`, with this function.
#' @param replacement The replacement value, usually a single string,
#' but it can be the a vector the same length as `string` or `pattern`.
#' References of the form `\1`, `\2`, etc will be replaced with
#' the contents of the respective matched group (created by `()`).
#'
#' Alternatively, supply a function (or formula): it will be passed a single
#' character vector and should return a character vector of the same length.
#'
#' To replace the complete string with `NA`, use
#' `replacement = NA_character_`.
#' @return A character vector the same length as
#' `string`/`pattern`/`replacement`.
#' @seealso [str_replace_na()] to turn missing values into "NA";
#' [stringi::stri_replace()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c("one apple", "two pears", "three bananas")
#' str_replace(fruits, "[aeiou]", "-")
#' str_replace_all(fruits, "[aeiou]", "-")
#' str_replace_all(fruits, "[aeiou]", toupper)
#' str_replace_all(fruits, "b", NA_character_)
#'
#' str_replace(fruits, "([aeiou])", "")
#' str_replace(fruits, "([aeiou])", "\\1\\1")
#'
#' # Note that str_replace() is vectorised along text, pattern, and replacement
#' str_replace(fruits, "[aeiou]", c("1", "2", "3"))
#' str_replace(fruits, c("a", "e", "i"), "-")
#'
#' # If you want to apply multiple patterns and replacements to the same
#' # string, pass a named vector to pattern.
#' fruits %>%
#' str_c(collapse = "---") %>%
#' str_replace_all(c("one" = "1", "two" = "2", "three" = "3"))
#'
#' # Use a function for more sophisticated replacement. This example
#' # replaces colour names with their hex values.
#' colours <- str_c("\\b", colors(), "\\b", collapse="|")
#' col2hex <- function(col) {
#' rgb <- col2rgb(col)
#' rgb(rgb["red", ], rgb["green", ], rgb["blue", ], maxColorValue = 255)
#' }
#'
#' x <- c(
#' "Roses are red, violets are blue",
#' "My favourite colour is green"
#' )
#' str_replace_all(x, colours, col2hex)
str_replace <- function(string, pattern, replacement) {
if (!missing(replacement) && is_replacement_fun(replacement)) {
replacement <- as_function(replacement)
return(str_transform(string, pattern, replacement))
}
check_lengths(string, pattern, replacement)
out <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = stri_replace_first_fixed(
string,
pattern,
replacement,
opts_fixed = opts(pattern)
),
coll = stri_replace_first_coll(
string,
pattern,
replacement,
opts_collator = opts(pattern)
),
regex = stri_replace_first_regex(
string,
pattern,
fix_replacement(replacement),
opts_regex = opts(pattern)
)
)
preserve_names_if_possible(string, pattern, out)
}
#' @export
#' @rdname str_replace
str_replace_all <- function(string, pattern, replacement) {
if (!missing(replacement) && is_replacement_fun(replacement)) {
replacement <- as_function(replacement)
return(str_transform_all(string, pattern, replacement))
}
if (!is.null(names(pattern))) {
vec <- FALSE
replacement <- unname(pattern)
pattern[] <- names(pattern)
} else {
check_lengths(string, pattern, replacement)
vec <- TRUE
}
out <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = stri_replace_all_fixed(
string,
pattern,
replacement,
vectorize_all = vec,
opts_fixed = opts(pattern)
),
coll = stri_replace_all_coll(
string,
pattern,
replacement,
vectorize_all = vec,
opts_collator = opts(pattern)
),
regex = stri_replace_all_regex(
string,
pattern,
fix_replacement(replacement),
vectorize_all = vec,
opts_regex = opts(pattern)
)
)
preserve_names_if_possible(string, pattern, out)
}
is_replacement_fun <- function(x) {
is.function(x) || is_formula(x)
}
fix_replacement <- function(x, error_call = caller_env()) {
check_character(x, arg = "replacement", call = error_call)
vapply(x, fix_replacement_one, character(1), USE.NAMES = FALSE)
}
fix_replacement_one <- function(x) {
if (is.na(x)) {
return(x)
}
chars <- str_split(x, "")[[1]]
out <- character(length(chars))
escaped <- logical(length(chars))
in_escape <- FALSE
for (i in seq_along(chars)) {
escaped[[i]] <- in_escape
char <- chars[[i]]
if (in_escape) {
# Escape character not printed previously so must include here
if (char == "$") {
out[[i]] <- "\\\\$"
} else if (char >= "0" && char <= "9") {
out[[i]] <- paste0("$", char)
} else {
out[[i]] <- paste0("\\", char)
}
in_escape <- FALSE
} else {
if (char == "$") {
out[[i]] <- "\\$"
} else if (char == "\\") {
in_escape <- TRUE
} else {
out[[i]] <- char
}
}
}
# tibble::tibble(chars, out, escaped)
paste0(out, collapse = "")
}
#' Turn NA into "NA"
#'
#' @inheritParams str_replace
#' @param replacement A single string.
#' @export
#' @examples
#' str_replace_na(c(NA, "abc", "def"))
str_replace_na <- function(string, replacement = "NA") {
check_string(replacement)
copy_names(string, stri_replace_na(string, replacement))
}
str_transform <- function(string, pattern, replacement) {
loc <- str_locate(string, pattern)
new <- replacement(str_sub(string, loc))
str_sub(string, loc, omit_na = TRUE) <- new
string
}
str_transform_all <- function(
string,
pattern,
replacement,
error_call = caller_env()
) {
locs <- str_locate_all(string, pattern)
old <- str_sub_all(string, locs)
# unchop list into a vector, apply replacement(), and then rechop back into
# a list
old_flat <- vctrs::list_unchop(old)
if (length(old_flat) == 0) {
# minor optimisation to avoid problems with the many replacement
# functions that use paste
new_flat <- character()
} else {
withCallingHandlers(
new_flat <- replacement(old_flat),
error = function(cnd) {
cli::cli_abort(
c(
tr_("Failed to apply {.arg replacement} function."),
i = tr_("It must accept a character vector of any length.")
),
parent = cnd,
call = error_call
)
}
)
}
if (!is.character(new_flat)) {
cli::cli_abort(
tr_(
"{.arg replacement} function must return a character vector, not {.obj_type_friendly {new_flat}}."
),
call = error_call
)
}
if (length(new_flat) != length(old_flat)) {
cli::cli_abort(
tr_(
"{.arg replacement} function must return a vector the same length as the input ({length(old_flat)}), not length {length(new_flat)}."
),
call = error_call
)
}
idx <- chop_index(old)
new <- vctrs::vec_chop(new_flat, idx)
stringi::stri_sub_all(string, locs) <- new
string
}
chop_index <- function(x) {
ls <- lengths(x)
start <- cumsum(c(1L, ls[-length(ls)]))
end <- start + ls - 1L
lapply(seq_along(ls), function(i) seq2(start[[i]], end[[i]]))
}
================================================
FILE: R/sort.R
================================================
#' Order, rank, or sort a character vector
#'
#' * `str_sort()` returns the sorted vector.
#' * `str_order()` returns an integer vector that returns the desired
#' order when used for subsetting, i.e. `x[str_order(x)]` is the same
#' as `str_sort()`
#' * `str_rank()` returns the ranks of the values, i.e.
#' `arrange(df, str_rank(x))` is the same as `str_sort(df$x)`.
#'
#' @param x A character vector to sort.
#' @param decreasing A boolean. If `FALSE`, the default, sorts from
#' lowest to highest; if `TRUE` sorts from highest to lowest.
#' @param na_last Where should `NA` go? `TRUE` at the end,
#' `FALSE` at the beginning, `NA` dropped.
#' @param numeric If `TRUE`, will sort digits numerically, instead
#' of as strings.
#' @param ... Other options used to control collation. Passed on to
#' [stringi::stri_opts_collator()].
#' @inheritParams coll
#' @return A character vector the same length as `string`.
#' @seealso [stringi::stri_order()] for the underlying implementation.
#' @export
#' @examples
#' x <- c("apple", "car", "happy", "char")
#' str_sort(x)
#'
#' str_order(x)
#' x[str_order(x)]
#'
#' str_rank(x)
#'
#' # In Czech, ch is a digraph that sorts after h
#' str_sort(x, locale = "cs")
#'
#' # Use numeric = TRUE to sort numbers in strings
#' x <- c("100a10", "100a5", "2b", "2a")
#' str_sort(x)
#' str_sort(x, numeric = TRUE)
str_order <- function(
x,
decreasing = FALSE,
na_last = TRUE,
locale = "en",
numeric = FALSE,
...
) {
check_bool(decreasing)
check_bool(na_last, allow_na = TRUE)
check_string(locale)
check_bool(numeric)
opts <- stri_opts_collator(locale, numeric = numeric, ...)
stri_order(
x,
decreasing = decreasing,
na_last = na_last,
opts_collator = opts
)
}
#' @export
#' @rdname str_order
str_rank <- function(x, locale = "en", numeric = FALSE, ...) {
check_string(locale)
check_bool(numeric)
opts <- stri_opts_collator(locale, numeric = numeric, ...)
stri_rank(x, opts_collator = opts)
}
#' @export
#' @rdname str_order
str_sort <- function(
x,
decreasing = FALSE,
na_last = TRUE,
locale = "en",
numeric = FALSE,
...
) {
check_bool(decreasing)
check_bool(na_last, allow_na = TRUE)
check_string(locale)
check_bool(numeric)
opts <- stri_opts_collator(locale, numeric = numeric, ...)
idx <- stri_order(
x,
decreasing = decreasing,
na_last = na_last,
opts_collator = opts
)
x[idx]
}
================================================
FILE: R/split.R
================================================
#' Split up a string into pieces
#'
#' @description
#' This family of functions provides various ways of splitting a string up
#' into pieces. These two functions return a character vector:
#'
#' * `str_split_1()` takes a single string and splits it into pieces,
#' returning a single character vector.
#' * `str_split_i()` splits each string in a character vector into pieces and
#' extracts the `i`th value, returning a character vector.
#'
#' These two functions return a more complex object:
#'
#' * `str_split()` splits each string in a character vector into a varying
#' number of pieces, returning a list of character vectors.
#' * `str_split_fixed()` splits each string in a character vector into a
#' fixed number of pieces, returning a character matrix.
#'
#' @inheritParams str_extract
#' @param n Maximum number of pieces to return. Default (Inf) uses all
#' possible split positions.
#'
#' For `str_split()`, this determines the maximum length of each element
#' of the output. For `str_split_fixed()`, this determines the number of
#' columns in the output; if an input is too short, the result will be padded
#' with `""`.
#' @return
#' * `str_split_1()`: a character vector.
#' * `str_split()`: a list the same length as `string`/`pattern` containing
#' character vectors.
#' * `str_split_fixed()`: a character matrix with `n` columns and the same
#' number of rows as the length of `string`/`pattern`.
#' * `str_split_i()`: a character vector the same length as `string`/`pattern`.
#' @seealso [stringi::stri_split()] for the underlying implementation.
#' @export
#' @examples
#' fruits <- c(
#' "apples and oranges and pears and bananas",
#' "pineapples and mangos and guavas"
#' )
#'
#' str_split(fruits, " and ")
#' str_split(fruits, " and ", simplify = TRUE)
#'
#' # If you want to split a single string, use `str_split_1`
#' str_split_1(fruits[[1]], " and ")
#'
#' # Specify n to restrict the number of possible matches
#' str_split(fruits, " and ", n = 3)
#' str_split(fruits, " and ", n = 2)
#' # If n greater than number of pieces, no padding occurs
#' str_split(fruits, " and ", n = 5)
#'
#' # Use fixed to return a character matrix
#' str_split_fixed(fruits, " and ", 3)
#' str_split_fixed(fruits, " and ", 4)
#'
#' # str_split_i extracts only a single piece from a string
#' str_split_i(fruits, " and ", 1)
#' str_split_i(fruits, " and ", 4)
#' # use a negative number to select from the end
#' str_split_i(fruits, " and ", -1)
str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
check_lengths(string, pattern)
check_positive_integer(n)
check_bool(simplify, allow_na = TRUE)
if (identical(n, Inf)) {
n <- -1L
}
out <- switch(
type(pattern),
empty = stri_split_boundaries(
string,
n = n,
simplify = simplify,
opts_brkiter = opts(pattern)
),
bound = stri_split_boundaries(
string,
n = n,
simplify = simplify,
opts_brkiter = opts(pattern)
),
fixed = stri_split_fixed(
string,
pattern,
n = n,
simplify = simplify,
opts_fixed = opts(pattern)
),
regex = stri_split_regex(
string,
pattern,
n = n,
simplify = simplify,
opts_regex = opts(pattern)
),
coll = stri_split_coll(
string,
pattern,
n = n,
simplify = simplify,
opts_collator = opts(pattern)
)
)
preserve_names_if_possible(string, pattern, out)
}
#' @export
#' @rdname str_split
str_split_1 <- function(string, pattern) {
check_string(string)
str_split(string, pattern)[[1]]
}
#' @export
#' @rdname str_split
str_split_fixed <- function(string, pattern, n) {
check_lengths(string, pattern)
check_positive_integer(n)
str_split(string, pattern, n = n, simplify = TRUE)
}
#' @export
#' @rdname str_split
#' @param i Element to return. Use a negative value to count from the
#' right hand side.
str_split_i <- function(string, pattern, i) {
check_number_whole(i)
if (i > 0) {
out <- str_split(string, pattern, simplify = NA, n = i + 1)
col <- out[, i]
if (keep_names(string, pattern)) copy_names(string, col) else col
} else if (i < 0) {
i <- abs(i)
pieces <- str_split(string, pattern)
last <- function(x) {
n <- length(x)
if (i > n) {
NA_character_
} else {
x[[n + 1 - i]]
}
}
out <- map_chr(pieces, last)
preserve_names_if_possible(string, pattern, out)
} else {
cli::cli_abort(tr_("{.arg i} must not be 0."))
}
}
check_positive_integer <- function(
x,
arg = caller_arg(x),
call = caller_env()
) {
if (!identical(x, Inf)) {
check_number_whole(x, min = 1, arg = arg, call = call)
}
}
================================================
FILE: R/stringr-package.R
================================================
#' @keywords internal
"_PACKAGE"
## usethis namespace: start
#' @import stringi
#' @import rlang
#' @importFrom glue glue
#' @importFrom lifecycle deprecated
## usethis namespace: end
NULL
================================================
FILE: R/sub.R
================================================
#' Get and set substrings using their positions
#'
#' `str_sub()` extracts or replaces the elements at a single position in each
#' string. `str_sub_all()` allows you to extract strings at multiple elements
#' in every string.
#'
#' @inheritParams str_detect
#' @param start,end A pair of integer vectors defining the range of characters
#' to extract (inclusive). Positive values count from the left of the string,
#' and negative values count from the right. In other words, if `string` is
#' `"abcdef"` then 1 refers to `"a"` and -1 refers to `"f"`.
#'
#' Alternatively, instead of a pair of vectors, you can pass a matrix to
#' `start`. The matrix should have two columns, either labelled `start`
#' and `end`, or `start` and `length`. This makes `str_sub()` work directly
#' with the output from [str_locate()] and friends.
#'
#' @param omit_na Single logical value. If `TRUE`, missing values in any of the
#' arguments provided will result in an unchanged input.
#' @param value Replacement string.
#' @return
#' * `str_sub()`: A character vector the same length as `string`/`start`/`end`.
#' * `str_sub_all()`: A list the same length as `string`. Each element is
#' a character vector the same length as `start`/`end`.
#'
#' If `end` comes before `start` or `start` is outside the range of `string`
#' then the corresponding output will be the empty string.
#' @seealso The underlying implementation in [stringi::stri_sub()]
#' @export
#' @examples
#' hw <- "Hadley Wickham"
#'
#' str_sub(hw, 1, 6)
#' str_sub(hw, end = 6)
#' str_sub(hw, 8, 14)
#' str_sub(hw, 8)
#'
#' # Negative values index from end of string
#' str_sub(hw, -1)
#' str_sub(hw, -7)
#' str_sub(hw, end = -7)
#'
#' # str_sub() is vectorised by both string and position
#' str_sub(hw, c(1, 8), c(6, 14))
#'
#' # if you want to extract multiple positions from multiple strings,
#' # use str_sub_all()
#' x <- c("abcde", "ghifgh")
#' str_sub(x, c(1, 2), c(2, 4))
#' str_sub_all(x, start = c(1, 2), end = c(2, 4))
#'
#' # Alternatively, you can pass in a two column matrix, as in the
#' # output from str_locate_all
#' pos <- str_locate_all(hw, "[aeio]")[[1]]
#' pos
#' str_sub(hw, pos)
#'
#' # You can also use `str_sub()` to modify strings:
#' x <- "BBCDEF"
#' str_sub(x, 1, 1) <- "A"; x
#' str_sub(x, -1, -1) <- "K"; x
#' str_sub(x, -2, -2) <- "GHIJ"; x
#' str_sub(x, 2, -2) <- ""; x
str_sub <- function(string, start = 1L, end = -1L) {
vctrs::vec_size_common(string = string, start = start, end = end)
out <- if (is.matrix(start)) {
stri_sub(string, from = start)
} else {
stri_sub(string, from = start, to = end)
}
# Preserve names unless `string` is recycled
if (length(out) == length(string)) copy_names(string, out) else out
}
#' @export
#' @rdname str_sub
"str_sub<-" <- function(string, start = 1L, end = -1L, omit_na = FALSE, value) {
vctrs::vec_size_common(
string = string,
start = start,
end = end,
value = value
)
if (is.matrix(start)) {
stri_sub(string, from = start, omit_na = omit_na) <- value
} else {
stri_sub(string, from = start, to = end, omit_na = omit_na) <- value
}
string
}
#' @export
#' @rdname str_sub
str_sub_all <- function(string, start = 1L, end = -1L) {
out <- if (is.matrix(start)) {
stri_sub_all(string, from = start)
} else {
stri_sub_all(string, from = start, to = end)
}
copy_names(string, out)
}
================================================
FILE: R/subset.R
================================================
#' Find matching elements
#'
#' @description
#' `str_subset()` returns all elements of `string` where there's at least
#' one match to `pattern`. It's a wrapper around `x[str_detect(x, pattern)]`,
#' and is equivalent to `grep(pattern, x, value = TRUE)`.
#'
#' Use [str_extract()] to find the location of the match _within_ each string.
#'
#' @inheritParams str_detect
#' @return A character vector, usually smaller than `string`.
#' @seealso [grep()] with argument `value = TRUE`,
#' [stringi::stri_subset()] for the underlying implementation.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_subset(fruit, "a")
#'
#' str_subset(fruit, "^a")
#' str_subset(fruit, "a$")
#' str_subset(fruit, "b")
#' str_subset(fruit, "[aeiou]")
#'
#' # Elements that don't match
#' str_subset(fruit, "^p", negate = TRUE)
#'
#' # Missings never match
#' str_subset(c("a", NA, "b"), ".")
str_subset <- function(string, pattern, negate = FALSE) {
check_lengths(string, pattern)
check_bool(negate)
idx <- switch(
type(pattern),
empty = no_empty(),
bound = no_boundary(),
fixed = str_detect(string, pattern, negate = negate),
coll = str_detect(string, pattern, negate = negate),
regex = str_detect(string, pattern, negate = negate)
)
idx[is.na(idx)] <- FALSE
string[idx]
}
#' Find matching indices
#'
#' `str_which()` returns the indices of `string` where there's at least
#' one match to `pattern`. It's a wrapper around
#' `which(str_detect(x, pattern))`, and is equivalent to `grep(pattern, x)`.
#'
#' @inheritParams str_detect
#' @return An integer vector, usually smaller than `string`.
#' @export
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_which(fruit, "a")
#'
#' # Elements that don't match
#' str_which(fruit, "^p", negate = TRUE)
#'
#' # Missings never match
#' str_which(c("a", NA, "b"), ".")
str_which <- function(string, pattern, negate = FALSE) {
which(str_detect(string, pattern, negate = negate))
}
================================================
FILE: R/trim.R
================================================
#' Remove whitespace
#'
#' `str_trim()` removes whitespace from start and end of string; `str_squish()`
#' removes whitespace at the start and end, and replaces all internal whitespace
#' with a single space.
#'
#' @inheritParams str_detect
#' @param side Side on which to remove whitespace: "left", "right", or
#' "both", the default.
#' @return A character vector the same length as `string`.
#' @export
#' @seealso [str_pad()] to add whitespace
#' @examples
#' str_trim(" String with trailing and leading white space\t")
#' str_trim("\n\nString with trailing and leading white space\n\n")
#'
#' str_squish(" String with trailing, middle, and leading white space\t")
#' str_squish("\n\nString with excess, trailing and leading white space\n\n")
str_trim <- function(string, side = c("both", "left", "right")) {
side <- arg_match(side)
out <- switch(
side,
left = stri_trim_left(string),
right = stri_trim_right(string),
both = stri_trim_both(string)
)
copy_names(string, out)
}
#' @export
#' @rdname str_trim
str_squish <- function(string) {
copy_names(string, stri_trim_both(str_replace_all(string, "\\s+", " ")))
}
================================================
FILE: R/trunc.R
================================================
#' Truncate a string to maximum width
#'
#' Truncate a string to a fixed of characters, so that
#' `str_length(str_trunc(x, n))` is always less than or equal to `n`.
#'
#' @inheritParams str_detect
#' @param width Maximum width of string.
#' @param side,ellipsis Location and content of ellipsis that indicates
#' content has been removed.
#' @return A character vector the same length as `string`.
#' @seealso [str_pad()] to increase the minimum width of a string.
#' @export
#' @examples
#' x <- "This string is moderately long"
#' rbind(
#' str_trunc(x, 20, "right"),
#' str_trunc(x, 20, "left"),
#' str_trunc(x, 20, "center")
#' )
str_trunc <- function(
string,
width,
side = c("right", "left", "center"),
ellipsis = "..."
) {
check_number_whole(width)
side <- arg_match(side)
check_string(ellipsis)
len <- str_length(string)
too_long <- !is.na(string) & len > width
width... <- width - str_length(ellipsis)
if (width... < 0) {
cli::cli_abort(
tr_(
"`width` ({width}) is shorter than `ellipsis` ({str_length(ellipsis)})."
)
)
}
string[too_long] <- switch(
side,
right = str_c(str_sub(string[too_long], 1, width...), ellipsis),
left = str_c(
ellipsis,
str_sub(string[too_long], len[too_long] - width... + 1, -1)
),
center = str_c(
str_sub(string[too_long], 1, ceiling(width... / 2)),
ellipsis,
str_sub(string[too_long], len[too_long] - floor(width... / 2) + 1, -1)
)
)
string
}
================================================
FILE: R/unique.R
================================================
#' Remove duplicated strings
#'
#' `str_unique()` removes duplicated values, with optional control over
#' how duplication is measured.
#'
#' @inheritParams str_detect
#' @inheritParams str_equal
#' @return A character vector, usually shorter than `string`.
#' @seealso [unique()], [stringi::stri_unique()] which this function wraps.
#' @examples
#' str_unique(c("a", "b", "c", "b", "a"))
#'
#' str_unique(c("a", "b", "c", "B", "A"))
#' str_unique(c("a", "b", "c", "B", "A"), ignore_case = TRUE)
#'
#' # Use ... to pass additional arguments to stri_unique()
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"))
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"), strength = 1)
#' @export
str_unique <- function(string, locale = "en", ignore_case = FALSE, ...) {
check_string(locale)
check_bool(ignore_case)
opts <- str_opts_collator(
locale = locale,
ignore_case = ignore_case,
...
)
keep <- !stringi::stri_duplicated(string, opts_collator = opts)
string[keep]
}
================================================
FILE: R/utils.R
================================================
#' Pipe operator
#'
#' @name %>%
#' @rdname pipe
#' @keywords internal
#' @export
#' @importFrom magrittr %>%
#' @usage lhs \%>\% rhs
NULL
check_lengths <- function(
string,
pattern,
replacement = NULL,
error_call = caller_env()
) {
# stringi already correctly recycles vectors of length 0 and 1
# we just want more stringent vctrs checks for other lengths
vctrs::vec_size_common(
string = string,
pattern = pattern,
replacement = replacement,
.call = error_call
)
}
no_boundary <- function(call = caller_env()) {
cli::cli_abort(tr_("{.arg pattern} can't be a boundary."), call = call)
}
no_empty <- function(call = caller_env()) {
cli::cli_abort(
tr_("{.arg pattern} can't be the empty string ({.code \"\"})."),
call = call
)
}
tr_ <- function(...) {
enc2utf8(gettext(paste0(...), domain = "R-stringr"))
}
# copy names from `string` to output, regardless of output type
copy_names <- function(from, to) {
nm <- names(from)
if (is.null(nm)) {
return(to)
}
if (is.matrix(to)) {
rownames(to) <- nm
to
} else {
set_names(to, nm)
}
}
# keep names if pattern is scalar (i.e. vectorised) or same length as string.
keep_names <- function(string, pattern) {
length(pattern) == 1L || length(pattern) == length(string)
}
preserve_names_if_possible <- function(string, pattern, out) {
if (keep_names(string, pattern)) {
copy_names(string, out)
} else {
out
}
}
================================================
FILE: R/view.R
================================================
#' View strings and matches
#'
#' @description
#' `str_view()` is used to print the underlying representation of a string and
#' to see how a `pattern` matches.
#'
#' Matches are surrounded by `<>` and unusual whitespace (i.e. all whitespace
#' apart from `" "` and `"\n"`) are surrounded by `{}` and escaped. Where
#' possible, matches and unusual whitespace are coloured blue and `NA`s red.
#'
#' @inheritParams str_detect
#' @param match If `pattern` is supplied, which elements should be shown?
#'
#' * `TRUE`, the default, shows only elements that match the pattern.
#' * `NA` shows all elements.
#' * `FALSE` shows only elements that don't match the pattern.
#'
#' If `pattern` is not supplied, all elements are always shown.
#' @param html Use HTML output? If `TRUE` will create an HTML widget; if `FALSE`
#' will style using ANSI escapes.
#' @param use_escapes If `TRUE`, all non-ASCII characters will be rendered
#' with unicode escapes. This is useful to see exactly what underlying
#' values are stored in the string.
#' @export
#' @examples
#' # Show special characters
#' str_view(c("\"\\", "\\\\\\", "fgh", NA, "NA"))
#'
#' # A non-breaking space looks like a regular space:
#' nbsp <- "Hi\u00A0you"
#' nbsp
#' # But it doesn't behave like one:
#' str_detect(nbsp, " ")
#' # So str_view() brings it to your attention with a blue background
#' str_view(nbsp)
#'
#' # You can also use escapes to see all non-ASCII characters
#' str_view(nbsp, use_escapes = TRUE)
#'
#' # Supply a pattern to see where it matches
#' str_view(c("abc", "def", "fghi"), "[aeiou]")
#' str_view(c("abc", "def", "fghi"), "^")
#' str_view(c("abc", "def", "fghi"), "..")
#'
#' # By default, only matching strings will be shown
#' str_view(c("abc", "def", "fghi"), "e")
#' # but you can show all:
#' str_view(c("abc", "def", "fghi"), "e", match = NA)
#' # or just those that don't match:
#' str_view(c("abc", "def", "fghi"), "e", match = FALSE)
str_view <- function(
string,
pattern = NULL,
match = TRUE,
html = FALSE,
use_escapes = FALSE
) {
rec <- vctrs::vec_recycle_common(string = string, pattern = pattern)
string <- rec$string
pattern <- rec$pattern
check_bool(match, allow_na = TRUE)
check_bool(html)
check_bool(use_escapes)
filter <- str_view_filter(string, pattern, match)
out <- string[filter]
pattern <- pattern[filter]
if (!is.null(pattern)) {
out <- str_replace_all(out, pattern, str_view_highlighter(html))
}
if (use_escapes) {
out <- stri_escape_unicode(out)
out <- str_replace_all(out, fixed("\\u001b"), "\u001b")
} else {
out <- str_view_special(out, html = html)
}
str_view_print(out, filter, html = html)
}
#' @rdname str_view
#' @usage NULL
#' @export
str_view_all <- function(
string,
pattern = NULL,
match = NA,
html = FALSE,
use_escapes = FALSE
) {
lifecycle::deprecate_warn("1.5.0", "str_view_all()", "str_view()")
str_view(
string = string,
pattern = pattern,
match = match,
html = html,
use_escapes = use_escapes
)
}
str_view_filter <- function(x, pattern, match) {
if (is.null(pattern) || inherits(pattern, "stringr_boundary")) {
rep(TRUE, length(x))
} else {
if (identical(match, TRUE)) {
str_detect(x, pattern) & !is.na(x)
} else if (identical(match, FALSE)) {
!str_detect(x, pattern) | is.na(x)
} else {
rep(TRUE, length(x))
}
}
}
# Helpers -----------------------------------------------------------------
str_view_highlighter <- function(html = TRUE) {
if (html) {
function(x) str_c("<span class='match'>", x, "</span>")
} else {
function(x) {
out <- cli::col_cyan("<", x, ">")
# Ensure styling is starts and ends within each line
out <- cli::ansi_strsplit(out, "\n", fixed = TRUE)
out <- map_chr(out, str_flatten, "\n")
out
}
}
}
str_view_special <- function(x, html = TRUE) {
if (html) {
replace <- function(x) str_c("<span class='special'>", x, "</span>")
} else {
replace <- function(x) {
if (length(x) == 0) {
return(character())
}
cli::col_cyan("{", stri_escape_unicode(x), "}")
}
}
# Highlight any non-standard whitespace characters
str_replace_all(x, "[\\p{Whitespace}-- \n]+", replace)
}
str_view_print <- function(x, filter, html = TRUE) {
if (html) {
str_view_widget(x)
} else {
structure(x, id = which(filter), class = "stringr_view")
}
}
str_view_widget <- function(lines) {
check_installed(c("htmltools", "htmlwidgets"))
lines <- str_replace_na(lines)
bullets <- str_c(
"<ul>\n",
str_c(" <li><pre>", lines, "</pre></li>", collapse = "\n"),
"\n</ul>"
)
html <- htmltools::HTML(bullets)
size <- htmlwidgets::sizingPolicy(
knitr.figure = FALSE,
defaultHeight = pmin(10 * length(lines), 300),
knitr.defaultHeight = "100%"
)
htmlwidgets::createWidget(
"str_view",
list(html = html),
sizingPolicy = size,
package = "stringr"
)
}
#' @export
print.stringr_view <- function(x, ..., n = getOption("stringr.view_n", 20)) {
n_extra <- length(x) - n
if (n_extra > 0) {
x <- x[seq_len(n)]
}
if (length(x) == 0) {
cli::cli_inform(c(x = "Empty `string` provided.\n"))
return(invisible(x))
}
bar <- if (cli::is_utf8_output()) "\u2502" else "|"
id <- format(paste0("[", attr(x, "id"), "] "), justify = "right")
indent <- paste0(cli::col_grey(id, bar), " ")
exdent <- paste0(strrep(" ", nchar(id[[1]])), cli::col_grey(bar), " ")
x[is.na(x)] <- cli::col_red("NA")
x <- paste0(indent, x)
x <- str_replace_all(x, "\n", paste0("\n", exdent))
cat(x, sep = "\n")
if (n_extra > 0) {
cat("... and ", n_extra, " more\n", sep = "")
}
invisible(x)
}
#' @export
`[.stringr_view` <- function(x, i, ...) {
structure(NextMethod(), id = attr(x, "id")[i], class = "stringr_view")
}
================================================
FILE: R/word.R
================================================
#' Extract words from a sentence
#'
#' @inheritParams str_detect
#' @param start,end Pair of integer vectors giving range of words (inclusive)
#' to extract. If negative, counts backwards from the last word.
#'
#' The default value select the first word.
#' @param sep Separator between words. Defaults to single space.
#' @return A character vector with the same length as `string`/`start`/`end`.
#' @export
#' @examples
#' sentences <- c("Jane saw a cat", "Jane sat down")
#' word(sentences, 1)
#' word(sentences, 2)
#' word(sentences, -1)
#' word(sentences, 2, -1)
#'
#' # Also vectorised over start and end
#' word(sentences[1], 1:3, -1)
#' word(sentences[1], 1, 1:4)
#'
#' # Can define words by other separators
#' str <- 'abc.def..123.4568.999'
#' word(str, 1, sep = fixed('..'))
#' word(str, 2, sep = fixed('..'))
word <- function(string, start = 1L, end = start, sep = fixed(" ")) {
args <- vctrs::vec_recycle_common(string = string, start = start, end = end)
string <- args$string
start <- args$start
end <- args$end
breaks <- str_locate_all(string, sep)
words <- lapply(breaks, invert_match)
# Convert negative values into actual positions
len <- vapply(words, nrow, integer(1))
neg_start <- !is.na(start) & start < 0L
start[neg_start] <- start[neg_start] + len[neg_start] + 1L
neg_end <- !is.na(end) & end < 0L
end[neg_end] <- end[neg_end] + len[neg_end] + 1L
# Replace indexes past end with NA
start[start > len] <- NA
end[end > len] <- NA
# To return all words when trying to extract more words than available
start[start < 1L] <- 1
# Extract locations
starts <- mapply(function(word, loc) word[loc, "start"], words, start)
ends <- mapply(function(word, loc) word[loc, "end"], words, end)
copy_names(string, str_sub(string, starts, ends))
}
================================================
FILE: R/wrap.R
================================================
#' Wrap words into nicely formatted paragraphs
#'
#' Wrap words into paragraphs, minimizing the "raggedness" of the lines
#' (i.e. the variation in length line) using the Knuth-Plass algorithm.
#'
#' @inheritParams str_detect
#' @param width Positive integer giving target line width (in number of
#' characters). A width less than or equal to 1 will put each word on its
#' own line.
#' @param indent,exdent A non-negative integer giving the indent for the
#' first line (`indent`) and all subsequent lines (`exdent`).
#' @param whitespace_only A boolean.
#' * If `TRUE` (the default) wrapping will only occur at whitespace.
#' * If `FALSE`, can break on any non-word character (e.g. `/`, `-`).
#' @return A character vector the same length as `string`.
#' @seealso [stringi::stri_wrap()] for the underlying implementation.
#' @export
#' @examples
#' thanks_path <- file.path(R.home("doc"), "THANKS")
#' thanks <- str_c(readLines(thanks_path), collapse = "\n")
#' thanks <- word(thanks, 1, 3, fixed("\n\n"))
#' cat(str_wrap(thanks), "\n")
#' cat(str_wrap(thanks, width = 40), "\n")
#' cat(str_wrap(thanks, width = 60, indent = 2), "\n")
#' cat(str_wrap(thanks, width = 60, exdent = 2), "\n")
#' cat(str_wrap(thanks, width = 0, exdent = 2), "\n")
str_wrap <- function(
string,
width = 80,
indent = 0,
exdent = 0,
whitespace_only = TRUE
) {
check_number_decimal(width)
if (width <= 0) {
width <- 1
}
check_number_whole(indent)
check_number_whole(exdent)
check_bool(whitespace_only)
out <- stri_wrap(
string,
width = width,
indent = indent,
exdent = exdent,
whitespace_only = whitespace_only,
simplify = FALSE
)
out <- vapply(out, str_c, collapse = "\n", character(1))
copy_names(string, out)
}
================================================
FILE: README.Rmd
================================================
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
library(stringr)
```
# stringr <a href='https://stringr.tidyverse.org'><img src='man/figures/logo.png' align="right" height="139" /></a>
<!-- badges: start -->
[](https://cran.r-project.org/package=stringr)
[](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/tidyverse/stringr?branch=main)
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->
## Overview
Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparation tasks. The stringr package provides a cohesive set of functions designed to make working with strings as easy as possible. If you're not familiar with strings, the best place to start is the [chapter on strings](https://r4ds.hadley.nz/strings) in R for Data Science.
stringr is built on top of [stringi](https://github.com/gagolews/stringi), which uses the [ICU](https://icu.unicode.org) C library to provide fast, correct implementations of common string manipulations. stringr focusses on the most important and commonly used string manipulation functions whereas stringi provides a comprehensive set covering almost anything you can imagine. If you find that stringr is missing a function that you need, try looking in stringi. Both packages share similar conventions, so once you've mastered stringr, you should find stringi similarly easy to use.
## Installation
```r
# The easiest way to get stringr is to install the whole tidyverse:
install.packages("tidyverse")
# Alternatively, install just stringr:
install.packages("stringr")
```
## Cheatsheet
<a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/main/pngs/thumbnails/strings-cheatsheet-thumbs.png" width="630" height="242"/></a>
## Usage
All functions in stringr start with `str_` and take a vector of strings as the first argument:
```{r}
x <- c("why", "video", "cross", "extra", "deal", "authority")
str_length(x)
str_c(x, collapse = ", ")
str_sub(x, 1, 2)
```
Most string functions work with regular expressions, a concise language for describing patterns of text. For example, the regular expression `"[aeiou]"` matches any single character that is a vowel:
```{r}
str_subset(x, "[aeiou]")
str_count(x, "[aeiou]")
```
There are seven main verbs that work with patterns:
* `str_detect(x, pattern)` tells you if there's any match to the pattern:
```{r}
str_detect(x, "[aeiou]")
```
* `str_count(x, pattern)` counts the number of patterns:
```{r}
str_count(x, "[aeiou]")
```
* `str_subset(x, pattern)` extracts the matching components:
```{r}
str_subset(x, "[aeiou]")
```
* `str_locate(x, pattern)` gives the position of the match:
```{r}
str_locate(x, "[aeiou]")
```
* `str_extract(x, pattern)` extracts the text of the match:
```{r}
str_extract(x, "[aeiou]")
```
* `str_match(x, pattern)` extracts parts of the match defined by parentheses:
```{r}
# extract the characters on either side of the vowel
str_match(x, "(.)[aeiou](.)")
```
* `str_replace(x, pattern, replacement)` replaces the matches with new text:
```{r}
str_replace(x, "[aeiou]", "?")
```
* `str_split(x, pattern)` splits up a string into multiple pieces:
```{r}
str_split(c("a,b", "c,d,e"), ",")
```
As well as regular expressions (the default), there are three other pattern matching engines:
* `fixed()`: match exact bytes
* `coll()`: match human letters
* `boundary()`: match boundaries
## RStudio Addin
The [RegExplain RStudio addin](https://www.garrickadenbuie.com/project/regexplain/) provides a friendly interface for working with regular expressions and functions from stringr. This addin allows you to interactively build your regexp, check the output of common string matching functions, consult the interactive help pages, or use the included resources to learn regular expressions.
This addin can easily be installed with devtools:
```r
# install.packages("devtools")
devtools::install_github("gadenbuie/regexplain")
```
## Compared to base R
R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they lag behind the string operations in other programming languages, so that some things that are easy to do in languages like Ruby or Python are rather hard to do in R.
* Uses consistent function and argument names. The first argument is always
the vector of strings to modify, which makes stringr work particularly well
in conjunction with the pipe:
```{r}
letters %>%
.[1:10] %>%
str_pad(3, "right") %>%
str_c(letters[2:11])
```
* Simplifies string operations by eliminating options that you don't need
95% of the time.
* Produces outputs than can easily be used as inputs. This includes ensuring
that missing inputs result in missing outputs, and zero length inputs
result in zero length outputs.
Learn more in `vignette("from-base")`
================================================
FILE: README.md
================================================
<!-- README.md is generated from README.Rmd. Please edit that file -->
# stringr <a href='https://stringr.tidyverse.org'><img src='man/figures/logo.png' align="right" height="139" /></a>
<!-- badges: start -->
[](https://cran.r-project.org/package=stringr)
[](https://github.com/tidyverse/stringr/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/tidyverse/stringr?branch=main)
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->
## Overview
Strings are not glamorous, high-profile components of R, but they do
play a big role in many data cleaning and preparation tasks. The stringr
package provides a cohesive set of functions designed to make working
with strings as easy as possible. If you’re not familiar with strings,
the best place to start is the [chapter on
strings](https://r4ds.hadley.nz/strings) in R for Data Science.
stringr is built on top of
[stringi](https://github.com/gagolews/stringi), which uses the
[ICU](https://icu.unicode.org) C library to provide fast, correct
implementations of common string manipulations. stringr focusses on the
most important and commonly used string manipulation functions whereas
stringi provides a comprehensive set covering almost anything you can
imagine. If you find that stringr is missing a function that you need,
try looking in stringi. Both packages share similar conventions, so once
you’ve mastered stringr, you should find stringi similarly easy to use.
## Installation
``` r
# The easiest way to get stringr is to install the whole tidyverse:
install.packages("tidyverse")
# Alternatively, install just stringr:
install.packages("stringr")
```
## Cheatsheet
<a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/main/pngs/thumbnails/strings-cheatsheet-thumbs.png" width="630" height="242"/></a>
## Usage
All functions in stringr start with `str_` and take a vector of strings
as the first argument:
``` r
x <- c("why", "video", "cross", "extra", "deal", "authority")
str_length(x)
#> [1] 3 5 5 5 4 9
str_c(x, collapse = ", ")
#> [1] "why, video, cross, extra, deal, authority"
str_sub(x, 1, 2)
#> [1] "wh" "vi" "cr" "ex" "de" "au"
```
Most string functions work with regular expressions, a concise language
for describing patterns of text. For example, the regular expression
`"[aeiou]"` matches any single character that is a vowel:
``` r
str_subset(x, "[aeiou]")
#> [1] "video" "cross" "extra" "deal" "authority"
str_count(x, "[aeiou]")
#> [1] 0 3 1 2 2 4
```
There are seven main verbs that work with patterns:
- `str_detect(x, pattern)` tells you if there’s any match to the
pattern:
``` r
str_detect(x, "[aeiou]")
#> [1] FALSE TRUE TRUE TRUE TRUE TRUE
```
- `str_count(x, pattern)` counts the number of patterns:
``` r
str_count(x, "[aeiou]")
#> [1] 0 3 1 2 2 4
```
- `str_subset(x, pattern)` extracts the matching components:
``` r
str_subset(x, "[aeiou]")
#> [1] "video" "cross" "extra" "deal" "authority"
```
- `str_locate(x, pattern)` gives the position of the match:
``` r
str_locate(x, "[aeiou]")
#> start end
#> [1,] NA NA
#> [2,] 2 2
#> [3,] 3 3
#> [4,] 1 1
#> [5,] 2 2
#> [6,] 1 1
```
- `str_extract(x, pattern)` extracts the text of the match:
``` r
str_extract(x, "[aeiou]")
#> [1] NA "i" "o" "e" "e" "a"
```
- `str_match(x, pattern)` extracts parts of the match defined by
parentheses:
``` r
# extract the characters on either side of the vowel
str_match(x, "(.)[aeiou](.)")
#> [,1] [,2] [,3]
#> [1,] NA NA NA
#> [2,] "vid" "v" "d"
#> [3,] "ros" "r" "s"
#> [4,] NA NA NA
#> [5,] "dea" "d" "a"
#> [6,] "aut" "a" "t"
```
- `str_replace(x, pattern, replacement)` replaces the matches with new
text:
``` r
str_replace(x, "[aeiou]", "?")
#> [1] "why" "v?deo" "cr?ss" "?xtra" "d?al" "?uthority"
```
- `str_split(x, pattern)` splits up a string into multiple pieces:
``` r
str_split(c("a,b", "c,d,e"), ",")
#> [[1]]
#> [1] "a" "b"
#>
#> [[2]]
#> [1] "c" "d" "e"
```
As well as regular expressions (the default), there are three other
pattern matching engines:
- `fixed()`: match exact bytes
- `coll()`: match human letters
- `boundary()`: match boundaries
## RStudio Addin
The [RegExplain RStudio
addin](https://www.garrickadenbuie.com/project/regexplain/) provides a
friendly interface for working with regular expressions and functions
from stringr. This addin allows you to interactively build your regexp,
check the output of common string matching functions, consult the
interactive help pages, or use the included resources to learn regular
expressions.
This addin can easily be installed with devtools:
``` r
# install.packages("devtools")
devtools::install_github("gadenbuie/regexplain")
```
## Compared to base R
R provides a solid set of string operations, but because they have grown
organically over time, they can be inconsistent and a little hard to
learn. Additionally, they lag behind the string operations in other
programming languages, so that some things that are easy to do in
languages like Ruby or Python are rather hard to do in R.
- Uses consistent function and argument names. The first argument is
always the vector of strings to modify, which makes stringr work
particularly well in conjunction with the pipe:
``` r
letters %>%
.[1:10] %>%
str_pad(3, "right") %>%
str_c(letters[2:11])
#> [1] "a b" "b c" "c d" "d e" "e f" "f g" "g h" "h i" "i j" "j k"
```
- Simplifies string operations by eliminating options that you don’t
need 95% of the time.
- Produces outputs than can easily be used as inputs. This includes
ensuring that missing inputs result in missing outputs, and zero
length inputs result in zero length outputs.
Learn more in `vignette("from-base")`
================================================
FILE: _pkgdown.yml
================================================
url: https://stringr.tidyverse.org
development:
mode: auto
template:
package: tidytemplate
bootstrap: 5
includes:
in_header: |
<script src="https://cdn.jsdelivr.net/gh/posit-dev/supported-by-posit/js/badge.min.js" data-max-height="43" data-light-bg="#666f76" data-light-fg="#f9f9f9"></script>
<script defer data-domain="stringr.tidyverse.org,all.tidyverse.org" src="https://plausible.io/js/plausible.js"></script>
home:
links:
- text: Learn more at R4DS
href: http://r4ds.hadley.nz/strings.html
reference:
- title: Pattern matching
- subtitle: String
contents:
- str_count
- str_detect
- str_escape
- str_extract
- str_locate
- str_match
- str_replace
- str_remove
- str_split
- str_starts
- modifiers
- subtitle: Vector
desc: >
Unlike other pattern matching functions, these functions operate on the
original character vector, not the individual matches.
contents:
- str_subset
- str_which
- title: Combining strings
contents:
- str_c
- str_flatten
- str_glue
- title: Character based
contents:
- str_dup
- str_length
- str_pad
- str_sub
- str_trim
- str_trunc
- str_wrap
- title: Locale aware
contents:
- str_order
- str_equal
- case
- str_unique
- title: Other helpers
contents:
- invert_match
- str_conv
- str_like
- str_replace_na
- str_to_camel
- str_view
- word
- title: Bundled data
contents:
- "`stringr-data`"
news:
releases:
- text: "Version 1.6.0"
href: https://tidyverse.org/blog/2025/11/stringr-1-6-0/
- text: "Version 1.5.0"
href: https://www.tidyverse.org/blog/2022/12/stringr-1-5-0/
- text: "Version 1.4.0"
href: https://www.tidyverse.org/articles/2019/02/stringr-1-4-0/
- text: "Version 1.3.0"
href: https://www.tidyverse.org/articles/2018/02/stringr-1-3-0/
- text: "Version 1.2.0"
href: https://blog.rstudio.com/2017/04/12/tidyverse-updates/
- text: "Version 1.1.0"
href: https://blog.rstudio.com/2016/08/24/stringr-1-1-0/
- text: "Version 1.0.0"
href: https://blog.rstudio.com/2015/05/05/stringr-1-0-0/
================================================
FILE: air.toml
================================================
================================================
FILE: codecov.yml
================================================
comment: false
coverage:
status:
project:
default:
target: auto
threshold: 1%
informational: true
patch:
default:
target: auto
threshold: 1%
informational: true
================================================
FILE: cran-comments.md
================================================
## R CMD check results
0 errors | 0 warnings | 0 note
## revdepcheck results
We checked 2390 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package.
* We saw 9 new problems
* We failed to check 2 packages
We've been working with maintainers for over a month to get fixes to CRAN in a timely manner. You can track our efforts at <https://github.com/tidyverse/stringr/issues/590>.
================================================
FILE: data-raw/harvard-sentences.txt
================================================
The birch canoe slid on the smooth planks.
Glue the sheet to the dark blue background.
It's easy to tell the depth of a well.
These days a chicken leg is a rare dish.
Rice is often served in round bowls.
The juice of lemons makes fine punch.
The box was thrown beside the parked truck.
The hogs were fed chopped corn and garbage.
Four hours of steady work faced us.
A large size in stockings is hard to sell.
The boy was there when the sun rose.
A rod is used to catch pink salmon.
The source of the huge river is the clear spring.
Kick the ball straight and follow through.
Help the woman get back to her feet.
A pot of tea helps to pass the evening.
Smoky fires lack flame and heat.
The soft cushion broke the man's fall.
The salt breeze came across from the sea.
The girl at the booth sold fifty bonds.
The small pup gnawed a hole in the sock.
The fish twisted and turned on the bent hook.
Press the pants and sew a button on the vest.
The swan dive was far short of perfect.
The beauty of the view stunned the young boy.
Two blue fish swam in the tank.
Her purse was full of useless trash.
The colt reared and threw the tall rider.
It snowed, rained, and hailed the same morning.
Read verse out loud for pleasure.
Hoist the load to your left shoulder.
Take the winding path to reach the lake.
Note closely the size of the gas tank.
Wipe the grease off his dirty face.
Mend the coat before you go out.
The wrist was badly strained and hung limp.
The stray cat gave birth to kittens.
The young girl gave no clear response.
The meal was cooked before the bell rang.
What joy there is in living.
A king ruled the state in the early days.
The ship was torn apart on the sharp reef.
Sickness kept him home the third week.
The wide road shimmered in the hot sun.
The lazy cow lay in the cool grass.
Lift the square stone over the fence.
The rope will bind the seven books at once.
Hop over the fence and plunge in.
The friendly gang left the drug store.
Mesh wire keeps chicks inside.
The frosty air passed through the coat.
The crooked maze failed to fool the mouse.
Adding fast leads to wrong sums.
The show was a flop from the very start.
A saw is a tool used for making boards.
The wagon moved on well oiled wheels.
March the soldiers past the next hill.
A cup of sugar makes sweet fudge.
Place a rosebush near the porch steps.
Both lost their lives in the raging storm.
We talked of the side show in the circus.
Use a pencil to write the first draft.
He ran half way to the hardware store.
The clock struck to mark the third period.
A small creek cut across the field.
Cars and busses stalled in snow drifts.
The set of china hit the floor with a crash.
This is a grand season for hikes on the road.
The dune rose from the edge of the water.
Those words were the cue for the actor to leave.
A yacht slid around the point into the bay.
The two met while playing on the sand.
The ink stain dried on the finished page.
The walled town was seized without a fight.
The lease ran out in sixteen weeks.
A tame squirrel makes a nice pet.
The horn of the car woke the sleeping cop.
The heart beat strongly and with firm strokes.
The pearl was worn in a thin silver ring.
The fruit peel was cut in thick slices.
The Navy attacked the big task force.
See the cat glaring at the scared mouse.
There are more than two factors here.
The hat brim was wide and too droopy.
The lawyer tried to lose his case.
The grass curled around the fence post.
Cut the pie into large parts.
Men strive but seldom get rich.
Always close the barn door tight.
He lay prone and hardly moved a limb.
The slush lay deep along the street.
A wisp of cloud hung in the blue air.
A pound of sugar costs more than eggs.
The fin was sharp and cut the clear water.
The play seems dull and quite stupid.
Bail the boat to stop it from sinking.
The term ended in late june that year.
A Tusk is used to make costly gifts.
Ten pins were set in order.
The bill was paid every third week.
Oak is strong and also gives shade.
Cats and Dogs each hate the other.
The pipe began to rust while new.
Open the crate but don't break the glass.
Add the sum to the product of these three.
Thieves who rob friends deserve jail.
The ripe taste of cheese improves with age.
Act on these orders with great speed.
The hog crawled under the high fence.
Move the vat over the hot fire.
The bark of the pine tree was shiny and dark.
Leaves turn brown and yellow in the fall.
The pennant waved when the wind blew.
Split the log with a quick, sharp blow.
Burn peat after the logs give out.
He ordered peach pie with ice cream.
Weave the carpet on the right hand side.
Hemp is a weed found in parts of the tropics.
A lame back kept his score low.
We find joy in the simplest things.
Type out three lists of orders.
The harder he tried the less he got done.
The boss ran the show with a watchful eye.
The cup cracked and spilled its contents.
Paste can cleanse the most dirty brass.
The slang word for raw whiskey is booze.
It caught its hind paw in a rusty trap.
The wharf could be seen at the farther shore.
Feel the heat of the weak dying flame.
The tiny girl took off her hat.
A cramp is no small danger on a swim.
He said the same phrase thirty times.
Pluck the bright rose without leaves.
Two plus seven is less than ten.
The glow deepened in the eyes of the sweet girl.
Bring your problems to the wise chief.
Write a fond note to the friend you cherish.
Clothes and lodging are free to new men.
We frown when events take a bad turn.
Port is a strong wine with a smoky taste.
The young kid jumped the rusty gate.
Guess the result from the first scores.
A salt pickle tastes fine with ham.
The just claim got the right verdict.
Those thistles bend in a high wind.
Pure bred poodles have curls.
The tree top waved in a graceful way.
The spot on the blotter was made by green ink.
Mud was spattered on the front of his white shirt.
The cigar burned a hole in the desk top.
The empty flask stood on the tin tray.
A speedy man can beat this track mark.
He broke a new shoelace that day.
The coffee stand is too high for the couch.
The urge to write short stories is rare.
The pencils have all been used.
The pirates seized the crew of the lost ship.
We tried to replace the coin but failed.
She sewed the torn coat quite neatly.
The sofa cushion is red and of light weight.
The jacket hung on the back of the wide chair.
At that high level the air is pure.
Drop the two when you add the figures.
A filing case is now hard to buy.
An abrupt start does not win the prize.
Wood is best for making toys and blocks.
The office paint was a dull, sad tan.
He knew the skill of the great young actress.
A rag will soak up spilled water.
A shower of dirt fell from the hot pipes.
Steam hissed from the broken valve.
The child almost hurt the small dog.
There was a sound of dry leaves outside.
The sky that morning was clear and bright blue.
Torn scraps littered the stone floor.
Sunday is the best part of the week.
The doctor cured him with these pills.
The new girl was fired today at noon.
They felt gay when the ship arrived in port.
Add the store's account to the last cent.
Acid burns holes in wool cloth.
Fairy tales should be fun to write.
Eight miles of woodland burned to waste.
The third act was dull and tired the players.
A young child should not suffer fright.
Add the column and put the sum here.
We admire and love a good cook.
There the flood mark is ten inches.
He carved a head from the round block of marble.
She has a smart way of wearing clothes.
The fruit of a fig tree is apple shaped.
Corn cobs can be used to kindle a fire.
Where were they when the noise started.
The paper box is full of thumb tacks.
Sell your gift to a buyer at a good gain.
The tongs lay beside the ice pail.
The petals fall with the next puff of wind.
Bring your best compass to the third class.
They could laugh although they were sad.
Farmers came in to thresh the oat crop.
The brown house was on fire to the attic.
The lure is used to catch trout and flounder.
Float the soap on top of the bath water.
A blue crane is a tall wading bird.
A fresh start will work such wonders.
The club rented the rink for the fifth night.
After the dance, they went straight home.
The hostess taught the new maid to serve.
He wrote his last novel there at the inn.
Even the worst will beat his low score.
The cement had dried when he moved it.
The loss of the second ship was hard to take.
The fly made its way along the wall.
Do that with a wooden stick.
Live wires should be kept covered.
The large house had hot water taps.
It is hard to erase blue or red ink.
Write at once or you may forget it.
The doorknob was made of bright clean brass.
The wreck occurred by the bank on Main Street.
A pencil with black lead writes best.
Coax a young calf to drink from a bucket.
Schools for ladies teach charm and grace.
The lamp shone with a steady green flame.
They took the axe and the saw to the forest.
The ancient coin was quite dull and worn.
The shaky barn fell with a loud crash.
Jazz and swing fans like fast music.
Rake the rubbish up and then burn it.
Slash the gold cloth into fine ribbons.
Try to have the court decide the case.
They are pushed back each time they attack.
He broke his ties with groups of former friends.
They floated on the raft to sun their white backs.
The map had an X that meant nothing.
Whitings are small fish caught in nets.
Some ads serve to cheat buyers.
Jerk the rope and the bell rings weakly.
A waxed floor makes us lose balance.
Madam, this is the best brand of corn.
On the islands the sea breeze is soft and mild.
The play began as soon as we sat down.
This will lead the world to more sound and fury.
Add salt before you fry the egg.
The rush for funds reached its peak Tuesday.
The birch looked stark white and lonesome.
The box is held by a bright red snapper.
To make pure ice, you freeze water.
The first worm gets snapped early.
Jump the fence and hurry up the bank.
Yell and clap as the curtain slides back.
They are men who walk the middle of the road.
Both brothers wear the same size.
In some form or other we need fun.
The prince ordered his head chopped off.
The houses are built of red clay bricks.
Ducks fly north but lack a compass.
Fruit flavors are used in fizz drinks.
These pills do less good than others.
Canned pears lack full flavor.
The dark pot hung in the front closet.
Carry the pail to the wall and spill it there.
The train brought our hero to the big town.
We are sure that one war is enough.
Gray paint stretched for miles around.
The rude laugh filled the empty room.
High seats are best for football fans.
Tea served from the brown jug is tasty.
A dash of pepper spoils beef stew.
A zestful food is the hot-cross bun.
The horse trotted around the field at a brisk pace.
Find the twin who stole the pearl necklace.
Cut the cord that binds the box tightly.
The red tape bound the smuggled food.
Look in the corner to find the tan shirt.
The cold drizzle will halt the bond drive.
Nine men were hired to dig the ruins.
The junk yard had a mouldy smell.
The flint sputtered and lit a pine torch.
Soak the cloth and drown the sharp odor.
The shelves were bare of both jam or crackers.
A joy to every child is the swan boat.
All sat frozen and watched the screen.
A cloud of dust stung his tender eyes.
To reach the end he needs much courage.
Shape the clay gently into block form.
A ridge on a smooth surface is a bump or flaw.
Hedge apples may stain your hands green.
Quench your thirst, then eat the crackers.
Tight curls get limp on rainy days.
The mute muffled the high tones of the horn.
The gold ring fits only a pierced ear.
The old pan was covered with hard fudge.
Watch the log float in the wide river.
The node on the stalk of wheat grew daily.
The heap of fallen leaves was set on fire.
Write fast if you want to finish early.
His shirt was clean but one button was gone.
The barrel of beer was a brew of malt and hops.
Tin cans are absent from store shelves.
Slide the box into that empty space.
The plant grew large and green in the window.
The beam dropped down on the workman's head.
Pink clouds floated with the breeze.
She danced like a swan, tall and graceful.
The tube was blown and the tire flat and useless.
It is late morning on the old wall clock.
Let's all join as we sing the last chorus.
The last switch cannot be turned off.
The fight will end in just six minutes.
The store walls were lined with colored frocks.
The peace league met to discuss their plans.
The rise to fame of a person takes luck.
Paper is scarce, so write with much care.
The quick fox jumped on the sleeping cat.
The nozzle of the fire hose was bright brass.
Screw the round cap on as tight as needed.
Time brings us many changes.
The purple tie was ten years old.
Men think and plan and sometimes act.
Fill the ink jar with sticky glue.
He smoke a big pipe with strong contents.
We need grain to keep our mules healthy.
Pack the records in a neat thin case.
The crunch of feet in the snow was the only sound.
The copper bowl shone in the sun's rays.
Boards will warp unless kept dry.
The plush chair leaned against the wall.
Glass will clink when struck by metal.
Bathe and relax in the cool green grass.
Nine rows of soldiers stood in a line.
The beach is dry and shallow at low tide.
The idea is to sew both edges straight.
The kitten chased the dog down the street.
Pages bound in cloth make a book.
Try to trace the fine lines of the painting.
Women form less than half of the group.
The zones merge in the central part of town.
A gem in the rough needs work to polish.
Code is used when secrets are sent.
Most of the news is easy for us to hear.
He used the lathe to make brass objects.
The vane on top of the pole revolved in the wind.
Mince pie is a dish served to children.
The clan gathered on each dull night.
Let it burn, it gives us warmth and comfort.
A castle built from sand fails to endure.
A child's wit saved the day for us.
Tack the strip of carpet to the worn floor.
Next Tuesday we must vote.
Pour the stew from the pot into the plate.
Each penny shone like new.
The man went to the woods to gather sticks.
The dirt piles were lines along the road.
The logs fell and tumbled into the clear stream.
Just hoist it up and take it away.
A ripe plum is fit for a king's palate.
Our plans right now are hazy.
Brass rings are sold by these natives.
It takes a good trap to capture a bear.
Feed the white mouse some flower seeds.
The thaw came early and freed the stream.
He took the lead and kept it the whole distance.
The key you designed will fit the lock.
Plead to the council to free the poor thief.
Better hash is made of rare beef.
This plank was made for walking on .
The lake sparkled in the red hot sun.
He crawled with care along the ledge.
Tend the sheep while the dog wanders.
It takes a lot of help to finish these.
Mark the spot with a sign painted red.
Take two shares as a fair profit.
The fur of cats goes by many names.
North winds bring colds and fevers.
He asks no person to vouch for him.
Go now and come here later.
A sash of gold silk will trim her dress.
Soap can wash most dirt away.
That move means the game is over.
He wrote down a long list of items.
A siege will crack the strong defense.
Grape juice and water mix well.
Roads are paved with sticky tar.
Fake stones shine but cost little.
The drip of the rain made a pleasant sound.
Smoke poured out of every crack.
Serve the hot rum to the tired heroes.
Much of the story makes good sense.
The sun came up to light the eastern sky.
Heave the line over the port side.
A lathe cuts and trims any wood.
It's a dense crowd in two distinct ways.
His hip struck the knee of the next player.
The stale smell of old beer lingers.
The desk was firm on the shaky floor.
It takes heat to bring out the odor.
Beef is scarcer than some lamb.
Raise the sail and steer the ship northward.
A cone costs five cents on Mondays.
A pod is what peas always grow in.
Jerk that dart from the cork target.
No cement will hold hard wood.
We now have a new base for shipping.
A list of names is carved around the base.
The sheep were led home by a dog.
Three for a dime, the young peddler cried.
The sense of smell is better than that of touch.
No hardship seemed to make him sad.
Grace makes up for lack of beauty.
Nudge gently but wake her now.
The news struck doubt into restless minds.
Once we stood beside the shore.
A chink in the wall allowed a draft to blow.
Fasten two pins on each side.
A cold dip restores health and zest.
He takes the oath of office each March.
The sand drifts over the sills of the old house.
The point of the steel pen was bent and twisted.
There is a lag between thought and act.
Seed is needed to plant the spring corn.
Draw the chart with heavy black lines.
The boy owed his pal thirty cents.
The chap slipped into the crowd and was lost.
Hats are worn to tea and not to dinner.
The ramp led up to the wide highway.
Beat the dust from the rug onto the lawn.
Say it slowly but make it ring clear.
The straw nest housed five robins.
Screen the porch with woven straw mats.
This horse will nose his way to the finish.
The dry wax protects the deep scratch.
He picked up the dice for a second roll.
These coins will be needed to pay his debt.
The nag pulled the frail cart along.
Twist the valve and release hot steam.
The vamp of the shoe had a gold buckle.
The smell of burned rags itches my nose.
New pants lack cuffs and pockets.
The marsh will freeze when cold enough.
They slice the sausage thin with a knife.
The bloom of the rose lasts a few days.
A gray mare walked before the colt.
Breakfast buns are fine with a hot drink.
Bottles hold four kinds of rum.
The man wore a feather in his felt hat.
He wheeled the bike past the winding road.
Drop the ashes on the worn old rug.
The desk and both chairs were painted tan.
Throw out the used paper cup and plate.
A clean neck means a neat collar.
The couch cover and hall drapes were blue.
The stems of the tall glasses cracked and broke.
The wall phone rang loud and often.
The clothes dried on a thin wooden rack.
Turn out the lantern which gives us light.
The cleat sank deeply into the soft turf.
The bills were mailed promptly on the tenth of the month.
To have is better than to wait and hope.
The price is fair for a good antique clock.
The music played on while they talked.
Dispense with a vest on a day like this.
The bunch of grapes was pressed into wine.
He sent the figs, but kept the ripe cherries.
The hinge on the door creaked with old age.
The screen before the fire kept in the sparks.
Fly by night and you waste little time.
Thick glasses helped him read the print.
Birth and death marks the limits of life.
The chair looked strong but had no bottom.
The kite flew wildly in the high wind.
A fur muff is stylish once more.
The tin box held priceless stones.
We need an end of all such matter.
The case was puzzling to the old and wise.
The bright lanterns were gay on the dark lawn.
We don't get much money but we have fun.
The youth drove with zest, but little skill.
Five years he lived with a shaggy dog.
A fence cuts through the corner lot.
The way to save money is not to spend much.
Shut the hatch before the waves push it in.
The odor of spring makes young hearts jump.
Crack the walnut with your sharp side teeth.
He offered proof in the form of a large chart.
Send the stuff in a thick paper bag.
A quart of milk is water for the most part.
They told wild tales to frighten him.
The three story house was built of stone.
In the rear of the ground floor was a large passage.
A man in a blue sweater sat at the desk.
Oats are a food eaten by horse and man.
Their eyelids droop for want of sleep.
A sip of tea revives his tired friend.
There are many ways to do these things.
Tuck the sheet under the edge of the mat.
A force equal to that would move the earth.
We like to see clear weather.
The work of the tailor is seen on each side.
Take a chance and win a china doll.
Shake the dust from your shoes, stranger.
She was kind to sick old people.
The square wooden crate was packed to be shipped.
The dusty bench stood by the stone wall.
We dress to suit the weather of most days.
Smile when you say nasty words.
A bowl of rice is free with chicken stew.
The water in this well is a source of good health.
Take shelter in this tent, but keep still.
That guy is the writer of a few banned books.
The little tales they tell are false.
The door was barred, locked, and bolted as well.
Ripe pears are fit for a queen's table.
A big wet stain was on the round carpet.
The kite dipped and swayed, but stayed aloft.
The pleasant hours fly by much too soon.
The room was crowded with a wild mob.
This strong arm shall shield your honor.
She blushed when he gave her a white orchid.
The beetle droned in the hot June sun.
Press the pedal with your left foot.
Neat plans fail without luck.
The black trunk fell from the landing.
The bank pressed for payment of the debt.
The theft of the pearl pin was kept secret.
Shake hands with this friendly child.
The vast space stretched into the far distance.
A rich farm is rare in this sandy waste.
His wide grin earned many friends.
Flax makes a fine brand of paper.
Hurdle the pit with the aid of a long pole.
A strong bid may scare your partner stiff.
Even a just cause needs power to win.
Peep under the tent and see the clowns.
The leaf drifts along with a slow spin.
Cheap clothes are flashy but don't last.
A thing of small note can cause despair.
Flood the mails with requests for this book.
A thick coat of black paint covered all.
The pencil was cut to be sharp at both ends.
Those last words were a strong statement.
He wrote his name boldly at the top of the sheet.
Dill pickles are sour but taste fine.
Down that road is the way to the grain farmer.
Either mud or dust are found at all times.
The best method is to fix it in place with clips.
If you mumble your speech will be lost.
At night the alarm roused him from a deep sleep.
Read just what the meter says.
Fill your pack with bright trinkets for the poor.
The small red neon lamp went out.
Clams are small, round, soft, and tasty.
The fan whirled its round blades softly.
The line where the edges join was clean.
Breathe deep and smell the piny air.
It matters not if he reads these words or those.
A brown leather bag hung from its strap.
A toad and a frog are hard to tell apart.
A white silk jacket goes with any shoes.
A break in the dam almost caused a flood.
Paint the sockets in the wall dull green.
The child crawled into the dense grass.
Bribes fail where honest men work.
Trample the spark, else the flames will spread.
The hilt of the sword was carved with fine designs.
A round hole was drilled through the thin board.
Footprints showed the path he took up the beach.
She was waiting at my front lawn.
A vent near the edge brought in fresh air.
Prod the old mule with a crooked stick.
It is a band of steel three inches wide.
The pipe ran almost the length of the ditch.
It was hidden from sight by a mass of leaves and shrubs.
The weight of the package was seen on the high scale.
Wake and rise, and step into the green outdoors.
The green light in the brown box flickered.
The brass tube circled the high wall.
The lobes of her ears were pierced to hold rings.
Hold the hammer near the end to drive the nail.
Next Sunday is the twelfth of the month.
Every word and phrase he speaks is true.
He put his last cartridge into the gun and fired.
They took their kids from the public school.
Drive the screw straight into the wood.
Keep the hatch tight and the watch constant.
Sever the twine with a quick snip of the knife.
Paper will dry out when wet.
Slide the catch back and open the desk.
Help the weak to preserve their strength.
A sullen smile gets few friends.
Stop whistling and watch the boys march.
Jerk the cord, and out tumbles the gold.
Slide the tray across the glass top.
The cloud moved in a stately way and was gone.
Light maple makes for a swell room.
Set the piece here and say nothing.
Dull stories make her laugh.
A stiff cord will do to fasten your shoe.
Get the trust fund to the bank early.
Choose between the high road and the low.
A plea for funds seems to come again.
He lent his coat to the tall gaunt stranger.
There is a strong chance it will happen once more.
The duke left the park in a silver coach.
Greet the new guests and leave quickly.
When the frost has come it is time for turkey.
Sweet words work better than fierce.
A thin stripe runs down the middle.
A six comes up more often than a ten.
Lush ferns grow on the lofty rocks.
The ram scared the school children off.
The team with the best timing looks good.
The farmer swapped his horse for a brown ox.
Sit on the perch and tell the others what to do.
A steep trail is painful for our feet.
The early phase of life moves fast.
Green moss grows on the northern side.
Tea in thin china has a sweet taste.
Pitch the straw through the door of the stable.
The latch on the back gate needed a nail.
The goose was brought straight from the old market.
The sink is the thing in which we pile dishes.
A whiff of it will cure the most stubborn cold.
The facts don't always show who is right.
She flaps her cape as she parades the street.
The loss of the cruiser was a blow to the fleet.
Loop the braid to the left and then over.
Plead with the lawyer to drop the lost cause.
Calves thrive on tender spring grass.
Post no bills on this office wall.
Tear a thin sheet from the yellow pad.
A cruise in warm waters in a sleek yacht is fun.
A streak of color ran down the left edge.
It was done before the boy could see it.
Crouch before you jump or miss the mark.
Pack the kits and don't forget the salt.
The square peg will settle in the round hole.
Fine soap saves tender skin.
Poached eggs and tea must suffice.
Bad nerves are jangled by a door slam.
Ship maps are different from those for planes.
Dimes showered down from all sides.
They sang the same tunes at each party.
The sky in the west is tinged with orange red.
The pods of peas ferment in bare fields.
The horse balked and threw the tall rider.
The hitch between the horse and cart broke.
Pile the coal high in the shed corner.
A gold vase is both rare and costly.
The knife was hung inside its bright sheath.
The rarest spice comes from the far East.
The roof should be tilted at a sharp slant.
A smatter of French is worse than none.
The mule trod the treadmill day and night.
The aim of the contest is to raise a great fund.
To send it now in large amounts is bad.
There is a fine hard tang in salty air.
Cod is the main business of the north shore.
The slab was hewn from heavy blocks of slate.
Dunk the stale biscuits into strong drink.
Hang tinsel from both branches.
Cap the jar with a tight brass cover.
The poor boy missed the boat again.
Be sure to set that lamp firmly in the hole.
Pick a card and slip it under the pack.
A round mat will cover the dull spot.
The first part of the plan needs changing.
A good book informs of what we ought to know.
The mail comes in three batches per day.
You cannot brew tea in a cold pot.
Dots of light betrayed the black cat.
Put the chart on the mantel and tack it down.
The night shift men rate extra pay.
The red paper brightened the dim stage.
See the player scoot to third base.
Slide the bill between the two leaves.
Many hands help get the job done.
We don't like to admit our small faults.
No doubt about the way the wind blows.
Dig deep in the earth for pirate's gold.
The steady drip is worse than a drenching rain.
A flat pack takes less luggage space.
Green ice frosted the punch bowl.
A stuffed chair slipped from the moving van.
The stitch will serve but needs to be shortened.
A thin book fits in the side pocket.
The gloss on top made it unfit to read.
The hail pattered on the burnt brown grass.
Seven seals were stamped on great sheets.
Our troops are set to strike heavy blows.
The store was jammed before the sale could start.
It was a bad error on the part of the new judge.
One step more and the board will collapse.
Take the match and strike it against your shoe.
The pot boiled but the contents failed to jell.
The baby puts his right foot in his mouth.
The bombs left most of the town in ruins.
Stop and stare at the hard working man.
The streets are narrow and full of sharp turns.
The pup jerked the leash as he saw a feline shape.
Open your book to the first page.
Fish evade the net and swim off.
Dip the pail once and let it settle.
Will you please answer that phone.
The big red apple fell to the ground.
The curtain rose and the show was on.
The young prince became heir to the throne.
He sent the boy on a short errand.
Leave now and you will arrive on time.
The corner store was robbed last night.
A gold ring will please most any girl.
The long journey home took a year.
She saw a cat in the neighbor's house.
A pink shell was found on the sandy beach.
Small children came to see him.
The grass and bushes were wet with dew.
The blind man counted his old coins.
A severe storm tore down the barn.
She called his name many times.
When you hear the bell, come quickly.
================================================
FILE: data-raw/samples.R
================================================
words <- rcorpora::corpora("words/common")$commonWords
fruit <- rcorpora::corpora("foods/fruits")$fruits
html <- read_html("https://harvardsentences.com")
html %>%
html_elements("li") %>%
html_text() %>%
iconv(to = "ASCII//translit") %>%
writeLines("data-raw/harvard-sentences.txt")
sentences <- readr::read_lines("data-raw/harvard-sentences.txt")
usethis::use_data(words, overwrite = TRUE)
usethis::use_data(fruit, overwrite = TRUE)
usethis::use_data(sentences, overwrite = TRUE)
================================================
FILE: inst/htmlwidgets/lib/str_view.css
================================================
.str_view ul {
font-size: 16px;
}
.str_view ul, .str_view li {
list-style: none;
padding: 0;
margin: 0.5em 0;
}
.str_view .match {
border: 1px solid #ccc;
background-color: #eee;
border-color: #ccc;
border-radius: 3px;
}
.str_view .special {
background-color: red;
}
================================================
FILE: inst/htmlwidgets/str_view.js
================================================
HTMLWidgets.widget({
name: 'str_view',
type: 'output',
initialize: function(el, width, height) {
},
renderValue: function(el, x, instance) {
el.innerHTML = x.html;
},
resize: function(el, width, height, instance) {
}
});
================================================
FILE: inst/htmlwidgets/str_view.yaml
================================================
dependencies:
- name: str_view
version: 0.1.0
src: htmlwidgets/lib/
stylesheet: str_view.css
================================================
FILE: man/case.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/case.R
\name{case}
\alias{case}
\alias{str_to_upper}
\alias{str_to_lower}
\alias{str_to_title}
\alias{str_to_sentence}
\title{Convert string to upper case, lower case, title case, or sentence case}
\usage{
str_to_upper(string, locale = "en")
str_to_lower(string, locale = "en")
str_to_title(string, locale = "en")
str_to_sentence(string, locale = "en")
}
\arguments{
\item{string}{Input vector. Either a character vector, or something
coercible to one.}
\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}
}
\value{
A character vector the same length as \code{string}.
}
\description{
\itemize{
\item \code{str_to_upper()} converts to upper case.
\item \code{str_to_lower()} converts to lower case.
\item \code{str_to_title()} converts to title case, where only the first letter of
each word is capitalized.
\item \code{str_to_sentence()} convert to sentence case, where only the first letter
of sentence is capitalized.
}
}
\examples{
dog <- "The quick brown dog"
str_to_upper(dog)
str_to_lower(dog)
str_to_title(dog)
str_to_sentence("the quick brown dog")
# Locale matters!
str_to_upper("i") # English
str_to_upper("i", "tr") # Turkish
}
================================================
FILE: man/invert_match.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/locate.R
\name{invert_match}
\alias{invert_match}
\title{Switch location of matches to location of non-matches}
\usage{
invert_match(loc)
}
\arguments{
\item{loc}{matrix of match locations, as from \code{\link[=str_locate_all]{str_locate_all()}}}
}
\value{
numeric match giving locations of non-matches
}
\description{
Invert a matrix of match locations to match the opposite of what was
previously matched.
}
\examples{
numbers <- "1 and 2 and 4 and 456"
num_loc <- str_locate_all(numbers, "[0-9]+")[[1]]
str_sub(numbers, num_loc[, "start"], num_loc[, "end"])
text_loc <- invert_match(num_loc)
str_sub(numbers, text_loc[, "start"], text_loc[, "end"])
}
================================================
FILE: man/modifiers.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/modifiers.R
\name{modifiers}
\alias{modifiers}
\alias{fixed}
\alias{coll}
\alias{regex}
\alias{boundary}
\title{Control matching behaviour with modifier functions}
\usage{
fixed(pattern, ignore_case = FALSE)
coll(pattern, ignore_case = FALSE, locale = "en", ...)
regex(
pattern,
ignore_case = FALSE,
multiline = FALSE,
comments = FALSE,
dotall = FALSE,
...
)
boundary(
type = c("character", "line_break", "sentence", "word"),
skip_word_none = NA,
...
)
}
\arguments{
\item{pattern}{Pattern to modify behaviour.}
\item{ignore_case}{Should case differences be ignored in the match?
For \code{fixed()}, this uses a simple algorithm which assumes a
one-to-one mapping between upper and lower case letters.}
\item{locale}{Locale to use for comparisons. See
\code{\link[stringi:stri_locale_list]{stringi::stri_locale_list()}} for all possible options.
Defaults to "en" (English) to ensure that default behaviour is
consistent across platforms.}
\item{...}{Other less frequently used arguments passed on to
\code{\link[stringi:stri_opts_collator]{stringi::stri_opts_collator()}},
\code{\link[stringi:stri_opts_regex]{stringi::stri_opts_regex()}}, or
\code{\link[stringi:stri_opts_brkiter]{stringi::stri_opts_brkiter()}}}
\item{multiline}{If \code{TRUE}, \code{$} and \code{^} match
the beginning and end of each line. If \code{FALSE}, the
default, only match the start and end of the input.}
\item{comments}{If \code{TRUE}, white space and comments beginning with
\verb{#} are ignored. Escape literal spaces with \verb{\\\\ }.}
\item{dotall}{If \code{TRUE}, \code{.} will also match line terminators.}
\item{type}{Boundary type to detect.
\describe{
\item{\code{character}}{Every character is a boundary.}
\item{\code{line_break}}{Boundaries are places where it is acceptable to have
a line break in the current locale.}
\item{\code{sentence}}{The beginnings and ends of sentences are boundaries,
using intelligent rules to avoid counting abbreviations
(\href{https://www.unicode.org/reports/tr29/#Sentence_Boundaries}{details}).}
\item{\code{word}}{The beginnings and ends of words are boundaries.}
}}
\item{skip_word_none}{Ignore "words" that don't contain any characters
or numbers - i.e. punctuation. Default \code{NA} will skip such "words"
only when splitting on \code{word} boundaries.}
}
\value{
A stringr modifier object, i.e. a character vector with
parent S3 class \code{stringr_pattern}.
}
\description{
Modifier functions control the meaning of the \code{pattern} argument to
stringr functions:
\itemize{
\item \code{boundary()}: Match boundaries between things.
\item \code{coll()}: Compare strings using standard Unicode collation rules.
\item \code{fixed()}: Compare literal bytes.
\item \code{regex()} (the default): Uses ICU regular expressions.
}
}
\examples{
pattern <- "a.b"
strings <- c("abb", "a.b")
str_detect(strings, pattern)
str_detect(strings, fixed(pattern))
str_detect(strings, coll(pattern))
# coll() is useful for locale-aware case-insensitive matching
i <- c("I", "\u0130", "i")
i
str_detect(i, fixed("i", TRUE))
str_detect(i, coll("i", TRUE))
str_detect(i, coll("i", TRUE, locale = "tr"))
# Word boundaries
words <- c("These are some words.")
str_count(words, boundary("word"))
str_split(words, " ")[[1]]
str_split(words, boundary("word"))[[1]]
# Regular expression variations
str_extract_all("The Cat in the Hat", "[a-z]+")
str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))
str_extract_all("a\nb\nc", "^.")
str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))
str_extract_all("a\nb\nc", "a.")
str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))
}
================================================
FILE: man/pipe.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils.R
\name{\%>\%}
\alias{\%>\%}
\title{Pipe operator}
\usage{
lhs \%>\% rhs
}
\description{
Pipe operator
}
\keyword{internal}
================================================
FILE: man/str_c.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/c.R
\name{str_c}
\alias{str_c}
\title{Join multiple strings into one string}
\usage{
str_c(..., sep = "", collapse = NULL)
}
\arguments{
\item{...}{One or more character vectors.
\code{NULL}s are removed; scalar inputs (vectors of length 1) are recycled to
the common length of vector inputs.
Like most other R functions, missing values are "infectious": whenever
a missing value is combined with another string the result will always
be missing. Use \code{\link[dplyr:coalesce]{dplyr::coalesce()}} or \code{\link[=str_replace_na]{str_replace_na()}} to convert to
the desired value.}
\item{sep}{String to insert between input vectors.}
\item{collapse}{Optional string used to combine output into single
string. Generally better to use \code{\link[=str_flatten]{str_flatten()}} if you needed this
behaviour.}
}
\value{
If \code{collapse = NULL} (the default) a character vector with
length equal to the longest input. If \code{collapse} is a string, a character
vector of length 1.
}
\description{
\code{str_c()} combines multiple character vectors into a single character
vector. It's very similar to \code{\link[=paste0]{paste0()}} but uses tidyverse recycling and
\code{NA} rules.
One way to understand how \code{str_c()} works is picture a 2d matrix of strings,
where each argument forms a column. \code{sep} is inserted between each column,
and then each row is combined together into a single string. If \code{collapse}
is set, it's inserted between each row, and then the result is again
combined, this time into a single string.
}
\examples{
str_c("Letter: ", letters)
str_c("Letter", letters, sep = ": ")
str_c(letters, " is for", "...")
str_c(letters[-26], " comes before ", letters[-1])
str_c(letters, collapse = "")
gitextract_1kgwvzj7/
├── .Rbuildignore
├── .covrignore
├── .github/
│ ├── .gitignore
│ ├── CODE_OF_CONDUCT.md
│ └── workflows/
│ ├── R-CMD-check.yaml
│ ├── pkgdown.yaml
│ ├── pr-commands.yaml
│ └── test-coverage.yaml
├── .gitignore
├── .vscode/
│ ├── extensions.json
│ └── settings.json
├── DESCRIPTION
├── LICENSE
├── LICENSE.md
├── NAMESPACE
├── NEWS.md
├── R/
│ ├── c.R
│ ├── case.R
│ ├── compat-obj-type.R
│ ├── compat-purrr.R
│ ├── compat-types-check.R
│ ├── conv.R
│ ├── count.R
│ ├── data.R
│ ├── detect.R
│ ├── dup.R
│ ├── equal.R
│ ├── escape.R
│ ├── extract.R
│ ├── flatten.R
│ ├── glue.R
│ ├── interp.R
│ ├── length.R
│ ├── locate.R
│ ├── match.R
│ ├── modifiers.R
│ ├── pad.R
│ ├── remove.R
│ ├── replace.R
│ ├── sort.R
│ ├── split.R
│ ├── stringr-package.R
│ ├── sub.R
│ ├── subset.R
│ ├── trim.R
│ ├── trunc.R
│ ├── unique.R
│ ├── utils.R
│ ├── view.R
│ ├── word.R
│ └── wrap.R
├── README.Rmd
├── README.md
├── _pkgdown.yml
├── air.toml
├── codecov.yml
├── cran-comments.md
├── data/
│ ├── fruit.rda
│ ├── sentences.rda
│ └── words.rda
├── data-raw/
│ ├── harvard-sentences.txt
│ └── samples.R
├── inst/
│ └── htmlwidgets/
│ ├── lib/
│ │ └── str_view.css
│ ├── str_view.js
│ └── str_view.yaml
├── man/
│ ├── case.Rd
│ ├── invert_match.Rd
│ ├── modifiers.Rd
│ ├── pipe.Rd
│ ├── str_c.Rd
│ ├── str_conv.Rd
│ ├── str_count.Rd
│ ├── str_detect.Rd
│ ├── str_dup.Rd
│ ├── str_equal.Rd
│ ├── str_escape.Rd
│ ├── str_extract.Rd
│ ├── str_flatten.Rd
│ ├── str_glue.Rd
│ ├── str_interp.Rd
│ ├── str_length.Rd
│ ├── str_like.Rd
│ ├── str_locate.Rd
│ ├── str_match.Rd
│ ├── str_order.Rd
│ ├── str_pad.Rd
│ ├── str_remove.Rd
│ ├── str_replace.Rd
│ ├── str_replace_na.Rd
│ ├── str_split.Rd
│ ├── str_starts.Rd
│ ├── str_sub.Rd
│ ├── str_subset.Rd
│ ├── str_to_camel.Rd
│ ├── str_trim.Rd
│ ├── str_trunc.Rd
│ ├── str_unique.Rd
│ ├── str_view.Rd
│ ├── str_which.Rd
│ ├── str_wrap.Rd
│ ├── stringr-data.Rd
│ ├── stringr-package.Rd
│ └── word.Rd
├── po/
│ ├── R-es.po
│ └── R-stringr.pot
├── revdep/
│ ├── .gitignore
│ ├── README.md
│ ├── cran.md
│ ├── email.yml
│ ├── failures.md
│ └── problems.md
├── stringr.Rproj
├── tests/
│ ├── testthat/
│ │ ├── _snaps/
│ │ │ ├── c.md
│ │ │ ├── conv.md
│ │ │ ├── detect.md
│ │ │ ├── dup.md
│ │ │ ├── equal.md
│ │ │ ├── flatten.md
│ │ │ ├── interp.md
│ │ │ ├── match.md
│ │ │ ├── modifiers.md
│ │ │ ├── replace.md
│ │ │ ├── split.md
│ │ │ ├── sub.md
│ │ │ ├── subset.md
│ │ │ ├── trunc.md
│ │ │ └── view.md
│ │ ├── test-c.R
│ │ ├── test-case.R
│ │ ├── test-conv.R
│ │ ├── test-count.R
│ │ ├── test-detect.R
│ │ ├── test-dup.R
│ │ ├── test-equal.R
│ │ ├── test-escape.R
│ │ ├── test-extract.R
│ │ ├── test-flatten.R
│ │ ├── test-glue.R
│ │ ├── test-interp.R
│ │ ├── test-length.R
│ │ ├── test-locate.R
│ │ ├── test-match.R
│ │ ├── test-modifiers.R
│ │ ├── test-pad.R
│ │ ├── test-remove.R
│ │ ├── test-replace.R
│ │ ├── test-sort.R
│ │ ├── test-split.R
│ │ ├── test-sub.R
│ │ ├── test-subset.R
│ │ ├── test-trim.R
│ │ ├── test-trunc.R
│ │ ├── test-unique.R
│ │ ├── test-utils.R
│ │ ├── test-view.R
│ │ ├── test-word.R
│ │ └── test-wrap.R
│ └── testthat.R
└── vignettes/
├── .gitignore
├── from-base.Rmd
├── locale-sensitive.Rmd
├── regular-expressions.Rmd
└── stringr.Rmd
Condensed preview — 163 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (418K chars).
[
{
"path": ".Rbuildignore",
"chars": 331,
"preview": "^pkgdown$\n^\\.covrignore$\n^.*\\.Rproj$\n^\\.Rproj\\.user$\n^packrat/\n^\\.Rprofile$\n^\\.travis\\.yml$\n^revdep$\n^cran-comments\\.md$"
},
{
"path": ".covrignore",
"chars": 26,
"preview": "R/deprec-*.R\nR/compat-*.R\n"
},
{
"path": ".github/.gitignore",
"chars": 7,
"preview": "*.html\n"
},
{
"path": ".github/CODE_OF_CONDUCT.md",
"chars": 5244,
"preview": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participa"
},
{
"path": ".github/workflows/R-CMD-check.yaml",
"chars": 1856,
"preview": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at"
},
{
"path": ".github/workflows/pkgdown.yaml",
"chars": 1329,
"preview": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at"
},
{
"path": ".github/workflows/pr-commands.yaml",
"chars": 2501,
"preview": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at"
},
{
"path": ".github/workflows/test-coverage.yaml",
"chars": 1740,
"preview": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at"
},
{
"path": ".gitignore",
"chars": 187,
"preview": "docs\n.Rproj.user\n.Rhistory\n.RData\npackrat/lib*/\npackrat/src\ninst/doc\n.httr-oauth\nrevdep/checks\nrevdep/library\nrevdep/che"
},
{
"path": ".vscode/extensions.json",
"chars": 62,
"preview": "{\n \"recommendations\": [\n \"Posit.air-vscode\"\n ]\n}\n"
},
{
"path": ".vscode/settings.json",
"chars": 114,
"preview": "{\n \"[r]\": {\n \"editor.formatOnSave\": true,\n \"editor.defaultFormatter\": \"Posit.air-vscode\"\n }\n}\n"
},
{
"path": "DESCRIPTION",
"chars": 1271,
"preview": "Package: stringr\nTitle: Simple, Consistent Wrappers for Common String Operations\nVersion: 1.6.0.9000\nAuthors@R: c(\n p"
},
{
"path": "LICENSE",
"chars": 45,
"preview": "YEAR: 2023\nCOPYRIGHT HOLDER: stringr authors\n"
},
{
"path": "LICENSE.md",
"chars": 1074,
"preview": "# MIT License\n\nCopyright (c) 2023 stringr authors\n\nPermission is hereby granted, free of charge, to any person obtaining"
},
{
"path": "NAMESPACE",
"chars": 1636,
"preview": "# Generated by roxygen2: do not edit by hand\n\nS3method(\"[\",stringr_pattern)\nS3method(\"[\",stringr_view)\nS3method(\"[[\",str"
},
{
"path": "NEWS.md",
"chars": 15046,
"preview": "# stringr (development version)\n\n# stringr 1.6.0\n\n## Breaking changes\n\n* All relevant stringr functions now preserve nam"
},
{
"path": "R/c.R",
"chars": 2332,
"preview": "#' Join multiple strings into one string\n#'\n#' @description\n#' `str_c()` combines multiple character vectors into a sing"
},
{
"path": "R/case.R",
"chars": 3612,
"preview": "#' Convert string to upper case, lower case, title case, or sentence case\n#'\n#' * `str_to_upper()` converts to upper cas"
},
{
"path": "R/compat-obj-type.R",
"chars": 7899,
"preview": "# nocov start --- r-lib/rlang compat-obj-type\n#\n# Changelog\n# =========\n#\n# 2022-10-04:\n# - `obj_type_friendly(value = T"
},
{
"path": "R/compat-purrr.R",
"chars": 4568,
"preview": "# nocov start - compat-purrr (last updated: rlang 0.3.2.9000)\n\n# This file serves as a reference for compatibility funct"
},
{
"path": "R/compat-types-check.R",
"chars": 8304,
"preview": "# nocov start --- r-lib/rlang compat-types-check\n#\n# Dependencies\n# ============\n#\n# - compat-obj-type.R\n#\n# Changelog\n#"
},
{
"path": "R/conv.R",
"chars": 567,
"preview": "#' Specify the encoding of a string\n#'\n#' This is a convenient way to override the current encoding of a string.\n#'\n#' @"
},
{
"path": "R/count.R",
"chars": 1753,
"preview": "#' Count number of matches\n#'\n#' Counts the number of times `pattern` is found within each element\n#' of `string.`\n#'\n#'"
},
{
"path": "R/data.R",
"chars": 686,
"preview": "#' Sample character vectors for practicing string manipulations\n#'\n#' `fruit` and `words` come from the `rcorpora` packa"
},
{
"path": "R/detect.R",
"chars": 7303,
"preview": "#' Detect the presence/absence of a match\n#'\n#' `str_detect()` returns a logical vector with `TRUE` for each element of\n"
},
{
"path": "R/dup.R",
"chars": 933,
"preview": "#' Duplicate a string\n#'\n#' `str_dup()` duplicates the characters within a string, e.g.\n#' `str_dup(\"xy\", 3)` returns `\""
},
{
"path": "R/equal.R",
"chars": 1057,
"preview": "#' Determine if two strings are equivalent\n#'\n#' This uses Unicode canonicalisation rules, and optionally ignores case.\n"
},
{
"path": "R/escape.R",
"chars": 653,
"preview": "#' Escape regular expression metacharacters\n#'\n#' This function escapes metacharacter, the characters that have special\n"
},
{
"path": "R/extract.R",
"chars": 3396,
"preview": "#' Extract the complete match\n#'\n#' `str_extract()` extracts the first complete match from each string,\n#' `str_extract_"
},
{
"path": "R/flatten.R",
"chars": 2219,
"preview": "#' Flatten a string\n#\n#' @description\n#' `str_flatten()` reduces a character vector to a single string. This is a\n#' sum"
},
{
"path": "R/glue.R",
"chars": 1440,
"preview": "#' Interpolation with glue\n#'\n#' @description\n#' These functions are wrappers around [glue::glue()] and [glue::glue_data"
},
{
"path": "R/interp.R",
"chars": 8020,
"preview": "#' String interpolation\n#'\n#' @description\n#' `r lifecycle::badge(\"superseded\")`\n#'\n#' `str_interp()` is superseded in f"
},
{
"path": "R/length.R",
"chars": 1352,
"preview": "#' Compute the length/width\n#'\n#' @description\n#' `str_length()` returns the number of codepoints in a string. These are"
},
{
"path": "R/locate.R",
"chars": 3335,
"preview": "#' Find location of match\n#'\n#' @description\n#' `str_locate()` returns the `start` and `end` position of the first match"
},
{
"path": "R/match.R",
"chars": 2729,
"preview": "#' Extract components (capturing groups) from a match\n#'\n#' @description\n#' Extract any number of matches defined by unn"
},
{
"path": "R/modifiers.R",
"chars": 6899,
"preview": "#' Control matching behaviour with modifier functions\n#'\n#' @description\n#' Modifier functions control the meaning of th"
},
{
"path": "R/pad.R",
"chars": 1688,
"preview": "#' Pad a string to minimum width\n#'\n#' Pad a string to a fixed width, so that\n#' `str_length(str_pad(x, n))` is always g"
},
{
"path": "R/remove.R",
"chars": 594,
"preview": "#' Remove matched patterns\n#'\n#' Remove matches, i.e. replace them with `\"\"`.\n#'\n#' @inheritParams str_detect\n#' @return"
},
{
"path": "R/replace.R",
"chars": 7936,
"preview": "#' Replace matches with new text\n#'\n#' `str_replace()` replaces the first match; `str_replace_all()` replaces\n#' all mat"
},
{
"path": "R/sort.R",
"chars": 2431,
"preview": "#' Order, rank, or sort a character vector\n#'\n#' * `str_sort()` returns the sorted vector.\n#' * `str_order()` returns an"
},
{
"path": "R/split.R",
"chars": 4732,
"preview": "#' Split up a string into pieces\n#'\n#' @description\n#' This family of functions provides various ways of splitting a str"
},
{
"path": "R/stringr-package.R",
"chars": 190,
"preview": "#' @keywords internal\n\"_PACKAGE\"\n\n## usethis namespace: start\n#' @import stringi\n#' @import rlang\n#' @importFrom glue gl"
},
{
"path": "R/sub.R",
"chars": 3401,
"preview": "#' Get and set substrings using their positions\n#'\n#' `str_sub()` extracts or replaces the elements at a single position"
},
{
"path": "R/subset.R",
"chars": 2003,
"preview": "#' Find matching elements\n#'\n#' @description\n#' `str_subset()` returns all elements of `string` where there's at least\n#"
},
{
"path": "R/trim.R",
"chars": 1155,
"preview": "#' Remove whitespace\n#'\n#' `str_trim()` removes whitespace from start and end of string; `str_squish()`\n#' removes white"
},
{
"path": "R/trunc.R",
"chars": 1504,
"preview": "#' Truncate a string to maximum width\n#'\n#' Truncate a string to a fixed of characters, so that\n#' `str_length(str_trunc"
},
{
"path": "R/unique.R",
"chars": 1006,
"preview": "#' Remove duplicated strings\n#'\n#' `str_unique()` removes duplicated values, with optional control over\n#' how duplicati"
},
{
"path": "R/utils.R",
"chars": 1449,
"preview": "#' Pipe operator\n#'\n#' @name %>%\n#' @rdname pipe\n#' @keywords internal\n#' @export\n#' @importFrom magrittr %>%\n#' @usage "
},
{
"path": "R/view.R",
"chars": 5873,
"preview": "#' View strings and matches\n#'\n#' @description\n#' `str_view()` is used to print the underlying representation of a strin"
},
{
"path": "R/word.R",
"chars": 1808,
"preview": "#' Extract words from a sentence\n#'\n#' @inheritParams str_detect\n#' @param start,end Pair of integer vectors giving rang"
},
{
"path": "R/wrap.R",
"chars": 1762,
"preview": "#' Wrap words into nicely formatted paragraphs\n#'\n#' Wrap words into paragraphs, minimizing the \"raggedness\" of the line"
},
{
"path": "README.Rmd",
"chars": 5684,
"preview": "---\noutput: github_document\n---\n\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n```{r, include "
},
{
"path": "README.md",
"chars": 6347,
"preview": "\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n# stringr <a href='https://stringr.tidyverse.or"
},
{
"path": "_pkgdown.yml",
"chars": 2105,
"preview": "url: https://stringr.tidyverse.org\n\ndevelopment:\n mode: auto\n\ntemplate:\n package: tidytemplate\n bootstrap: 5\n includ"
},
{
"path": "air.toml",
"chars": 0,
"preview": ""
},
{
"path": "codecov.yml",
"chars": 232,
"preview": "comment: false\n\ncoverage:\n status:\n project:\n default:\n target: auto\n threshold: 1%\n infor"
},
{
"path": "cran-comments.md",
"chars": 428,
"preview": "## R CMD check results\n\n0 errors | 0 warnings | 0 note\n\n## revdepcheck results\n\nWe checked 2390 reverse dependencies, co"
},
{
"path": "data-raw/harvard-sentences.txt",
"chars": 29055,
"preview": "The birch canoe slid on the smooth planks.\nGlue the sheet to the dark blue background.\nIt's easy to tell the depth of a "
},
{
"path": "data-raw/samples.R",
"chars": 491,
"preview": "words <- rcorpora::corpora(\"words/common\")$commonWords\nfruit <- rcorpora::corpora(\"foods/fruits\")$fruits\n\nhtml <- read_h"
},
{
"path": "inst/htmlwidgets/lib/str_view.css",
"chars": 288,
"preview": ".str_view ul {\n font-size: 16px;\n}\n\n.str_view ul, .str_view li {\n list-style: none;\n padding: 0;\n margin: 0.5em 0;\n}"
},
{
"path": "inst/htmlwidgets/str_view.js",
"chars": 247,
"preview": "HTMLWidgets.widget({\n\n name: 'str_view',\n\n type: 'output',\n\n initialize: function(el, width, height) {\n },\n\n render"
},
{
"path": "inst/htmlwidgets/str_view.yaml",
"chars": 103,
"preview": "dependencies:\n - name: str_view\n version: 0.1.0\n src: htmlwidgets/lib/\n stylesheet: str_view.css\n"
},
{
"path": "man/case.Rd",
"chars": 1407,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/case.R\n\\name{case}\n\\alias{case}\n\\alias{str"
},
{
"path": "man/invert_match.Rd",
"chars": 733,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/locate.R\n\\name{invert_match}\n\\alias{invert"
},
{
"path": "man/modifiers.Rd",
"chars": 3708,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/modifiers.R\n\\name{modifiers}\n\\alias{modifi"
},
{
"path": "man/pipe.Rd",
"chars": 208,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utils.R\n\\name{\\%>\\%}\n\\alias{\\%>\\%}\n\\title{"
},
{
"path": "man/str_c.Rd",
"chars": 2219,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/c.R\n\\name{str_c}\n\\alias{str_c}\n\\title{Join"
},
{
"path": "man/str_conv.Rd",
"chars": 681,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/conv.R\n\\name{str_conv}\n\\alias{str_conv}\n\\t"
},
{
"path": "man/str_count.Rd",
"chars": 1605,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/count.R\n\\name{str_count}\n\\alias{str_count}"
},
{
"path": "man/str_detect.Rd",
"chars": 1785,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/detect.R\n\\name{str_detect}\n\\alias{str_dete"
},
{
"path": "man/str_dup.Rd",
"chars": 755,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/dup.R\n\\name{str_dup}\n\\alias{str_dup}\n\\titl"
},
{
"path": "man/str_equal.Rd",
"chars": 1305,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/equal.R\n\\name{str_equal}\n\\alias{str_equal}"
},
{
"path": "man/str_escape.Rd",
"chars": 759,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/escape.R\n\\name{str_escape}\n\\alias{str_esca"
},
{
"path": "man/str_extract.Rd",
"chars": 2737,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/extract.R\n\\name{str_extract}\n\\alias{str_ex"
},
{
"path": "man/str_flatten.Rd",
"chars": 1685,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/flatten.R\n\\name{str_flatten}\n\\alias{str_fl"
},
{
"path": "man/str_glue.Rd",
"chars": 2559,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/glue.R\n\\name{str_glue}\n\\alias{str_glue}\n\\a"
},
{
"path": "man/str_interp.Rd",
"chars": 2351,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/interp.R\n\\name{str_interp}\n\\alias{str_inte"
},
{
"path": "man/str_length.Rd",
"chars": 1410,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/length.R\n\\name{str_length}\n\\alias{str_leng"
},
{
"path": "man/str_like.Rd",
"chars": 1696,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/detect.R\n\\name{str_like}\n\\alias{str_like}\n"
},
{
"path": "man/str_locate.Rd",
"chars": 2402,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/locate.R\n\\name{str_locate}\n\\alias{str_loca"
},
{
"path": "man/str_match.Rd",
"chars": 2300,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/match.R\n\\name{str_match}\n\\alias{str_match}"
},
{
"path": "man/str_order.Rd",
"chars": 2087,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/sort.R\n\\name{str_order}\n\\alias{str_order}\n"
},
{
"path": "man/str_pad.Rd",
"chars": 1442,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/pad.R\n\\name{str_pad}\n\\alias{str_pad}\n\\titl"
},
{
"path": "man/str_remove.Rd",
"chars": 1278,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/remove.R\n\\name{str_remove}\n\\alias{str_remo"
},
{
"path": "man/str_replace.Rd",
"chars": 3167,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/replace.R\n\\name{str_replace}\n\\alias{str_re"
},
{
"path": "man/str_replace_na.Rd",
"chars": 434,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/replace.R\n\\name{str_replace_na}\n\\alias{str"
},
{
"path": "man/str_split.Rd",
"chars": 3814,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/split.R\n\\name{str_split}\n\\alias{str_split}"
},
{
"path": "man/str_starts.Rd",
"chars": 1381,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/detect.R\n\\name{str_starts}\n\\alias{str_star"
},
{
"path": "man/str_sub.Rd",
"chars": 2793,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/sub.R\n\\name{str_sub}\n\\alias{str_sub}\n\\alia"
},
{
"path": "man/str_subset.Rd",
"chars": 1839,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/subset.R\n\\name{str_subset}\n\\alias{str_subs"
},
{
"path": "man/str_to_camel.Rd",
"chars": 1132,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/case.R\n\\name{str_to_camel}\n\\alias{str_to_c"
},
{
"path": "man/str_trim.Rd",
"chars": 1057,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/trim.R\n\\name{str_trim}\n\\alias{str_trim}\n\\a"
},
{
"path": "man/str_trunc.Rd",
"chars": 936,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/trunc.R\n\\name{str_trunc}\n\\alias{str_trunc}"
},
{
"path": "man/str_unique.Rd",
"chars": 1416,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/unique.R\n\\name{str_unique}\n\\alias{str_uniq"
},
{
"path": "man/str_view.Rd",
"chars": 2786,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/view.R\n\\name{str_view}\n\\alias{str_view}\n\\a"
},
{
"path": "man/str_which.Rd",
"chars": 1442,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/subset.R\n\\name{str_which}\n\\alias{str_which"
},
{
"path": "man/str_wrap.Rd",
"chars": 1566,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/wrap.R\n\\name{str_wrap}\n\\alias{str_wrap}\n\\t"
},
{
"path": "man/stringr-data.Rd",
"chars": 746,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{stringr-data}\n"
},
{
"path": "man/stringr-package.Rd",
"chars": 1061,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/stringr-package.R\n\\docType{package}\n\\name{"
},
{
"path": "man/word.Rd",
"chars": 1071,
"preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/word.R\n\\name{word}\n\\alias{word}\n\\title{Ext"
},
{
"path": "po/R-es.po",
"chars": 2489,
"preview": "msgid \"\"\nmsgstr \"\"\n\"Project-Id-Version: stringr 1.5.1.9000\\n\"\n\"POT-Creation-Date: 2024-07-17 11:07-0500\\n\"\n\"PO-Revision-"
},
{
"path": "po/R-stringr.pot",
"chars": 1591,
"preview": "msgid \"\"\nmsgstr \"\"\n\"Project-Id-Version: stringr 1.5.1.9000\\n\"\n\"POT-Creation-Date: 2024-08-15 10:19-0700\\n\"\n\"PO-Revision-"
},
{
"path": "revdep/.gitignore",
"chars": 79,
"preview": "checks\nlibrary\nchecks.noindex\nlibrary.noindex\ndata.sqlite\n*.html\ncloud.noindex\n"
},
{
"path": "revdep/README.md",
"chars": 949,
"preview": "# Revdeps\n\n## Failed to check (2)\n\n|package |version |error |warning |note |\n|:-------------------|:-------|"
},
{
"path": "revdep/cran.md",
"chars": 878,
"preview": "## revdepcheck results\n\nWe checked 2390 reverse dependencies, comparing R CMD check results across CRAN and dev versions"
},
{
"path": "revdep/email.yml",
"chars": 99,
"preview": "release_date: ???\nrel_release_date: ???\nmy_news_url: ???\nrelease_version: ???\nrelease_details: ???\n"
},
{
"path": "revdep/failures.md",
"chars": 7009,
"preview": "# DSMolgenisArmadillo (3.0.1)\n\n* GitHub: <https://github.com/molgenis/molgenis-r-datashield>\n* Email: <mailto:m.k.slofst"
},
{
"path": "revdep/problems.md",
"chars": 19276,
"preview": "# huxtable (5.7.0)\n\n* GitHub: <https://github.com/hughjonesd/huxtable>\n* Email: <mailto:davidhughjones@gmail.com>\n* GitH"
},
{
"path": "stringr.Rproj",
"chars": 396,
"preview": "Version: 1.0\n\nRestoreWorkspace: Default\nSaveWorkspace: Default\nAlwaysSaveHistory: Default\n\nEnableCodeIndexing: Yes\nUseSp"
},
{
"path": "tests/testthat/_snaps/c.md",
"chars": 540,
"preview": "# obeys tidyverse recycling rules\n\n Code\n str_c(c(\"x\", \"y\"), character())\n Condition\n Error in `str_c()`"
},
{
"path": "tests/testthat/_snaps/conv.md",
"chars": 199,
"preview": "# check encoding argument\n\n Code\n str_conv(\"A\", c(\"ISO-8859-1\", \"ISO-8859-2\"))\n Condition\n Error in `str"
},
{
"path": "tests/testthat/_snaps/detect.md",
"chars": 1762,
"preview": "# can't empty/boundary\n\n Code\n str_detect(\"x\", \"\")\n Condition\n Error in `str_detect()`:\n ! `pattern"
},
{
"path": "tests/testthat/_snaps/dup.md",
"chars": 352,
"preview": "# separator must be a single string\n\n Code\n str_dup(\"a\", 3, sep = 1)\n Condition\n Error in `str_dup()`:\n "
},
{
"path": "tests/testthat/_snaps/equal.md",
"chars": 179,
"preview": "# vectorised using TRR\n\n Code\n str_equal(letters[1:3], c(\"a\", \"b\"))\n Condition\n Error in `str_equal()`:\n"
},
{
"path": "tests/testthat/_snaps/flatten.md",
"chars": 194,
"preview": "# collapse must be single string\n\n Code\n str_flatten(\"A\", c(\"a\", \"b\"))\n Condition\n Error in `str_flatten"
},
{
"path": "tests/testthat/_snaps/interp.md",
"chars": 797,
"preview": "# str_interp fails when encountering nested placeholders\n\n Code\n str_interp(\"${${msg}}\")\n Condition\n Err"
},
{
"path": "tests/testthat/_snaps/match.md",
"chars": 648,
"preview": "# match and match_all fail when pattern is not a regex\n\n Code\n str_match(phones, fixed(\"3\"))\n Condition\n "
},
{
"path": "tests/testthat/_snaps/modifiers.md",
"chars": 818,
"preview": "# patterns coerced to character\n\n Code\n . <- regex(x)\n Condition\n Warning in `regex()`:\n Coercing `"
},
{
"path": "tests/testthat/_snaps/replace.md",
"chars": 1795,
"preview": "# replacement must be a string\n\n Code\n str_replace(\"x\", \"x\", 1)\n Condition\n Error in `str_replace()`:\n "
},
{
"path": "tests/testthat/_snaps/split.md",
"chars": 1088,
"preview": "# str_split() checks its inputs\n\n Code\n str_split(letters[1:3], letters[1:2])\n Condition\n Error in `str_"
},
{
"path": "tests/testthat/_snaps/sub.md",
"chars": 341,
"preview": "# bad vectorisation gives informative error\n\n Code\n str_sub(x, 1:2, 1:3)\n Condition\n Error in `str_sub()"
},
{
"path": "tests/testthat/_snaps/subset.md",
"chars": 304,
"preview": "# can't use boundaries\n\n Code\n str_subset(c(\"a\", \"b c\"), \"\")\n Condition\n Error in `str_subset()`:\n "
},
{
"path": "tests/testthat/_snaps/trunc.md",
"chars": 342,
"preview": "# does not truncate to a length shorter than elipsis\n\n Code\n str_trunc(\"foobar\", 2)\n Condition\n Error in"
},
{
"path": "tests/testthat/_snaps/view.md",
"chars": 2493,
"preview": "# results are truncated\n\n Code\n str_view(words)\n Output\n [1] | a\n [2] | able\n [3] | about\n "
},
{
"path": "tests/testthat/test-c.R",
"chars": 658,
"preview": "test_that(\"basic case works\", {\n test <- c(\"a\", \"b\", \"c\")\n\n expect_equal(str_c(test), test)\n expect_equal(str_c(test,"
},
{
"path": "tests/testthat/test-case.R",
"chars": 2351,
"preview": "test_that(\"to_upper and to_lower have equivalent base versions\", {\n x <- \"This is a sentence.\"\n expect_identical(str_t"
},
{
"path": "tests/testthat/test-conv.R",
"chars": 399,
"preview": "test_that(\"encoding conversion works\", {\n skip_on_os(\"windows\")\n\n x <- rawToChar(as.raw(177))\n expect_equal(str_conv("
},
{
"path": "tests/testthat/test-count.R",
"chars": 1291,
"preview": "test_that(\"counts are as expected\", {\n fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\")\n expect_equal(str_count(frui"
},
{
"path": "tests/testthat/test-detect.R",
"chars": 3697,
"preview": "test_that(\"special cases are correct\", {\n expect_equal(str_detect(NA, \"x\"), NA)\n expect_equal(str_detect(character(), "
},
{
"path": "tests/testthat/test-dup.R",
"chars": 1210,
"preview": "test_that(\"basic duplication works\", {\n expect_equal(str_dup(\"a\", 3), \"aaa\")\n expect_equal(str_dup(\"abc\", 2), \"abcabc\""
},
{
"path": "tests/testthat/test-equal.R",
"chars": 407,
"preview": "test_that(\"vectorised using TRR\", {\n expect_equal(str_equal(\"a\", character()), logical())\n expect_equal(str_equal(\"a\","
},
{
"path": "tests/testthat/test-escape.R",
"chars": 310,
"preview": "test_that(\"multiplication works\", {\n expect_equal(\n str_escape(\".^$|*+?{}[]()\"),\n \"\\\\.\\\\^\\\\$\\\\|\\\\*\\\\+\\\\?\\\\{\\\\}\\\\["
},
{
"path": "tests/testthat/test-extract.R",
"chars": 2488,
"preview": "test_that(\"single pattern extracted correctly\", {\n test <- c(\"one two three\", \"a b c\")\n\n expect_equal(\n str_extract"
},
{
"path": "tests/testthat/test-flatten.R",
"chars": 997,
"preview": "test_that(\"equivalent to paste with collapse\", {\n expect_equal(str_flatten(letters), paste0(letters, collapse = \"\"))\n})"
},
{
"path": "tests/testthat/test-glue.R",
"chars": 428,
"preview": "test_that(\"verify wrapper is functional\", {\n expect_equal(as.character(str_glue(\"a {b}\", b = \"b\")), \"a b\")\n\n df <- dat"
},
{
"path": "tests/testthat/test-interp.R",
"chars": 2002,
"preview": "test_that(\"str_interp works with default env\", {\n subject <- \"statistics\"\n number <- 7\n floating <- 6.656\n\n expect_e"
},
{
"path": "tests/testthat/test-length.R",
"chars": 844,
"preview": "test_that(\"str_length is number of characters\", {\n expect_equal(str_length(\"a\"), 1)\n expect_equal(str_length(\"ab\"), 2)"
},
{
"path": "tests/testthat/test-locate.R",
"chars": 2723,
"preview": "test_that(\"basic location matching works\", {\n expect_equal(str_locate(\"abc\", \"a\")[1, ], c(start = 1, end = 1))\n expect"
},
{
"path": "tests/testthat/test-match.R",
"chars": 3310,
"preview": "set.seed(1410)\nnum <- matrix(sample(9, 10 * 10, replace = T), ncol = 10)\nnum_flat <- apply(num, 1, str_c, collapse = \"\")"
},
{
"path": "tests/testthat/test-modifiers.R",
"chars": 1164,
"preview": "test_that(\"patterns coerced to character\", {\n x <- factor(\"a\")\n\n expect_snapshot({\n . <- regex(x)\n . <- coll(x)\n"
},
{
"path": "tests/testthat/test-pad.R",
"chars": 1243,
"preview": "test_that(\"long strings are unchanged\", {\n lengths <- sample(40:100, 10)\n strings <- vapply(\n lengths,\n function"
},
{
"path": "tests/testthat/test-remove.R",
"chars": 298,
"preview": "test_that(\"succesfully wraps str_replace_all\", {\n expect_equal(str_remove_all(\"abababa\", \"ba\"), \"a\")\n expect_equal(str"
},
{
"path": "tests/testthat/test-replace.R",
"chars": 5484,
"preview": "test_that(\"basic replacement works\", {\n expect_equal(str_replace_all(\"abababa\", \"ba\", \"BA\"), \"aBABABA\")\n expect_equal("
},
{
"path": "tests/testthat/test-sort.R",
"chars": 891,
"preview": "test_that(\"digits can be sorted/ordered as strings or numbers\", {\n x <- c(\"2\", \"1\", \"10\")\n\n expect_equal(str_sort(x, n"
},
{
"path": "tests/testthat/test-split.R",
"chars": 3545,
"preview": "test_that(\"special cases are correct\", {\n expect_equal(str_split(NA, \"\")[[1]], NA_character_)\n expect_equal(str_split("
},
{
"path": "tests/testthat/test-sub.R",
"chars": 3136,
"preview": "test_that(\"correct substring extracted\", {\n alphabet <- str_c(letters, collapse = \"\")\n expect_equal(str_sub(alphabet, "
},
{
"path": "tests/testthat/test-subset.R",
"chars": 1740,
"preview": "test_that(\"can subset with regexps\", {\n x <- c(\"a\", \"b\", \"c\")\n expect_equal(str_subset(x, \"a|c\"), c(\"a\", \"c\"))\n expec"
},
{
"path": "tests/testthat/test-trim.R",
"chars": 843,
"preview": "test_that(\"trimming removes spaces\", {\n expect_equal(str_trim(\"abc \"), \"abc\")\n expect_equal(str_trim(\" abc\"), \"abc"
},
{
"path": "tests/testthat/test-trunc.R",
"chars": 2348,
"preview": "test_that(\"NA values in input pass through unchanged\", {\n expect_equal(\n str_trunc(NA_character_, width = 5),\n NA"
},
{
"path": "tests/testthat/test-unique.R",
"chars": 469,
"preview": "test_that(\"unique values returned for strings with duplicate values\", {\n expect_equal(str_unique(c(\"a\", \"a\", \"a\")), \"a\""
},
{
"path": "tests/testthat/test-utils.R",
"chars": 776,
"preview": "test_that(\"keep_names() returns logical flag based on inputs\", {\n expect_true(keep_names(\"a\", \"x\"))\n expect_false(keep"
},
{
"path": "tests/testthat/test-view.R",
"chars": 1900,
"preview": "test_that(\"results are truncated\", {\n expect_snapshot(str_view(words))\n\n # and can control with option\n local_options"
},
{
"path": "tests/testthat/test-word.R",
"chars": 667,
"preview": "test_that(\"word extraction\", {\n expect_equal(\"walk\", word(\"walk the moon\"))\n expect_equal(\"walk\", word(\"walk the moon\""
},
{
"path": "tests/testthat/test-wrap.R",
"chars": 695,
"preview": "test_that(\"wrapping removes spaces\", {\n expect_equal(str_wrap(\"\"), \"\")\n expect_equal(str_wrap(\" \"), \"\")\n expect_equal"
},
{
"path": "tests/testthat.R",
"chars": 58,
"preview": "library(testthat)\nlibrary(stringr)\n\ntest_check(\"stringr\")\n"
},
{
"path": "vignettes/.gitignore",
"chars": 10,
"preview": "/.quarto/\n"
},
{
"path": "vignettes/from-base.Rmd",
"chars": 16663,
"preview": "---\ntitle: \"From base R\"\nauthor: \"Sara Stoudt\"\noutput: rmarkdown::html_vignette\nvignette: >\n %\\VignetteIndexEntry{From "
},
{
"path": "vignettes/locale-sensitive.Rmd",
"chars": 4041,
"preview": "---\ntitle: \"Locale sensitive functions\"\noutput: rmarkdown::html_vignette\nvignette: >\n %\\VignetteIndexEntry{Locale sensi"
},
{
"path": "vignettes/regular-expressions.Rmd",
"chars": 15516,
"preview": "---\ntitle: \"Regular expressions\"\noutput: rmarkdown::html_vignette\nvignette: >\n %\\VignetteIndexEntry{Regular expressions"
},
{
"path": "vignettes/stringr.Rmd",
"chars": 9532,
"preview": "---\ntitle: \"Introduction to stringr\"\noutput: rmarkdown::html_vignette\nvignette: >\n %\\VignetteIndexEntry{Introduction to"
}
]
// ... and 3 more files (download for full content)
About this extraction
This page contains the full source code of the tidyverse/stringr GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 163 files (377.5 KB), approximately 112.8k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.