[
  {
    "path": ".Rbuildignore",
    "content": "^.*\\.Rproj$\n^\\.Rproj\\.user$\n^LICENSE\\.md$\n^data-raw$\n^README\\.Rmd$\n^.*\\.pdf$\n^.github$\n^_pkgdown\\.yml$\n^docs$\n^pkgdown$\n^vignettes/articles$\n^\\.github$\n^vignettes/nflfastR-models\\.Rmd$\n^vignettes$\n^\\.travis\\.yml$\n^man/figures/card\\.png$\n^man/figures/header_github\\.png$\n^man/figures/header_twitter\\.png$\n^man/figures/nflfastR_logo_fillsize\\.png$\n^cran-comments\\.md$\n^CRAN-RELEASE$\n^man/figures/readme-cp-model-1\\.png$\n^man/figures/readme-epa-model-1\\.png$\n^revdep$\n^CRAN-SUBMISSION$\n^[.]?air[.]toml$\n^\\.vscode$\n^\\.git-blame-ignore-revs$\n"
  },
  {
    "path": ".git-blame-ignore-revs",
    "content": "# This file lists revisions of large-scale formatting/style changes so that\n# they can be excluded from git blame results.\n#\n# To set this file as the default ignore file for git blame, run:\n#   $ git config blame.ignoreRevsFile .git-blame-ignore-revs\n\n# Format whole project with air format . (#47)\n66de9ebe6d53415a770de224c1f0f442ef22358c\n"
  },
  {
    "path": ".github/.gitignore",
    "content": "*.html\n"
  },
  {
    "path": ".github/workflows/R-CMD-check.yaml",
    "content": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help\non:\n  push:\n    branches: [main, master]\n  pull_request:\n  workflow_dispatch:\n\nname: R-CMD-check.yaml\n\npermissions: read-all\n\njobs:\n  R-CMD-check:\n    runs-on: ${{ matrix.config.os }}\n\n    name: ${{ matrix.config.os }} (${{ matrix.config.r }})\n\n    strategy:\n      fail-fast: false\n      matrix:\n        config:\n          - {os: macos-latest,   r: 'release'}\n          - {os: windows-latest, r: 'release'}\n          - {os: ubuntu-latest,   r: 'devel', http-user-agent: 'release'}\n          - {os: ubuntu-latest,   r: 'release'}\n          - {os: ubuntu-latest,   r: 'oldrel-1'}\n\n    env:\n      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}\n      R_KEEP_PKG_SOURCE: yes\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: r-lib/actions/setup-pandoc@v2\n\n      - uses: r-lib/actions/setup-r@v2\n        with:\n          r-version: ${{ matrix.config.r }}\n          http-user-agent: ${{ matrix.config.http-user-agent }}\n          use-public-rspm: true\n\n      - uses: r-lib/actions/setup-r-dependencies@v2\n        with:\n          extra-packages: |\n            any::rcmdcheck\n            nflverse/fastrmodels\n            nflverse/nflreadr\n            nflverse/nflseedR\n          needs: check\n\n      - uses: r-lib/actions/check-r-package@v2\n        with:\n          upload-snapshots: true\n          build_args: 'c(\"--no-manual\",\"--compact-vignettes=gs+qpdf\")'\n"
  },
  {
    "path": ".github/workflows/format-suggest.yaml",
    "content": "# Workflow derived from https://github.com/posit-dev/setup-air/tree/main/examples\n\non:\n  # Using `pull_request_target` over `pull_request` for elevated `GITHUB_TOKEN`\n  # privileges, otherwise we can't set `pull-requests: write` when the pull\n  # request comes from a fork, which is our main use case (external contributors).\n  #\n  # `pull_request_target` runs in the context of the target branch (`main`, usually),\n  # rather than in the context of the pull request like `pull_request` does. Due\n  # to this, we must explicitly checkout `ref: ${{ github.event.pull_request.head.sha }}`.\n  # This is typically frowned upon by GitHub, as it exposes you to potentially running\n  # untrusted code in a context where you have elevated privileges, but they explicitly\n  # call out the use case of reformatting and committing back / commenting on the PR\n  # as a situation that should be safe (because we aren't actually running the untrusted\n  # code, we are just treating it as passive data).\n  # https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/\n  pull_request_target:\n\nname: format-suggest.yaml\n\njobs:\n  format-suggest:\n    name: format-suggest\n    runs-on: ubuntu-latest\n\n    permissions:\n      # Required to push suggestion comments to the PR\n      pull-requests: write\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          ref: ${{ github.event.pull_request.head.sha }}\n\n      - name: Install\n        uses: posit-dev/setup-air@v1\n\n      - name: Format\n        run: air format .\n\n      - name: Suggest\n        uses: reviewdog/action-suggester@v1\n        with:\n          level: error\n          fail_level: error\n          tool_name: air\n"
  },
  {
    "path": ".github/workflows/pkgdown.yaml",
    "content": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help\non:\n  push:\n    branches: [main, master]\n  pull_request:\n    branches: [main, master]\n  release:\n    types: [published]\n  workflow_dispatch:\n\nname: pkgdown\n\njobs:\n  pkgdown:\n    runs-on: ubuntu-latest\n    # Only restrict concurrency for non-PR jobs\n    concurrency:\n      group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}\n    env:\n      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}\n      NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}\n      NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}\n      isPush: ${{ github.event_name == 'push' || github.event_name == 'workflow_dispatch' }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: r-lib/actions/setup-pandoc@v2\n\n      - uses: r-lib/actions/setup-r@v2\n        with:\n          use-public-rspm: true\n\n\n      - uses: r-lib/actions/setup-r-dependencies@v2\n        with:\n          extra-packages: |\n            r-lib/pkgdown\n            nflverse/fastrmodels\n            nflverse/nflplotR\n            nflverse/nflreadr\n            any::tidyverse\n            any::ggrepel\n            any::knitr\n            any::tictoc\n            any::ragg\n            any::DT\n            local::.\n          needs: website\n\n      - name: Build site\n        run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)\n        shell: Rscript {0}\n\n      - name: Deploy to GitHub pages 🚀\n        if: github.event_name != 'pull_request'\n        uses: JamesIves/github-pages-deploy-action@v4.5.0\n        with:\n          clean: false\n          branch: gh-pages\n          folder: docs\n\n      - name: Deploy to Netlify\n        if: contains(env.isPush, 'false')\n        id: netlify-deploy\n        uses: nwtgck/actions-netlify@v1.1\n        with:\n          publish-dir: './docs'\n          production-branch: master\n          github-token: ${{ secrets.GITHUB_TOKEN }}\n          overwrites-pull-request-comment: true\n          deploy-message:\n            'Deploy from GHA: ${{ github.event.pull_request.title || github.event.head_commit.message }} (${{ github.sha }})'\n        timeout-minutes: 1\n"
  },
  {
    "path": ".github/workflows/revdepcheck.yaml",
    "content": "# Workflow derived from https://github.com/r-devel/recheck?tab=readme-ov-file#how-to-use-with-github-actions\non:\n  workflow_dispatch:\n    inputs:\n      which:\n        type: choice\n        description: Which dependents to check\n        options:\n        - strong\n        - most\n\nname: Reverse dependency check\n\njobs:\n  revdep_check:\n    name: Reverse check ${{ inputs.which }} dependents\n    uses: r-devel/recheck/.github/workflows/recheck.yml@v1\n    with:\n      which: ${{ inputs.which }}\n"
  },
  {
    "path": ".github/workflows/rhub.yaml",
    "content": "# R-hub's generic GitHub Actions workflow file. It's canonical location is at\n# https://github.com/r-hub/actions/blob/v1/workflows/rhub.yaml\n# You can update this file to a newer version using the rhub2 package:\n#\n# rhub::rhub_setup()\n#\n# It is unlikely that you need to modify this file manually.\n\nname: R-hub\nrun-name: \"${{ github.event.inputs.id }}: ${{ github.event.inputs.name || format('Manually run by {0}', github.triggering_actor) }}\"\n\non:\n  workflow_dispatch:\n    inputs:\n      config:\n        description: 'A comma separated list of R-hub platforms to use.'\n        type: string\n        default: 'linux,windows,macos'\n      name:\n        description: 'Run name. You can leave this empty now.'\n        type: string\n      id:\n        description: 'Unique ID. You can leave this empty now.'\n        type: string\n\njobs:\n\n  setup:\n    runs-on: ubuntu-latest\n    outputs:\n      containers: ${{ steps.rhub-setup.outputs.containers }}\n      platforms: ${{ steps.rhub-setup.outputs.platforms }}\n\n    steps:\n    # NO NEED TO CHECKOUT HERE\n    - uses: r-hub/actions/setup@v1\n      with:\n        config: ${{ github.event.inputs.config }}\n      id: rhub-setup\n\n  linux-containers:\n    needs: setup\n    if: ${{ needs.setup.outputs.containers != '[]' }}\n    runs-on: ubuntu-latest\n    name: ${{ matrix.config.label }}\n    strategy:\n      fail-fast: false\n      matrix:\n        config: ${{ fromJson(needs.setup.outputs.containers) }}\n    container:\n      image: ${{ matrix.config.container }}\n\n    steps:\n      - uses: r-hub/actions/checkout@v1\n      - uses: r-hub/actions/platform-info@v1\n        with:\n          token: ${{ secrets.RHUB_TOKEN }}\n          job-config: ${{ matrix.config.job-config }}\n      - uses: r-hub/actions/setup-deps@v1\n        with:\n          token: ${{ secrets.RHUB_TOKEN }}\n          job-config: ${{ matrix.config.job-config }}\n      - uses: r-hub/actions/run-check@v1\n        with:\n          token: ${{ secrets.RHUB_TOKEN }}\n          job-config: ${{ matrix.config.job-config }}\n\n  other-platforms:\n    needs: setup\n    if: ${{ needs.setup.outputs.platforms != '[]' }}\n    runs-on: ${{ matrix.config.os }}\n    name: ${{ matrix.config.label }}\n    strategy:\n      fail-fast: false\n      matrix:\n        config: ${{ fromJson(needs.setup.outputs.platforms) }}\n\n    steps:\n      - uses: r-hub/actions/checkout@v1\n      - uses: r-hub/actions/setup-r@v1\n        with:\n          job-config: ${{ matrix.config.job-config }}\n          token: ${{ secrets.RHUB_TOKEN }}\n      - uses: r-hub/actions/platform-info@v1\n        with:\n          token: ${{ secrets.RHUB_TOKEN }}\n          job-config: ${{ matrix.config.job-config }}\n      - uses: r-hub/actions/setup-deps@v1\n        with:\n          job-config: ${{ matrix.config.job-config }}\n          token: ${{ secrets.RHUB_TOKEN }}\n      - uses: r-hub/actions/run-check@v1\n        with:\n          job-config: ${{ matrix.config.job-config }}\n          token: ${{ secrets.RHUB_TOKEN }}\n"
  },
  {
    "path": ".gitignore",
    "content": "# History files\n.Rhistory\n.Rapp.history\n# Session Data files\n.RData\n# User-specific files\n.Ruserdata\n# Example code in package build process\n*-Ex.R\n# Output files from R CMD build\n/*.tar.gz\n# Output files from R CMD check\n/*.Rcheck/\n# RStudio files\n.Rproj.user/\n# produced vignettes\nvignettes/*.html\nvignettes/*.pdf\n# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3\n.httr-oauth\n# knitr and R markdown default cache directories\n*_cache/\n/cache/\n# Temporary files created by R markdown\n*.utf8.md\n*.knit.md\n# R Environment Variables\n.Renviron\n.DS_Store\ndocs\ninst/doc\nrevdep\n"
  },
  {
    "path": ".vscode/extensions.json",
    "content": "{\n    \"recommendations\": [\n        \"Posit.air-vscode\"\n    ]\n}\n"
  },
  {
    "path": ".vscode/settings.json",
    "content": "{\n    \"[r]\": {\n        \"editor.formatOnSave\": true,\n        \"editor.defaultFormatter\": \"Posit.air-vscode\"\n    },\n    \"[quarto]\": {\n        \"editor.formatOnSave\": true,\n        \"editor.defaultFormatter\": \"quarto.quarto\"\n    }\n}\n"
  },
  {
    "path": "DESCRIPTION",
    "content": "Type: Package\nPackage: nflfastR\nTitle: Functions to Efficiently Access NFL Play by Play Data\nVersion: 5.2.0.9012\nAuthors@R: c(\n    person(\"Sebastian\", \"Carl\", , \"mrcaseb@gmail.com\", role = \"aut\"),\n    person(\"Ben\", \"Baldwin\", , \"bbaldwin206@gmail.com\", role = c(\"cre\", \"aut\")),\n    person(\"Lee\", \"Sharpe\", role = \"ctb\"),\n    person(\"Maksim\", \"Horowitz\", , \"maksim.horowitz@gmail.com\", role = \"ctb\"),\n    person(\"Ron\", \"Yurko\", , \"ryurko@stat.cmu.edu\", role = \"ctb\"),\n    person(\"Samuel\", \"Ventura\", , \"samventura22@gmail.com\", role = \"ctb\"),\n    person(\"Tan\", \"Ho\", role = \"ctb\"),\n    person(\"John\", \"Edwards\", , \"edwards1860@gmail.com\", role = \"ctb\")\n  )\nDescription: A set of functions to access National Football League\n    play-by-play data from <https://www.nfl.com/>.\nLicense: MIT + file LICENSE\nURL: https://nflfastr.com/, https://github.com/nflverse/nflfastR\nBugReports: https://github.com/nflverse/nflfastR/issues\nDepends: \n    R (>= 4.1.0)\nImports: \n    cli (>= 3.0.0),\n    curl,\n    data.table (>= 1.15.0),\n    dplyr (>= 1.0.0),\n    fastrmodels (>= 2.1.0),\n    furrr,\n    future,\n    glue,\n    janitor,\n    lifecycle,\n    mgcv,\n    nflreadr (>= 1.2.0),\n    progressr (>= 0.6.0),\n    rlang (>= 0.4.7),\n    stringr (>= 1.4.0),\n    tibble (>= 3.0),\n    tidyr (>= 1.0.0),\n    xgboost (>= 1.1)\nSuggests: \n    DBI,\n    duckdb,\n    gsisdecoder,\n    nflseedR (>= 2.0.0),\n    purrr (>= 0.3.0),\n    rmarkdown,\n    RSQLite,\n    testthat (>= 3.0.0)\nConfig/testthat/edition: 3\nEncoding: UTF-8\nLazyData: true\nRoxygen: list(markdown = TRUE)\nRoxygenNote: 7.3.3\n"
  },
  {
    "path": "LICENSE",
    "content": "YEAR: 2020\nCOPYRIGHT HOLDER: Sebastian Carl; Ben Baldwin\n"
  },
  {
    "path": "LICENSE.md",
    "content": "# MIT License\n\nCopyright (c) 2020 Sebastian Carl; Ben Baldwin\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "NAMESPACE",
    "content": "# Generated by roxygen2: do not edit by hand\n\nexport(add_qb_epa)\nexport(add_xpass)\nexport(add_xyac)\nexport(build_nflfastR_pbp)\nexport(calculate_expected_points)\nexport(calculate_player_stats)\nexport(calculate_player_stats_def)\nexport(calculate_player_stats_kicking)\nexport(calculate_series_conversion_rates)\nexport(calculate_standings)\nexport(calculate_stats)\nexport(calculate_win_probability)\nexport(clean_pbp)\nexport(decode_player_ids)\nexport(fast_scraper)\nexport(fast_scraper_roster)\nexport(fast_scraper_schedules)\nexport(load_pbp)\nexport(load_player_stats)\nexport(load_rosters)\nexport(load_schedules)\nexport(load_team_stats)\nexport(missing_raw_pbp)\nexport(most_recent_season)\nexport(nflverse_sitrep)\nexport(report)\nexport(save_raw_pbp)\nexport(update_db)\nexport(update_pbp_db)\nimport(dplyr)\nimport(fastrmodels)\nimportFrom(data.table,\"%between%\")\nimportFrom(data.table,\"%chin%\")\nimportFrom(nflreadr,load_pbp)\nimportFrom(nflreadr,load_player_stats)\nimportFrom(nflreadr,load_rosters)\nimportFrom(nflreadr,load_schedules)\nimportFrom(nflreadr,load_team_stats)\nimportFrom(nflreadr,most_recent_season)\nimportFrom(nflreadr,nflverse_sitrep)\nimportFrom(rlang,\"%||%\")\nimportFrom(rlang,\":=\")\nimportFrom(rlang,.data)\nimportFrom(rlang,.env)\nimportFrom(xgboost,getinfo)\n"
  },
  {
    "path": "NEWS.md",
    "content": "# nflfastR (development version)\n\n- Added new function `update_pbp_db()`, a fresh approach to the database helper. (#544)\n- Added `\"game_id\"` to the output `calculate_stats()` if `summary_level == \"week\"`. (#566)\n- Fixed a bug where `fixed_drive` did not increment after a muffed blocked field goal attempt. Yes this happened in `\"2025_10_NO_CAR\"`, play id 2504. (#567)\n- nflfastR stopped supporting the 1999 and 2000 seasons because of inconsistent data sources. Data is still available through `load_pbp()` but we will not fix any issues related to those old seasons anymore. It's possible to install nflfastR v5.2.0 (with `pak::pak(\"nflverse/nflfastR@v5.2.0\")`) to parse those seasons if necessary. (#568)\n- Implemented a fresh approach to compute `play_type` based on `play_type_nfl` for faster and more consistent output. (#568)\n- Fixed a bug where nflfastR overwrote the kickoff_attempt variable in the event of a penalty on a kickoff. (#569)\n- Added various definitions of 'explosive' plays to the output of `calculate_stats()`. It counts passes, runs, and receptions with 10+, 20+, 40+ yards gained as well as 12+ yard runs and 16+ yard passes. (#573)\n- Added several punting stats to the output of `calculate_stats()`. (#574)\n- Added overall fumble counters to the output of `calculate_stats()` because it was missing some edge case fumbles on offense. (#575)\n- The `play_type` variable now possibly shows `\"pass\"` or `\"run\"` on 2 point conversion plays with a post-snap penalty enforced between downs. This is different from `play_type_nfl` (which will show `\"PENALTY\"` in these cases). (#579)\n- Fixed bug where `calculate_stats()` counted fumble recoveries in `fumble_recovery_yards_own` and `fumble_recovery_yards_opp` instead of the corresponding yards. (#584)\n- Fixed bug where `calculate_stats()` counted some blocked punts as punt attempts that officially do not count as punt attempts. (#584)\n- Fixed bug where `calculate_stats()` overcounted first downs in some edge cases. (#587)\n- nflfastR now loads raw play-by-play data from season based releases in the `nflverse/nflverse-pbp` GitHub repository. The legacy repository `nflverse/nflfastR-raw` is deprecated and won't update in future seasons. This means that previous nflfastR versions won't be able to download 2026+ seasons! (#589)\n\n# nflfastR 5.2.0\n\n- Bump required fastrmodels version to 2.0 for better compatibility with xgboost.\n- Fixed an issue with duplicated play IDs in some 2000 games. (#521)\n- Added the argument `pbp` to `calculate_stats()` to allow stats calculation based on subsets of nflverse play-by-play data. (#524)\n- Fixed a bug where `calculate_stats()` didn't count 60 yard field goal attempts in `\"fg_made_60_\"` and `\"fg_missed_60_\"`. (#531)\n- Fixed a bug where `clean_pbp()` did not provide a passer on plays where scrambles where manually adjusted based on data from Aaron Schatz. (#536)\n- nflfastR now directly reexports nflreadr's `load_pbp()`, `load_player_stats()`, and `load_team_stats()`. This means that the functions can be called normally via nflfastR, but are no longer available in the documentation (whether in the R Help or on the pkgdown website). Instead, only links to nflreadr are included. This ensures that the documentation is always up to date. (#538)\n- `fast_scraper_roster()` and `fast_scraper_schedules()` are officially deprecated and will be removed in a future update. Please use `load_rosters()` and `load_schedules()`. (#539)\n- `report()` is deprecated and will be removed in a future update. Please use `nflverse_sitrep()`. (#540)\n- Fixed incompatibility with xgboost v3 model outputs. (#553)\n- Added `\"Kickoff Out of Bounds\"` (introduced in the 2024 season) to the `penalty_type` variable in play-by-play. (#560)\n\nThank you to &#x0040;Doug-Analytics, &#x0040;isaactpetersen, &#x0040;jeleff1000, &#x0040;JoeMarino2021, &#x0040;kbannon77, &#x0040;lancejames35, &#x0040;LinkedInMindset, &#x0040;manbradcalf, &#x0040;mrcaseb, &#x0040;thedfszone, &#x0040;TheMathNinja, and &#x0040;zaynpatel for their questions, feedback, and contributions towards this release.\n\n# nflfastR 5.1.0\n\n- The function `calculate_standings()` has been deprecated. Please use `nflseedR::nfl_standings()` in nflseedR v2.0 instead. (#510)\n- nflfastR now requires R 4.1 to allow the package to use R's native pipe `|>` operator. This follows the [Tidyverse R version support rules](https://tidyverse.org/blog/2019/04/r-version-support/). (#511)\n- Fixed a bug where `calculate_stats()` incorrectly counted `receiving_air_yards`. (#500)\n- Fixed a bug where `vegas_wp` variables were broken when `spread_line` data was missing. (#503)\n- Fixed a bug where `calculate_stats()` incorrectly calculated `target_share` and `air_yards_share` when `summary_level = \"season\"`. (#505)\n- Fixed a bug where `calculate_stats()` incorrectly counted `fumbles`. (#514)\n- Compatibility improvements with xgboost. (#517)\n\nThank you to &#x0040;ak47twq, &#x0040;isaactpetersen, &#x0040;jacobakaye, &#x0040;johnpholden, &#x0040;marvin3FF, &#x0040;mrcaseb, and &#x0040;tanho63 for their questions, feedback, and contributions towards this release.\n\n# nflfastR 5.0.0\n\n## Major Changes\n\n- Added new function `calculate_stats()` that combines the output of all `calculate_player_stats*()` functions with a more robust and faster approach. The `calculate_player_stats*()` function will be deprecated in a future release. (#470)\n- Added new exported dataframe `nfl_stats_variables`. It lists and explains all variables returned by `calculate_stats()`. A searchable table is available at <https://nflfastr.com/articles/stats_variables.html>. (#470)\n\n## Bug Fixes and Minor Changes\n\n- Drop `{crayon}`, `{DT}`, `{httr}`, `{jsonlite}`, `{qs}` dependencies. (#453)\n- The function `calculate_player_stats_def` now returns `season_type` if argument `weekly` is set to `TRUE` for consistency with the other player stats functions. (#455)\n- The function `missing_raw_pbp()` now allows filtering by season. (#457)\n- More robust handling of player IDs in `decode_player_ids()`. (#458)\n- Fixed rare cases where the value of the `yrdln` variable didn't equal `\"MID 50\"` at midfield. (#459)\n- Fixed rare cases where `drive_start_yard_line` missed the blank space between team name and yard line number. (#459)\n- Fixed play description in some 1999 and 2000 games where the string \"D.Holland\" replaced the kick distance. (#459)\n- Fixed a problem where the `goal_to_go` variable was `FALSE` in actual goal to go situations. (#460)\n- Fixed a bug in `fixed_drive` and `fixed_drive_result` where the second weather delay in `2023_13_ARI_PIT` wasn't identified correctly. (#461)\n- `punter_player_id`, and `punter_player_name` are filled for blocked punt attempts. (#463)\n- Fixed an issue affecting scores of 2022 games involving a return touchdown (#466)\n- Added identification of scrambles from 1999 through 2004 with thank to Aaron Schatz (#468, #489)\n- Updated the dataframe `stat_ids` with some IDs that were previously missing. (#470)\n- nflfastR tried to fix bugs in the underlying pbp data of JAX home games prior to the 2016 season. An update of the raw pbp data resolved those bugs so nflfastR needs to remove the hard coded adjustments. This means that nflfastR <= v4.6.1 will return incorrect pbp data for all Jacksonville home games prior to the 2016 season! (#478)\n- Fixed a problem where `clean_pbp()` returned `pass = 1` in actual rush plays in very rare cases. (#479)\n- Removed extra lines for injury timeouts that were breaking `fixed_drive` (#482)\n- The variable `penalty_type` now correctly lists the penalty \"Kickoff Short of Landing Zone\" introduced in the 2024 season. (#486)\n- Fixed a bug where `ep` was incorrect on PAT attempts preceded by a timeout and then a penalty (extremely rare). This bug also caused the variables `total_home_epa` and `total_away_epa` to be incorrect for all subsequent plays in the same game. (#493)\n\nThank you to &#x0040;ahmed-cheema, &#x0040;andrewtek, &#x0040;guga31bb, &#x0040;isaactpetersen, &#x0040;JoeMarino2021, &#x0040;john-b-edwards, &#x0040;marcusSasser, &#x0040;mlounsberry, &#x0040;morganandrew, &#x0040;mrcaseb, &#x0040;mscoop16, &#x0040;parsnipz, &#x0040;rjthompson2, and &#x0040;Useight for their questions, feedback, and contributions towards this release.\n\n# nflfastR 4.6.1\n\n- The function `calculate_series_conversion_rates()` now correctly aggregates season level conversion rates. Performance has also been improved. (#440)\n- Adjusted test behavior at CRAN's request. \n\nThank you to\n&#x0040;andrewtek, &#x0040;gregalvi86, &#x0040;Ic4ru5Wing, &#x0040;JoeMarino2021, &#x0040;jreddy1990, &#x0040;marvin3FF, &#x0040;mrcaseb, &#x0040;RicShern, &#x0040;SPNE, and &#x0040;trivialfis for their questions, feedback, and contributions towards this release.\n\n# nflfastR 4.6.0\n\n## New Features\n\n- nflfastR now fully supports loading raw pbp data from local file system. The best way to use this feature is to set `options(\"nflfastR.raw_directory\" = {\"your/local/directory\"})`. Alternatively, both `build_nflfastR_pbp()` and `fast_scraper()` support the argument `dir` which defaults to the above option. (#423)\n- Added the new function `save_raw_pbp()` which efficiently downloads raw play-by-play data and saves it to the local file system. This serves as a helper to setup the system for faster play-by-play parsing via the above functionality. (#423)\n- Added the new function `missing_raw_pbp()` that computes a vector of game IDs missing in the local raw play-by-play directory. (#423)\n\n## Minor Improvements and Bugfixes\n\n- The internal function `get_pbp_nfl()` now uses `ifelse()` instead of `dplyr::if_else()` to handle some null-checking, fixes bug found in `2022_21_CIN_KC` match.\n- The function `calculate_player_stats()` now summarises target share and air yards share correctly when called with argument `weekly = FALSE` (#413)\n- The function `calculate_player_stats()` now returns the opponent team when called with argument `weekly = TRUE` (#414)\n- The function `calculate_player_stats_def()` no longer errors when small subsets of pbp data are missing stats. (#415)\n- The function `calculate_series_conversion_rates()` no longer returns `NA` values if a small subset of pbp data is missing series on offense or defense. (#417)\n- `fixed_drive` now correctly increments on plays where posteam lost a fumble but remains posteam because defteam also lost a fumble during the same play. (#419)\n- nflfastR now fixes missing drive number counts in raw pbp data in order to provide accurate drive information. (#420)\n- nflfastR now returns correct `kick_distance` on all punts and kickoffs. (#422)\n- Decode player IDs in 2023 pbp. (#425)\n- Drop the pseudo plays TV Timeout and Two-Minute Warning. (#426)\n- Fix posteam on kickoffs and PATs following a defensive TD in 2023+ pbp. (#427)\n- `calculate_player_stats()` no more counts lost fumbles on plays where a player fumbles, a team mate recovers and then loses a fumble to the defense. (#431)\n- The variables `passer`, `receiver`, and `rusher` no more return `NA` on \"abnormal\" plays - like direct snaps, aborted snaps, laterals etc. - that resulted in a penalty. (#435) \n\nThank you to\n&#x0040;903124, &#x0040;ak47twq, &#x0040;andrewtek, &#x0040;darkhark, &#x0040;dennisbrookner, &#x0040;marvin3FF, &#x0040;mistakia, &#x0040;mrcaseb, &#x0040;nicholasmendoza22, &#x0040;rickstarblazer, &#x0040;RileyJohnson22, and &#x0040;tanho63 for their questions, feedback, and contributions towards this release.\n\n# nflfastR 4.5.1\n\n* New implementation of tests to be able to identify breaking changes in reverse dependencies (#396, #406)\n* `calculate_standings()` no more freezes when computing standings from schedules where some games are missing results, i.e. upcoming games.\n* Bug fix that caused problems with upcoming dplyr and tidyselect updates that weren't reverse compatible.\n* Significant performance improvements of internal functions. (#402)\n* Wrap examples in `try()` to avoid CRAN problems. (#404)\n* Fixed a bug where `calculate_standings()` wasn't able to handle nflverse pbp data. (#404)\n\n# nflfastR 4.5.0\n\n## New (experimental) functions\n* Added new function `calculate_player_stats_def()` that aggregates defensive player stats either at game level or overall. (#288)\n* The situation report `nflverse_sitrep` which is an alias of the already available `report()`\n* Added new function `calculate_player_stats_kicking()` that aggregates player stats for field goals and extra points at game level or overall. (#381)\n* Added new function `calculate_series_conversion_rates()` that computes series conversion and series result rates at a game level or season level. (#393)\n\n## Bugfixes and Minor Improvements\n\n* Internal change to `calculate_player_stats()` that reflects new nflverse data infrastructure.\n* `calculate_player_stats()` now unifies player names and joins the following player information via `nflreadr::load_players()`:\n  - `player_display_name` - Full name of the player\n  - `position` - Position of the player\n  - `position_group` - Position group of the player\n  - `headshot_url` - URL to a player headshot image\n* Make data work in 2022 (hopefully)\n* Fix Amon-Ra St. Brown breaking the name parser\n* Add gsis_id patch to `clean_pbp()`.\n* `calculate_player_stats_def()` failed in situations where play-by-play data is missing certain stats. (#382)\n* Spot-fixing `calculate_player_stats()` for `NA` names.\n\n# nflfastR 4.4.0\n\n## New Functions, Options, Data\n\n* Added new function `calculate_standings()` that computes regular season division standings and playoff seeds from nflverse data.\n* The database function `update_db()` now supports the option \"nflfastR.dbdirectory\" which can be used to set the directory of the nflfastR pbp database globally and independent of any project structure or working directories.\n* The embedded data frame `?teams_colors_logos` has been updated to reflect the most recent team color themes and gained additional variables for conference and division as well as logo urls to the conference and league logos. (#290)\n* The embedded data frame `?teams_colors_logos` has been updated with the Washington Commanders. (#312)\n\n## Deprecation\n\n* The argument `qs` in the functions `load_pbp()` and `load_player_stats()` has been deprecated as of nflfastR 4.3.0. This release removes the argument entirely. \n\n## Bugfixes and Minor Improvements\n\n* Fixed bug where a player could be duplicated in `calculate_player_stats()` in very rare cases caused by plays with laterals. (#289)\n* Fixed a bug where the function `add_xpass()` failed when called with an empty data frame. (#296)\n* Fixed a bug where `play_type` showed `no_play` on plays with penalties that don't result in a replay of the down. (#277, #281)\n* Fixed a bug in the variable descriptions of `total_home_score` and `total_away_score`. (#300)\n* `fast_scraper_rosters()` and `fast_scraper_schedules()` now call `nflreadr::load_rosters()` and `nflreadr::load_schedules()` under the hood (#304)\n* Fixed a bug causing missing EPA on game-ending turnovers in overtime\n* Bump minimum nflreadr version to 1.2.0 for data repository change\n* Fix a bug affecting yardline for a very small number of plays in the 2000 season (#323)\n* `update_db()` now uses a default play to predefine column types for all db drivers. (#324)\n* Fix a bug that resulted in incorrect `xyac_mean_yardage` on 4th downs (#327)\n* Fix a bug that resulted in missing `xyac` information for plays involving J.O'Shaughnessy (#329)\n* Fix a bug that resulted in missing `epa` on the last play of some games involving NE and BUF (#331)\n* `fast_scraper()` and `build_nflfastR_pbp()` now return data frames of class `nflverse_data` to be consistent with `nflreadr`.\n* Fix behavior of EP model in neutral site games (treats both teams as away teams)\n\n# nflfastR 4.3.0\n\n## Minor Changes\n\n* Add [nflreadr](https://nflreadr.nflverse.com/) to dependecies and drop lubridate and magrittr dependency\n* The functions `load_pbp()` and `load_player_stats()` now call `nflreadr::load_pbp()` and `nflreadr::load_player_stats()` respectively. Therefore the argument `qs` has been deprecated in both functions. It will be removed in a future release. Running `load_player_stats()` without any argument will now return player stats of the current season only (the default in `nflreadr`).\n* The deprecated arguments `source` and `pp` in the functions `fast_scraper_*()` and `build_nflfastR_pbp()` have been removed\n* Added the variables `racr` (\"Receiver Air Conversion Ratio\"), `target_share`, `air_yards_share`, `wopr` (\"Weighted Opportunity Rating\") and `pacr` (\"Passing Air Conversion Ratio\") to the output of `calculate_player_stats()`\n* Added the function `report()` which will be used by the maintainers to help users debug their problems (#274).\n\n## Bug Fixes\n\n* Fixed a minor bug in the console output of `update_db()`\n* Fix for a handful of missing `receiver` names (#270)\n* Fixed bug with missing `return_team` on interception return touchdowns (#275)\n* Fixed a rare bug where an internal object wasn't predefined (#272)\n\n# nflfastR 4.2.0\n\n* All `wpa` variables are `NA` on end game line\n* All `wp` variables are 0, 0.5, 1, or `NA` on end game line\n* Fix bug where win prob on PATs assumed a PAT placed at 15 yard line, even in older seasons\n* The function `decode_player_ids()` now really decodes the new variable `fantasy_id` (#229)\n* Fixed a bug that caused slightly differing `wp` values depending on the first game in the data set (#183)\n* Edited GitHub references to point to nflverse\n* Added the variables `sack_yards`, `sack_fumbles`, `rushing_fumbles` and `receiving_fumbles` to the output of the function `calculate_player_stats()`, thanks to Mike Filicicchia (@TheMathNinja). (#239)\n* Fixed a bug where `calculate_player_stats()` falsely counted lost fumbles on aborted snaps (#238)\n* Added the variable `season_type` to the output of `calculate_player_stats()` and `load_player_stats()` in preparation of the extended Regular Season starting in 2021 (#240)\n* Updated `season_type` definitions in preparation of the extended Regular Season starting in 2021 (#242)\n* Fix for `fixed_drive` where it wasn't incrementing when there was a muffed punt followed by timeout (#244)\n* Fix for `fixed_drive` where it wasn't incrementing following an interception with the intercepting player then losing a fumble (#247)\n* Fix for more issues with missing play info in 2018_01_ATL_PHI (#246)\n* Added the variables `safety_player_name` and `safety_player_id` to the play-by-play data (#252)\n* Dropped the dependency `usethis`\n\n# nflfastR 4.1.0\n\n## Breaking changes\n\n### Functions\n\n* Added the function `calculate_player_stats()` that aggregates official passing, rushing, and receiving stats either at game level or overall\n* Added the function `load_player_stats()` that loads weekly player stats from 1999 to the most recent season\n* The performance of the functions `add_xyac()` and `clean_pbp()` has been significantly improved\n\n### New Variables\n\n* Added the new columns `td_player_name` and `td_player_id` to clearly identify the player who scored a touchdown (this is especially helpful for plays with multiple fumbles or laterals resulting in a touchdown)\n* The function `calculate_player_stats()` now adds the variable `dakota`, the `epa` + `cpoe` composite, for players with minimum 5 pass attempts.\n* Added column `home_opening_kickoff` to `clean_pbp()`\n* Added the variables `sack_player_id`, `sack_player_name`, `half_sack_1_player_id`, `half_sack_1_player_name`, `half_sack_2_player_id` and `half_sack_2_player_name` who identify players that recorded sacks (or half sacks). Also updated the description of the variables `qb_hit_1_player_id`, `qb_hit_1_player_name`, `qb_hit_2_player_id` and `qb_hit_2_player_name` to make more clear that they did not record a sack. (#180)\n\n## Minor improvements and fixes\n\n* The variable `qb_scramble` was incomplete for the 2005 season because of missing scramble indicators in the play description. This has been mostly fixed courtesy of charting data from Football Outsiders (with thanks to Aaron Schatz!). Some notes on this fix: Weeks 1-16 are based on charting. Weeks 17-21 are guesses (basically every QB run except those that were a) a loss, b) no gain, or c) on 3/4 down with 1-2 to go). Plays nullified by penalty are not included.\n* Change `name`, `id`, `rusher`, and `rusher_id` to be the player charged with the fumble on aborted snaps when the QB is unable to make a play (i.e. pass, sack, or scramble) (#162)\n* The function `clean_pbp()` now standardizes the team name columns `tackle_with_assist_*_team`\n* Fix bug in `drive` that was causing incorrect overtime win probabilities (#194)\n* Fixed a bug where `posteam` was not `NA` on end of quarter 2 (or end of quarter 4 in overtime games) causing wrong values for `fixed_drive`, `fixed_drive_result`, `series` and `series_result`\n* Fixed a bug where `fixed_drive` and `series` were falsely incrementing on kickoffs recovered by the kicking team or on defensive touchdowns followed by timeouts\n* Fixed a bug where `fixed_drive` and `series` were falsely incrementing on muffed punts recovered by the punting team for a touchdown\n* Fixed a bug where `add_xpass()` crashed when ran with data already including xpass variables. \n* Fixed a bug in `epa` when a safety is scored by the team beginning the play in possession of the ball (#186)\n* Fix some bugs related to David and Duke Johnson on the Texans in 2020 (#163)\n* Fix yet another bug related to correctly identifying possession team on kickoffs nullified by penalty (#199)\n* Fixed a bug where `calculate_player_stats()` forgot to clean player names by using their IDs\n* Fixed a bug where special teams touchdowns were missing in the output of `calculate_player_stats()` (#203)\n* Fixed for some old Jaguars games where the wrong team was awarded points for safeties and kickoff return TDs (#209)\n* The function `update_db()` no more falsely closes a database connection provided by the argument `db_connection` (#210)\n* Fixed a bug where `yards_gained` was missing yardage on plays with laterals. (#216)\n* Fixed a bug where there were stats wrongly given on a play with penalty (#218)\n* `fixed_drive` now increments properly on onside kick recoveries (#215)\n* `fixed_drive` no longer counts a muffed kickoff as a one-play drive on its own (#217)\n* `fixed_drive` now properly increments after a safety (#219)\n* Improved parser for `penalty_type` and updated the description of the variable to make more clear it's the first penalty that happened on a play. (#223)\n\n# nflfastR 4.0.0\n\n## Breaking changes\n\n### Changed Functions\n\n* Deprecated the arguments `source` and `pp` all across the package. Using them will cause a \nwarning. Parallel processing has to be activated by choosing an appropriate `future::plan()` before\ncalling the relevant functions. For more information please see [the package documentation](https://nflfastr.com/reference/nflfastR-package.html).\n* The function `build_nflfastR_pbp()` will now run `decode_player_ids()` by default (can be deactivated with the argument `decode = FALSE`). \n* The function `build_nflfastR_pbp()` will now run `add_xpass()` by default and add the new variables `xpass` and `pass_oe`.\n* The functions `fast_scraper()` and `build_nflfastR_pbp()` now allow the output of `fast_scraper_schedules()` directly as input so it's not necessary anymore to pull the `game_id` first.\n\n### New Functions and Variables\n\n* Added the new function `load_pbp()` that loads complete seasons into memory for fast access of the play-by-play data.\n* Added the new variables `rushing_yards`, `lateral_rushing_yards`, `passing_yards`, `receiving_yards`, `lateral_receiving_yards` to fix an old bug where `yards_gained` gets overwritten on plays with laterals (#115).\n* Added columns `vegas_wpa` and `vegas_home_wpa` which contain Win Probability Added from the spread-adjusted WP model\n* Added column `out_of_bounds`\n* Added columns `fantasy`, `fantasy_id`, `fantasy_player_name`, and `fantasy_player_id` that indicate the rusher or receiver on the play\n* Added columns `tackle_with_assist`, `tackle_with_assist_1_player_id`, `tackle_with_assist_1_player_name`, `tackle_with_assist_1_team`, `tackle_with_assist_2_player_id`, `tackle_with_assist_2_player_name`, `tackle_with_assist_2_team`\n\n### Models and Miscellaneous\n\n* Tuned spread-adjusted win probability model one final (?) time. Expected points is now no longer \nrequired for `calculate_win_probability()`\n* Added field descriptions `vignette(\"field_descriptions\")` with a searchable list of all nflfastR variables\n* Switched data source for 2001-2010 to what is used for 2011 and on\n* All models have been moved to the [fastrmodels](https://cran.r-project.org/package=fastrmodels) package\n* Added the data frames `?field_descriptions` and `?stat_ids` to the package\n\n## Minor improvements and fixes\n\n* Fix bug where `fixed_drive` and `series` weren't updating after muffed punt (#144)\n* Fix bug induced by fixing the above (#149)\n* Fix bug where some special teams plays were incorrectly being labeled as pass plays (#125)\n* Fix bug where points for safeties were given to the `defteam` instead of the `posteam` (#152)\n* Fix bug where a muffed punt TD was given to the wrong team in a 2011 Jaguars game (#154)\n* Win probability is now calculated prior to PAT attempts rather than using WP on the ensuing kickoff\n* Improved performance of internal functions that speed up the rebuilding process in `update_db()`\n(added `qs` and `curl` to dependencies)\n* Fixed a bug where `calculate_expected_points()` and `calculate_win_probability()` duplicated some existing variables instead of replacing them (#170)\n* Fixed a bug where `penalty_type` wasn't `\"no_play\"` although it should have been (#172)\n* Fixed a bug where `penalty_team` could be incorrect in games of the Jaguars in the seasons 2011 - 2015 (#174)\n* Fixed a bug related to the calculation of `epa` on plays before a failed pass interference challenge in a few 2019 games (#175)\n* Fixed a bug related to lots of fields with `NA` on offsetting penalties (#44)\n* Fixed a bug in `epa` when possession team changes at end of 1st or 3rd quarter (#182)\n* Fixed a bug where various functions have left open connections\n* `vegas_wp` is now `NA` on final line since there is no possession team\n\n\n# nflfastR 3.2.0\n\n## Models\n\n* Performance update for win probability model with point spread (`vegas_wp`)\n* Added `yardline_100` as an input to both win probability models (not having it included was an oversight)\n\n## Minor improvements and fixes\n\n* Fixed a bug where `series` was increased on PATs\n* Fixed a bug affecting the week 10 Raiders-Broncos game\n* Added the column `team_wordmark` - which contains URLs to the team's wordmarks - to the included data frame `?teams_colors_logos`\n\n# nflfastR 3.1.1\n\n## New features\n\n### Database Function `update_db()`\n\n* The argument `force_rebuild` of the function `update_db()` is now of hybrid \ntype. It can rebuild the play by play data table either for the whole nflfastR \nera (with `force_rebuild = TRUE`) or just for specified seasons \n(e.g. `force_rebuild = 2019:2020`).\nThe latter is intended to be used for running seasons because the NFL fixes bugs\nin the play by play data during the week and we recommend to rebuild the current \nseason every Thursday.\n* Fixed a bug where `update_db()` disconnected the connection to a database provided \nby the argument `db_connection` (#102)\n* Fixed a bug where `update_db()` didn't build a fresh database without providing\nthe argument `force_rebuild`\n* `update_db()` no longer removes the complete data table when a numeric argument \n`force_rebuild` is passed but only removes the rows within the table (#109)\n\n### New Functions\n\n* Added the new function `build_nflfastR_pbp()`, a convenient wrapper around \nmultiple nflfastR functions for an easy creation of the nflfastR play-by-play data set\n* Added a function that applies our experimental expected pass model, `add_xpass()`,\nthat creates columns `xpass` and `pass_oe`\n\n## Minor improvements and fixes\n\n* More fixes for `fixed_drive` which was not incrementing properly on drives\nthat began following a timeout\n* Fixed more bugs in EPA and win probability on PATs and kickoffs with penalties\n* Fixed a bug where scoring probabilities weren't adding to 1 on field goal \nattempts near the end of a half\n* Messages to the user are now created with the new dependency `usethis`\n* Fixed bug where plays with \"backward pass\" in play description were counted as \npass plays (`pass` = 1)\n* Fixed missing kick distance on touchbacks and blocked punts (#53)\n* Added the option `fast` (either `TRUE` or `FALSE`) to the function \n`decode_player_ids()` to activate the high efficient C++ decoder of the package \n[`gsisdecoder`](https://cran.r-project.org/package=gsisdecoder)\n\n# nflfastR 3.0.0\n\n## Breaking changes\n\n* `fast_scraper_roster()` is finally back! It loads NFL roster of a given season.\n* Added the function `decode_player_ids()` to decode all player IDs to the \ncommonly known GSIS ID format (00-00xxxxx)\n\n## New features\n\n* Add option `source = \"old\"` to `fast_scraper()` to enable scraping of old source.\nThis is mostly useless as it doesn't work for 2020 and provides less info\n* Added new option `db_connection` to `update_db()` to allow advanced users to\nuse other DBI drivers, such as `RMariaDB::MariaDB()`, `RPostgres::Postgres()` or \n`odbc::odbc()` (please see [dbplyr](https://dbplyr.tidyverse.org/articles/dbplyr.html)\nfor more information)\n\n## Minor improvements and fixes\n\n* `clean_pbp()` now fixes some bugs in jersey numbers\n* `clean_pbp()`, `add_qb_epa()` and `add_xyac()` can now handle empty data frames\n* Fix empty line causing `fast_scraper()` to fail (affects multiple games of the 2020 season)\n* Fix bug in `fixed_drive` that counted PAT after defensive TD as its own drive\n* Fixed a bug which caused too high number of tackles in special cases\n* Fixed a bug where CPOE was NA when targeting players with apostrophe in last name\n\n# nflfastR 2.2.1\n\n* Fix `add_xyac()` breaking with some old packages\n* Fix `add_xyac()` and `add_qb_epa()` calculations being wrong for some failed 4th downs\n* Updated Readme with ep and cp model plots\n* Updated `vignette(\"examples\")` with the new `add_xyac()` function\n* Added xYAC model to `vignette(\"nflfastR-models\")`\n* Added variables `fixed_drive` and `fixed_drive_result` to the output of \n`fast_scraper()` because the NFL-provided drive info is extremely buggy\n* Added variable `series_result`\n* `clean_pbp()` now adds 4 new variables `passer_jersey_number`, \n`rusher_jersey_number`, `receiver_jersey_number` and `jersey_number`. These can \nbe used to join rosters. \n* Fixed incorrect `timeout_team`, `return_team`, `fumble_recovery_1_team` for JAX\ngames from 2011-2015\n* Re-trained EPA model with `fixed_drive` and corrections to `timeout_team`\n\n# nflfastR 2.2.0\n\n* New function `add_xyac()` which adds the following columns associated with expected yards after\nthe catch (xYAC): `xyac_epa`, `xyac_success`, `xyac_fd`, `xyac_mean_yardage`, `xyac_median_yardage`\n\n# nflfastR 2.1.3\n\n* Fixed a bug in `series_success` caused by bad `drive` information provided by NFL\n\n# nflfastR 2.1.2\n\n* Added the following columns that are available 2011 and later: `special_teams_play`, `st_play_type`, `time_of_day`, and `order_sequence`\n* Added `old_game_id` column (useful for merging to external data that still uses this ID: format is YYYYMMDDxx)\n* The `clean_pbp()` function now adds an `aborted_play` column\n* Fixed a bug where pass plays with a penalty at end of play were classified as `play_type` = `no_play` rather than `pass`\n* Fixed bug where EPA on defensive 2 point return was -0.95 instead of -2.95\n* Fixed some remaining failed challenge plays that incorrectly had 0 for EPA\n* Updated the included dataframe `teams_colors_logos` for the interim name of \nthe 'Washington Football Team' and the corresponding logo urls.\n* Some internal code improvements causing the required `tidyselect` version\nto be >= 1.1.0\n\n# nflfastR 2.1.1\n\n### Functions\n\n* `clean_pbp()` now standardizes player IDs across the old (1999-2010) and new \n(2011+) data sources. Player IDs once again uniquely identify players, and each \nunique player has one unique ID (as they did before the NFL data source change):\n    * For players whose careers finished before 2011, their IDs remain the same\n    * For players who played in both eras or only in the new era, their ID is \n    the new ID\n    * For example, Akili Smith (ID: 00-0015082) and Alex Smith \n    (ID: 32013030-2d30-3032-3334-3336b638d37d) are both abbreviated as \"A.Smith\" \n    but can be distinguished by their IDs, with Akili showing what the old \n    format ID looks like, and Smith the new one\n    * Standardization is realized by using an ID map\n    available in the data repo\n    \n* `clean_pbp()` now removes all variables it is about to create to make sure \nnothing unexpected can happen\n\n### Miscellaneous\n\n* Added minimum version requirements to some package dependencies because \ninstallation broke for some users with outdated packages\n\n* Made a minor bug fix to catch more out-of-order plays and fixed a bug where some\nplays were being incorrectly dropped in older seasons\n\n* Standardized team names (e.g. `SD` --> `LAC`) in some columns we had missed\n\n# nflfastR 2.1.0\n\n### Models\n\n* Removed `week` from Expected Points models along with an update of\n`vignette(\"nflfastR-models\")` and `vignette(\"examples\")`\n\n### Functions\n\n* Added function `update_db()` which adds all completed games to a SQLite database\n* Added function `calculate_win_probability()` \n* Added new examples to `vignette(\"examples\")` demonstrating the usage of the\nabove mentioned functions\n\n### Bugs\n\n* Fixed a problem with inconsistent data types of the variable\n`drive_real_start_time` pre and post 2011\n* Fixed a problem where some `game_id`s were overwritten during the play by play parsing\n* Fix some more WP bugs on kickoffs with penalties and rare play description\n\n### Miscellaneous\n\n* `fast_scraper()` now loads the raw game data from a separate raw data repo\n* Completely overhauled the entire code base to directly implement\n[tidy evaluation](https://dplyr.tidyverse.org/articles/programming.html) using \n`.data` from the [rlang](https://rlang.r-lib.org/) package (this is a major \ncode change that takes some getting used to but we need it in preparation of \na future release)\n\n# nflfastR 2.0.6\n\n* Fixed a problem where defensive two point conversions were not counted\n* Kneels on kickoffs are no longer counted as qb kneels\n* Variable `yards_gained` more precisely defined\n* Bugfixes for more games with out of order of plays\n* Fix bug related to EPA on plays with a failed pass interference challenge\n* Added new example to `vignette(\"examples\")` to demonstrate Expected Points \ncalculator `calculate_expected_points()`\n* Fix for WP on 2-pt conversion negated by penalty\n* Add more variables (containing team names) to team standardization in `clean_pbp()`\n* Fix WP for onside kicks\n\n# nflfastR 2.0.5\n\n* Fix yet another bug caused by NFL providing plays out of order\n* Fix bugs related to penalties on PATs and kickoffs\n* Fix bugs related to NFL providing wrong scoring team on defensive touchdowns \nin older games involving the Jaguars\n* Fix some minor issues related to wrong `first_down_rush` and `return_touchdown`\n* Improved error handling of `fast_scraper()` for not yet played games\n* Improved variable documentation and prepared for new website\n* Improved performance for dplyr v1.0.0\n* Rebuilt EP and WP models due to bugfixes in the underlying data in the versions\n2.0.3, 2.0.4 and 2.0.5\n\n# nflfastR 2.0.4\n\n* Fix another bug with out of order plays\n* Fix bug affecting cumulative totals for WPA, air_WPA and yac_WPA \n* Fix bug affecting cumulative totals for air_EPA and yac_EPA\n\n# nflfastR 2.0.3\n\n* Fix for NFL providing plays out of order\n* Fix for series not incrementing following defensive TD\n\n# nflfastR 2.0.2\n\n* Fixed a bug in the series and series success calculations caused by timeouts\nfollowing a possession change\n* Fixed win probability on PATs\n\n# nflfastR 2.0.1\n\n* Added minimum version requirement on `xgboost` (>= 1.1) as the recent `xgboost` update \ncaused a breaking change leading to failure in adding model results to data\n\n# nflfastR 2.0.0\n\n### Models\n* Added new models for Expected Points, Win Probability and Completion Probability \nand removed `nflscrapR` dependency. This is a **major** change as we are stepping away \nfrom the well established `nflscrapR` models. But we believe it is a good step forward.\nSee `data-raw/MODEL-README.md` for detailed model information.\n\n* Added internal functions for `EPA` and `WPA` to `helper_add_ep_wp.R`.\n\n* Added new function `calculate_expected_points()` usable for the enduser.\n\n### Functions\n* Completely overhauled `fast_scraper()` to make it work with the NFL's new server \nbackend. The option `source` is still available but will be deprecated since there\nis only one source now. There are some changes in the output as well (please see below).\n\n* `fast_scraper()` now adds game data to the play by play data set courtesy of Lee Sharpe. \nGame data include:\naway_score, home_score, location, result, total, spread_line, total_line, div_game, \nroof, surface, temp, wind, home_coach, away_coach, stadium, stadium_id, gameday\n\n* `fastcraper_schedules()` now incorporates Lee Sharpe's `games.rds`.\n\n* The functions `fast_scraper_clips()` and `fast_scraper_roster()` are deactivated \ndue to the missing data source. They might be reactivated or completely dropped \nin future versions.\n\n* The function `fix_fumbles()` has been renamed to `add_qb_epa()` as the new name\nmuch better describes what the function is actually doing.\n\n### Miscellaneous\n\n* Added progress information using the `progressr`package and removed the \n`furrr` progress bars.\n\n* `clean_pbp()` now adds the column `ìd` which is the id of the player in the column `name`. \nBecause we have to piece together different data to cover the full span of years,\n**player IDs are not consistent between the early (1999-2010) and recent (2011 onward)\nperiods**.\n\n* Added a `NEWS.md` file to track changes to the package.\n\n* Fixed several bugs inhereted from `nflscrapR`, including one where EPA was missing \nwhen a play was followed by two timeouts (for example, a two-minute warning followed by a timeout),\nand another where `play_type` was incorrect on plays with declined penalties.\n\n* Fixed a bug, where `receiver_player_name` and `receiver` didn't name the correct\nplayers on plays with lateral passes.\n\n### Play-by-Play Output\nThe output has changed a little bit. \n\n#### The following variables were dropped\n\n| Dropped Variables          | Description                                                                                                                                                                       |\n|----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| game_key                   | RS feed game identifier.                                                                                                                                                          |\n| game_time_local            | Kickoff time in local time zone.                                                                                                                                                  |\n| iso_time                   | Kickoff time according ISO 8601.                                                                                                                                                  |\n| game_type                  | One of 'REG', 'WC', 'DIV', 'CON', 'SB' indicating if a game was a regular season game or one of the playoff rounds.                                                               |\n| site_id                    | RS feed id for game site.                                                                                                                                                         |\n| site_city                  | Game site city.                                                                                                                                                                   |\n| site_state                 | Game site state.                                                                                                                                                                  |\n| drive_possession_team_abbr | Abbreviation of the possession team in a given drive.                                                                                                                             |\n| scoring_team_abbr          | Abbreviation of the scoring team if the play was a scoring play.                                                                                                                  |\n| scoring_type               | String indicating the scoring type. One of 'FG', 'TD', 'PAT', 'SFTY', 'PAT2'.                                                                                                     |\n| alert_play_type            | String describing the play type of a play the NFL has listed as alert play. For most of those plays there are highlight clips available through fast_scraper_clips. |\n| time_of_day                | Local time at the beginning of the play.                                                                                                                                          |\n| yards                      | Analogue yards_gained but with the kicking team being the possession team (which means that there are many yards gained through kickoffs and punts).                              |\n| end_yardline_number        | Yardline number within the above given side at the end of the given play.                                                                                                         |\n| end_yardline_side          | String indicating the side of the field at the end of the given play.                                                                                                             |\n\n#### The following variables were renamed\n\n| Renamed Variables                             | Description                                                                                                                                               |\n|-----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|\n| game_time_eastern -> start_time               | Kickoff time in eastern time zone.                                                                                                                        |\n| site_fullname -> stadium                      | Game site name.                                                                                                                                           |\n| drive_how_started -> drive_start_transition   | String indicating how the offense got the ball.                                                                                                           |\n| drive_how_ended -> drive_end_transition       | String indicating how the offense lost the ball.                                                                                                          |\n| drive_start_time -> drive_game_clock_start    | Game time at the beginning of a given drive.                                                                                                              |\n| drive_end_time -> drive_game_clock_end        | Game time at the end of a given drive.                                                                                                                    |\n| drive_start_yardline -> drive_start_yard_line | String indicating where a given drive started consisting of team half and yard line number.                                                               |\n| drive_end_yardline -> drive_end_yard_line     | String indicating where a given drive ended consisting of team half and yard line number.                                                                 |\n| roof_type -> roof                             | One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference) |\n\n#### The following variables were added\n\n| Added Variables        | Description                                                                                                                                                                                                          |\n|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| vegas_wp               | Estimated win probabiity for the posteam given the current situation at the start of the given play, incorporating pre-game Vegas line.                                                                              |\n| vegas_home_wp          | Estimated win probability for the home team incorporating pre-game Vegas line.                                                                                                                                       |\n| weather                | String describing the weather including temperature, humidity and wind (direction and speed). Doesn't change during the game!                                                                                        |\n| nfl_api_id             | UUID of the game in the new NFL API.                                                                                                                                                                                 |\n| play_clock             | Time on the playclock when the ball was snapped.                                                                                                                                                                     |\n| play_deleted           | Binary indicator for deleted plays.                                                                                                                                                                                  |\n| end_clock_time         | Game time at the end of a given play.                                                                                                                                                                                |\n| end_yard_line          | String indicating the yardline at the end of the given play consisting of team half and yard line number.                                                                                                            |\n| drive_real_start_time  | Local day time when the drive started (currently not used by the NFL and therefore mostly 'NA').                                                                                                                     |\n| drive_ended_with_score | Binary indicator the drive ended with a score.                                                                                                                                                                       |\n| drive_quarter_start    | Numeric value indicating in which quarter the given drive has started.                                                                                                                                               |\n| drive_quarter_end      | Numeric value indicating in which quarter the given drive has ended.                                                                                                                                                 |\n| drive_play_id_started  | Play_id of the first play in the given drive.                                                                                                                                                                        |\n| drive_play_id_ended    | Play_id of the last play in the given drive.                                                                                                                                                                         |\n| away_score             | Total points scored by the away team.                                                                                                                                                                                |\n| home_score             | Total points scored by the home team.                                                                                                                                                                                |\n| location               | Either 'Home' o 'Neutral' indicating if the home team played at home or at a neutral site.                                                                                                                           |\n| result                 | Equals home_score - away_score and means the game outcome from the perspective of the home team.                                                                                                                     |\n| total                  | Equals home_score + away_score and means the total points scored in the given game.                                                                                                                                  |\n| spread_line            | The closing spread line for the game. A positive number means the home team was favored by that many points, a negative number means the away team was favored by that many points. (Source: Pro-Football-Reference) |\n| total_line             | The closing total line for the game. (Source: Pro-Football-Reference)                                                                                                                                                |\n| div_game               | Binary indicator for if the given game was a division game.                                                                                                                                                          |\n| roof                   | One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference)                                                            |\n| surface                | What type of ground the game was played on. (Source: Pro-Football-Reference)                                                                                                                                         |\n| temp                   | The temperature at the stadium only for 'roof' = 'outdoors' or 'open'.(Source: Pro-Football-Reference)                                                                                                               |\n| wind                   | The speed of the wind in miles/hour only for 'roof' = 'outdoors' or 'open'. (Source: Pro-Football-Reference)                                                                                                         |\n| home_coach             | First and last name of the home team coach. (Source: Pro-Football-Reference)                                                                                                                                         |\n| away_coach             | First and last name of the away team coach. (Source: Pro-Football-Reference)                                                                                                                                         |\n| stadium_id             | ID of the stadium the game was played in. (Source: Pro-Football-Reference)                                                                                                                                           |\n| game_stadium           | Name of the stadium the game was played in. (Source: Pro-Football-Reference)                                                                                                                                         |\n"
  },
  {
    "path": "R/aggregate_game_stats.R",
    "content": "################################################################################\n# Author: Ben Baldwin, Sebastian Carl\n# Styleguide: styler::tidyverse_style()\n################################################################################\n\n#' Get Official Game Stats\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated because we have a new, much better and\n#' harmonized approach in [`calculate_stats()`].\n#'\n#' @param pbp A Data frame of NFL play-by-play data typically loaded with\n#' [load_pbp()] or [build_nflfastR_pbp()]. If the data doesn't include the variable\n#' `qb_epa`, the function `add_qb_epa()` will be called to add it.\n#' @param weekly If `TRUE`, returns week-by-week stats, otherwise, stats\n#' for the entire Data frame.\n#' @description Build columns that aggregate official passing, rushing, and receiving stats\n#' either at the game level or at the level of the entire data frame passed.\n#' @return A data frame including the following columns (all ID columns are\n#' decoded to the gsis ID format):\n#' \\describe{\n#' \\item{player_id}{ID of the player. Use this to join to other sources.}\n#' \\item{player_name}{Name of the player}\n#' \\item{player_display_name}{Full name of the player}\n#' \\item{position}{Position of the player}\n#' \\item{position_group}{Position group of the player}\n#' \\item{headshot_url}{URL to a player headshot image}\n#' \\item{games}{The number of games where the player recorded passing, rushing or receiving stats.}\n#' \\item{recent_team}{Most recent team player appears in `pbp` with.}\n#' \\item{season}{Season if `weekly` is `TRUE`}\n#' \\item{week}{Week if `weekly` is `TRUE`}\n#' \\item{season_type}{`REG` or `POST` if `weekly` is `TRUE`}\n#' \\item{opponent_team}{The player's opponent team if `weekly` is `TRUE`}\n#' \\item{completions}{The number of completed passes.}\n#' \\item{attempts}{The number of pass attempts as defined by the NFL.}\n#' \\item{passing_yards}{Yards gained on pass plays.}\n#' \\item{passing_tds}{The number of passing touchdowns.}\n#' \\item{interceptions}{The number of interceptions thrown.}\n#' \\item{sacks}{The Number of times sacked.}\n#' \\item{sack_yards}{Yards lost on sack plays.}\n#' \\item{sack_fumbles}{The number of sacks with a fumble.}\n#' \\item{sack_fumbles_lost}{The number of sacks with a lost fumble.}\n#' \\item{passing_air_yards}{Passing air yards (includes incomplete passes).}\n#' \\item{passing_yards_after_catch}{Yards after the catch gained on plays in\n#' which player was the passer (this is an unofficial stat and may differ slightly\n#' between different sources).}\n#' \\item{passing_first_downs}{First downs on pass attempts.}\n#' \\item{passing_epa}{Total expected points added on pass attempts and sacks.\n#' NOTE: this uses the variable `qb_epa`, which gives QB credit for EPA for up\n#' to the point where a receiver lost a fumble after a completed catch and makes\n#' EPA work more like passing yards on plays with fumbles.}\n#' \\item{passing_2pt_conversions}{Two-point conversion passes.}\n#' \\item{pacr}{Passing Air Conversion Ratio. PACR = `passing_yards` / `passing_air_yards`}\n#' \\item{dakota}{Adjusted EPA + CPOE composite based on coefficients which best predict adjusted EPA/play in the following year.}\n#' \\item{carries}{The number of official rush attempts (incl. scrambles and kneel downs).\n#' Rushes after a lateral reception don't count as carry.}\n#' \\item{rushing_yards}{Yards gained when rushing with the ball (incl. scrambles and kneel downs).\n#' Also includes yards gained after obtaining a lateral on a play that started\n#' with a rushing attempt.}\n#' \\item{rushing_tds}{The number of rushing touchdowns (incl. scrambles).\n#' Also includes touchdowns after obtaining a lateral on a play that started\n#' with a rushing attempt.}\n#' \\item{rushing_fumbles}{The number of rushes with a fumble.}\n#' \\item{rushing_fumbles_lost}{The number of rushes with a lost fumble.}\n#' \\item{rushing_first_downs}{First downs on rush attempts (incl. scrambles).}\n#' \\item{rushing_epa}{Expected points added on rush attempts (incl. scrambles and kneel downs).}\n#' \\item{rushing_2pt_conversions}{Two-point conversion rushes}\n#' \\item{receptions}{The number of pass receptions. Lateral receptions officially\n#' don't count as reception.}\n#' \\item{targets}{The number of pass plays where the player was the targeted receiver.}\n#' \\item{receiving_yards}{Yards gained after a pass reception. Includes yards\n#' gained after receiving a lateral on a play that started as a pass play.}\n#' \\item{receiving_tds}{The number of touchdowns following a pass reception.\n#' Also includes touchdowns after receiving a lateral on a play that started\n#' as a pass play.}\n#' \\item{receiving_air_yards}{Receiving air yards (incl. incomplete passes).}\n#' \\item{receiving_yards_after_catch}{Yards after the catch gained on plays in\n#' which player was receiver (this is an unofficial stat and may differ slightly\n#' between different sources).}\n#' \\item{receiving_fumbles}{The number of fumbles after a pass reception.}\n#' \\item{receiving_fumbles_lost}{The number of fumbles lost after a pass reception.}\n#' \\item{receiving_2pt_conversions}{Two-point conversion receptions}\n#' \\item{racr}{Receiver Air Conversion Ratio. RACR = `receiving_yards` / `receiving_air_yards`}\n#' \\item{target_share}{The share of targets of the player in all targets of his team}\n#' \\item{air_yards_share}{The share of receiving_air_yards of the player in all air_yards of his team}\n#' \\item{wopr}{Weighted Opportunity Rating. WOPR = 1.5 × `target_share` + 0.7 × `air_yards_share`}\n#' \\item{fantasy_points}{Standard fantasy points.}\n#' \\item{fantasy_points_ppr}{PPR fantasy points.}\n#' }\n#' @export\n#' @keywords internal\n#' @seealso The function [load_player_stats()] and the corresponding examples\n#' on [the nflfastR website](https://nflfastr.com/articles/nflfastR.html#example-11-replicating-official-stats)\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#' # pbp <- nflfastR::load_pbp(2020)\n#'\n#' # weekly <- calculate_player_stats(pbp, weekly = TRUE)\n#' # dplyr::glimpse(weekly)\n#'\n#' # overall <- calculate_player_stats(pbp, weekly = FALSE)\n#' # dplyr::glimpse(overall)\n#' })\n#' }\ncalculate_player_stats <- function(pbp, weekly = FALSE) {\n  lifecycle::deprecate_warn(\n    \"5.0\",\n    \"calculate_player_stats()\",\n    \"calculate_stats()\"\n  )\n\n  # need newer version of nflreadr to use load_players\n  rlang::check_installed(\"nflreadr (>= 1.3.0)\", \"to join player information.\")\n\n  # Prepare data ------------------------------------------------------------\n\n  # load plays with multiple laterals\n  mult_lats <- nflreadr::rds_from_url(\n    \"https://github.com/nflverse/nflverse-data/releases/download/misc/multiple_lateral_yards.rds\"\n  ) |>\n    dplyr::mutate(\n      season = substr(.data$game_id, 1, 4) |> as.integer(),\n      week = substr(.data$game_id, 6, 7) |> as.integer()\n    ) |>\n    dplyr::filter(.data$yards != 0) |>\n    # the list includes all plays with multiple laterals\n    # and all receivers. Since the last one already is in the\n    # pbp data, we have to drop him here so the entry isn't duplicated\n    dplyr::group_by(.data$game_id, .data$play_id) |>\n    dplyr::slice(seq_len(dplyr::n() - 1)) |>\n    dplyr::ungroup() |>\n    # there are some very rare cases where a player collects lateral yards\n    # multiple times in the same play. We need to aggregate here to make sure\n    # this don't messes up joins (#289)\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      .data$type,\n      .data$gsis_player_id\n    ) |>\n    dplyr::summarise(yards = sum(.data$yards)) |>\n    dplyr::ungroup()\n\n  # filter down to the 2 dfs we need\n  suppressMessages({\n    # 1. for \"normal\" plays: get plays that count in official stats\n    data <- pbp |>\n      dplyr::filter(\n        !is.na(.data$down),\n        .data$play_type %in% c(\"pass\", \"qb_kneel\", \"qb_spike\", \"run\")\n      ) |>\n      decode_player_ids()\n\n    if (!\"qb_epa\" %in% names(data)) {\n      data <- add_qb_epa(data)\n    }\n\n    # 2. for 2pt conversions only, get those plays\n    two_points <- pbp |>\n      dplyr::filter(.data$two_point_conv_result == \"success\") |>\n      dplyr::select(\n        \"week\",\n        \"season\",\n        \"posteam\",\n        \"defteam\",\n        \"pass_attempt\",\n        \"rush_attempt\",\n        \"passer_player_name\",\n        \"passer_player_id\",\n        \"rusher_player_name\",\n        \"rusher_player_id\",\n        \"lateral_rusher_player_name\",\n        \"lateral_rusher_player_id\",\n        \"receiver_player_name\",\n        \"receiver_player_id\",\n        \"lateral_receiver_player_name\",\n        \"lateral_receiver_player_id\"\n      ) |>\n      decode_player_ids()\n  })\n\n  if (!\"special\" %in% names(pbp)) {\n    # we need this column for the special teams tds\n    pbp <- pbp |>\n      dplyr::mutate(\n        special = dplyr::if_else(\n          .data$play_type %in%\n            c(\"extra_point\", \"field_goal\", \"kickoff\", \"punt\"),\n          1,\n          0\n        )\n      )\n  }\n\n  s_type <- pbp |>\n    dplyr::select(\"season\", \"season_type\", \"week\") |>\n    dplyr::distinct()\n\n  # we'll join some player information like position or full name later\n  # so we load it here to be able to use it for racr ids as well\n  player_info <- nflreadr::load_players() |>\n    dplyr::select(\n      \"player_id\" = \"gsis_id\",\n      \"player_display_name\" = \"display_name\",\n      \"player_name\" = \"short_name\",\n      \"position\",\n      \"position_group\",\n      \"headshot_url\" = \"headshot\"\n    )\n\n  # load gsis_ids of RBs, FBs and HBs for RACR\n  racr_ids <- player_info |>\n    dplyr::filter(.data$position %in% c(\"RB\", \"FB\", \"HB\")) |>\n    dplyr::select(\"gsis_id\" = \"player_id\")\n\n  # Passing stats -----------------------------------------------------------\n\n  # get passing stats\n  pass_df <- data |>\n    dplyr::filter(.data$play_type %in% c(\"pass\", \"qb_spike\")) |>\n    dplyr::group_by(.data$passer_player_id, .data$week, .data$season) |>\n    dplyr::summarize(\n      passing_yards_after_catch = sum(\n        (.data$passing_yards - .data$air_yards) * .data$complete_pass,\n        na.rm = TRUE\n      ),\n      name_pass = dplyr::first(.data$passer_player_name),\n      team_pass = dplyr::first(.data$posteam),\n      opp_pass = dplyr::first(.data$defteam),\n      passing_yards = sum(.data$passing_yards, na.rm = TRUE),\n      passing_tds = sum(\n        .data$touchdown == 1 &\n          .data$td_team == .data$posteam &\n          .data$complete_pass == 1\n      ),\n      interceptions = sum(.data$interception),\n      attempts = sum(\n        .data$complete_pass == 1 |\n          .data$incomplete_pass == 1 |\n          .data$interception == 1\n      ),\n      completions = sum(.data$complete_pass == 1),\n      sack_fumbles = sum(\n        .data$fumble == 1 & .data$fumbled_1_player_id == .data$passer_player_id\n      ),\n      sack_fumbles_lost = sum(\n        .data$fumble_lost == 1 &\n          .data$fumbled_1_player_id == .data$passer_player_id &\n          .data$fumble_recovery_1_team != .data$posteam\n      ),\n      passing_air_yards = sum(.data$air_yards, na.rm = TRUE),\n      sacks = sum(.data$sack),\n      sack_yards = -1 * sum(.data$yards_gained * .data$sack),\n      passing_first_downs = sum(.data$first_down_pass),\n      passing_epa = sum(.data$qb_epa, na.rm = TRUE),\n      pacr = .data$passing_yards / .data$passing_air_yards,\n      pacr = dplyr::case_when(\n        is.nan(.data$pacr) ~ NA_real_,\n        .data$passing_air_yards <= 0 ~ 0,\n        TRUE ~ .data$pacr\n      ),\n    ) |>\n    dplyr::rename(\"player_id\" = \"passer_player_id\") |>\n    dplyr::ungroup()\n\n  if (isTRUE(weekly)) {\n    pass_df <- add_dakota(pass_df, pbp = pbp, weekly = weekly)\n  }\n\n  pass_two_points <- two_points |>\n    dplyr::filter(.data$pass_attempt == 1) |>\n    dplyr::group_by(.data$passer_player_id, .data$week, .data$season) |>\n    dplyr::summarise(\n      # need name_pass and team_pass here for the full join in the next pipe\n      name_pass = custom_mode(.data$passer_player_name),\n      team_pass = custom_mode(.data$posteam),\n      opp_pass = custom_mode(.data$defteam),\n      passing_2pt_conversions = dplyr::n()\n    ) |>\n    dplyr::rename(\"player_id\" = \"passer_player_id\") |>\n    dplyr::ungroup()\n\n  pass_df <- pass_df |>\n    # need a full join because players without passing stats that recorded\n    # a passing two point (e.g. WRs) are dropped in any other join\n    dplyr::full_join(\n      pass_two_points,\n      by = c(\n        \"player_id\",\n        \"week\",\n        \"season\",\n        \"name_pass\",\n        \"team_pass\",\n        \"opp_pass\"\n      )\n    ) |>\n    dplyr::mutate(\n      passing_2pt_conversions = dplyr::if_else(\n        is.na(.data$passing_2pt_conversions),\n        0L,\n        .data$passing_2pt_conversions\n      )\n    ) |>\n    dplyr::filter(!is.na(.data$player_id))\n\n  pass_df_nas <- is.na(pass_df)\n  epa_index <- which(\n    dimnames(pass_df_nas)[[2]] %in% c(\"passing_epa\", \"dakota\", \"pacr\")\n  )\n  pass_df_nas[, epa_index] <- c(FALSE)\n\n  pass_df[pass_df_nas] <- 0\n\n  # Rushing stats -----------------------------------------------------------\n\n  # rush df 1: primary rusher\n  rushes <- data |>\n    dplyr::filter(.data$play_type %in% c(\"run\", \"qb_kneel\")) |>\n    dplyr::group_by(.data$rusher_player_id, .data$week, .data$season) |>\n    dplyr::summarize(\n      name_rush = dplyr::first(.data$rusher_player_name),\n      team_rush = dplyr::first(.data$posteam),\n      opp_rush = dplyr::first(.data$defteam),\n      yards = sum(.data$rushing_yards, na.rm = TRUE),\n      tds = sum(.data$td_player_id == .data$rusher_player_id, na.rm = TRUE),\n      carries = dplyr::n(),\n      rushing_fumbles = sum(\n        .data$fumble == 1 &\n          .data$fumbled_1_player_id == .data$rusher_player_id &\n          is.na(.data$lateral_rusher_player_id)\n      ),\n      rushing_fumbles_lost = sum(\n        .data$fumble_lost == 1 &\n          .data$fumbled_1_player_id == .data$rusher_player_id &\n          is.na(.data$lateral_rusher_player_id) &\n          .data$fumble_recovery_1_team != .data$posteam\n      ),\n      rushing_first_downs = sum(\n        .data$first_down_rush & is.na(.data$lateral_rusher_player_id)\n      ),\n      rushing_epa = sum(.data$epa, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # rush df 2: lateral\n  laterals <- data |>\n    dplyr::filter(!is.na(.data$lateral_rusher_player_id)) |>\n    dplyr::group_by(.data$lateral_rusher_player_id, .data$week, .data$season) |>\n    dplyr::summarize(\n      lateral_yards = sum(.data$lateral_rushing_yards, na.rm = TRUE),\n      lateral_fds = sum(.data$first_down_rush, na.rm = TRUE),\n      lateral_tds = sum(\n        .data$td_player_id == .data$lateral_rusher_player_id,\n        na.rm = TRUE\n      ),\n      lateral_att = dplyr::n(),\n      lateral_fumbles = sum(.data$fumble, na.rm = TRUE),\n      lateral_fumbles_lost = sum(.data$fumble_lost, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::rename(\"rusher_player_id\" = \"lateral_rusher_player_id\") |>\n    dplyr::bind_rows(\n      mult_lats |>\n        dplyr::filter(\n          .data$type == \"lateral_rushing\" &\n            .data$season %in% data$season &\n            .data$week %in% data$week\n        ) |>\n        dplyr::select(\n          \"season\",\n          \"week\",\n          \"rusher_player_id\" = \"gsis_player_id\",\n          \"lateral_yards\" = \"yards\"\n        ) |>\n        dplyr::mutate(lateral_tds = 0L, lateral_att = 1L)\n    ) |>\n    # at this stage it is possible that a player is duplicated because he\n    # has lateral yards both in the regular pbp and in the multiple laterals file.\n    # This can happen when a player was the last lateral player in one play and\n    # not the last lateral player in another play in the same game (wow absurd)\n    # We summarise all columns to make sure there is only one row per player\n    # per game. See (#289)\n    dplyr::group_by(.data$rusher_player_id, .data$week, .data$season) |>\n    dplyr::summarise_all(.funs = sum, na.rm = TRUE) |>\n    dplyr::ungroup()\n\n  # rush df: join\n  rush_df <- rushes |>\n    dplyr::left_join(laterals, by = c(\"rusher_player_id\", \"week\", \"season\")) |>\n    dplyr::mutate(\n      lateral_yards = dplyr::if_else(\n        is.na(.data$lateral_yards),\n        0,\n        .data$lateral_yards\n      ),\n      lateral_tds = dplyr::if_else(\n        is.na(.data$lateral_tds),\n        0L,\n        .data$lateral_tds\n      ),\n      lateral_fumbles = dplyr::if_else(\n        is.na(.data$lateral_fumbles),\n        0,\n        .data$lateral_fumbles\n      ),\n      lateral_fumbles_lost = dplyr::if_else(\n        is.na(.data$lateral_fumbles_lost),\n        0,\n        .data$lateral_fumbles_lost\n      ),\n      lateral_fds = dplyr::if_else(\n        is.na(.data$lateral_fds),\n        0,\n        .data$lateral_fds\n      )\n    ) |>\n    dplyr::mutate(\n      rushing_yards = .data$yards + .data$lateral_yards,\n      rushing_tds = .data$tds + .data$lateral_tds,\n      rushing_first_downs = .data$rushing_first_downs + .data$lateral_fds,\n      rushing_fumbles = .data$rushing_fumbles + .data$lateral_fumbles,\n      rushing_fumbles_lost = .data$rushing_fumbles_lost +\n        .data$lateral_fumbles_lost\n    ) |>\n    dplyr::rename(\"player_id\" = \"rusher_player_id\") |>\n    dplyr::select(\n      \"player_id\",\n      \"week\",\n      \"season\",\n      \"name_rush\",\n      \"team_rush\",\n      \"opp_rush\",\n      \"rushing_yards\",\n      \"carries\",\n      \"rushing_tds\",\n      \"rushing_fumbles\",\n      \"rushing_fumbles_lost\",\n      \"rushing_first_downs\",\n      \"rushing_epa\"\n    ) |>\n    dplyr::ungroup()\n\n  rush_two_points <- two_points |>\n    dplyr::filter(.data$rush_attempt == 1) |>\n    dplyr::group_by(.data$rusher_player_id, .data$week, .data$season) |>\n    dplyr::summarise(\n      # need name_rush and team_rush here for the full join in the next pipe\n      name_rush = custom_mode(.data$rusher_player_name),\n      team_rush = custom_mode(.data$posteam),\n      opp_rush = custom_mode(.data$defteam),\n      rushing_2pt_conversions = dplyr::n()\n    ) |>\n    dplyr::rename(\"player_id\" = \"rusher_player_id\") |>\n    dplyr::ungroup()\n\n  rush_df <- rush_df |>\n    # need a full join because players without rushing stats that recorded\n    # a rushing two point (mostly QBs) are dropped in any other join\n    dplyr::full_join(\n      rush_two_points,\n      by = c(\n        \"player_id\",\n        \"week\",\n        \"season\",\n        \"name_rush\",\n        \"team_rush\",\n        \"opp_rush\"\n      )\n    ) |>\n    dplyr::mutate(\n      rushing_2pt_conversions = dplyr::if_else(\n        is.na(.data$rushing_2pt_conversions),\n        0L,\n        .data$rushing_2pt_conversions\n      )\n    ) |>\n    dplyr::filter(!is.na(.data$player_id))\n\n  rush_df_nas <- is.na(rush_df)\n  epa_index <- which(dimnames(rush_df_nas)[[2]] == \"rushing_epa\")\n  rush_df_nas[, epa_index] <- c(FALSE)\n\n  rush_df[rush_df_nas] <- 0\n\n  # Receiving stats ---------------------------------------------------------\n\n  # receiver df 1: primary receiver\n  rec <- data |>\n    dplyr::filter(!is.na(.data$receiver_player_id)) |>\n    dplyr::group_by(.data$receiver_player_id, .data$week, .data$season) |>\n    dplyr::summarize(\n      name_receiver = dplyr::first(.data$receiver_player_name),\n      team_receiver = dplyr::first(.data$posteam),\n      opp_receiver = dplyr::first(.data$defteam),\n      yards = sum(.data$receiving_yards, na.rm = TRUE),\n      receptions = sum(.data$complete_pass == 1),\n      targets = dplyr::n(),\n      tds = sum(.data$td_player_id == .data$receiver_player_id, na.rm = TRUE),\n      receiving_fumbles = sum(\n        .data$fumble == 1 &\n          .data$fumbled_1_player_id == .data$receiver_player_id &\n          is.na(.data$lateral_receiver_player_id)\n      ),\n      receiving_fumbles_lost = sum(\n        .data$fumble_lost == 1 &\n          .data$fumbled_1_player_id == .data$receiver_player_id &\n          is.na(.data$lateral_receiver_player_id) &\n          .data$fumble_recovery_1_team != .data$posteam\n      ),\n      receiving_air_yards = sum(.data$air_yards, na.rm = TRUE),\n      receiving_yards_after_catch = sum(.data$yards_after_catch, na.rm = TRUE),\n      receiving_first_downs = sum(\n        .data$first_down_pass & is.na(.data$lateral_receiver_player_id)\n      ),\n      receiving_epa = sum(.data$epa, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # receiver df 2: lateral\n  laterals <- data |>\n    dplyr::filter(!is.na(.data$lateral_receiver_player_id)) |>\n    dplyr::group_by(\n      .data$lateral_receiver_player_id,\n      .data$week,\n      .data$season\n    ) |>\n    dplyr::summarize(\n      lateral_yards = sum(.data$lateral_receiving_yards, na.rm = TRUE),\n      lateral_tds = sum(\n        .data$td_player_id == .data$lateral_receiver_player_id,\n        na.rm = TRUE\n      ),\n      lateral_att = dplyr::n(),\n      lateral_fds = sum(.data$first_down_pass, na.rm = T),\n      lateral_fumbles = sum(.data$fumble, na.rm = T),\n      lateral_fumbles_lost = sum(.data$fumble_lost, na.rm = T)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::rename(\"receiver_player_id\" = \"lateral_receiver_player_id\") |>\n    dplyr::bind_rows(\n      mult_lats |>\n        dplyr::filter(\n          .data$type == \"lateral_receiving\" &\n            .data$season %in% data$season &\n            .data$week %in% data$week\n        ) |>\n        dplyr::select(\n          \"season\",\n          \"week\",\n          \"receiver_player_id\" = \"gsis_player_id\",\n          \"lateral_yards\" = \"yards\"\n        ) |>\n        dplyr::mutate(lateral_tds = 0L, lateral_att = 1L)\n    ) |>\n    # at this stage it is possible that a player is duplicated because he\n    # has lateral yards both in the regular pbp and in the multiple laterals file.\n    # This can happen when a player was the last lateral player in one play and\n    # not the last lateral player in another play in the same game (wow absurd)\n    # We summarise all columns to get make sure there is only one row per player\n    # per game. See (#289)\n    dplyr::group_by(.data$receiver_player_id, .data$week, .data$season) |>\n    dplyr::summarise_all(.funs = sum, na.rm = TRUE) |>\n    dplyr::ungroup()\n\n  # receiver df 3: team receiving for WOPR\n  rec_team <- data |>\n    dplyr::filter(!is.na(.data$receiver_player_id)) |>\n    dplyr::group_by(.data$posteam, .data$week, .data$season) |>\n    dplyr::summarize(\n      team_targets = dplyr::n(),\n      team_air_yards = sum(.data$air_yards, na.rm = TRUE),\n    ) |>\n    dplyr::ungroup()\n\n  # rec df: join\n  rec_df <- rec |>\n    dplyr::left_join(\n      laterals,\n      by = c(\"receiver_player_id\", \"week\", \"season\")\n    ) |>\n    dplyr::left_join(\n      rec_team,\n      by = c(\"team_receiver\" = \"posteam\", \"week\", \"season\")\n    ) |>\n    dplyr::mutate(\n      lateral_yards = dplyr::if_else(\n        is.na(.data$lateral_yards),\n        0,\n        .data$lateral_yards\n      ),\n      lateral_tds = dplyr::if_else(\n        is.na(.data$lateral_tds),\n        0L,\n        .data$lateral_tds\n      ),\n      lateral_fumbles = dplyr::if_else(\n        is.na(.data$lateral_fumbles),\n        0,\n        .data$lateral_fumbles\n      ),\n      lateral_fumbles_lost = dplyr::if_else(\n        is.na(.data$lateral_fumbles_lost),\n        0,\n        .data$lateral_fumbles_lost\n      ),\n      lateral_fds = dplyr::if_else(\n        is.na(.data$lateral_fds),\n        0,\n        .data$lateral_fds\n      )\n    ) |>\n    dplyr::mutate(\n      receiving_yards = .data$yards + .data$lateral_yards,\n      receiving_tds = .data$tds + .data$lateral_tds,\n      receiving_yards_after_catch = .data$receiving_yards_after_catch +\n        .data$lateral_yards,\n      receiving_first_downs = .data$receiving_first_downs + .data$lateral_fds,\n      receiving_fumbles = .data$receiving_fumbles + .data$lateral_fumbles,\n      receiving_fumbles_lost = .data$receiving_fumbles_lost +\n        .data$lateral_fumbles_lost,\n      racr = .data$receiving_yards / .data$receiving_air_yards,\n      racr = dplyr::case_when(\n        is.nan(.data$racr) ~ NA_real_,\n        .data$receiving_air_yards == 0 ~ 0,\n        # following Josh Hermsmeyer's definition, RACR stays < 0 for RBs (and FBs) and is set to\n        # 0 for Receivers. The list \"racr_ids\" includes all known RB and FB gsis_ids\n        .data$receiving_air_yards < 0 &\n          !.data$receiver_player_id %in% racr_ids$gsis_id ~ 0,\n        TRUE ~ .data$racr\n      ),\n      target_share = .data$targets / .data$team_targets,\n      air_yards_share = .data$receiving_air_yards / .data$team_air_yards,\n      wopr = 1.5 * .data$target_share + 0.7 * .data$air_yards_share\n    ) |>\n    dplyr::rename(\"player_id\" = \"receiver_player_id\") |>\n    dplyr::select(\n      \"player_id\",\n      \"week\",\n      \"season\",\n      \"name_receiver\",\n      \"team_receiver\",\n      \"opp_receiver\",\n      \"receiving_yards\",\n      \"receiving_air_yards\",\n      \"receiving_yards_after_catch\",\n      \"receptions\",\n      \"targets\",\n      \"receiving_tds\",\n      \"receiving_fumbles\",\n      \"receiving_fumbles_lost\",\n      \"receiving_first_downs\",\n      \"receiving_epa\",\n      \"racr\",\n      \"target_share\",\n      \"air_yards_share\",\n      \"wopr\"\n    )\n\n  rec_two_points <- two_points |>\n    dplyr::filter(.data$pass_attempt == 1) |>\n    dplyr::group_by(.data$receiver_player_id, .data$week, .data$season) |>\n    dplyr::summarise(\n      # need name_receiver and team_receiver here for the full join in the next pipe\n      name_receiver = custom_mode(.data$receiver_player_name),\n      team_receiver = custom_mode(.data$posteam),\n      opp_receiver = custom_mode(.data$defteam),\n      receiving_2pt_conversions = dplyr::n()\n    ) |>\n    dplyr::rename(\"player_id\" = \"receiver_player_id\") |>\n    dplyr::ungroup()\n\n  rec_df <- rec_df |>\n    # need a full join because players without receiving stats that recorded\n    # a receiving two point are dropped in any other join\n    dplyr::full_join(\n      rec_two_points,\n      by = c(\n        \"player_id\",\n        \"week\",\n        \"season\",\n        \"name_receiver\",\n        \"team_receiver\",\n        \"opp_receiver\"\n      )\n    ) |>\n    dplyr::mutate(\n      receiving_2pt_conversions = dplyr::if_else(\n        is.na(.data$receiving_2pt_conversions),\n        0L,\n        .data$receiving_2pt_conversions\n      )\n    ) |>\n    dplyr::filter(!is.na(.data$player_id), !is.na(.data$name_receiver))\n\n  rec_df_nas <- is.na(rec_df)\n  epa_index <- which(\n    dimnames(rec_df_nas)[[2]] %in%\n      c(\"receiving_epa\", \"racr\", \"target_share\", \"air_yards_share\", \"wopr\")\n  )\n  rec_df_nas[, epa_index] <- c(FALSE)\n\n  rec_df[rec_df_nas] <- 0\n\n  # Special Teams -----------------------------------------------------------\n\n  st_tds <- pbp |>\n    dplyr::filter(.data$special == 1 & !is.na(.data$td_player_id)) |>\n    dplyr::group_by(.data$td_player_id, .data$week, .data$season) |>\n    dplyr::summarise(\n      name_st = custom_mode(.data$td_player_name),\n      team_st = custom_mode(.data$td_team),\n      opp_st = custom_mode(.data$defteam),\n      special_teams_tds = sum(.data$touchdown, na.rm = TRUE)\n    ) |>\n    dplyr::rename(\"player_id\" = \"td_player_id\")\n\n  # Combine all stats -------------------------------------------------------\n\n  # combine all the stats together\n  player_df <- pass_df |>\n    dplyr::full_join(rush_df, by = c(\"player_id\", \"week\", \"season\")) |>\n    dplyr::full_join(rec_df, by = c(\"player_id\", \"week\", \"season\")) |>\n    dplyr::full_join(st_tds, by = c(\"player_id\", \"week\", \"season\")) |>\n    dplyr::left_join(s_type, by = c(\"season\", \"week\")) |>\n    dplyr::mutate(\n      player_name = dplyr::case_when(\n        !is.na(.data$name_pass) ~ .data$name_pass,\n        !is.na(.data$name_rush) ~ .data$name_rush,\n        !is.na(.data$name_receiver) ~ .data$name_receiver,\n        TRUE ~ .data$name_st\n      ),\n      recent_team = dplyr::case_when(\n        !is.na(.data$team_pass) ~ .data$team_pass,\n        !is.na(.data$team_rush) ~ .data$team_rush,\n        !is.na(.data$team_receiver) ~ .data$team_receiver,\n        TRUE ~ .data$team_st\n      ),\n      opponent_team = dplyr::case_when(\n        !is.na(.data$opp_pass) ~ .data$opp_pass,\n        !is.na(.data$opp_rush) ~ .data$opp_rush,\n        !is.na(.data$opp_receiver) ~ .data$opp_receiver,\n        TRUE ~ .data$opp_st\n      )\n    ) |>\n    dplyr::select(dplyr::any_of(c(\n      # id information\n      \"player_id\",\n      \"player_name\",\n      \"recent_team\",\n      \"season\",\n      \"week\",\n      \"season_type\",\n      \"opponent_team\",\n\n      # passing stats\n      \"completions\",\n      \"attempts\",\n      \"passing_yards\",\n      \"passing_tds\",\n      \"interceptions\",\n      \"sacks\",\n      \"sack_yards\",\n      \"sack_fumbles\",\n      \"sack_fumbles_lost\",\n      \"passing_air_yards\",\n      \"passing_yards_after_catch\",\n      \"passing_first_downs\",\n      \"passing_epa\",\n      \"passing_2pt_conversions\",\n      \"pacr\",\n      \"dakota\",\n\n      # rushing stats\n      \"carries\",\n      \"rushing_yards\",\n      \"rushing_tds\",\n      \"rushing_fumbles\",\n      \"rushing_fumbles_lost\",\n      \"rushing_first_downs\",\n      \"rushing_epa\",\n      \"rushing_2pt_conversions\",\n\n      # receiving stats\n      \"receptions\",\n      \"targets\",\n      \"receiving_yards\",\n      \"receiving_tds\",\n      \"receiving_fumbles\",\n      \"receiving_fumbles_lost\",\n      \"receiving_air_yards\",\n      \"receiving_yards_after_catch\",\n      \"receiving_first_downs\",\n      \"receiving_epa\",\n      \"receiving_2pt_conversions\",\n      \"racr\",\n      \"target_share\",\n      \"air_yards_share\",\n      \"wopr\",\n\n      # special teams\n      \"special_teams_tds\"\n    ))) |>\n    dplyr::filter(!is.na(.data$player_id), !is.na(.data$player_name))\n\n  player_df_nas <- is.na(player_df)\n  epa_index <- which(\n    dimnames(player_df_nas)[[2]] %in%\n      c(\n        \"passing_epa\",\n        \"rushing_epa\",\n        \"receiving_epa\",\n        \"dakota\",\n        \"racr\",\n        \"target_share\",\n        \"air_yards_share\",\n        \"wopr\",\n        \"pacr\"\n      )\n  )\n  player_df_nas[, epa_index] <- c(FALSE)\n\n  player_df[player_df_nas] <- 0\n\n  player_df <- player_df |>\n    dplyr::mutate(\n      fantasy_points = 1 /\n        25 *\n        .data$passing_yards +\n        4 * .data$passing_tds +\n        -2 * .data$interceptions +\n        1 / 10 * (.data$rushing_yards + .data$receiving_yards) +\n        6 *\n          (.data$rushing_tds + .data$receiving_tds + .data$special_teams_tds) +\n        2 *\n          (.data$passing_2pt_conversions +\n            .data$rushing_2pt_conversions +\n            .data$receiving_2pt_conversions) +\n        -2 *\n          (.data$sack_fumbles_lost +\n            .data$rushing_fumbles_lost +\n            .data$receiving_fumbles_lost),\n\n      fantasy_points_ppr = .data$fantasy_points + .data$receptions\n    ) |>\n    dplyr::arrange(.data$player_id, .data$season, .data$week)\n\n  # if user doesn't want week-by-week input, aggregate the whole df\n  if (isFALSE(weekly)) {\n    player_df <- player_df |>\n      # helper variables to summarise targetshare and air yard share\n      # because targets and air yards summarise first\n      dplyr::mutate(\n        tgts = .data$targets,\n        rec_air_yds = .data$receiving_air_yards\n      ) |>\n      dplyr::group_by(.data$player_id) |>\n      dplyr::summarise(\n        player_name = custom_mode(.data$player_name),\n        games = dplyr::n(),\n        recent_team = dplyr::last(.data$recent_team),\n        # passing\n        completions = sum(.data$completions),\n        attempts = sum(.data$attempts),\n        passing_yards = sum(.data$passing_yards),\n        passing_tds = sum(.data$passing_tds),\n        interceptions = sum(.data$interceptions),\n        sacks = sum(.data$sacks),\n        sack_yards = sum(.data$sack_yards),\n        sack_fumbles = sum(.data$sack_fumbles),\n        sack_fumbles_lost = sum(.data$sack_fumbles_lost),\n        passing_air_yards = sum(.data$passing_air_yards),\n        passing_yards_after_catch = sum(.data$passing_yards_after_catch),\n        passing_first_downs = sum(.data$passing_first_downs),\n        passing_epa = dplyr::if_else(\n          all(is.na(.data$passing_epa)),\n          NA_real_,\n          sum(.data$passing_epa, na.rm = TRUE)\n        ),\n        passing_2pt_conversions = sum(.data$passing_2pt_conversions),\n        pacr = .data$passing_yards / .data$passing_air_yards,\n\n        # rushing\n        carries = sum(.data$carries),\n        rushing_yards = sum(.data$rushing_yards),\n        rushing_tds = sum(.data$rushing_tds),\n        rushing_fumbles = sum(.data$rushing_fumbles),\n        rushing_fumbles_lost = sum(.data$rushing_fumbles_lost),\n        rushing_first_downs = sum(.data$rushing_first_downs),\n        rushing_epa = dplyr::if_else(\n          all(is.na(.data$rushing_epa)),\n          NA_real_,\n          sum(.data$rushing_epa, na.rm = TRUE)\n        ),\n        rushing_2pt_conversions = sum(.data$rushing_2pt_conversions),\n\n        # receiving\n        receptions = sum(.data$receptions),\n        targets = sum(.data$targets),\n        receiving_yards = sum(.data$receiving_yards),\n        receiving_tds = sum(.data$receiving_tds),\n        receiving_fumbles = sum(.data$receiving_fumbles),\n        receiving_fumbles_lost = sum(.data$receiving_fumbles_lost),\n        receiving_air_yards = sum(.data$receiving_air_yards),\n        receiving_yards_after_catch = sum(.data$receiving_yards_after_catch),\n        receiving_first_downs = sum(.data$receiving_first_downs),\n        receiving_epa = dplyr::if_else(\n          all(is.na(.data$receiving_epa)),\n          NA_real_,\n          sum(.data$receiving_epa, na.rm = TRUE)\n        ),\n        receiving_2pt_conversions = sum(.data$receiving_2pt_conversions),\n        racr = .data$receiving_yards / .data$receiving_air_yards,\n        target_share = dplyr::if_else(\n          all(is.na(.data$target_share)),\n          NA_real_,\n          sum(.data$tgts, na.rm = TRUE) /\n            sum(.data$tgts / .data$target_share, na.rm = TRUE)\n        ),\n        air_yards_share = dplyr::if_else(\n          all(is.na(.data$air_yards_share)),\n          NA_real_,\n          sum(.data$rec_air_yds, na.rm = TRUE) /\n            sum(.data$rec_air_yds / .data$air_yards_share, na.rm = TRUE)\n        ),\n        wopr = 1.5 * .data$target_share + 0.7 * .data$air_yards_share,\n\n        # special teams\n        special_teams_tds = sum(.data$special_teams_tds),\n\n        # fantasy\n        fantasy_points = sum(.data$fantasy_points),\n        fantasy_points_ppr = sum(.data$fantasy_points_ppr)\n      ) |>\n      dplyr::ungroup() |>\n      dplyr::mutate(\n        racr = dplyr::case_when(\n          is.nan(.data$racr) ~ NA_real_,\n          .data$receiving_air_yards == 0 ~ 0,\n          # following Josh Hermsmeyer's definition, RACR stays < 0 for RBs (and FBs) and is set to\n          # 0 for Receivers. The list \"racr_ids\" includes all known RB and FB gsis_ids\n          .data$receiving_air_yards < 0 &\n            !.data$player_id %in% racr_ids$gsis_id ~ 0,\n          TRUE ~ .data$racr\n        ),\n        pacr = dplyr::case_when(\n          is.nan(.data$pacr) ~ NA_real_,\n          .data$passing_air_yards <= 0 ~ 0,\n          TRUE ~ .data$pacr\n        )\n      ) |>\n      add_dakota(pbp = pbp, weekly = weekly) |>\n      dplyr::select(\n        \"player_id\":\"pacr\",\n        dplyr::any_of(\"dakota\"),\n        dplyr::everything()\n      )\n  }\n\n  # data is missing position and player name can be messed up in pbp\n  # so we join player information next\n  player_df <- player_df |>\n    dplyr::select(-\"player_name\") |>\n    dplyr::left_join(player_info, by = \"player_id\") |>\n    dplyr::select(\n      \"player_id\",\n      \"player_name\",\n      \"player_display_name\",\n      \"position\",\n      \"position_group\",\n      \"headshot_url\",\n      dplyr::everything()\n    )\n\n  return(player_df)\n}\n\nadd_dakota <- function(add_to_this, pbp, weekly) {\n  dakota_model <- NULL\n  con <- url(\n    \"https://github.com/nflverse/nflfastR-data/blob/master/models/dakota_model.Rdata?raw=true\"\n  )\n  try(load(con), silent = TRUE)\n  close(con)\n\n  if (is.null(dakota_model)) {\n    user_message(\n      \"This function needs to download the model data from GitHub. Please check your Internet connection and try again!\",\n      \"oops\"\n    )\n    return(add_to_this)\n  }\n\n  if (!\"id\" %in% names(pbp)) {\n    pbp <- clean_pbp(pbp)\n  }\n  if (!\"qb_epa\" %in% names(pbp)) {\n    pbp <- add_qb_epa(pbp)\n  }\n\n  suppressMessages({\n    df <- pbp |>\n      dplyr::filter(.data$pass == 1 | .data$rush == 1) |>\n      dplyr::filter(\n        !is.na(.data$posteam) &\n          !is.na(.data$qb_epa) &\n          !is.na(.data$id) &\n          !is.na(.data$down)\n      ) |>\n      dplyr::mutate(\n        epa = dplyr::if_else(.data$qb_epa < -4.5, -4.5, .data$qb_epa)\n      ) |>\n      decode_player_ids()\n  })\n\n  if (isTRUE(weekly)) {\n    relevant_players <- add_to_this |>\n      dplyr::filter(.data$attempts >= 5) |>\n      dplyr::mutate(\n        filter_id = paste(.data$player_id, .data$season, .data$week, sep = \"_\")\n      ) |>\n      dplyr::pull(.data$filter_id)\n\n    model_data <- df |>\n      dplyr::group_by(.data$id, .data$week, .data$season) |>\n      dplyr::summarize(\n        n_plays = n(),\n        epa_per_play = sum(.data$epa) / .data$n_plays,\n        cpoe = mean(.data$cpoe, na.rm = TRUE)\n      ) |>\n      dplyr::ungroup() |>\n      dplyr::mutate(cpoe = dplyr::if_else(is.na(.data$cpoe), 0, .data$cpoe)) |>\n      dplyr::rename(\"player_id\" = \"id\") |>\n      dplyr::mutate(\n        filter_id = paste(.data$player_id, .data$season, .data$week, sep = \"_\")\n      ) |>\n      dplyr::filter(.data$filter_id %in% relevant_players)\n\n    model_data$dakota <- mgcv::predict.gam(dakota_model, model_data) |>\n      as.vector()\n\n    out <- add_to_this |>\n      dplyr::left_join(\n        model_data |>\n          dplyr::select(\"player_id\", \"week\", \"season\", \"dakota\"),\n        by = c(\"player_id\", \"week\", \"season\")\n      )\n  } else if (isFALSE(weekly)) {\n    relevant_players <- add_to_this |>\n      dplyr::filter(.data$attempts >= 5) |>\n      dplyr::pull(.data$player_id)\n\n    model_data <- df |>\n      dplyr::group_by(.data$id) |>\n      dplyr::summarize(\n        n_plays = n(),\n        epa_per_play = sum(.data$epa) / .data$n_plays,\n        cpoe = mean(.data$cpoe, na.rm = TRUE)\n      ) |>\n      dplyr::ungroup() |>\n      dplyr::mutate(cpoe = dplyr::if_else(is.na(.data$cpoe), 0, .data$cpoe)) |>\n      dplyr::rename(\"player_id\" = \"id\") |>\n      dplyr::filter(.data$player_id %in% relevant_players)\n\n    model_data$dakota <- mgcv::predict.gam(dakota_model, model_data) |>\n      as.vector()\n\n    out <- add_to_this |>\n      dplyr::left_join(\n        model_data |>\n          dplyr::select(\"player_id\", \"dakota\"),\n        by = \"player_id\"\n      )\n  }\n  return(out)\n}\n"
  },
  {
    "path": "R/aggregate_game_stats_def.R",
    "content": "################################################################################\n# Author: Christian Lohr, Sebastian Carl, Tan Ho\n# Styleguide: styler::tidyverse_style()\n################################################################################\n\n#' Get Official Game Stats on Defense\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated because we have a new, much better and\n#' harmonized approach in [`calculate_stats()`].\n#'\n#' @param pbp A Data frame of NFL play-by-play data typically loaded with\n#'   [load_pbp()] or [build_nflfastR_pbp()]. If the data doesn't include the variable\n#'   `qb_epa`, the function `add_qb_epa()` will be called to add it.\n#' @param weekly If `TRUE`, returns week-by-week stats, otherwise, stats\n#'   for the entire Data frame.\n#' @description Build columns that aggregate official defense stats\n#'   either at the game level or at the level of the entire data frame passed.\n#' @return A data frame of defensive player stats. See dictionary (# TODO)\n#' @export\n#' @keywords internal\n#' @seealso The function [load_player_stats()] and the corresponding examples\n#' on [the nflfastR website](https://nflfastr.com/articles/nflfastR.html#example-11-replicating-official-stats)\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#'   # pbp <- nflfastR::load_pbp(2020)\n#'\n#'   # weekly <- calculate_player_stats_def(pbp, weekly = TRUE)\n#'   # dplyr::glimpse(weekly)\n#'\n#'   # overall <- calculate_player_stats_def(pbp, weekly = FALSE)\n#'   # dplyr::glimpse(overall)\n#' })\n#' }\n#'\n\n#++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n# what do we need:\n#\n# Solo Tackles --> done\n# Tackles With Assist --> done\n# Assisted Tackles --> done\n# Tackles for Loss --> done\n# TFL Yards --> done\n# Sacks --> done\n# Sack Yards --> done\n# QB Hits --> done\n# Passes Defensed --> done\n# Interceptions --> done\n# Interception Yards --> done\n# Interception Return TDs ///// --> only \"TD\" for defense\n# Forced Fumbles --> done\n# Opp Fumble Recoveries --> done\n# Opp Fumble Recovery Yards --> done\n# Opp Fumble Recovery TDs ///// --> only \"TD\" for defense\n# Safeties --> done\n# Penalties --> done\n# Penalty Yards --> done\n# Fumbles --> done\n# Own Fumble Recoveries --> done\n# Own Fumble Recovery Yards --> done\n# Own Fumble Recovery TDs ///// --> only \"TD\" for defense\n\ncalculate_player_stats_def <- function(pbp, weekly = FALSE) {\n  lifecycle::deprecate_warn(\n    \"5.0\",\n    \"calculate_player_stats_def()\",\n    \"calculate_stats()\"\n  )\n\n  # need newer version of nflreadr to use load_players\n  rlang::check_installed(\"nflreadr (>= 1.3.0)\")\n\n  # Prepare data ------------------------------------------------------------\n\n  suppressMessages({\n    # 1. for \"normal\" plays: get plays that count in official stats\n    # we exclude special teams and 2pts here for now\n    data <- pbp |>\n      dplyr::filter(\n        !is.na(.data$down),\n        .data$play_type %in% c(\"pass\", \"qb_kneel\", \"qb_spike\", \"run\")\n      ) |>\n      nflfastR::decode_player_ids()\n\n    # 2. filter penalty plays for penalty stats\n    penalty_data <- pbp |>\n      dplyr::filter(.data$penalty == 1) |>\n      nflfastR::decode_player_ids()\n  })\n\n  stype <- data |>\n    dplyr::select(\"season\", \"week\", \"season_type\") |>\n    dplyr::distinct()\n\n  # Tackling stats -----------------------------------------------------------\n\n  tackle_vars <- c(\n    \"solo_tackle_1_player_id\",\n    \"tackle_for_loss_1_player_id\",\n    \"assist_tackle_1_player_id\",\n    \"tackle_with_assist_1_player_id\",\n    \"solo_tackle_2_player_id\",\n    \"forced_fumble_player_1_player_id\",\n    \"assist_tackle_2_player_id\",\n    \"forced_fumble_player_2_player_id\"\n  )\n\n  # get tackling stats\n  tackle_df <- data |>\n    dplyr::select(\"season\", \"week\", \"defteam\", dplyr::any_of(tackle_vars)) |>\n    tidyr::pivot_longer(\n      cols = dplyr::any_of(tackle_vars),\n      names_to = \"desc\",\n      values_to = \"tackle_player_id\",\n      values_drop_na = TRUE\n    ) |>\n    dplyr::count(\n      .data$tackle_player_id,\n      .data$defteam,\n      .data$season,\n      .data$week,\n      .data$desc\n    ) |>\n    dplyr::mutate(\n      desc = stringr::str_remove_all(.data$desc, \"_player_id\") |>\n        stringr::str_remove_all(\"_[0-9]\")\n    ) |>\n    tidyr::pivot_wider(\n      names_from = .data$desc,\n      values_from = .data$n,\n      values_fill = 0L,\n      values_fn = sum\n    ) |>\n    add_column_if_missing(\n      \"solo_tackle\",\n      \"tackle_with_assist\",\n      \"tackle_for_loss\",\n      \"assist_tackle\",\n      \"forced_fumble_player\"\n    ) |>\n    dplyr::mutate(\n      tackles = .data$solo_tackle + .data$tackle_with_assist\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"defteam\",\n      \"player_id\" = \"tackle_player_id\",\n      \"tackles\",\n      \"tackles_solo\" = \"solo_tackle\",\n      \"tackles_with_assist\" = \"tackle_with_assist\",\n      \"tackle_assists\" = \"assist_tackle\",\n      \"forced_fumbles\" = \"forced_fumble_player\",\n      \"tackles_for_loss\" = \"tackle_for_loss\"\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      tackles = sum(.data$tackles, na.rm = TRUE),\n      tackles_solo = sum(.data$tackles_solo, na.rm = TRUE),\n      tackles_with_assist = sum(.data$tackles_with_assist, na.rm = TRUE),\n      tackle_assists = sum(.data$tackle_assists, na.rm = TRUE),\n      forced_fumbles = sum(.data$forced_fumbles, na.rm = TRUE),\n      tackles_for_loss = sum(.data$tackles_for_loss, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # get tackle for loss yards\n  tackle_yards_df <- data |>\n    dplyr::filter(\n      .data$tackled_for_loss == 1,\n      .data$fumble == 0,\n      .data$sack == 0\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"defteam\",\n      \"tackle_for_loss_1_player_id\",\n      \"tackle_for_loss_2_player_id\",\n      \"yards_gained\"\n    ) |>\n    tidyr::pivot_longer(\n      cols = c(\"tackle_for_loss_1_player_id\", \"tackle_for_loss_2_player_id\"),\n      names_to = \"desc\",\n      values_to = \"player_id\",\n      values_drop_na = TRUE\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      tfl_yards = sum(-.data$yards_gained, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # Sack and QB Hits stats -----------------------------------------------------------\n\n  # get sack and pressure stats\n  pressure_df <- data |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"defteam\",\n      dplyr::contains(\"sack_\"),\n      \"yards_gained\",\n      dplyr::starts_with(\"qb_hit_\"),\n      -dplyr::contains(\"_name\")\n    ) |>\n    tidyr::pivot_longer(\n      cols = c(\n        dplyr::contains(\"sack_\"),\n        dplyr::starts_with(\"qb_hit_\")\n      ),\n      names_to = \"desc\",\n      names_prefix = \"sk_\",\n      values_to = \"player_id\",\n      values_drop_na = TRUE\n    ) |>\n    dplyr::mutate(\n      n = dplyr::case_when(\n        .data$desc %in%\n          c(\"half_sack_1_player_id\", \"half_sack_2_player_id\") ~ 0.5,\n        TRUE ~ 1\n      ),\n      desc = stringr::str_remove_all(.data$desc, \"_player_id\") |>\n        stringr::str_remove_all(\"_[0-9]\") |>\n        stringr::str_remove(\"half_\")\n    ) |>\n    dplyr::mutate(\n      sack_yards = .data$n * .data$yards_gained * -1\n    ) |>\n    tidyr::pivot_wider(\n      names_from = .data$desc,\n      values_from = c(.data$n, .data$sack_yards),\n      values_fn = sum,\n      values_fill = 0L\n    ) |>\n    add_column_if_missing(\"n_sack\", \"n_qb_hit\", \"sack_yards_sack\") |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\",\n      \"player_id\",\n      \"sacks\" = \"n_sack\",\n      \"qb_hit\" = \"n_qb_hit\",\n      \"sack_yards\" = \"sack_yards_sack\"\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      sacks = sum(.data$sacks, na.rm = TRUE),\n      qb_hit = sum(.data$qb_hit, na.rm = TRUE),\n      sack_yards = sum(.data$sack_yards, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # Interception and Deflection stats ---------------------------------------------------------\n\n  # get int and def stats\n  int_df <- data |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"return_yards\",\n      \"team\" = \"defteam\",\n      dplyr::starts_with(\"interception_\"),\n      dplyr::starts_with(\"pass_defense_\"),\n      -dplyr::contains(\"_name\")\n    ) |>\n    tidyr::pivot_longer(\n      cols = c(\n        dplyr::starts_with(\"interception_\"),\n        dplyr::starts_with(\"pass_defense_\")\n      ),\n      names_to = \"desc\",\n      names_prefix = \"int_\",\n      values_to = \"db_player_id\",\n      values_drop_na = TRUE\n    ) |>\n    dplyr::mutate(\n      n = 1,\n      desc = stringr::str_remove_all(.data$desc, \"_player_id\") |>\n        stringr::str_remove_all(\"_[0-9]\")\n    ) |>\n    tidyr::pivot_wider(\n      names_from = \"desc\",\n      values_from = c(\"n\", \"return_yards\"),\n      values_fn = sum,\n      values_fill = 0L\n    ) |>\n    add_column_if_missing(\n      \"n_interception\",\n      \"n_pass_defense\",\n      \"return_yards_interception\"\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\",\n      \"player_id\" = \"db_player_id\",\n      \"int\" = \"n_interception\",\n      \"pass_defended\" = \"n_pass_defense\",\n      \"int_yards\" = \"return_yards_interception\"\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      int = sum(.data$int, na.rm = TRUE),\n      pass_defended = sum(.data$pass_defended, na.rm = TRUE),\n      int_yards = sum(.data$int_yards, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # Safety stats -----------------------------------------------------------\n\n  safety_df <- data |>\n    dplyr::filter(.data$safety == 1, !is.na(.data$safety_player_id)) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"defteam\",\n      \"player_id\" = \"safety_player_id\"\n    ) |>\n    dplyr::count(\n      .data$season,\n      .data$week,\n      .data$team,\n      .data$player_id,\n      name = \"safety\"\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      safety = sum(.data$safety, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # Fumble stats -----------------------------------------------------------\n\n  # get fumble stats for fumbles and own fumble recoveries\n  fumble_df_own <- data |>\n    dplyr::filter(.data$fumble == 1 | .data$fumble_lost == 1) |>\n    dplyr::filter(\n      .data$defteam == .data$fumbled_1_team |\n        .data$defteam == .data$fumbled_2_team\n    ) |>\n    dplyr::mutate(\n      fumbled_1_player_id = dplyr::if_else(\n        .data$defteam == .data$fumbled_1_team,\n        .data$fumbled_1_player_id,\n        NA_character_,\n        NA_character_\n      )\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      dplyr::matches(\"^fumble.+team\"),\n      dplyr::matches(\"^fumble.+player_id\")\n    ) |>\n    tidyr::pivot_longer(\n      cols = dplyr::contains(\"fumble\"),\n      names_pattern = \"(.+)_(team|player_id)\",\n      names_to = c(\"desc\", \".value\")\n    ) |>\n    dplyr::mutate(\n      n = 1,\n      desc = stringr::str_remove_all(.data$desc, \"_[0-9]\")\n    ) |>\n    tidyr::pivot_wider(\n      names_from = .data$desc,\n      values_from = .data$n,\n      values_fn = sum,\n      values_fill = 0L\n    ) |>\n    # Renaming fails if the columns don't exist. So we row bind a dummy tibble\n    # including the relevant columns. The row will be filtered after renaming\n    dplyr::bind_rows(\n      tibble::tibble(\n        player_id = NA_character_,\n        fumbled = numeric(),\n        fumble_recovery = numeric()\n      )\n    ) |>\n    dplyr::rename(\n      \"fumble\" = \"fumbled\",\n      \"fumble_recovery_own\" = \"fumble_recovery\"\n    ) |>\n    dplyr::filter(!is.na(.data$player_id)) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      fumble = sum(.data$fumble, na.rm = TRUE),\n      fumble_recovery_own = sum(.data$fumble_recovery_own, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # get fumble stats for opponent recoveries\n  fumble_df_opp <- data |>\n    dplyr::filter(.data$fumble == 1 | .data$fumble_lost == 1) |>\n    dplyr::filter(\n      .data$defteam == .data$fumble_recovery_1_team |\n        .data$defteam == .data$fumble_recovery_2_team\n    ) |>\n    dplyr::mutate(\n      # use data.table fifelse because base ifelse changed data type to logical\n      # if there are 0 rows\n      fumble_recovery_1_player_id = data.table::fifelse(\n        .data$defteam != .data$fumbled_1_team,\n        .data$fumble_recovery_1_player_id,\n        NA_character_\n      ),\n      fumble_recovery_2_player_id = data.table::fifelse(\n        .data$defteam != .data$fumbled_2_team,\n        .data$fumble_recovery_2_player_id,\n        NA_character_\n      )\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      dplyr::matches(\"^fumble_recovery.+team\"),\n      dplyr::matches(\"^fumble_recovery.+player_id\")\n    ) |>\n    tidyr::pivot_longer(\n      cols = dplyr::contains(\"fumble\"),\n      names_pattern = \"(.+)_(team|player_id)\",\n      names_to = c(\"desc\", \".value\")\n    ) |>\n    dplyr::mutate(\n      n = 1,\n      desc = stringr::str_remove_all(.data$desc, \"_[0-9]\")\n    ) |>\n    tidyr::pivot_wider(\n      names_from = .data$desc,\n      values_from = .data$n,\n      values_fn = sum,\n      values_fill = 0L\n    ) |>\n    dplyr::filter(!is.na(.data$player_id)) |>\n    add_column_if_missing(\"fumble_recovery\") |>\n    dplyr::rename(\"fumble_recovery_opp\" = \"fumble_recovery\") |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      fumble_recovery_opp = sum(.data$fumble_recovery_opp, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # get fumble yards for own recoveries\n  fumble_yds_own_data <- data |>\n    dplyr::filter(.data$fumble == 1 | .data$fumble_lost == 1) |>\n    dplyr::filter(\n      .data$defteam == .data$fumbled_1_team |\n        .data$defteam == .data$fumbled_2_team\n    )\n\n  fumble_yds_own_df <- fumble_yds_own_data |>\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      \"team\" = .data$fumble_recovery_1_team,\n      \"player_id\" = .data$fumble_recovery_1_player_id\n    ) |>\n    dplyr::summarise(recovery_yards = sum(.data$fumble_recovery_1_yards)) |>\n    dplyr::filter(!is.na(.data$player_id)) |> ### this happens when a fumble goes out of bounds. Noone gets yards --> NA/NA\n    dplyr::bind_rows(\n      fumble_yds_own_data |>\n        dplyr::group_by(\n          .data$season,\n          .data$week,\n          \"team\" = .data$fumble_recovery_2_team,\n          \"player_id\" = .data$fumble_recovery_2_player_id\n        ) |>\n        dplyr::summarise(recovery_yards = sum(.data$fumble_recovery_2_yards)) |>\n        dplyr::filter(!is.na(.data$player_id))\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(fumble_recovery_yards_own = sum(.data$recovery_yards)) |>\n    dplyr::ungroup()\n\n  # get fumble yards for opp recoveries\n  fumble_yds_opp_data <- data |>\n    dplyr::filter(.data$fumble == 1 | .data$fumble_lost == 1) |>\n    dplyr::filter(\n      .data$defteam == .data$fumble_recovery_1_team,\n      .data$defteam != .data$fumbled_1_team\n    )\n\n  fumble_yds_opp_df <- fumble_yds_opp_data |>\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      \"team\" = .data$fumble_recovery_1_team,\n      \"player_id\" = .data$fumble_recovery_1_player_id\n    ) |>\n    dplyr::summarise(recovery_yards = sum(.data$fumble_recovery_1_yards)) |>\n    dplyr::filter(!is.na(.data$player_id)) |>\n    dplyr::bind_rows(\n      fumble_yds_opp_data |>\n        dplyr::group_by(\n          .data$season,\n          .data$week,\n          \"team\" = .data$fumble_recovery_2_team,\n          \"player_id\" = .data$fumble_recovery_2_player_id\n        ) |>\n        dplyr::summarise(recovery_yards = sum(.data$fumble_recovery_2_yards)) |>\n        dplyr::filter(!is.na(.data$player_id))\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(fumble_recovery_yards_opp = sum(.data$recovery_yards)) |>\n    dplyr::ungroup()\n\n  # Penalty stats -----------------------------------------------------------\n\n  # get penalty stats\n  penalty_df <- penalty_data |>\n    dplyr::filter(\n      !is.na(.data$penalty_player_id),\n      .data$defteam == .data$penalty_team\n    ) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"penalty_yards\",\n      \"penalty_team\",\n      \"penalty_player_id\"\n    ) |>\n    tidyr::pivot_longer(\n      cols = dplyr::contains(\"penalty\"),\n      names_pattern = \"(.+)_(team|player_id|yards)\",\n      names_to = c(\"desc\", \".value\"),\n      values_drop_na = TRUE\n    ) |>\n    dplyr::mutate(n = 1) |>\n    tidyr::pivot_wider(\n      names_from = .data$desc,\n      values_from = c(.data$n, .data$yards),\n      values_fn = sum,\n      values_fill = 0L\n    ) |>\n    add_column_if_missing(\"n_penalty\", \"yards_penalty\") |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\",\n      \"player_id\",\n      \"penalty\" = \"n_penalty\",\n      \"penalty_yards\" = \"yards_penalty\"\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$team, .data$player_id) |>\n    dplyr::summarise(\n      penalty = sum(.data$penalty, na.rm = TRUE),\n      penalty_yards = sum(.data$penalty_yards, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # Touchdown stats -----------------------------------------------------------\n\n  # get defensive touchdowns\n  touchdown_df <- data |>\n    dplyr::filter(.data$touchdown == 1) |>\n    dplyr::filter(.data$defteam == .data$td_team) |>\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      \"team\" = .data$td_team,\n      \"player_id\" = .data$td_player_id\n    ) |>\n    dplyr::summarise(td = sum(.data$touchdown)) |>\n    dplyr::ungroup()\n\n  # Combine all stats -------------------------------------------------------\n\n  # combine all the stats together\n\n  player_df <- tackle_df |>\n    dplyr::full_join(\n      tackle_yards_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      pressure_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(int_df, by = c(\"season\", \"week\", \"player_id\", \"team\")) |>\n    dplyr::full_join(\n      safety_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      fumble_df_own,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      fumble_df_opp,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      fumble_yds_own_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      fumble_yds_opp_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      penalty_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::full_join(\n      touchdown_df,\n      by = c(\"season\", \"week\", \"player_id\", \"team\")\n    ) |>\n    dplyr::mutate_if(is.numeric, tidyr::replace_na, 0) |>\n    dplyr::left_join(\n      nflreadr::load_players() |>\n        dplyr::select(\n          \"player_id\" = \"gsis_id\",\n          \"player_display_name\" = \"display_name\",\n          \"player_name\" = \"short_name\",\n          \"position\",\n          \"position_group\",\n          \"headshot_url\" = \"headshot\"\n        ),\n      by = \"player_id\"\n    ) |>\n    dplyr::left_join(stype, by = c(\"season\", \"week\")) |>\n    dplyr::select(dplyr::any_of(c(\n      # game information\n      \"season\",\n      \"week\",\n      \"season_type\",\n\n      # id information\n      \"player_id\",\n      \"player_name\",\n      \"player_display_name\",\n      \"position\",\n      \"position_group\",\n      \"headshot_url\",\n      \"team\",\n\n      # tackle stats\n      \"def_tackles\" = \"tackles\",\n      \"def_tackles_solo\" = \"tackles_solo\",\n      \"def_tackles_with_assist\" = \"tackles_with_assist\",\n      \"def_tackle_assists\" = \"tackle_assists\",\n      \"def_tackles_for_loss\" = \"tackles_for_loss\",\n      \"def_tackles_for_loss_yards\" = \"tfl_yards\",\n      \"def_fumbles_forced\" = \"forced_fumbles\",\n\n      # pressure stats\n      \"def_sacks\" = \"sacks\",\n      \"def_sack_yards\" = \"sack_yards\",\n      \"def_qb_hits\" = \"qb_hit\",\n\n      # coverage stats\n      \"def_interceptions\" = \"int\",\n      \"def_interception_yards\" = \"int_yards\",\n      \"def_pass_defended\" = \"pass_defended\",\n\n      # misc stats\n      \"def_tds\" = \"td\",\n      \"def_fumbles\" = \"fumble\",\n      \"def_fumble_recovery_own\" = \"fumble_recovery_own\",\n      \"def_fumble_recovery_yards_own\" = \"fumble_recovery_yards_own\",\n      \"def_fumble_recovery_opp\" = \"fumble_recovery_opp\",\n      \"def_fumble_recovery_yards_opp\" = \"fumble_recovery_yards_opp\",\n      \"def_safety\" = \"safety\",\n      \"def_penalty\" = \"penalty\",\n      \"def_penalty_yards\" = \"penalty_yards\"\n    ))) |>\n    dplyr::filter(!is.na(.data$player_id)) |>\n    dplyr::arrange(.data$player_id, .data$season, .data$week)\n\n  # if user doesn't want week-by-week input, aggregate the whole df\n  if (isFALSE(weekly)) {\n    player_df <- player_df |>\n      dplyr::group_by(.data$player_id, .data$team) |>\n      dplyr::summarise(\n        player_name = custom_mode(.data$player_name),\n        player_display_name = custom_mode(.data$player_display_name),\n        games = dplyr::n(),\n        position = custom_mode(.data$position),\n        position_group = custom_mode(.data$position_group),\n        headshot_url = custom_mode(.data$headshot_url),\n        def_tackles = sum(.data$def_tackles),\n        def_tackles_solo = sum(.data$def_tackles_solo),\n        def_tackles_with_assist = sum(.data$def_tackles_with_assist),\n        def_tackle_assists = sum(.data$def_tackle_assists),\n        def_tackles_for_loss = sum(.data$def_tackles_for_loss),\n        def_tackles_for_loss_yards = sum(.data$def_tackles_for_loss_yards),\n        def_fumbles_forced = sum(.data$def_fumbles_forced),\n        def_sacks = sum(.data$def_sacks),\n        def_sack_yards = sum(.data$def_sack_yards),\n        def_qb_hits = sum(.data$def_qb_hits),\n        def_interceptions = sum(.data$def_interceptions),\n        def_interception_yards = sum(.data$def_interception_yards),\n        def_pass_defended = sum(.data$def_pass_defended),\n        def_tds = sum(.data$def_tds),\n        def_fumbles = sum(.data$def_fumbles),\n        def_fumble_recovery_own = sum(.data$def_fumble_recovery_own),\n        def_fumble_recovery_yards_own = sum(\n          .data$def_fumble_recovery_yards_own\n        ),\n        def_fumble_recovery_opp = sum(.data$def_fumble_recovery_opp),\n        def_fumble_recovery_yards_opp = sum(\n          .data$def_fumble_recovery_yards_opp\n        ),\n        def_safety = sum(.data$def_safety),\n        def_penalty = sum(.data$def_penalty),\n        def_penalty_yards = sum(.data$def_penalty_yards)\n      ) |>\n      dplyr::ungroup() |>\n      dplyr::select(\n        \"player_id\",\n        \"player_name\",\n        \"player_display_name\",\n        \"games\",\n        \"position\",\n        \"position_group\",\n        \"headshot_url\",\n        \"team\",\n        dplyr::everything()\n      )\n  }\n\n  player_df\n}\n\n# This function checks if the variables in ... exists as column\n# names in the argument .data. If not, it adds those columns and assigns\n# them the value in the argument value\nadd_column_if_missing <- function(.data, ..., value = 0L) {\n  dots <- rlang::list2(...)\n  new_cols <- dots[!dots %in% names(.data)]\n  .data[, unlist(new_cols)] <- value\n  .data\n}\n"
  },
  {
    "path": "R/aggregate_game_stats_kicking.R",
    "content": "#' Summarize Kicking Stats\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated because we have a new, much better and\n#' harmonized approach in [`calculate_stats()`].\n#'\n#' Build columns that aggregate kicking stats at the game level.\n#'\n#' @param pbp A Data frame of NFL play-by-play data typically loaded with\n#' [load_pbp()] or [build_nflfastR_pbp()].\n#' @param weekly If `TRUE`, returns week-by-week stats, otherwise, stats for\n#' the entire data frame in argument `pbp`.\n#'\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#'     # pbp <- nflreadr::load_pbp(2021)\n#'     # weekly <- calculate_player_stats_kicking(pbp, weekly = TRUE)\n#'     # dplyr::glimpse(weekly)\n#'\n#'     # overall <- calculate_player_stats_kicking(pbp, weekly = FALSE)\n#'     # dplyr::glimpse(overall)\n#' })\n#' }\n#'\n#' @return a dataframe of kicking stats\n#' @seealso <https://nflreadr.nflverse.com/reference/load_player_stats.html> for the nflreadr function to download this from repo (`stat_type = \"kicking\"`)\n#' @export\n#' @keywords internal\ncalculate_player_stats_kicking <- function(pbp, weekly = FALSE) {\n  lifecycle::deprecate_warn(\n    \"5.0\",\n    \"calculate_player_stats_kicking()\",\n    \"calculate_stats()\"\n  )\n\n  # need newer version of nflreadr to use load_players\n  rlang::check_installed(\"nflreadr (>= 1.3.0)\")\n\n  # First, creating a grouping variable object to toggle the weekly argument w/\n  grp_vars <- if (isTRUE(weekly)) {\n    list(\"season\", \"week\", \"season_type\", \"player_id\", \"team\")\n  } else if (isFALSE(weekly)) {\n    list(\"player_id\", \"team\")\n  }\n  grp_vars <- lapply(grp_vars, as.symbol)\n\n  # Filtering down / creating a base dataset\n  df_fg_or_pat <- pbp |>\n    dplyr::group_by(.data$game_id, .data$posteam) |>\n    dplyr::filter(\n      .data$field_goal_attempt == 1 |\n        .data$extra_point_attempt == 1 |\n        .data$fixed_drive == max(.data$fixed_drive, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::filter(!is.na(.data$kicker_player_id)) |>\n    dplyr::select(\n      \"game_id\",\n      \"season\",\n      \"week\",\n      \"season_type\",\n      \"team\" = \"posteam\",\n      \"player_name\" = \"kicker_player_name\",\n      \"player_id\" = \"kicker_player_id\",\n      \"dist\" = \"kick_distance\",\n      \"field_goal_attempt\",\n      \"fg_res\" = \"field_goal_result\",\n      \"extra_point_attempt\",\n      \"pat_res\" = \"extra_point_result\",\n      \"fixed_drive\",\n      \"score_differential\"\n    )\n\n  # Field-goal relevant columns\n  df_field_goals <- df_fg_or_pat |>\n    dplyr::filter(.data$field_goal_attempt == 1) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::mutate(\n      temp_made_idx = .data$fg_res == \"made\",\n      temp_miss_idx = .data$fg_res == \"missed\",\n      temp_block_idx = .data$fg_res == \"blocked\"\n    ) |>\n    dplyr::summarise(\n      games_fg = list(unique(.data$game_id)),\n      fg_made = sum(.data$temp_made_idx, na.rm = TRUE),\n      fg_att = sum(.data$field_goal_attempt, na.rm = TRUE),\n      fg_missed = sum(.data$temp_miss_idx, na.rm = TRUE),\n      fg_blocked = sum(.data$temp_block_idx, na.rm = TRUE),\n      fg_long = if (any(.data$temp_made_idx, na.rm = TRUE)) {\n        max(.data$dist[.data$temp_made_idx], na.rm = TRUE)\n      } else {\n        NA_real_\n      },\n      fg_pct = round(.data$fg_made / .data$fg_att, 3L),\n      fg_made_0_19 = sum(\n        dplyr::between(.data$dist[.data$temp_made_idx], 0, 19),\n        na.rm = TRUE\n      ),\n      fg_made_20_29 = sum(\n        dplyr::between(.data$dist[.data$temp_made_idx], 20, 29),\n        na.rm = TRUE\n      ),\n      fg_made_30_39 = sum(\n        dplyr::between(.data$dist[.data$temp_made_idx], 30, 39),\n        na.rm = TRUE\n      ),\n      fg_made_40_49 = sum(\n        dplyr::between(.data$dist[.data$temp_made_idx], 40, 49),\n        na.rm = TRUE\n      ),\n      fg_made_50_59 = sum(\n        dplyr::between(.data$dist[.data$temp_made_idx], 50, 59),\n        na.rm = TRUE\n      ),\n      fg_made_60_ = sum(.data$dist[.data$temp_made_idx] >= 60, na.rm = TRUE),\n      fg_missed_0_19 = sum(\n        dplyr::between(.data$dist[.data$temp_miss_idx], 0, 19),\n        na.rm = TRUE\n      ),\n      fg_missed_20_29 = sum(\n        dplyr::between(.data$dist[.data$temp_miss_idx], 20, 29),\n        na.rm = TRUE\n      ),\n      fg_missed_30_39 = sum(\n        dplyr::between(.data$dist[.data$temp_miss_idx], 30, 39),\n        na.rm = TRUE\n      ),\n      fg_missed_40_49 = sum(\n        dplyr::between(.data$dist[.data$temp_miss_idx], 40, 49),\n        na.rm = TRUE\n      ),\n      fg_missed_50_59 = sum(\n        dplyr::between(.data$dist[.data$temp_miss_idx], 50, 59),\n        na.rm = TRUE\n      ),\n      fg_missed_60_ = sum(.data$dist[.data$temp_miss_idx] >= 60, na.rm = TRUE),\n      fg_made_list = paste(\n        stats::na.omit(.data$dist[.data$temp_made_idx]),\n        collapse = \";\"\n      ),\n      fg_missed_list = paste(\n        stats::na.omit(.data$dist[.data$temp_miss_idx]),\n        collapse = \";\"\n      ),\n      fg_blocked_list = paste(\n        stats::na.omit(.data$dist[.data$temp_block_idx]),\n        collapse = \";\"\n      ),\n      fg_made_distance = sum(.data$dist[.data$temp_made_idx], na.rm = TRUE),\n      fg_missed_distance = sum(.data$dist[.data$temp_miss_idx], na.rm = TRUE),\n      fg_blocked_distance = sum(.data$dist[.data$temp_block_idx], na.rm = TRUE),\n      .groups = \"drop\"\n    )\n\n  # Extra points\n  df_pat <- df_fg_or_pat |>\n    dplyr::filter(.data$extra_point_attempt == 1) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      games_pat = list(unique(.data$game_id)),\n      pat_made = sum(.data$pat_res == \"good\", na.rm = TRUE),\n      pat_att = sum(.data$extra_point_attempt, na.rm = TRUE),\n      pat_missed = sum(.data$pat_res == \"failed\", na.rm = TRUE),\n      pat_blocked = sum(.data$pat_res == \"blocked\", na.rm = TRUE),\n      pat_pct = round(.data$pat_made / .data$pat_att, 3L),\n      .groups = \"drop\"\n    )\n\n  # The Game Winning kicks distance include up to one value at the weekly level\n  # but can include multiple across the season. This is one way to account for that.\n  # the downside is that the column names change depending on if it is weekly vs\n  # seasonal.\n  if (weekly) {\n    gw_dist_name <- \"gwfg_distance\"\n  } else {\n    gw_dist_name <- \"gwfg_distance_list\"\n  }\n\n  # See the above note. I wonder if this should also include field goals that tie\n  # the game but I kept the filter dplyr::between(score_differential, -2, 0) the way\n  # that is was previously. If you do include field goals that send the game into OT,\n  # then you'll probably need to include the gwfg_distance AND gwfg_distance_list columns\n  # in the weekly data\n  game_winners <- df_fg_or_pat |>\n    dplyr::group_by(.data$game_id, .data$team) |>\n    dplyr::filter(.data$fixed_drive == max(.data$fixed_drive, na.rm = TRUE)) |>\n    dplyr::ungroup() |>\n    dplyr::filter(\n      .data$field_goal_attempt == 1,\n      dplyr::between(.data$score_differential, -2, 0)\n    ) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      games_gwfg = list(unique(.data$game_id)),\n      gwfg_att = dplyr::n(),\n      !!gw_dist_name := if (weekly) {\n        .data$dist\n      } else {\n        paste(stats::na.omit(.data$dist), collapse = \";\")\n      },\n      gwfg_made = sum(.data$fg_res == \"made\", na.rm = TRUE),\n      gwfg_missed = sum(.data$fg_res == \"missed\", na.rm = TRUE),\n      gwfg_blocked = sum(.data$fg_res == \"blocked\", na.rm = TRUE),\n      .groups = \"drop\"\n    )\n\n  # Prepping data to merge-in player names\n  df_player_names <- nflreadr::load_players() |>\n    dplyr::select(\n      \"player_id\" = \"gsis_id\",\n      \"player_display_name\" = \"display_name\",\n      \"player_name\" = \"short_name\",\n      \"position\",\n      \"position_group\",\n      \"headshot_url\" = \"headshot\"\n    )\n\n  # Joining all the data together and organizing the first few columns.\n  full_kicks <- df_field_goals |>\n    dplyr::full_join(df_pat, as.character(grp_vars)) |>\n    dplyr::full_join(game_winners, as.character(grp_vars)) |>\n    dplyr::left_join(df_player_names, \"player_id\") |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::mutate(\n      games = length(unique(unlist(c(\n        .data$games_fg,\n        .data$games_pat,\n        .data$games_gwfg\n      ))))\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::select(\n      dplyr::any_of(c(\"season\", \"week\", \"season_type\")),\n      \"player_id\",\n      \"team\",\n      \"player_name\",\n      \"player_display_name\",\n      \"games\",\n      \"position\",\n      \"position_group\",\n      \"headshot_url\",\n      dplyr::everything(),\n      -c(\"games_fg\", \"games_pat\", \"games_gwfg\")\n    ) |>\n    # replace \"\" with NA\n    dplyr::mutate_all(~ replace(.x, nchar(.x) == 0 | is.nan(.x), NA)) |>\n    # replace NA in attempt columns with 0\n    dplyr::mutate_at(\n      c(\"fg_att\", \"pat_att\", \"gwfg_att\"),\n      ~ tidyr::replace_na(.x, 0)\n    )\n\n  if (weekly) {\n    full_kicks |>\n      dplyr::select(-\"games\") |>\n      dplyr::arrange(.data$player_id, .data$season, .data$week)\n  } else {\n    full_kicks |>\n      dplyr::arrange(.data$player_id)\n  }\n}\n"
  },
  {
    "path": "R/build_nflfastR_pbp.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Wrapper around multiple nflfastR functions\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n# The idea for this wrapper as well as some helper functions and the documentation\n# style are heavily borrowed from the r-lib package pkgdown (https://github.com/r-lib/pkgdown/blob/master/R/build.r)\n\n#' Build a Complete nflfastR Data Set\n#'\n#' @description\n#' `build_nflfastR_pbp` is a convenient wrapper around 6 nflfastR functions:\n#'\n#' \\itemize{\n#'  \\item{[fast_scraper()]}\n#'  \\item{[clean_pbp()]}\n#'  \\item{[add_qb_epa()]}\n#'  \\item{[add_xyac()]}\n#'  \\item{[add_xpass()]}\n#'  \\item{[decode_player_ids()]}\n#' }\n#'\n#' Please see either the documentation of each function or\n#' [the nflfastR Field Descriptions website](https://nflfastr.com/articles/field_descriptions.html)\n#' to learn about the output.\n#'\n#' @inheritParams fast_scraper\n#' @param decode If `TRUE`, the function [decode_player_ids()] will be executed.\n#' @param rules If `FALSE`, printing of the header and footer in the console output will be suppressed.\n#' @return An nflfastR play-by-play data frame like it can be loaded from <https://github.com/nflverse/nflverse-data>.\n#' @details To load valid game_ids please use the package function [fast_scraper_schedules()].\n#' @seealso For information on parallel processing and progress updates please\n#' see [nflfastR].\n#' @export\n#' @examples\n#' \\donttest{\n#' # Build nflfastR pbp for the 2018 and 2019 Super Bowls\n#' try({# to avoid CRAN test problems\n#' build_nflfastR_pbp(c(\"2018_21_NE_LA\", \"2019_21_SF_KC\"))\n#' })\n#'\n#' # It is also possible to directly use the\n#' # output of `load_schedules` as input\n#' try({# to avoid CRAN test problems\n#' nflreadr::load_schedules(2025) |>\n#'   dplyr::slice_tail(n = 3) |>\n#'   build_nflfastR_pbp()\n#' })\n#'\n#' \\dontshow{\n#' # Close open connections for R CMD Check\n#' future::plan(\"sequential\")\n#' }\n#' }\nbuild_nflfastR_pbp <- function(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...,\n  decode = TRUE,\n  rules = TRUE\n) {\n  if (!is.vector(game_ids) && is.data.frame(game_ids)) {\n    game_ids <- game_ids$game_id\n  }\n\n  if (!is.vector(game_ids)) {\n    cli::cli_abort(\"Param {.arg game_ids} is not a valid vector!\")\n  }\n\n  if (isTRUE(decode) && !is_installed(\"gsisdecoder\")) {\n    cli::cli_abort(\n      \"Package {.pkg gsisdecoder} required for decoding. Please install it with {.code install.packages(\\\"gsisdecoder\\\")}.\"\n    )\n  }\n\n  if (isTRUE(rules)) {\n    rule_header(\"Build nflfastR Play-by-Play Data\")\n  }\n\n  # nflfastR v6 stopped supporting the 1999 and 2000 seasons because of\n  # inconsistent data sources. Data is still available through load_pbp\n  # but we will not fix any issues.\n  # It's possible to install nflfastR v5.2.0 to parse those seasons.\n  # try pak::pak(\"nflverse/nflfastR@v5.2.0\")\n  game_ids <- check_for_dropped_seasons(game_ids)\n\n  game_count <- ifelse(is.vector(game_ids), length(game_ids), nrow(game_ids))\n  builder <- TRUE\n\n  cli::cli_ul(\"{my_time()} | Start download of {game_count} game{?s}...\")\n\n  ret <- fast_scraper(\n    game_ids = game_ids,\n    dir = dir,\n    ...,\n    in_builder = builder\n  ) |>\n    clean_pbp(in_builder = builder) |>\n    add_qb_epa(in_builder = builder) |>\n    add_xyac(in_builder = builder) |>\n    add_xpass(in_builder = builder)\n\n  if (isTRUE(decode)) {\n    ret <- decode_player_ids(ret, in_builder = builder)\n  }\n\n  if (isTRUE(rules)) {\n    rule_footer(\"DONE\")\n  }\n\n  make_nflverse_data(ret)\n}\n"
  },
  {
    "path": "R/build_playstats.R",
    "content": "build_playstats <- function(\n  seasons = nflreadr::most_recent_season(),\n  stat_ids = 1:1000,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  skip_local = FALSE\n) {\n  if (is_sequential()) {\n    cli::cli_alert_info(\n      \"It is recommended to use parallel processing when using this function. \\\\\n        Please consider running {.code future::plan(\\\"multisession\\\")}! \\\\\n        Will go on sequentially...\",\n      wrap = TRUE\n    )\n  }\n\n  games <- nflreadr::load_schedules(seasons = seasons) |>\n    dplyr::filter(!is.na(.data$result)) |>\n    dplyr::pull(.data$game_id)\n\n  p <- progressr::progressor(along = games)\n\n  l <- furrr::future_map(\n    games,\n    function(id, p = NULL, dir, skip_local) {\n      if (id %in% c(\"2000_03_SD_KC\", \"2000_06_BUF_MIA\", \"1999_01_BAL_STL\")) {\n        cli::cli_alert_warning(\n          \"We are missing raw game data of {.val {id}}. Skipping.\"\n        )\n        return(data.frame())\n      }\n      season <- substr(id, 1, 4)\n      raw_data <- load_raw_game(id, dir = dir, skip_local = skip_local)\n      if (season <= 2000) {\n        drives <- raw_data[[1]][[\"drives\"]] |>\n          purrr::keep(is.list)\n        out <- tibble::tibble(d = drives) |>\n          tidyr::unnest_wider(\"d\") |>\n          tidyr::unnest_longer(\"plays\") |>\n          tidyr::unnest_wider(\"plays\", names_sep = \"_\") |>\n          dplyr::select(\"playId\" = \"plays_id\", \"playStats\" = \"plays_players\") |>\n          dplyr::mutate(\n            playId = uniquify_ids(.data$playId)\n          ) |>\n          tidyr::unnest_longer(\"playStats\") |>\n          tidyr::unnest_longer(\"playStats\") |>\n          tidyr::unnest_wider(\"playStats\") |>\n          dplyr::mutate(\n            playId = as.integer(.data$playId),\n            statId = as.integer(.data$statId),\n            yards = as.integer(.data$yards),\n            team.id = NA_character_\n          ) |>\n          dplyr::select(-\"sequence\") |>\n          dplyr::rename(\n            team.abbreviation = \"clubcode\",\n            gsis.Player.id = \"playStats_id\"\n          ) |>\n          tidyr::nest(\n            playStats = c(\n              \"statId\",\n              \"yards\",\n              \"playerName\",\n              \"team.id\",\n              \"team.abbreviation\",\n              \"gsis.Player.id\"\n            )\n          )\n      } else {\n        out <- raw_data$data$viewer$gameDetail$plays[, c(\"playId\", \"playStats\")]\n      }\n      out$game_id <- as.character(id)\n      p(sprintf(\"ID=%s\", as.character(id)))\n      out\n    },\n    p = p,\n    dir = dir,\n    skip_local = skip_local\n  )\n\n  out <- data.table::rbindlist(l) |>\n    tidyr::unnest(cols = c(\"playStats\")) |>\n    janitor::clean_names() |>\n    dplyr::filter(.data$stat_id %in% stat_ids) |>\n    dplyr::mutate(\n      season = as.integer(substr(.data$game_id, 1, 4)),\n      week = as.integer(substr(.data$game_id, 6, 7))\n    ) |>\n    decode_player_ids() |>\n    dplyr::select(\n      \"game_id\",\n      \"season\",\n      \"week\",\n      \"play_id\",\n      \"stat_id\",\n      \"yards\",\n      \"team_abbr\" = \"team_abbreviation\",\n      \"player_name\",\n      \"gsis_player_id\",\n    ) |>\n    dplyr::mutate_if(\n      .predicate = is.character,\n      .funs = ~ dplyr::na_if(.x, \"\")\n    )\n  out\n}\n"
  },
  {
    "path": "R/calculate_series_conversion_rates.R",
    "content": "#' Compute Series Conversion Information from Play by Play\n#'\n#' @description A \"Series\" begins on a 1st and 10 and each team attempts to either earn\n#'   a new 1st down (on offense) or prevent the offense from converting a new\n#'   1st down (on defense). Series conversion rate represents how many series\n#'   have been either converted to a new 1st down or ended in a touchdown.\n#'   This function computes series conversion rates on offense and defense from\n#'   nflverse play-by-play data along with other series results.\n#'   The function automatically removes series that ended in a QB kneel down.\n#'\n#' @param pbp Play-by-play data as returned by [`load_pbp()`], [`build_nflfastR_pbp()`], or\n#'   [`fast_scraper()`].\n#' @param weekly If `TRUE`, returns week-by-week stats, otherwise,\n#'   season-by-season stats in argument `pbp`.\n#'\n#' @return A data frame of series information including the following columns:\n#' \\describe{\n#' \\item{season}{The NFL season}\n#' \\item{team}{NFL team abbreviation}\n#' \\item{week}{Week if `weekly` is `TRUE`}\n#' \\item{off_n}{The number of series the offense played (excludes QB kneel\n#' downs, kickoffs, extra point/two point conversion attempts, non-plays, and\n#' plays that do not list a \"posteam\")}\n#' \\item{off_scr}{The rate at which a series ended in either new 1st down or\n#' touchdown while the offense was on the field}\n#' \\item{off_scr_1st}{The rate at which an offense earned a 1st down\n#' or scored a touchdown on 1st down}\n#' \\item{off_scr_2nd}{The rate at which an offense earned a 1st down\n#' or scored a touchdown on 2nd down}\n#' \\item{off_scr_3rd}{The rate at which an offense earned a 1st down\n#' or scored a touchdown on 3rd down}\n#' \\item{off_scr_4th}{The rate at which an offense earned a 1st down\n#' or scored a touchdown on 4th down}\n#' \\item{off_1st}{The rate of series that ended in a new 1st down while the\n#' offense was on the field (does not include offensive touchdown)}\n#' \\item{off_td}{The rate of series that ended in an offensive touchdown while the\n#' offense was on the field}\n#' \\item{off_fg}{The rate of series that ended in a field goal attempt while the\n#' offense was on the field}\n#' \\item{off_punt}{The rate of series that ended in a punt while the\n#' offense was on the field}\n#' \\item{off_to}{The rate of series that ended in a turnover (including on downs), in an\n#' opponent score, or at the end of half (or game) while the\n#' offense was on the field}\n#' \\item{def_n}{The number of series the defense played (excludes QB kneel\n#' downs, kickoffs, extra point/two point conversion attempts, non-plays, and\n#' plays that do not list a \"posteam\")}\n#' \\item{def_scr}{The rate at which a series ended in either new 1st down or\n#' touchdown while the defense was on the field}\n#' \\item{def_scr_1st}{The rate at which a defense allowed a\n#' 1st down or touchdown on 1st down}\n#' \\item{def_scr_2nd}{The rate at which a defense allowed a\n#' 1st down or touchdown on 2nd down}\n#' \\item{def_scr_3rd}{The rate at which a defense allowed a\n#' 1st down or touchdown on 3rd down}\n#' \\item{def_scr_4th}{The rate at which a defense allowed a\n#' 1st down or touchdown on 4th down}\n#' \\item{def_1st}{The rate of series that ended in a new 1st down while the\n#' defense was on the field (does not include offensive touchdown)}\n#' \\item{def_td}{The rate of series that ended in an offensive touchdown while the\n#' defense was on the field}\n#' \\item{def_fg}{The rate of series that ended in a field goal attempt while the\n#' defense was on the field}\n#' \\item{def_punt}{The rate of series that ended in a punt while the\n#' defense was on the field}\n#' \\item{def_to}{The rate of series that ended in a turnover (including on downs), in an\n#' opponent score, or at the end of half (or game) while the\n#' defense was on the field}\n#' }\n#' @export\n#'\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#'   pbp <- nflfastR::load_pbp(2021)\n#'\n#'   weekly <- calculate_series_conversion_rates(pbp, weekly = TRUE)\n#'   dplyr::glimpse(weekly)\n#'\n#'   overall <- calculate_series_conversion_rates(pbp, weekly = FALSE)\n#'   dplyr::glimpse(overall)\n#' })\n#' }\ncalculate_series_conversion_rates <- function(pbp, weekly = FALSE) {\n  if (isTRUE(weekly)) {\n    grp <- c(\"season\", \"team\", \"week\")\n  } else if (isFALSE(weekly)) {\n    grp <- c(\"season\", \"team\")\n  }\n  grp_vars <- lapply(grp, as.symbol)\n\n  # Offense -----------------------------------------------------------------\n\n  off_series <- pbp |>\n    dplyr::filter(\n      !is.na(.data$down),\n      .data$series_result != \"QB kneel\"\n      # .data$rush == 1 | .data$pass == 1\n    ) |>\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      team = .data$posteam,\n      .data$series\n    ) |>\n    dplyr::summarise(\n      conversion = dplyr::first(.data$series_success),\n      result = dplyr::first(.data$series_result),\n      last_down = dplyr::last(.data$down),\n      .groups = \"drop\"\n    )\n\n  offense <- off_series |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      off_n = dplyr::n(),\n      off_scr = mean(.data$conversion),\n      off_scr_1st = mean(.data$last_down == 1 * .data$conversion),\n      off_scr_2nd = mean(.data$last_down == 2 * .data$conversion),\n      off_scr_3rd = mean(.data$last_down == 3 * .data$conversion),\n      off_scr_4th = mean(.data$last_down == 4 * .data$conversion),\n      off_1st = mean(.data$result == \"First down\"),\n      off_td = mean(.data$result == \"Touchdown\"),\n      off_fg = mean(.data$result %in% c(\"Field goal\", \"Missed field goal\")),\n      off_punt = mean(.data$result == \"Punt\"),\n      off_to = mean(\n        .data$result %in%\n          c(\n            \"Turnover on downs\",\n            \"Turnover\",\n            \"Opp touchdown\",\n            \"Safety\",\n            \"End of half\"\n          )\n      ),\n      .groups = \"drop\"\n    )\n\n  # Defense -----------------------------------------------------------------\n\n  def_series <- pbp |>\n    dplyr::filter(\n      !is.na(.data$down),\n      .data$series_result != \"QB kneel\"\n      # .data$rush == 1 | .data$pass == 1\n    ) |>\n    dplyr::group_by(\n      .data$season,\n      .data$week,\n      team = .data$defteam,\n      .data$series\n    ) |>\n    dplyr::summarise(\n      conversion = dplyr::first(.data$series_success),\n      result = dplyr::first(.data$series_result),\n      last_down = dplyr::last(.data$down),\n      .groups = \"drop\"\n    )\n\n  defense <- def_series |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      def_n = dplyr::n(),\n      def_scr = mean(.data$conversion),\n      def_scr_1st = mean(.data$last_down == 1 * .data$conversion),\n      def_scr_2nd = mean(.data$last_down == 2 * .data$conversion),\n      def_scr_3rd = mean(.data$last_down == 3 * .data$conversion),\n      def_scr_4th = mean(.data$last_down == 4 * .data$conversion),\n      def_1st = mean(.data$result == \"First down\"),\n      def_td = mean(.data$result == \"Touchdown\"),\n      def_fg = mean(.data$result %in% c(\"Field goal\", \"Missed field goal\")),\n      def_punt = mean(.data$result == \"Punt\"),\n      def_to = mean(\n        .data$result %in%\n          c(\n            \"Turnover on downs\",\n            \"Turnover\",\n            \"Opp touchdown\",\n            \"Safety\",\n            \"End of half\"\n          )\n      ),\n      .groups = \"drop\"\n    )\n\n  # Offense + Defense -------------------------------------------------------\n\n  combined <- dplyr::full_join(offense, defense, by = grp)\n\n  combined\n}\n"
  },
  {
    "path": "R/calculate_standings.R",
    "content": "#' Compute Division Standings and Conference Seeds from Play by Play\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated and replaced by [nflseedR::nfl_standings()].\n#'\n#' This function calculates division standings as well as playoff\n#'   seeds per conference based on either nflverse play-by-play data or nflverse\n#'   schedule data.\n#'\n#' @param nflverse_object Data object of class `nflverse_data`. Either schedules\n#'   as returned by [`fast_scraper_schedules()`] or [`nflreadr::load_schedules()`].\n#'   Or play-by-play data as returned by [`load_pbp()`], [`build_nflfastR_pbp()`], or\n#'  [`fast_scraper()`].\n#' @param playoff_seeds Number of playoff teams per conference. If `NULL` (the\n#'   default), the function will try to split `nflverse_object` into seasons prior\n#'   2020 (6 seeds) and 2020ff (7 seeds). If set to a numeric, it will be used\n#'   for all seasons in `nflverse_object`!\n#' @inheritParams nflseedR::compute_conference_seeds\n#'\n#' @keywords internal\n#' @return A tibble with NFL regular season standings\n#' @export\n#'\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#'   # load nflverse data both schedules and pbp\n#'   # scheds <- fast_scraper_schedules(2014)\n#'   # pbp <- load_pbp(c(2018, 2021))\n#'\n#'   # calculate standings based on pbp\n#'   # calculate_standings(pbp)\n#'\n#'   # calculate standings based on schedules\n#'   # calculate_standings(scheds)\n#' })\n#' }\ncalculate_standings <- function(\n  nflverse_object,\n  tiebreaker_depth = 3,\n  playoff_seeds = NULL\n) {\n  lifecycle::deprecate_warn(\n    \"5.1.0\",\n    \"calculate_standings()\",\n    \"nflseedR::nfl_standings()\"\n  )\n\n  if (!inherits(nflverse_object, \"nflverse_data\")) {\n    cli::cli_abort(\n      \"The function argument {.arg nflverse_object} has to be\n                   of class {.cls nflverse_data}\"\n    )\n  }\n\n  rlang::check_installed(\n    \"nflseedR\",\n    \"to compute standings.\",\n    compare = \">=\",\n    version = \"1.0.2\"\n  )\n\n  type <- attr(nflverse_object, \"nflverse_type\")\n\n  if (type == \"play by play data\") {\n    .standings_from_pbp(\n      nflverse_object,\n      tiebreaker_depth = tiebreaker_depth,\n      playoff_seeds = playoff_seeds\n    )\n  } else if (type == \"games and schedules\") {\n    .standings_from_games(\n      nflverse_object,\n      tiebreaker_depth = tiebreaker_depth,\n      playoff_seeds = playoff_seeds\n    )\n  } else {\n    cli::cli_abort(\n      \"Can only handle nflverse_type {.val play by play data} or\n                   {.val games and schedules} and not {.val {type}}\"\n    )\n  }\n}\n\n.standings_from_pbp <- function(pbp, tiebreaker_depth, playoff_seeds) {\n  g <- pbp |>\n    dplyr::filter(.data$season_type == \"REG\") |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::summarise(\n      sim = dplyr::first(.data$season),\n      game_type = dplyr::first(.data$season_type),\n      week = dplyr::first(.data$week),\n      away_team = dplyr::first(.data$away_team),\n      home_team = dplyr::first(.data$home_team),\n      result = dplyr::last(.data$home_score) - dplyr::last(.data$away_score)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::select(-\"game_id\")\n\n  if (is.null(playoff_seeds)) {\n    g6 <- g |>\n      dplyr::filter(.data$sim %in% 1999:2019)\n    g7 <- g |>\n      dplyr::filter(.data$sim >= 2020)\n    dplyr::bind_rows(\n      .compute_standings(\n        g6,\n        tiebreaker_depth = tiebreaker_depth,\n        playoff_seeds = 6\n      ),\n      .compute_standings(\n        g7,\n        tiebreaker_depth = tiebreaker_depth,\n        playoff_seeds = 7\n      )\n    )\n  } else {\n    .compute_standings(\n      g,\n      tiebreaker_depth = tiebreaker_depth,\n      playoff_seeds = playoff_seeds\n    )\n  }\n}\n\n.standings_from_games <- function(games, tiebreaker_depth, playoff_seeds) {\n  g <- games |>\n    dplyr::filter(.data$game_type == \"REG\", !is.na(.data$result)) |>\n    dplyr::select(\n      \"sim\" = \"season\",\n      \"game_type\",\n      \"week\",\n      \"away_team\",\n      \"home_team\",\n      \"result\"\n    )\n\n  if (is.null(playoff_seeds)) {\n    g6 <- g |>\n      dplyr::filter(.data$sim %in% 1999:2019)\n    g7 <- g |>\n      dplyr::filter(.data$sim >= 2020)\n    dplyr::bind_rows(\n      .compute_standings(\n        g6,\n        tiebreaker_depth = tiebreaker_depth,\n        playoff_seeds = 6\n      ),\n      .compute_standings(\n        g7,\n        tiebreaker_depth = tiebreaker_depth,\n        playoff_seeds = 7\n      )\n    )\n  } else {\n    .compute_standings(\n      g,\n      tiebreaker_depth = tiebreaker_depth,\n      playoff_seeds = playoff_seeds\n    )\n  }\n}\n\n.compute_standings <- function(games, tiebreaker_depth, playoff_seeds) {\n  if (nrow(games) == 0) {\n    return(data.frame())\n  }\n  suppressMessages({\n    div <- nflseedR::compute_division_ranks(\n      games,\n      tiebreaker_depth = tiebreaker_depth\n    )\n    conf <- nflseedR::compute_conference_seeds(\n      div,\n      h2h = div$h2h,\n      tiebreaker_depth = tiebreaker_depth,\n      playoff_seeds = playoff_seeds\n    )\n  })\n  conf$standings |>\n    dplyr::select(-\"exit\", -\"wins\") |>\n    dplyr::select(\"sim\":\"division\", \"div_rank\", \"seed\", dplyr::everything()) |>\n    dplyr::rename(\"season\" = \"sim\", \"wins\" = \"true_wins\") |>\n    dplyr::arrange(.data$season, .data$division, .data$div_rank, .data$seed) |>\n    tibble::as_tibble()\n}\n"
  },
  {
    "path": "R/calculate_stats.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n################################################################################\n\n#' Calculate NFL Stats\n#'\n#' Compute various NFL stats based off nflverse Play-by-Play data.\n#'\n#' @param seasons A numeric vector of 4-digit years associated with given NFL\n#'  seasons - defaults to latest season. If set to TRUE, returns all available\n#'  data since 1999. Ignored if argument `pbp` is not `NULL`.\n#' @param summary_level Summarize stats by `\"season\"` or `\"week\"`.\n#' @param stat_type Calculate `\"player\"` level stats or `\"team\"` level stats.\n#' @param season_type One of `\"REG\"`, `\"POST\"`, or `\"REG+POST\"`. Filters\n#'  data to regular season (\"REG\"), post season (\"POST\") or keeps all data.\n#'  Only applied if `summary_level` == `\"season\"`.\n#' @param pbp This argument allows passing a subset of nflverse play-by-play\n#'  data, created with [build_nflfastR_pbp()] or loaded with [load_pbp()].\n#'  Stats are then calculated based on the `game_id`s and `play_id`s in this\n#'  subset of play-by-play data, rather then using the seasons specified in the\n#'  `seasons` argument. The function will error if required variables are\n#'  missing from the subset, but lists which variables are missing.\n#'  If `pbp = NULL` (the default), all available games and plays from the\n#'  `seasons` argument are used to calculate stats.\n#'  Please use this responsibly, because the output is structurally identical\n#'  to full seasons, even if plays have been filtered out. It may then appear\n#'  as if the stats are incorrect. If `pbp` is not `NULL`, the function will add\n#'  the attribute `\"custom_pbp\" = TRUE` to the function output to help identify\n#'  stats that are possibly based on play-by-play subsets.\n#'\n#' @return A tibble of player/team stats summarized by season/week.\n#' @seealso [nfl_stats_variables] for a description of all variables.\n#' @seealso <https://nflfastr.com/articles/stats_variables.html> for a searchable\n#' table of the stats variable descriptions.\n#' @export\n#'\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#' stats <- calculate_stats(2023, \"season\", \"player\")\n#' dplyr::glimpse(stats)\n#' })\n#' }\ncalculate_stats <- function(\n  seasons = nflreadr::most_recent_season(),\n  summary_level = c(\"season\", \"week\"),\n  stat_type = c(\"player\", \"team\"),\n  season_type = c(\"REG\", \"POST\", \"REG+POST\"),\n  pbp = NULL\n) {\n  summary_level <- rlang::arg_match(summary_level)\n  stat_type <- rlang::arg_match(stat_type)\n  season_type <- rlang::arg_match(season_type)\n  custom_pbp <- !is.null(pbp)\n\n  if (!custom_pbp) {\n    pbp <- nflreadr::load_pbp(seasons = seasons)\n  }\n\n  # make sure (custom) pbp includes all required variables.\n  # stats_validate_pbp will return all unique seasons in pbp.\n  # We'll use this to download playstats for all seasons listed in pbp.\n  seasons_in_pbp <- stats_validate_pbp(pbp)\n\n  # we don't want groups to mess up something or slow us down.\n  # this is only relevant if a user supplies grouped pbp data\n  pbp <- dplyr::ungroup(pbp)\n\n  if (season_type %in% c(\"REG\", \"POST\") && summary_level == \"season\") {\n    pbp <- dplyr::filter(pbp, .data$season_type == .env$season_type)\n    if (nrow(pbp) == 0) {\n      cli::cli_alert_warning(\n        \"Filtering {.val {seasons}} data to {.arg season_type} == \\\\\n        {.val {season_type}} resulted in 0 rows. Returning empty tibble.\"\n      )\n      return(tibble::tibble())\n    }\n  }\n\n  # defensive stats require knowledge of which team is on defense\n  # special teams stats require knowledge of which plays were special teams plays\n  playinfo <- pbp |>\n    dplyr::group_by(.data$game_id, .data$play_id) |>\n    dplyr::summarise(\n      off = .data$posteam,\n      def = .data$defteam,\n      special = as.integer(.data$special == 1)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate_at(\n      .vars = dplyr::vars(\"off\", \"def\"),\n      .funs = team_name_fn\n    )\n\n  season_type_from_pbp <- pbp |>\n    dplyr::select(\"game_id\", \"season_type\") |>\n    dplyr::distinct()\n  s_type_vctr <- season_type_from_pbp$season_type |>\n    rlang::set_names(season_type_from_pbp$game_id)\n\n  gwfg_attempts_from_pbp <- pbp |>\n    dplyr::mutate(\n      # final_posteam_score = data.table::fifelse(.data$posteam_type == \"home\", .data$home_score, .data$away_score),\n      final_defteam_score = data.table::fifelse(\n        .data$posteam_type == \"home\",\n        .data$away_score,\n        .data$home_score\n      ),\n      identifier = paste(.data$game_id, .data$play_id, sep = \"_\")\n    ) |>\n    dplyr::group_by(.data$game_id, .data$posteam) |>\n    dplyr::mutate(\n      # A game winning field goal attempt is\n      # - a field goal attempt,\n      # - in the posteam's final drive,\n      # - where the posteam trailed the defteam by 2 points or less prior to the kick,\n      # - and the defteam did not score afterwards\n      is_gwfg_attempt = dplyr::case_when(\n        .data$field_goal_attempt == 1 &\n          .data$fixed_drive == max(.data$fixed_drive) &\n          dplyr::between(.data$score_differential, -2, 0) &\n          .data$defteam_score == .data$final_defteam_score ~ 1L,\n        TRUE ~ 0L\n      )\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::filter(\n      is_gwfg_attempt == 1L\n    ) |>\n    dplyr::select(\"identifier\", \"is_gwfg_attempt\")\n  gwfg_vctr <- gwfg_attempts_from_pbp$is_gwfg_attempt |>\n    rlang::set_names(gwfg_attempts_from_pbp$identifier)\n\n  # load_playstats defined below\n  # more_stats = all stat IDs of one player in a single play\n  # team_stats = all stat IDs of one team in a single play\n  # all_stats = all stat IDs of a play, regardless of team (we need this for punting)\n  # we need those to identify things like fumbles depending on playtype or\n  # first downs depending on playtype\n  playstats <- load_playstats(seasons = seasons_in_pbp) |>\n    # apply filtering on play stats so that it matches only plays included\n    # in pbp in case it was provided manually\n    dplyr::semi_join(pbp, by = c(\"game_id\", \"play_id\")) |>\n    dplyr::rename(\"player_id\" = \"gsis_player_id\", \"team\" = \"team_abbr\") |>\n    dplyr::group_by(.data$season, .data$week, .data$play_id, .data$player_id) |>\n    dplyr::mutate(\n      # we wrap the collapsed string in \";\" in order to search for the pattern\n      # \";stat_id;\" to avoid matching 1 with 10, 11, 21, etc.\n      more_stats = paste0(\";\", paste(stat_id, collapse = \";\"), \";\")\n    ) |>\n    dplyr::group_by(.data$season, .data$week, .data$play_id, .data$team) |>\n    dplyr::mutate(\n      # we wrap the collapsed string in \";\" in order to search for the pattern\n      # \";stat_id;\" to avoid matching 1 with 10, 11, 21, etc.\n      team_stats = paste0(\";\", paste(stat_id, collapse = \";\"), \";\"),\n      team_play_air_yards = sum((stat_id %in% 111:112) * yards)\n    ) |>\n    # need to group by game and play here to avoid mixing of play IDs of different\n    # games in the same week\n    dplyr::group_by(.data$game_id, .data$play_id) |>\n    dplyr::mutate(\n      # we wrap the collapsed string in \";\" in order to search for the pattern\n      # \";stat_id;\" to avoid matching 1 with 10, 11, 21, etc.\n      all_stats = paste0(\";\", paste(stat_id, collapse = \";\"), \";\"),\n      play_punt_return_yards = sum((stat_id %in% 33:36) * yards)\n    ) |>\n    # compute team targets and team air yards for calculation of target share\n    # and air yard share. Since it's relative, we need to be careful with the groups\n    # depending on summary level\n    dplyr::group_by(\n      !!!rlang::data_syms(\n        if (summary_level == \"season\") {\n          c(\"season\", \"team\")\n        } else {\n          c(\"season\", \"week\", \"team\")\n        }\n      )\n    ) |>\n    dplyr::mutate(\n      team_targets = sum(stat_id == 115),\n      team_air_yards = sum((stat_id %in% 111:112) * yards)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::left_join(\n      playinfo,\n      by = c(\"game_id\", \"play_id\")\n    ) |>\n    dplyr::mutate(\n      season_type = unname(s_type_vctr[.data$game_id]),\n      is_gwfg_attempt = unname(gwfg_vctr[paste(\n        .data$game_id,\n        .data$play_id,\n        sep = \"_\"\n      )]) %ifna%\n        0L\n    )\n\n  # Check combination of summary_level and stat_type to set a helper that is\n  # used to create the grouping variables\n  grp_id <- data.table::fcase(\n    summary_level == \"season\" && stat_type == \"player\" , \"10\" ,\n    summary_level == \"season\" && stat_type == \"team\"   , \"20\" ,\n    summary_level == \"week\" && stat_type == \"player\"   , \"30\" ,\n    summary_level == \"week\" && stat_type == \"team\"     , \"40\"\n  )\n  # grp_vctr is used as character vector for joining pbp stats\n  grp_vctr <- switch(\n    grp_id,\n    \"10\" = c(\"season\", \"player_id\"),\n    \"20\" = c(\"season\", \"team\"),\n    \"30\" = c(\"season\", \"week\", \"player_id\"),\n    \"40\" = c(\"season\", \"week\", \"team\")\n  )\n  # grp_vars is used as grouping variables\n  grp_vars <- rlang::data_syms(grp_vctr)\n\n  # Stats from PBP #####################\n  # we want passing epa, rushing epa, and receiving epa\n  # since these depend on different player id variables and filters,\n  # we create separate dfs for these stats\n  passing_stats_from_pbp <- pbp |>\n    dplyr::filter(.data$play_type %in% c(\"pass\", \"qb_spike\")) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"posteam\",\n      \"player_id\" = \"passer_player_id\",\n      \"qb_epa\",\n      \"cpoe\"\n    ) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      passing_epa = sum(.data$qb_epa, na.rm = TRUE),\n      # mean will return NaN if all values are NA, because we remove NA\n      passing_cpoe = if (any(!is.na(.data$cpoe))) {\n        mean(.data$cpoe, na.rm = TRUE)\n      } else {\n        NA_real_\n      }\n    ) |>\n    dplyr::ungroup()\n\n  rushing_stats_from_pbp <- pbp |>\n    dplyr::filter(.data$play_type %in% c(\"run\", \"qb_kneel\")) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"posteam\",\n      \"player_id\" = \"rusher_player_id\",\n      \"epa\"\n    ) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      rushing_epa = sum(.data$epa, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  receiving_stats_from_pbp <- pbp |>\n    dplyr::filter(!is.na(.data$receiver_player_id)) |>\n    dplyr::select(\n      \"season\",\n      \"week\",\n      \"team\" = \"posteam\",\n      \"player_id\" = \"receiver_player_id\",\n      \"epa\"\n    ) |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      receiving_epa = sum(.data$epa, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  stats <- playstats |>\n    dplyr::group_by(!!!grp_vars) |>\n    dplyr::summarise(\n      player_name = if (.env$stat_type == \"player\") {\n        custom_mode(.data$player_name, na.rm = TRUE)\n      } else {\n        NULL\n      },\n      # Season Type #####################\n      # if summary level is week, then we have to use the season type variable\n      # from playstats as it could be REG or POST depending on the value of\n      # the argument season_type\n      # if summary level is season, then we collapse the values of season_type\n      # this will make sure that season_type is only REG+POST if the user asked\n      # for it AND if postseason data is available\n      season_type = if (.env$summary_level == \"week\") {\n        dplyr::first(.data$season_type)\n      } else {\n        paste(unique(.data$season_type), collapse = \"+\")\n      },\n\n      # Game ID #####################\n      # it's not strictly necessary to output game_id because we have\n      # season, week, team, and opponent information but it is convenient\n      # to add this here\n      # Only makes sense in case of weekly stats of course\n      game_id = if (.env$summary_level == \"week\") {\n        dplyr::first(.data$game_id)\n      } else {\n        NULL\n      },\n\n      # Team Info #####################\n      # recent_team if we do a season summary of player stats\n      # team if we do a week summary of player stats\n      recent_team = if (.env$grp_id == \"10\") dplyr::last(.data$team) else NULL,\n      team = if (.env$grp_id == \"30\") dplyr::first(.data$team) else NULL,\n      # opponent team if we do week summaries\n      opponent_team = if (.env$summary_level == \"week\") {\n        data.table::fifelse(\n          dplyr::first(.data$team) == dplyr::first(.data$off),\n          dplyr::first(.data$def),\n          dplyr::first(.data$off)\n        )\n      } else {\n        NULL\n      },\n\n      # number of games is only relevant if we summarise the season\n      games = if (.env$summary_level == \"season\") {\n        dplyr::n_distinct(.data$game_id)\n      } else {\n        NULL\n      },\n\n      # Offense #####################\n      completions = sum(stat_id %in% 15:16),\n      attempts = sum(stat_id %in% c(14:16, 19)),\n      passing_yards = sum((stat_id %in% 15:16) * yards),\n      passing_tds = sum(stat_id == 16),\n      passing_interceptions = sum(stat_id == 19),\n      sacks_suffered = sum(stat_id == 20),\n      sack_yards_lost = sum((stat_id == 20) * yards),\n      sack_fumbles = sum(stat_id == 20 & has_id(52:54, more_stats)),\n      sack_fumbles_lost = sum(stat_id == 20 & has_id(106, more_stats)),\n      # includes incompletions (111 = complete, 112 = incomplete)\n      passing_air_yards = sum((stat_id %in% 111:112) * yards),\n      # passing yac equals passing yards - air yards on completed passes\n      passing_yards_after_catch = .data$passing_yards -\n        sum((stat_id == 111) * yards),\n      passing_first_downs = sum((stat_id %in% 15:16) & has_id(4, team_stats)),\n      passing_2pt_conversions = sum(stat_id == 77),\n      # this is a player stat and we skip it in team stats\n      pacr = if (.env$stat_type == \"player\") {\n        .data$passing_yards / .data$passing_air_yards\n      } else {\n        NULL\n      },\n      # \"Explosives\" (see #550 for discussion about the definition)\n      passing_10 = sum((stat_id %in% 15:16) * (yards >= 10)),\n      passing_16 = sum((stat_id %in% 15:16) * (yards >= 16)),\n      passing_20 = sum((stat_id %in% 15:16) * (yards >= 20)),\n      passing_40 = sum((stat_id %in% 15:16) * (yards >= 40)),\n      # dakota = requires pbp,\n\n      carries = sum(stat_id %in% 10:11),\n      rushing_yards = sum((stat_id %in% 10:13) * yards),\n      rushing_tds = sum(stat_id %in% c(11, 13)),\n      rushing_fumbles = sum((stat_id %in% 10:11) & has_id(52:54, more_stats)),\n      rushing_fumbles_lost = sum(\n        (stat_id %in% 10:11) & has_id(106, more_stats)\n      ),\n      rushing_first_downs = sum((stat_id %in% 10:11) & has_id(3, team_stats)),\n      rushing_2pt_conversions = sum(stat_id == 75),\n      # \"Explosives\" (see #550 for discussion about the definition)\n      rushing_10 = sum((stat_id %in% 10:13) * (yards >= 10)),\n      rushing_12 = sum((stat_id %in% 10:13) * (yards >= 12)),\n      rushing_20 = sum((stat_id %in% 10:13) * (yards >= 20)),\n      rushing_40 = sum((stat_id %in% 10:13) * (yards >= 40)),\n\n      receptions = sum(stat_id %in% 21:22),\n      targets = sum(stat_id == 115),\n      receiving_yards = sum((stat_id %in% 21:24) * yards),\n      receiving_tds = sum(stat_id %in% c(22, 24)),\n      receiving_fumbles = sum((stat_id %in% 21:22) & has_id(52:54, more_stats)),\n      receiving_fumbles_lost = sum(\n        (stat_id %in% 21:22) & has_id(106, more_stats)\n      ),\n      # air_yards are counted in 111:112 but it is a passer stat not a receiver stat\n      # so we count team air yards when a player accounted for a reception\n      # team air yards will always equal the correct air yards as 111 and 112\n      # cannot appear more than once per play.\n      # If this ever changes, we can use pbp instead.\n      receiving_air_yards = if (.env$stat_type == \"player\") {\n        sum((stat_id == 115) * .data$team_play_air_yards)\n      } else {\n        .data$passing_air_yards\n      },\n      receiving_yards_after_catch = sum((stat_id == 113) * yards),\n      receiving_first_downs = sum((stat_id %in% 21:22) & has_id(4, team_stats)),\n      receiving_2pt_conversions = sum(stat_id == 104),\n      # \"Explosives\" (see #550 for discussion about the definition)\n      receiving_10 = sum((stat_id %in% 21:24) * (yards >= 10)),\n      receiving_16 = sum((stat_id %in% 21:24) * (yards >= 16)),\n      receiving_20 = sum((stat_id %in% 21:24) * (yards >= 20)),\n      receiving_40 = sum((stat_id %in% 21:24) * (yards >= 40)),\n      # these are player stats and we skip them in team stats\n      racr = if (.env$stat_type == \"player\") {\n        .data$receiving_yards / .data$receiving_air_yards\n      } else {\n        NULL\n      },\n      target_share = if (.env$stat_type == \"player\") {\n        .data$targets / dplyr::first(.data$team_targets)\n      } else {\n        NULL\n      },\n      air_yards_share = if (.env$stat_type == \"player\") {\n        .data$receiving_air_yards / dplyr::first(.data$team_air_yards)\n      } else {\n        NULL\n      },\n      wopr = if (.env$stat_type == \"player\") {\n        1.5 * .data$target_share + 0.7 * .data$air_yards_share\n      } else {\n        NULL\n      },\n\n      special_teams_tds = sum((special == 1) & stat_id %in% td_ids()),\n\n      # Defense #####################\n      # def_tackles = ,\n      def_tackles_solo = sum(stat_id == 79),\n      def_tackles_with_assist = sum(stat_id == 80),\n      def_tackle_assists = sum(stat_id == 82),\n      def_tackles_for_loss = sum(stat_id == 402),\n      def_tackles_for_loss_yards = sum((stat_id == 402) * yards),\n      def_fumbles_forced = sum(stat_id == 91),\n      def_sacks = sum(stat_id == 83) + 1 / 2 * sum(stat_id == 84),\n      def_sack_yards = sum((stat_id == 83) * -yards) +\n        1 / 2 * sum((stat_id == 84) * -yards),\n      def_qb_hits = sum(stat_id == 110),\n      def_interceptions = sum(stat_id %in% 25:26),\n      def_interception_yards = sum((stat_id %in% 25:28) * yards),\n      def_pass_defended = sum(stat_id == 85),\n      def_tds = sum(team == def & special != 1 & stat_id %in% td_ids()),\n      # stat ID 54 is a fumble out of bounds. It's never counted alone,\n      # always in combination with 52 or 53.\n      def_fumbles = sum((team == def) & stat_id %in% 52:53),\n      def_safeties = sum(stat_id == 89),\n\n      # Misc #####################\n      # mostly yards gained after blocked punts or fgs\n      misc_yards = sum((stat_id %in% 63:64) * yards),\n      fumble_recovery_own = sum(stat_id %in% 55:56),\n      # 57, 58 don't count as recovery because player received a\n      # lateral after recovery by other player\n      fumble_recovery_yards_own = sum((stat_id %in% 55:58) * yards),\n      fumble_recovery_opp = sum(stat_id %in% 59:60),\n      # 61, 62 don't count as recovery because player received a\n      # lateral after recovery by other player\n      fumble_recovery_yards_opp = sum((stat_id %in% 59:62) * yards),\n      fumble_recovery_tds = sum(stat_id %in% c(56, 58, 60, 62)),\n      penalties = sum(stat_id == 93),\n      penalty_yards = sum((stat_id == 93) * yards),\n      timeouts = if (.env$stat_type == \"team\") sum(stat_id == 68) else NULL,\n      # we are missing some fumbles on offense (see 515) so we just add\n      # totals here. These fumble stats count all fumbles regardless of\n      # the unit the player was on. This means that all above fumble stats\n      # are included here but we make sure not to loose any fumbles, esp. on offense\n      fumbles_forced_by_opp = sum(stat_id == 52),\n      fumbles_not_forced = sum(stat_id == 53),\n      fumbles_out_of_bounds = sum(stat_id == 54),\n      # we could tell users to just add the above three stats but fumbles are\n      # a bit confusing overall so it is ok to add a total counter that doesn't\n      # miss any fumbles.\n      # stat ID 54 is a fumble out of bounds. It's never counted alone,\n      # always in combination with 52 or 53. So we cannot add it to the total.\n      fumbles_total = sum(stat_id %in% 52:53),\n      fumbles_lost_total = sum(stat_id == 106),\n\n      # Returning #####################\n      punt_returns = sum(stat_id %in% 33:34),\n      punt_return_yards = sum((stat_id %in% 33:36) * yards),\n      # punt return tds are counted in special teams tds atm\n      # punt_return_tds = sum(stat_id %in% c(34, 36)),\n      kickoff_returns = sum(stat_id %in% 45:46),\n      kickoff_return_yards = sum((stat_id %in% 45:48) * yards),\n      # kickoff return tds are counted in special teams tds atm\n      # kickoff_return_tds = sum(stat_id %in% c(46, 48)),\n\n      # Kicking #####################\n      fg_made = sum(stat_id == 70),\n      fg_att = sum(stat_id %in% 69:71),\n      fg_missed = sum(stat_id == 69),\n      fg_blocked = sum(stat_id == 71),\n      fg_long = max((stat_id == 70) * yards) %0% NA_integer_,\n      # avoid 0/0 = NaN\n      fg_pct = if (.data$fg_att > 0) .data$fg_made / .data$fg_att else NA_real_,\n      fg_made_0_19 = sum((stat_id == 70) * (yards %between% c(0, 19))),\n      fg_made_20_29 = sum((stat_id == 70) * (yards %between% c(20, 29))),\n      fg_made_30_39 = sum((stat_id == 70) * (yards %between% c(30, 39))),\n      fg_made_40_49 = sum((stat_id == 70) * (yards %between% c(40, 49))),\n      fg_made_50_59 = sum((stat_id == 70) * (yards %between% c(50, 59))),\n      fg_made_60_ = sum((stat_id == 70) * (yards >= 60)),\n      fg_missed_0_19 = sum((stat_id == 69) * (yards %between% c(0, 19))),\n      fg_missed_20_29 = sum((stat_id == 69) * (yards %between% c(20, 29))),\n      fg_missed_30_39 = sum((stat_id == 69) * (yards %between% c(30, 39))),\n      fg_missed_40_49 = sum((stat_id == 69) * (yards %between% c(40, 49))),\n      fg_missed_50_59 = sum((stat_id == 69) * (yards %between% c(50, 59))),\n      fg_missed_60_ = sum((stat_id == 69) * (yards >= 60)),\n      fg_made_list = fg_list(stat_id, yards, collapse_id = 70),\n      fg_missed_list = fg_list(stat_id, yards, collapse_id = 69),\n      fg_blocked_list = fg_list(stat_id, yards, collapse_id = 71),\n      fg_made_distance = sum((stat_id == 70) * yards),\n      fg_missed_distance = sum((stat_id == 69) * yards),\n      fg_blocked_distance = sum((stat_id == 71) * yards),\n      pat_made = sum(stat_id == 72),\n      pat_att = sum(stat_id %in% 72:74),\n      pat_missed = sum(stat_id == 73),\n      pat_blocked = sum(stat_id == 74),\n      # avoid 0/0 = NaN\n      pat_pct = if (.data$pat_att > 0) {\n        .data$pat_made / .data$pat_att\n      } else {\n        NA_real_\n      },\n      gwfg_made = sum((stat_id == 70) * is_gwfg_attempt),\n      gwfg_att = sum((stat_id %in% 69:71) * is_gwfg_attempt),\n      gwfg_missed = sum((stat_id == 69) * is_gwfg_attempt),\n      gwfg_blocked = sum((stat_id == 71) * is_gwfg_attempt),\n      gwfg_distance = if (.env$summary_level == \"week\") {\n        sum((stat_id %in% 69:71) * is_gwfg_attempt * yards)\n      } else {\n        NULL\n      },\n      gwfg_distance_list = if (.env$summary_level == \"season\") {\n        fg_list(stat_id, yards, collapse_id = 69:71, gwfg = is_gwfg_attempt)\n      } else {\n        NULL\n      },\n\n      # Punts #####################\n      # stat ID 2 counts blocked punts that do not count as punt\n      pt_att = sum(stat_id %in% c(29, 31, 32)), # 31 probably unnecessary\n      pt_blocked = sum(stat_id == 2),\n      pt_long = max(stat_id %in% c(29, 32) * yards) %0% NA_integer_,\n      pt_yards = sum(stat_id %in% c(29, 32) * yards),\n      pt_inside_20 = sum(stat_id == 30),\n      # the following stats are a bit special as we need opponent team stats\n      # to get the counts right. That's what 'all_stats' is for\n      # stat IDs 37, 38, 39 (punts oob, downed, fair caught) are assigned to\n      # the receiving team (or to the receiver in case of 39)\n      # Also the number of returns, return TDs and the yardage\n      pt_out_of_bounds = sum((stat_id == 29) & has_id(37, all_stats)),\n      pt_downed = sum((stat_id == 29) & has_id(38, all_stats)),\n      pt_touchback = sum(stat_id == 32),\n      pt_fair_caught = sum((stat_id == 29) & has_id(39, all_stats)),\n      pt_returned = sum(\n        (stat_id %in% c(2, 29, 31, 32)) & has_id(33:34, all_stats)\n      ),\n      pt_return_yards = sum(\n        (stat_id %in% c(2, 29, 31, 32)) * .data$play_punt_return_yards\n      ),\n      pt_return_tds = sum(\n        (stat_id %in% c(2, 29, 31, 32)) & has_id(c(34, 36), all_stats)\n      ),\n      pt_net_yards = .data$pt_yards -\n        .data$pt_return_yards -\n        .data$pt_touchback * 20L\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate_if(\n      .predicate = is.character,\n      .funs = ~ dplyr::na_if(.x, \"\")\n    ) |>\n    # Join PBP Stats #####################\n    dplyr::left_join(passing_stats_from_pbp, by = grp_vctr) |>\n    dplyr::left_join(rushing_stats_from_pbp, by = grp_vctr) |>\n    dplyr::left_join(receiving_stats_from_pbp, by = grp_vctr) |>\n    # relocate epa variables. This could be done with dplyr::relocate\n    # but we want to be compatible with older dplyr versions\n    dplyr::select(\n      \"season\":\"passing_first_downs\",\n      \"passing_epa\",\n      \"passing_cpoe\",\n      \"passing_2pt_conversions\":\"rushing_first_downs\",\n      \"rushing_epa\",\n      \"rushing_2pt_conversions\":\"receiving_first_downs\",\n      \"receiving_epa\",\n      dplyr::everything()\n    ) |>\n    dplyr::arrange(!!!grp_vars)\n\n  # Apply Player Modifications #####################\n  if (stat_type == \"player\") {\n    # need newer version of nflreadr to use load_players\n    rlang::check_installed(\"nflreadr (>= 1.3.0)\", \"to join player information.\")\n\n    player_info <- nflreadr::load_players() |>\n      dplyr::select(\n        \"player_id\" = \"gsis_id\",\n        \"player_display_name\" = \"display_name\",\n        # \"player_name\" = \"short_name\",\n        \"position\",\n        \"position_group\",\n        \"headshot_url\" = \"headshot\"\n      )\n\n    # load gsis_ids of RBs, FBs and HBs for RACR\n    racr_ids <- player_info |>\n      dplyr::filter(.data$position %in% c(\"RB\", \"FB\", \"HB\")) |>\n      dplyr::pull(\"player_id\")\n\n    stats <- stats |>\n      dplyr::mutate(\n        pacr = dplyr::case_when(\n          is.nan(.data$pacr) ~ NA_real_,\n          .data$passing_air_yards <= 0 ~ 0,\n          TRUE ~ .data$pacr\n        ),\n        racr = dplyr::case_when(\n          is.nan(.data$racr) ~ NA_real_,\n          .data$receiving_air_yards == 0 ~ 0,\n          # following Josh Hermsmeyer's definition, RACR stays < 0 for RBs (and FBs) and is set to\n          # 0 for Receivers. The list \"racr_ids\" includes all known RB and FB gsis_ids\n          .data$receiving_air_yards < 0 & !.data$player_id %in% racr_ids ~ 0,\n          TRUE ~ .data$racr\n        ),\n        # Fantasy #####################\n        fantasy_points = 1 /\n          25 *\n          .data$passing_yards +\n          4 * .data$passing_tds +\n          -2 * .data$passing_interceptions +\n          1 / 10 * (.data$rushing_yards + .data$receiving_yards) +\n          6 *\n            (.data$rushing_tds +\n              .data$receiving_tds +\n              .data$special_teams_tds) +\n          2 *\n            (.data$passing_2pt_conversions +\n              .data$rushing_2pt_conversions +\n              .data$receiving_2pt_conversions) +\n          -2 *\n            (.data$sack_fumbles_lost +\n              .data$rushing_fumbles_lost +\n              .data$receiving_fumbles_lost),\n\n        fantasy_points_ppr = .data$fantasy_points + .data$receptions\n      ) |>\n      dplyr::left_join(player_info, by = \"player_id\") |>\n      dplyr::select(\n        \"player_id\",\n        \"player_name\",\n        \"player_display_name\",\n        \"position\",\n        \"position_group\",\n        \"headshot_url\",\n        dplyr::everything()\n      )\n  }\n\n  if (custom_pbp) {\n    attr(stats, \"custom_pbp\") <- TRUE\n  }\n\n  stats\n}\n\n# Silence global vars NOTE\n# We do this differently here because it's only a bunch of variables\n# and the code is more readable\nutils::globalVariables(c(\n  \"stat_id\",\n  \"yards\",\n  \"more_stats\",\n  \"team_stats\",\n  \"team\",\n  \"def\",\n  \"off\",\n  \"special\",\n  \"is_gwfg_attempt\"\n))\n\nload_playstats <- function(seasons = nflreadr::most_recent_season()) {\n  if (isTRUE(seasons)) {\n    seasons <- seq(1999, nflreadr::most_recent_season())\n  }\n\n  stopifnot(\n    is.numeric(seasons),\n    seasons >= 1999,\n    seasons <= nflreadr::most_recent_season()\n  )\n\n  urls <- paste0(\n    \"https://github.com/nflverse/nflverse-pbp/releases/download/playstats/play_stats_\",\n    seasons,\n    \".rds\"\n  )\n\n  out <- nflreadr::load_from_url(urls, seasons = TRUE, nflverse = FALSE)\n\n  out\n}\n\nfg_list <- function(stat_ids, yards, collapse_id, gwfg = NULL) {\n  if (is.null(gwfg)) {\n    paste(\n      yards[stat_ids == collapse_id],\n      collapse = \";\"\n    )\n  } else {\n    paste(\n      yards[stat_ids %in% collapse_id & gwfg == 1L],\n      collapse = \";\"\n    )\n  }\n}\n\n`%0%` <- function(lhs, rhs) if (lhs != 0) lhs else rhs\n\n`%ifna%` <- function(lhs, rhs) data.table::fifelse(is.na(lhs), rhs, lhs)\n\nhas_id <- function(id, all_ids) {\n  stringr::str_detect(all_ids, paste0(\";\", id, \";\", collapse = \"|\"))\n}\n\ntd_ids <- function() {\n  c(\n    11,\n    13,\n    16,\n    18,\n    22,\n    24,\n    26,\n    28,\n    34,\n    36,\n    46,\n    48,\n    # 56, 58, 60, 62, # 56-62 are separately counted in fumble_recovery_tds\n    64,\n    108\n  )\n}\n\nstats_validate_pbp <- function(pbp) {\n  required_names <- c(\n    \"season\",\n    \"game_id\",\n    \"play_id\",\n    \"posteam\",\n    \"defteam\",\n    \"special\",\n    \"season_type\",\n    \"away_score\",\n    \"home_score\",\n    \"field_goal_attempt\",\n    \"fixed_drive\",\n    \"score_differential\",\n    \"play_type\",\n    \"week\",\n    \"passer_player_id\",\n    \"qb_epa\",\n    \"cpoe\",\n    \"rusher_player_id\",\n    \"epa\",\n    \"receiver_player_id\"\n  )\n  available_names <- names(pbp)\n  missing <- required_names[!required_names %in% available_names]\n  if (length(missing) > 0) {\n    cli::cli_abort(\n      \"You have passed custom pbp to the argument {.arg pbp} but \\\\\n      it is missing the following required variable{?s}: {.val {missing}}\",\n      call = rlang::caller_env()\n    )\n  }\n  unique(pbp$season) |>\n    stats::na.omit() |>\n    as.vector()\n}\n"
  },
  {
    "path": "R/data_documentation.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Documenting Data Files\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n#' NFL Team names, colors and logo urls.\n#'\n#' @docType data\n#' @format A data frame with 36 rows and 10 variables containing NFL team level\n#' information, including franchises in multiple cities:\n#' \\describe{\n#'   \\item{team_abbr}{Team abbreviation}\n#'   \\item{team_name}{Complete Team name}\n#'   \\item{team_id}{Team id used in the roster function}\n#'   \\item{team_nick}{Nickname}\n#'   \\item{team_conf}{Conference}\n#'   \\item{team_division}{Division}\n#'   \\item{team_color}{Primary color}\n#'   \\item{team_color2}{Secondary color}\n#'   \\item{team_color3}{Tertiary color}\n#'   \\item{team_color4}{Quaternary color}\n#'   \\item{team_logo_wikipedia}{Url to Team logo on wikipedia}\n#'   \\item{team_logo_espn}{Url to higher quality logo on espn}\n#'   \\item{team_wordmark}{Url to team wordmarks}\n#'   \\item{team_conference_logo}{Url to AFC and NFC logos}\n#'   \\item{team_league_logo}{Url to NFL logo}\n#' }\n#' The primary and secondary colors have been taken from nfl.com with some modifications\n#' for better team distinction and most recent team color themes.\n#' The tertiary and quaternary colors are taken from Lee Sharpe's teamcolors.csv\n#' who has taken them from the `teamcolors` package created by Ben Baumer and\n#' Gregory Matthews. The Wikipeadia logo urls are taken from Lee Sharpe's logos.csv\n#' Team wordmarks from nfl.com\n#' @examples\n#' \\donttest{\n#' teams_colors_logos\n#' }\n\"teams_colors_logos\"\n\n#' nflfastR Field Descriptions\n#'\n#' @docType data\n#' @format A data frame including names and descriptions of all variables in\n#' an nflfastR dataset.\n#' @seealso The searchable table on the\n#' [nflfastR website](https://nflfastr.com/articles/field_descriptions.html)\n#' @examples\n#' \\donttest{\n#' field_descriptions\n#' }\n\"field_descriptions\"\n\n#' NFL Stat IDs and their Meanings\n#'\n#' @docType data\n#' @format A data frame including NFL stat IDs, names and descriptions used in\n#' an nflfastR dataset.\n#' @source \\url{http://www.nflgsis.com/gsis/Documentation/Partners/StatIDs.html}\n#' @examples\n#' \\donttest{\n#' stat_ids\n#' }\n\"stat_ids\"\n\n#' NFL Stats Variables\n#'\n#' @docType data\n#' @format A data frame explaining all variables returned by the function\n#' [calculate_stats()].\n#' @examples\n#' \\donttest{\n#' nfl_stats_variables\n#' }\n\"nfl_stats_variables\"\n"
  },
  {
    "path": "R/database.R",
    "content": "#' Update or Create a nflverse Play-by-Play Data Table in a Connected Database\n#'\n#' @description\n#' The nflfastR play-by-play era dates back to 1999. To analyze all the data\n#' efficiently, there is practically no alternative to working with a database.\n#'\n#' This function helps to create and maintain a table containing all\n#' play-by-play data of the nflfastR era in a connected database.\n#' Primarily, the preprocessed data from [load_pbp] is written to the database\n#' and, if necessary, supplemented with the latest games using\n#' [build_nflfastR_pbp].\n#'\n#' @param conn A `DBIConnection` object, as returned by [DBI::dbConnect()]\n#' @inheritParams rlang::args_dots_empty\n#' @inheritParams DBI::dbExistsTable\n#' @param seasons Hybrid argument (logical or numeric) to update parts\n#' of or the complete play by play table within the database.\n#'\n#' It can update the play by play data table either for the whole nflfastR era\n#' (with `seasons = TRUE`) or just for specified seasons\n#' (e.g. `seasons = 2024:2025`).\n#'\n#' Defaults to [most_recent_season]. Please see details for further information.\n#'\n#' @details\n#' ## The `seasons` argument\n#'\n#' The `seasons` argument controls how the table in the connected database is\n#' handled.\n#'\n#' With `seasons = TRUE`, the table in argument `name` will be removed completely\n#' (by calling [DBI::dbRemoveTable]) and all seasons of the nflfastR era will be\n#' added to a fresh table. This is helpful when new columns are added during the\n#' offseason.\n#'\n#' With a numerical vector, e.g. `seasons = 2024:2025`, the table in argument\n#' `name` will be preserved and only rows from the given seasons will be deleted\n#' and re-added (by calling [DBI::dbAppendTable]). This is intended to be used\n#' for ongoing seasons because the NFL fixes bugs in the underlying data during\n#' the week and we recommend rebuilding the current season every Thursday during\n#' the season.\n#'\n#' The default behavior is `seasons = most_recent_season()`, which means that\n#' only the most recent season is updated or added.\n#'\n#' To keep the table, and thus also the schema, but update all play-by-play\n#' data of the nflfastR era, set\n#' ```\n#' seasons = seq(1999, most_recent_season())\n#' ```\n#'\n#' If `seasons` contains multiple seasons, it is possible to control whether the\n#' seasons are loaded individually and written to the database, or whether\n#' multiple seasons should be processed in chunks. The latter is more efficient\n#' because fewer write operations are required, but at the same time, the data\n#' must first be stored in memory. The option `“nflfastR.db_chunk_size”` can\n#' be used to control how many seasons are loaded together in a chunk and\n#' written to the database. With the following option, for example, 5 seasons\n#' are always loaded together and written to the database.\n#' ```\n#' options(\"nflfastR.db_chunk_size\" = 5L)\n#' ```\n#'\n#' @returns Always returns the database connection invisibly.\n#' @export\n#'\n#' @examples\n#' \\donttest{\n#' con <- DBI::dbConnect(duckdb::duckdb())\n#' try({# to avoid CRAN test problems\n#' update_pbp_db(con, seasons = 2024)\n#' })\n#' }\nupdate_pbp_db <- function(\n  conn,\n  ...,\n  name = \"nflverse_pbp\",\n  seasons = most_recent_season()\n) {\n  rlang::check_installed(\"DBI\", \"to communicate with databases\")\n  rlang::check_dots_empty()\n\n  # Validate connection and table name --------------------------------------\n\n  if (!DBI::dbIsValid(conn)) {\n    cli::cli_abort(\n      \"The connection in argument {.arg conn} is invalid. \\\\\n      Do you need to run {.fun DBI::dbConnect}?\"\n    )\n  }\n\n  rule_header(\"Update nflverse Play-by-Play Data in Connected Database\")\n\n  # msg_name is the table name used in cli messages. We need it because `name`\n  # could be a call to DBI::SQL() or DBI::Id()\n  # I don't want to evaluate name in every subsequent function call, so I do it\n  # here once and pass it around\n  msg_name <- DBI::dbQuoteIdentifier(conn = conn, x = name) |>\n    as.character()\n\n  initiated <- FALSE\n  if (!DBI::dbExistsTable(conn = conn, name = name)) {\n    do_it <- confirm(\n      \"Table {.val {msg_name}} does not yet exist in your connected database.\n      Do you wish to create it? (Y/n)\"\n    )\n    if (do_it) {\n      initiated <- db_initiate_pbp(\n        conn = conn,\n        name = name,\n        msg_name = msg_name\n      )\n    } else {\n      rule_footer(\"ABORTED\")\n      return(invisible(conn))\n    }\n  }\n\n  # Validate seasons --------------------------------------------------------\n\n  if (is.numeric(seasons)) {\n    invalid <- setdiff(seasons, valid_seasons())\n    if (length(invalid) > 0) {\n      cli::cli_abort(\n        \"The following {cli::qty(length(invalid))} season{?s} {?is/are} \\\\\n        invalid: {.val {invalid}}\"\n      )\n    }\n    ret <- db_drop_seasons(\n      conn = conn,\n      name = name,\n      seasons = seasons,\n      msg_name = msg_name\n    )\n  } else if (isTRUE(seasons)) {\n    # We need this block inside if (isTRUE(seasons)) to make sure we run\n    # the else block in the right conditions\n    if (isFALSE(initiated)) {\n      do_it <- confirm(\n        \"Purge table {.val {msg_name}} in your connected database? (Y/n)\"\n      )\n      if (do_it) {\n        ret <- DBI::dbRemoveTable(conn = conn, name = name)\n        cli_message(\"Removed {.val {msg_name}}\")\n        initiated <- db_initiate_pbp(\n          conn = conn,\n          name = name,\n          msg_name = msg_name\n        )\n      } else {\n        rule_footer(\"ABORTED\")\n        return(invisible(conn))\n      }\n    }\n  } else {\n    cli::cli_abort(\n      \"Argument {.arg seasons} must be either a vector of valid \\\\\n      seasons or scalar TRUE\"\n    )\n  }\n\n  seasons <- if (isTRUE(seasons)) valid_seasons() else seasons\n\n  # Append seasons ----------------------------------------------------------\n  ret <- db_write_pbp_seasons(\n    conn = conn,\n    name = name,\n    seasons = seasons,\n    msg_name = msg_name\n  )\n\n  # Process missing games ---------------------------------------------------\n  db_games <- db_query_game_ids(conn = conn, name = name, seasons = seasons)\n  completed_games <- completed_game_ids(seasons = seasons)\n  missing_games <- setdiff(completed_games, db_games)\n\n  # This block is only relevant on game days\n  if (length(missing_games) > 0) {\n    # we enter this block if some completed games are missing in load_pbp.\n    # This can happen on game days\n    vec <- cli::cli_vec(missing_games, list(\"vec-trunc\" = 5L))\n    cli_message(\n      \"The following {cli::no(length(missing_games))} game{?s} {?is/are} not \\\\\n      yet available via {.fun load_pbp} and {?is/are} therefore parsed directly \\\\\n      with {.fun build_nflfastR_pbp} and appended to table {.val {msg_name}}: \\\\\n      {.val {vec}}\"\n    )\n    # build pbp of missing games. If raw pbp isn't ready, the function will\n    # return an empty dataframe for that game\n    new_pbp <- build_nflfastR_pbp(missing_games, rules = FALSE)\n    ret <- DBI::dbAppendTable(\n      conn = conn,\n      name = name,\n      value = new_pbp\n    )\n    # Check how many new games have been added\n    new_ids <- unique(new_pbp[[\"game_id\"]])\n    cli_message(\n      \"Appended {cli::no(length(new_ids))} game{?s} to table {.val {msg_name}}\",\n      .cli_fct = cli::cli_alert_success\n    )\n    # Let user know that some games are still missing\n    still_missing <- setdiff(missing_games, new_ids)\n    vec <- cli::cli_vec(still_missing, list(\"vec-trunc\" = 5L))\n    cli_message(\n      \"Raw pbp data for the following {cli::no(length(still_missing))} game{?s} \\\\\n      still missing: {.val {vec}}. Please try again in about 10 minutes.\",\n      .cli_fct = cli::cli_alert_warning\n    )\n  }\n\n  # Remove Dummy ------------------------------------------------------------\n  ret <- db_remove_dummy(conn = conn, name = name)\n\n  # Finish ------------------------------------------------------------------\n  cli_message(\n    \"Database update completed\",\n    .cli_fct = cli::cli_alert_success\n  )\n  rule_footer(\"DONE\")\n\n  invisible(conn)\n}\n\ndb_query_game_ids <- function(conn, name, seasons) {\n  res <- DBI::dbGetQuery(\n    conn = conn,\n    statement = glue::glue_sql(\n      \"SELECT DISTINCT game_id FROM {`name`} WHERE season IN ({seasons*});\",\n      .con = conn\n    )\n  )\n  res[[\"game_id\"]]\n}\n\ndb_remove_dummy <- function(conn, name) {\n  n_drops <- DBI::dbExecute(\n    conn = conn,\n    statement = glue::glue_sql(\n      \"DELETE FROM {`name`} WHERE game_id IN ({vals*})\",\n      vals = \"9999_99_DEF_TYP\",\n      .con = conn\n    )\n  )\n  invisible(TRUE)\n}\n\ndb_write_pbp_seasons <- function(conn, name, seasons, msg_name) {\n  vec <- cli::cli_vec(seasons, list(\"vec-trunc\" = 5L))\n  cli_message(\n    \"Append {.val {vec}} {cli::qty(length(seasons))}\\\\\n    season{?s} to table {.val {msg_name}}\"\n  )\n  chunks <- compute_chunks(\n    seasons,\n    chunk_size = getOption(\"nflfastR.db_chunk_size\", 1L)\n  )\n  p <- progressr::progressor(along = chunks)\n  for (chunk in chunks) {\n    ret <- DBI::dbAppendTable(\n      conn = conn,\n      name = name,\n      value = load_pbp(seasons = chunk)\n    )\n    p(\"Appending...\")\n  }\n  invisible(TRUE)\n}\n\ndb_drop_seasons <- function(conn, name, seasons, msg_name) {\n  vec <- cli::cli_vec(seasons, list(\"vec-trunc\" = 5L))\n  cli_message(\n    \"Drop {.val {vec}} {cli::qty(length(seasons))}\\\\\n    season{?s} from table {.val {msg_name}}\"\n  )\n  n_drops <- DBI::dbExecute(\n    conn = conn,\n    statement = glue::glue_sql(\n      \"DELETE FROM {`name`} WHERE season IN ({vals*})\",\n      vals = seasons,\n      .con = conn\n    )\n  )\n  invisible(TRUE)\n}\n\ndb_initiate_pbp <- function(conn, name, msg_name) {\n  cli_message(\n    \"Initiate table {.val {msg_name}} with nflverse pbp schema\"\n  )\n  ret <- DBI::dbCreateTable(\n    conn = conn,\n    name = name,\n    fields = default_play\n  )\n  invisible(TRUE)\n}\n\ncompleted_game_ids <- function(seasons) {\n  scheds <- nflreadr::load_schedules(seasons = seasons)\n  scheds <- data.table::setDT(scheds)\n  scheds[\n    !is.na(result) & !game_id %in% missing_raw_games,\n    game_id\n  ]\n}\nutils::globalVariables(c(\"result\", \"game_id\"), add = TRUE)\n\nmissing_raw_games <- c(\"1999_01_BAL_STL\", \"2000_06_BUF_MIA\", \"2000_03_SD_KC\")\n\nvalid_seasons <- function() {\n  seq(1999, nflreadr::most_recent_season())\n}\n\n# https://stackoverflow.com/a/3321659\ncompute_chunks <- function(x, chunk_size = 4) {\n  split(x, ceiling(seq_along(x) / chunk_size))\n}\n\nconfirm <- function(msg, ..., .envir = parent.frame()) {\n  cli::cli_alert_info(\n    text = msg,\n    wrap = FALSE,\n    .envir = .envir\n  )\n  ans <- readline()\n  tolower(ans) %in% c(\"\", \"y\", \"yes\", \"yeah\", \"yep\")\n}\n"
  },
  {
    "path": "R/ep_wp_calculators.R",
    "content": "#' Compute expected points\n#'\n#' for provided plays. Returns the data with\n#' probabilities of each scoring event and EP added. The following columns\n#' must be present: season, home_team, posteam, roof (coded as 'open',\n#' 'closed', or 'retractable'), half_seconds_remaining, yardline_100,\n#' ydstogo, posteam_timeouts_remaining, defteam_timeouts_remaining\n#'\n#' @param pbp_data Play-by-play dataset to estimate expected points for.\n#' @details Computes expected points for provided plays. Returns the data with\n#' probabilities of each scoring event and EP added. The following columns\n#' must be present:\n#' \\itemize{\n#' \\item{season}\n#' \\item{home_team}\n#' \\item{posteam}\n#' \\item{roof (coded as 'outdoors', 'dome', or 'open'/'closed'/NA (retractable))}\n#' \\item{half_seconds_remaining}\n#' \\item{yardline_100}\n#' \\item{down}\n#' \\item{ydstogo}\n#' \\item{posteam_timeouts_remaining}\n#' \\item{defteam_timeouts_remaining}\n#' }\n#' @return The original pbp_data with the following columns appended to it:\n#' \\describe{\n#' \\item{ep}{expected points.}\n#' \\item{no_score_prob}{probability of no more scoring this half.}\n#' \\item{opp_fg_prob}{probability next score opponent field goal this half.}\n#' \\item{opp_safety_prob}{probability next score opponent safety  this half.}\n#' \\item{opp_td_prob}{probability of next score opponent touchdown this half.}\n#' \\item{fg_prob}{probability next score field goal this half.}\n#' \\item{safety_prob}{probability next score safety this half.}\n#' \\item{td_prob}{probability text score touchdown this half.}\n#' }\n#' @export\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#' library(dplyr)\n#' data <- tibble::tibble(\n#' \"season\" = 1999:2019,\n#' \"home_team\" = \"SEA\",\n#' \"posteam\" = \"SEA\",\n#' \"roof\" = \"outdoors\",\n#' \"half_seconds_remaining\" = 1800,\n#' \"yardline_100\" = c(rep(80, 17), rep(75, 4)),\n#' \"down\" = 1,\n#' \"ydstogo\" = 10,\n#' \"posteam_timeouts_remaining\" = 3,\n#' \"defteam_timeouts_remaining\" = 3\n#' )\n#'\n#' nflfastR::calculate_expected_points(data) |>\n#'   dplyr::select(season, yardline_100, td_prob, ep)\n#' })\n#' }\ncalculate_expected_points <- function(pbp_data) {\n  # drop existing values of ep and the probs before making new ones\n  pbp_data <- pbp_data |> dplyr::select(-dplyr::any_of(drop.cols))\n\n  suppressWarnings(\n    model_data <- pbp_data |>\n      make_model_mutations() |>\n      ep_model_select()\n  )\n\n  preds <- stats::predict(load_model(\"ep\"), as.matrix(model_data))\n\n  # xgboost v3 returns a matrix of predictions instead of a vector as returned\n  # by xgboost v1.\n  if (is.vector(preds)) {\n    preds <- preds |>\n      matrix(ncol = 7, byrow = TRUE) |>\n      as.data.frame()\n  } else if (is.matrix(preds)) {\n    preds <- as.data.frame(preds)\n  }\n\n  colnames(preds) <- c(\n    \"td_prob\",\n    \"opp_td_prob\",\n    \"fg_prob\",\n    \"opp_fg_prob\",\n    \"safety_prob\",\n    \"opp_safety_prob\",\n    \"no_score_prob\"\n  )\n\n  preds <- preds |>\n    dplyr::mutate(\n      ep = (-3 * .data$opp_fg_prob) +\n        (-2 * .data$opp_safety_prob) +\n        (-7 * .data$opp_td_prob) +\n        (3 * .data$fg_prob) +\n        (2 * .data$safety_prob) +\n        (7 * .data$td_prob)\n    ) |>\n    dplyr::bind_cols(pbp_data)\n\n  return(preds)\n}\n\n# helper column for ep calculator\ndrop.cols <- c(\n  \"ep\",\n  \"td_prob\",\n  \"opp_td_prob\",\n  \"fg_prob\",\n  \"opp_fg_prob\",\n  \"safety_prob\",\n  \"opp_safety_prob\",\n  \"no_score_prob\"\n)\n\n\n#' Compute win probability\n#'\n#' for provided plays. Returns the data with\n#' probabilities of winning the game. The following columns\n#' must be present: receive_h2_ko (1 if game is in 1st half and possession\n#' team will receive 2nd half kickoff, 0 otherwise),\n#' home_team, posteam, half_seconds_remaining, game_seconds_remaining,\n#' spread_line (how many points home team was favored by), down, ydstogo,\n#' yardline_100, posteam_timeouts_remaining, defteam_timeouts_remaining\n#'\n#' @param pbp_data Play-by-play dataset to estimate win probability for.\n#' @details Computes win probability for provided plays. Returns the data with\n#' spread and non-spread-adjusted win probabilities. The following columns\n#' must be present:\n#' \\itemize{\n#' \\item{receive_2h_ko (1 if game is in 1st half and possession team will receive 2nd half kickoff, 0 otherwise)}\n#' \\item{score_differential}\n#' \\item{home_team}\n#' \\item{posteam}\n#' \\item{half_seconds_remaining}\n#' \\item{game_seconds_remaining}\n#' \\item{spread_line (how many points home team was favored by)}\n#' \\item{down}\n#' \\item{ydstogo}\n#' \\item{yardline_100}\n#' \\item{posteam_timeouts_remaining}\n#' \\item{defteam_timeouts_remaining}\n#' }\n#' @return The original pbp_data with the following columns appended to it:\n#' \\describe{\n#' \\item{wp}{win probability.}\n#' \\item{vegas_wp}{win probability taking into account pre-game spread.}\n#' }\n#' @export\n#' @examples\n#' \\donttest{\n#' try({# to avoid CRAN test problems\n#' library(dplyr)\n#' data <- tibble::tibble(\n#' \"receive_2h_ko\" = 0,\n#' \"home_team\" = \"SEA\",\n#' \"posteam\" = \"SEA\",\n#' \"score_differential\" = 0,\n#' \"half_seconds_remaining\" = 1800,\n#' \"game_seconds_remaining\" = 3600,\n#' \"spread_line\" = c(1, 3, 4, 7, 14),\n#' \"down\" = 1,\n#' \"ydstogo\" = 10,\n#' \"yardline_100\" = 75,\n#' \"posteam_timeouts_remaining\" = 3,\n#' \"defteam_timeouts_remaining\" = 3\n#' )\n#'\n#' nflfastR::calculate_win_probability(data) |>\n#'   dplyr::select(spread_line, wp, vegas_wp)\n#' })\n#' }\ncalculate_win_probability <- function(pbp_data) {\n  # drop existing values of ep and the probs before making new ones\n  pbp_data <- pbp_data |> dplyr::select(-dplyr::any_of(drop.cols.wp))\n\n  suppressWarnings(\n    model_data <- pbp_data |>\n      dplyr::mutate(\n        home = dplyr::if_else(.data$posteam == .data$home_team, 1, 0),\n        posteam_spread = dplyr::if_else(\n          .data$home == 1,\n          .data$spread_line,\n          -1 * .data$spread_line\n        ),\n        elapsed_share = (3600 - .data$game_seconds_remaining) / 3600,\n        spread_time = .data$posteam_spread * exp(-4 * .data$elapsed_share),\n        Diff_Time_Ratio = .data$score_differential /\n          (exp(-4 * .data$elapsed_share))\n      )\n  )\n\n  wp <- get_preds_wp(model_data) |>\n    tibble::as_tibble() |>\n    dplyr::rename(wp = \"value\")\n  wp_spread <- get_preds_wp_spread(model_data) |>\n    tibble::as_tibble() |>\n    dplyr::rename(vegas_wp = \"value\")\n\n  preds <- dplyr::bind_cols(\n    pbp_data,\n    wp,\n    wp_spread\n  )\n\n  return(preds)\n}\n\n# helper column for wp calculator\ndrop.cols.wp <- c(\n  \"wp\",\n  \"vegas_wp\"\n)\n"
  },
  {
    "path": "R/helper_add_cp_cpoe.R",
    "content": "################################################################################\n# Author: Ben Baldwin, Sebastian Carl\n# Purpose: Function to add cp and cpoe variables.\n# CP model created by Zach Feldman: https://github.com/z-feldman/Expected_Completion_NFL\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\nadd_cp <- function(pbp) {\n  # testing only\n  # pbp <- g\n\n  passes <- prepare_cp_data(pbp)\n\n  if (!nrow(passes |> dplyr::filter(.data$valid_pass == 1)) == 0) {\n    pred <- stats::predict(\n      load_model(\"cp\"),\n      as.matrix(passes |> dplyr::select(-\"complete_pass\", -\"valid_pass\"))\n    ) |>\n      tibble::as_tibble() |>\n      dplyr::rename(cp = \"value\") |>\n      dplyr::bind_cols(passes) |>\n      dplyr::select(\"cp\", \"valid_pass\")\n\n    pbp <- pbp |>\n      dplyr::bind_cols(pred) |>\n      dplyr::mutate(\n        cp = dplyr::if_else(\n          .data$valid_pass == 1,\n          .data$cp,\n          NA_real_\n        ),\n        cpoe = dplyr::if_else(\n          !is.na(.data$cp),\n          100 * (.data$complete_pass - .data$cp),\n          NA_real_\n        )\n      ) |>\n      dplyr::select(-\"valid_pass\")\n\n    user_message(\"added cp and cpoe\", \"done\")\n  } else {\n    pbp <- pbp |>\n      dplyr::mutate(\n        cp = NA_real_,\n        cpoe = NA_real_\n      )\n    user_message(\n      \"No non-NA values for cp calculation detected. cp and cpoe set to NA\",\n      \"info\"\n    )\n  }\n\n  return(pbp)\n}\n\n\n### helper function for getting the data ready\nprepare_cp_data <- function(pbp) {\n  # valid pass play: at least -15 air yards, less than 70 air yards, has intended receiver, has pass location\n  passes <- pbp |>\n    dplyr::mutate(\n      receiver_player_name = stringr::str_extract(\n        .data$desc,\n        \"(?<=((to)|(for))\\\\s[:digit:]{0,2}\\\\-{0,1})[A-Z][A-z]*\\\\.\\\\s?[A-Z][A-z]+(\\\\s(I{2,3})|(IV))?\"\n      ),\n      pass_middle = dplyr::if_else(.data$pass_location == \"middle\", 1, 0),\n      air_is_zero = dplyr::if_else(.data$air_yards == 0, 1, 0),\n      distance_to_sticks = .data$air_yards - .data$ydstogo,\n      valid_pass = dplyr::if_else(\n        (.data$complete_pass == 1 |\n          .data$incomplete_pass == 1 |\n          .data$interception == 1) &\n          !is.na(.data$air_yards) &\n          .data$air_yards >= -15 &\n          .data$air_yards < 70 &\n          (!is.na(.data$receiver_player_name) |\n            !is.na(.data$receiver_player_id)) &\n          !is.na(.data$pass_location),\n        1,\n        0\n      )\n    ) |>\n    dplyr::select(\n      \"complete_pass\",\n      \"air_yards\",\n      \"yardline_100\",\n      \"ydstogo\",\n      \"down1\",\n      \"down2\",\n      \"down3\",\n      \"down4\",\n      \"air_is_zero\",\n      \"pass_middle\",\n      \"era2\",\n      \"era3\",\n      \"era4\",\n      \"qb_hit\",\n      \"home\",\n      \"outdoors\",\n      \"retractable\",\n      \"dome\",\n      \"distance_to_sticks\",\n      \"valid_pass\"\n    )\n}\n"
  },
  {
    "path": "R/helper_add_ep_wp.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Functions to add ep(a) and wp(a) variables\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\nadd_ep <- function(pbp) {\n  out <- pbp |> add_ep_variables()\n  user_message(\"added ep variables\", \"done\")\n  return(out)\n}\n\nadd_air_yac_ep <- function(pbp) {\n  if (nrow(pbp |> dplyr::filter(!is.na(.data$air_yards))) == 0) {\n    out <- pbp |>\n      dplyr::mutate(\n        air_epa = NA_real_,\n        yac_epa = NA_real_,\n        comp_air_epa = NA_real_,\n        comp_yac_epa = NA_real_,\n        home_team_comp_air_epa = NA_real_,\n        away_team_comp_air_epa = NA_real_,\n        home_team_comp_yac_epa = NA_real_,\n        away_team_comp_yac_epa = NA_real_,\n        total_home_comp_air_epa = NA_real_,\n        total_away_comp_air_epa = NA_real_,\n        total_home_comp_yac_epa = NA_real_,\n        total_away_comp_yac_epa = NA_real_,\n        home_team_raw_air_epa = NA_real_,\n        away_team_raw_air_epa = NA_real_,\n        home_team_raw_yac_epa = NA_real_,\n        away_team_raw_yac_epa = NA_real_,\n        total_home_raw_air_epa = NA_real_,\n        total_away_raw_air_epa = NA_real_,\n        total_home_raw_yac_epa = NA_real_,\n        total_away_raw_yac_epa = NA_real_\n      )\n    user_message(\n      \"No non-NA air_yards detected. air_yac_ep variables set to NA\",\n      \"info\"\n    )\n  } else {\n    out <- pbp |> add_air_yac_ep_variables()\n    user_message(\"added air_yac_ep variables\", \"done\")\n  }\n  return(out)\n}\n\nadd_wp <- function(pbp) {\n  out <- pbp |> add_wp_variables()\n  user_message(\"added wp variables\", \"done\")\n  return(out)\n}\n\nadd_air_yac_wp <- function(pbp) {\n  if (nrow(pbp |> dplyr::filter(!is.na(.data$air_yards))) == 0) {\n    out <- pbp |>\n      dplyr::mutate(\n        air_wpa = NA_real_,\n        yac_wpa = NA_real_,\n        comp_air_wpa = NA_real_,\n        comp_yac_wpa = NA_real_,\n        home_team_comp_air_wpa = NA_real_,\n        away_team_comp_air_wpa = NA_real_,\n        home_team_comp_yac_wpa = NA_real_,\n        away_team_comp_yac_wpa = NA_real_,\n        total_home_comp_air_wpa = NA_real_,\n        total_away_comp_air_wpa = NA_real_,\n        total_home_comp_yac_wpa = NA_real_,\n        total_away_comp_yac_wpa = NA_real_,\n        home_team_raw_air_wpa = NA_real_,\n        away_team_raw_air_wpa = NA_real_,\n        home_team_raw_yac_wpa = NA_real_,\n        away_team_raw_yac_wpa = NA_real_,\n        total_home_raw_air_wpa = NA_real_,\n        total_away_raw_air_wpa = NA_real_,\n        total_home_raw_yac_wpa = NA_real_,\n        total_away_raw_yac_wpa = NA_real_\n      )\n    user_message(\n      \"No non-NA air_yards detected. air_yac_wp variables set to NA\",\n      \"info\"\n    )\n  } else {\n    out <- pbp |> add_air_yac_wp_variables()\n    user_message(\"added air_yac_wp variables\", \"done\")\n  }\n  return(out)\n}\n\n#get predictions for a set of pbp data\n#for predict stage of EP\nget_preds <- function(pbp) {\n  if (\"location\" %in% names(pbp)) {\n    pbp <- pbp |>\n      dplyr::mutate(\n        home = dplyr::if_else(.data$location == \"Neutral\", 0, .data$home)\n      )\n  }\n\n  preds <- stats::predict(\n    load_model(\"ep\"),\n    pbp |> ep_model_select() |> as.matrix()\n  )\n\n  # xgboost v3 returns a matrix of predictions instead of a vector as returned\n  # by xgboost v1.\n  if (is.vector(preds)) {\n    preds <- preds |>\n      matrix(ncol = 7, byrow = TRUE) |>\n      as.data.frame()\n  } else if (is.matrix(preds)) {\n    preds <- as.data.frame(preds)\n  }\n\n  colnames(preds) <- c(\n    \"Touchdown\",\n    \"Opp_Touchdown\",\n    \"Field_Goal\",\n    \"Opp_Field_Goal\",\n    \"Safety\",\n    \"Opp_Safety\",\n    \"No_Score\"\n  )\n\n  return(preds)\n}\n\n#get predictions for a set of pbp data\n#for predict stage\nget_preds_wp <- function(pbp) {\n  preds <- stats::predict(load_model(\"wp\"), as.matrix(pbp |> wp_model_select()))\n\n  return(preds)\n}\n\n#get predictions for a set of pbp data\n#for predict stage\nget_preds_wp_spread <- function(pbp) {\n  preds <- stats::predict(\n    load_model(\"wp_spread\"),\n    as.matrix(pbp |> wp_spread_model_select())\n  )\n\n  return(preds)\n}\n\n\n#get the columns needed for ep predictions\n#making sure they're in the right order\nep_model_select <- function(pbp) {\n  pbp <- pbp |>\n    dplyr::select(\n      \"half_seconds_remaining\",\n      \"yardline_100\",\n      \"home\",\n      \"retractable\",\n      \"dome\",\n      \"outdoors\",\n      \"ydstogo\",\n      \"era0\",\n      \"era1\",\n      \"era2\",\n      \"era3\",\n      \"era4\",\n      \"down1\",\n      \"down2\",\n      \"down3\",\n      \"down4\",\n      \"posteam_timeouts_remaining\",\n      \"defteam_timeouts_remaining\",\n    )\n\n  return(pbp)\n}\n\n#get the columns needed for wp predictions\n#making sure they're in the right order\nwp_model_select <- function(pbp) {\n  pbp <- pbp |>\n    dplyr::select(\n      \"receive_2h_ko\",\n      \"home\",\n      \"half_seconds_remaining\",\n      \"game_seconds_remaining\",\n      \"Diff_Time_Ratio\",\n      \"score_differential\",\n      \"down\",\n      \"ydstogo\",\n      \"yardline_100\",\n      \"posteam_timeouts_remaining\",\n      \"defteam_timeouts_remaining\"\n    )\n\n  return(pbp)\n}\n\n#get the columns needed for wp predictions\n#making sure they're in the right order\nwp_spread_model_select <- function(pbp) {\n  pbp <- pbp |>\n    dplyr::select(\n      \"receive_2h_ko\",\n      \"spread_time\",\n      \"home\",\n      \"half_seconds_remaining\",\n      \"game_seconds_remaining\",\n      \"Diff_Time_Ratio\",\n      \"score_differential\",\n      \"down\",\n      \"ydstogo\",\n      \"yardline_100\",\n      \"posteam_timeouts_remaining\",\n      \"defteam_timeouts_remaining\"\n    )\n\n  return(pbp)\n}\nprepare_wp_data <- function(pbp) {\n  if (any(is.na(pbp$spread_line))) {\n    broken_games <- pbp |>\n      dplyr::filter(is.na(.data$spread_line)) |>\n      dplyr::pull(.data$game_id) |>\n      unique() |>\n      sort()\n    cli::cli_alert_danger(\n      \"The following game{?s} {?is/are} missing valid spread lines: {.val {broken_games}}.\"\n    )\n    cli::cli_alert_warning(\n      \"nflfastR will manually set the spread for the home team to {.val 1.5} points!\"\n    )\n    cli::cli_alert_warning(\n      \"If you see this, please reach out to the package maintainers {.url https://github.com/nflverse/nflfastR/issues}\"\n    )\n    pbp$spread_line[is.na(pbp$spread_line)] <- 1.5\n  }\n\n  pbp <- pbp |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      receive_2h_ko = dplyr::if_else(\n        .data$qtr <= 2 &\n          .data$posteam == dplyr::first(stats::na.omit(.data$defteam)),\n        1,\n        0\n      )\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate(\n      posteam_spread = dplyr::if_else(\n        .data$home == 1,\n        .data$spread_line,\n        -1 * .data$spread_line\n      ),\n      elapsed_share = (3600 - .data$game_seconds_remaining) / 3600,\n      spread_time = .data$posteam_spread * exp(-4 * .data$elapsed_share),\n      Diff_Time_Ratio = .data$score_differential /\n        (exp(-4 * .data$elapsed_share))\n    )\n\n  return(pbp)\n}\n\n\n#add ep variables\n#All of these are heavily borrowed from nflscrapR (Maksim Horowitz, Ronald Yurko, and Samuel Ventura)\nadd_ep_variables <- function(pbp_data) {\n  #testing\n  #pbp_data <- g\n\n  #this function is below\n  base_ep_preds <- get_preds(pbp_data)\n\n  # ----------------------------------------------------------------------------\n  # ---- special case: deal with FG attempts\n  # Now make another dataset that to get the EP probabilities from a missed FG:\n  missed_fg_data <- pbp_data\n  # Subtract 5.065401 from TimeSecs:\n  missed_fg_data$half_seconds_remaining <- missed_fg_data$half_seconds_remaining -\n    5.065401\n\n  # Correct the yrdline100:\n  missed_fg_data$yardline_100 <- 100 - (missed_fg_data$yardline_100 + 8)\n  # Now first down:\n  missed_fg_data$down1 <- rep(1, nrow(pbp_data))\n  missed_fg_data$down2 <- rep(0, nrow(pbp_data))\n  missed_fg_data$down3 <- rep(0, nrow(pbp_data))\n  missed_fg_data$down4 <- rep(0, nrow(pbp_data))\n  # 10 ydstogo:\n  missed_fg_data$ydstogo <- rep(10, nrow(pbp_data))\n\n  # Get the new predicted probabilites:\n  if (nrow(missed_fg_data) > 1) {\n    missed_fg_ep_preds <- get_preds(missed_fg_data)\n  } else {\n    missed_fg_ep_preds <- get_preds(missed_fg_data)\n  }\n\n  # Find the rows where TimeSecs_Remaining became 0 or negative and make all the probs equal to 0:\n  end_game_i <- which(missed_fg_data$half_seconds_remaining <= 0)\n  missed_fg_ep_preds[end_game_i, ] <- rep(0, ncol(missed_fg_ep_preds))\n\n  # if the half ends, no one scored\n  missed_fg_ep_preds[end_game_i, \"No_Score\"] <- 1\n\n  # Get the probability of making the field goal:\n  make_fg_prob <- as.numeric(mgcv::predict.bam(\n    fastrmodels::fg_model,\n    newdata = pbp_data,\n    type = \"response\"\n  ))\n\n  # Multiply each value of the missed_fg_ep_preds by the 1 - make_fg_prob\n  missed_fg_ep_preds <- missed_fg_ep_preds * (1 - make_fg_prob)\n  # Find the FG attempts:\n  fg_attempt_i <- which(pbp_data$play_type == \"field_goal\")\n\n  # Now update the probabilities for the FG attempts (also includes Opp_Field_Goal probability from missed_fg_ep_preds)\n  base_ep_preds[fg_attempt_i, \"Field_Goal\"] <- make_fg_prob[fg_attempt_i] +\n    missed_fg_ep_preds[fg_attempt_i, \"Opp_Field_Goal\"]\n  # Update the other columns based on the opposite possession:\n  base_ep_preds[fg_attempt_i, \"Touchdown\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"Opp_Touchdown\"\n  ]\n  base_ep_preds[fg_attempt_i, \"Opp_Field_Goal\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"Field_Goal\"\n  ]\n  base_ep_preds[fg_attempt_i, \"Opp_Touchdown\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"Touchdown\"\n  ]\n  base_ep_preds[fg_attempt_i, \"Safety\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"Opp_Safety\"\n  ]\n  base_ep_preds[fg_attempt_i, \"Opp_Safety\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"Safety\"\n  ]\n  base_ep_preds[fg_attempt_i, \"No_Score\"] <- missed_fg_ep_preds[\n    fg_attempt_i,\n    \"No_Score\"\n  ]\n\n  # ----------------------------------------------------------------------------------\n  # ---- special case: deal with kickoffs\n  # Calculate the EP for receiving a touchback (from the point of view for recieving team)\n  # and update the columns for Kickoff plays:\n  kickoff_data <- pbp_data\n\n  # Change the yard line to be 80 for 2009-2015 and 75 otherwise\n  # (accounting for the fact that Jan 2016 is in the 2015 season:\n  kickoff_data$yardline_100 <- with(kickoff_data, ifelse(season < 2016, 80, 75))\n  # Now first down:\n  kickoff_data$down1 <- rep(1, nrow(pbp_data))\n  kickoff_data$down2 <- rep(0, nrow(pbp_data))\n  kickoff_data$down3 <- rep(0, nrow(pbp_data))\n  kickoff_data$down4 <- rep(0, nrow(pbp_data))\n  # 10 ydstogo:\n  kickoff_data$ydstogo <- rep(10, nrow(pbp_data))\n\n  # Get the new predicted probabilites:\n  kickoff_preds <- get_preds(kickoff_data)\n\n  # Find the kickoffs:\n  kickoff_i <- which(\n    pbp_data$play_type == \"kickoff\" | pbp_data$kickoff_attempt == 1\n  )\n\n  # Now update the probabilities:\n  base_ep_preds[kickoff_i, \"Field_Goal\"] <- kickoff_preds[\n    kickoff_i,\n    \"Field_Goal\"\n  ]\n  base_ep_preds[kickoff_i, \"Touchdown\"] <- kickoff_preds[kickoff_i, \"Touchdown\"]\n  base_ep_preds[kickoff_i, \"Opp_Field_Goal\"] <- kickoff_preds[\n    kickoff_i,\n    \"Opp_Field_Goal\"\n  ]\n  base_ep_preds[kickoff_i, \"Opp_Touchdown\"] <- kickoff_preds[\n    kickoff_i,\n    \"Opp_Touchdown\"\n  ]\n  base_ep_preds[kickoff_i, \"Safety\"] <- kickoff_preds[kickoff_i, \"Safety\"]\n  base_ep_preds[kickoff_i, \"Opp_Safety\"] <- kickoff_preds[\n    kickoff_i,\n    \"Opp_Safety\"\n  ]\n  base_ep_preds[kickoff_i, \"No_Score\"] <- kickoff_preds[kickoff_i, \"No_Score\"]\n\n  # ----------------------------------------------------------------------------------\n  # Insert probabilities of 0 for everything but No_Score for QB Kneels that\n  # occur on the possession team's side of the field:\n  # Find these QB Kneels:\n  qb_kneels_i <- which(\n    pbp_data$play_type == \"qb_kneel\" & pbp_data$yardline_100 > 50\n  )\n\n  # Now update the probabilities:\n  base_ep_preds[qb_kneels_i, \"Field_Goal\"] <- 0\n  base_ep_preds[qb_kneels_i, \"Touchdown\"] <- 0\n  base_ep_preds[qb_kneels_i, \"Opp_Field_Goal\"] <- 0\n  base_ep_preds[qb_kneels_i, \"Opp_Touchdown\"] <- 0\n  base_ep_preds[qb_kneels_i, \"Safety\"] <- 0\n  base_ep_preds[qb_kneels_i, \"Opp_Safety\"] <- 0\n  base_ep_preds[qb_kneels_i, \"No_Score\"] <- 1\n\n  # ----------------------------------------------------------------------------------\n  # Create two new columns, ExPoint_Prob and TwoPoint_Prob, for the PAT events:\n  base_ep_preds$ExPoint_Prob <- 0\n  base_ep_preds$TwoPoint_Prob <- 0\n\n  # Find the indices for these types of plays:\n  extrapoint_i <- which(\n    (pbp_data$play_type == \"extra_point\" |\n      pbp_data$play_type_nfl == \"XP_KICK\") &\n      (is.na(pbp_data$play_type_nfl) | pbp_data$play_type_nfl != \"PAT2\")\n  )\n  twopoint_i <- which(pbp_data$two_point_attempt == 1)\n\n  #new: special case for PAT or kickoff with penalty\n  #for inserting NAs\n  st_penalty_i_1 <- which(\n    # pat: prior play was TD or PAT or Timeout and next play is PAT and this play isn't a td and it's not a regular down\n    (pbp_data$touchdown == 0 &\n      is.na(pbp_data$down) &\n      (dplyr::lag(pbp_data$touchdown) == 1 |\n        dplyr::lag(pbp_data$play_type_nfl) == \"XP_KICK\" |\n        dplyr::lag(pbp_data$timeout) == 1) &\n      (dplyr::lead(pbp_data$two_point_attempt) == 1 |\n        dplyr::lead(pbp_data$extra_point_attempt) == 1 |\n        dplyr::lead(pbp_data$play_type_nfl) == \"XP_KICK\")) |\n      #kickoff: prior play was PAT and next play is kickoff\n      ((dplyr::lag(pbp_data$two_point_attempt) == 1 |\n        dplyr::lag(pbp_data$extra_point_attempt) == 1) &\n        dplyr::lead(pbp_data$kickoff_attempt == 1))\n  )\n\n  st_penalty_i_2 <- which(\n    is.na(dplyr::lead(pbp_data$down)) &\n      # has a key term in desc\n      (((stringr::str_detect(pbp_data$desc, 'Kick formation') &\n        is.na(pbp_data$down) &\n        pbp_data$play_type == 'no_play') |\n        (stringr::str_detect(pbp_data$desc, 'Pass formation') &\n          is.na(pbp_data$down) &\n          pbp_data$play_type == 'no_play') |\n        (stringr::str_detect(pbp_data$desc, 'kicks onside') &\n          is.na(pbp_data$down) &\n          pbp_data$play_type == 'no_play') |\n        (stringr::str_detect(pbp_data$desc, 'Offside on Free Kick') &\n          is.na(pbp_data$down) &\n          pbp_data$play_type == 'no_play') |\n        (stringr::str_detect(pbp_data$desc, 'TWO-POINT CONVERSION')) &\n          # down is NA and play type no play and next play isn't a kickoff\n          is.na(pbp_data$down) &\n          pbp_data$play_type == 'no_play' &\n          dplyr::lead(pbp_data$kickoff_attempt) == 0))\n  )\n\n  # Assign the make_fg_probs of the extra-point PATs:\n  base_ep_preds$ExPoint_Prob[extrapoint_i] <- make_fg_prob[extrapoint_i]\n\n  # Assign the TwoPoint_Prob with the historical success rate:\n  base_ep_preds$TwoPoint_Prob[twopoint_i] <- 0.4735\n\n  # ----------------------------------------------------------------------------------\n  # Insert NAs for timeouts and end of play rows:\n  missing_i <- which(\n    (pbp_data$timeout == 1 &\n      pbp_data$play_type == \"no_play\" &\n      !stringr::str_detect(pbp_data$desc, ' pass ') &\n      !stringr::str_detect(pbp_data$desc, ' sacked ') &\n      !stringr::str_detect(pbp_data$desc, ' scramble ') &\n      !stringr::str_detect(pbp_data$desc, ' punts ') &\n      !stringr::str_detect(pbp_data$desc, ' up the middle ') &\n      !stringr::str_detect(pbp_data$desc, ' left end ') &\n      !stringr::str_detect(pbp_data$desc, ' left guard ') &\n      !stringr::str_detect(pbp_data$desc, ' left tackle ') &\n      !stringr::str_detect(pbp_data$desc, ' right end ') &\n      !stringr::str_detect(pbp_data$desc, ' right guard ') &\n      !stringr::str_detect(pbp_data$desc, ' right tackle ')) |\n      is.na(pbp_data$play_type)\n  )\n\n  # Now update the probabilities for missing and PATs:\n  base_ep_preds$Field_Goal[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$Touchdown[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$Opp_Field_Goal[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$Opp_Touchdown[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$Safety[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$Opp_Safety[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n  base_ep_preds$No_Score[c(\n    missing_i,\n    extrapoint_i,\n    twopoint_i,\n    st_penalty_i_1,\n    st_penalty_i_2\n  )] <- 0\n\n  # Rename the events to all have _Prob at the end of them:\n  base_ep_preds <- dplyr::rename(\n    base_ep_preds,\n    Field_Goal_Prob = \"Field_Goal\",\n    Touchdown_Prob = \"Touchdown\",\n    Opp_Field_Goal_Prob = \"Opp_Field_Goal\",\n    Opp_Touchdown_Prob = \"Opp_Touchdown\",\n    Safety_Prob = \"Safety\",\n    Opp_Safety_Prob = \"Opp_Safety\",\n    No_Score_Prob = \"No_Score\"\n  )\n\n  # Join them together:\n  pbp_data <- cbind(pbp_data, base_ep_preds)\n\n  # Calculate the ExpPts:\n  pbp_data_ep <- dplyr::mutate(\n    pbp_data,\n    ExpPts = (0 * .data$No_Score_Prob) +\n      (-3 * .data$Opp_Field_Goal_Prob) +\n      (-2 * .data$Opp_Safety_Prob) +\n      (-7 * .data$Opp_Touchdown_Prob) +\n      (3 * .data$Field_Goal_Prob) +\n      (2 * .data$Safety_Prob) +\n      (7 * .data$Touchdown_Prob) +\n      (1 * .data$ExPoint_Prob) +\n      (2 * .data$TwoPoint_Prob)\n  )\n\n  #just going to set these to NA bc we have no way of calculating EPA for them\n  if (length(st_penalty_i_1) > 0) {\n    pbp_data_ep$ExpPts[st_penalty_i_1] <- NA_real_\n  }\n\n  if (length(st_penalty_i_2) > 0) {\n    pbp_data_ep$ExpPts[st_penalty_i_2] <- NA_real_\n  }\n\n  pbp_data_ep$ExpPts[missing_i] <- NA_real_\n\n  #################################################################\n  # Calculate EPA:\n\n  ### Adding Expected Points Added (EPA) column\n\n  # Create multiple types of EPA columns\n  # for each of the possible cases,\n  # grouping by GameID (will then just use\n  # an ifelse statement to decide which one\n  # to use as the final EPA):\n  pbp_data_ep |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      # Now conditionally assign the EPA, first for possession team\n      # touchdowns:\n      ep = .data$ExpPts,\n      tmp_posteam = .data$posteam\n    ) |>\n    tidyr::fill(\n      .data$ep,\n      .direction = \"up\"\n    ) |>\n    tidyr::fill(\n      .data$tmp_posteam,\n      .direction = \"up\"\n    ) |>\n    dplyr::mutate(\n      # get epa for non-scoring plays\n      home_ep = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$ep,\n        -.data$ep\n      ),\n      home_epa = dplyr::lead(.data$home_ep) - .data$home_ep,\n      epa = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$home_epa,\n        -.data$home_epa\n      ),\n\n      # td\n      epa = dplyr::if_else(\n        !is.na(.data$td_team),\n        dplyr::if_else(\n          .data$td_team == .data$posteam,\n          7 - .data$ep,\n          -7 - .data$ep\n        ),\n        .data$epa\n      ),\n      # Offense field goal:\n      epa = dplyr::if_else(\n        is.na(.data$td_team) & .data$field_goal_made == 1,\n        3 - .data$ep,\n        .data$epa,\n        missing = .data$epa\n      ),\n      # Offense extra-point:\n      epa = dplyr::if_else(\n        is.na(.data$td_team) &\n          .data$field_goal_made == 0 &\n          .data$extra_point_good == 1,\n        1 - .data$ep,\n        .data$epa,\n        missing = .data$epa\n      ),\n      # Offense two-point conversion:\n      epa = dplyr::if_else(\n        is.na(.data$td_team) &\n          .data$field_goal_made == 0 &\n          .data$extra_point_good == 0 &\n          (.data$two_point_rush_good == 1 |\n            .data$two_point_pass_good == 1 |\n            .data$two_point_pass_reception_good == 1),\n        2 - .data$ep,\n        .data$epa,\n        missing = .data$epa\n      ),\n      # Failed PAT (both 1 and 2):\n      epa = dplyr::if_else(\n        is.na(.data$td_team) &\n          .data$field_goal_made == 0 &\n          .data$extra_point_good == 0 &\n          ((.data$extra_point_failed == 1 |\n            .data$extra_point_blocked == 1 |\n            .data$extra_point_aborted == 1) |\n            (.data$two_point_rush_failed == 1 |\n              .data$two_point_pass_failed == 1 |\n              .data$two_point_pass_reception_failed == 1)),\n        0 - .data$ep,\n        .data$epa,\n        missing = .data$epa\n      ),\n      # Opponent scores defensive 2 point:\n      epa = dplyr::if_else(\n        .data$defensive_two_point_conv == 1,\n        -2 - .data$ep,\n        .data$epa,\n        missing = .data$epa\n      ),\n      # Safety:\n      epa = dplyr::case_when(\n        !is.na(.data$safety_team) & .data$safety_team == .data$posteam ~ 2 -\n          .data$ep,\n        !is.na(.data$safety_team) & .data$safety_team == .data$defteam ~ -2 -\n          .data$ep,\n        TRUE ~ .data$epa\n      )\n    ) |>\n    # Now rename each of the expected points columns to match the style of\n    # the updated code:\n    dplyr::rename(\n      no_score_prob = \"No_Score_Prob\",\n      opp_fg_prob = \"Opp_Field_Goal_Prob\",\n      opp_safety_prob = \"Opp_Safety_Prob\",\n      opp_td_prob = \"Opp_Touchdown_Prob\",\n      fg_prob = \"Field_Goal_Prob\",\n      safety_prob = \"Safety_Prob\",\n      td_prob = \"Touchdown_Prob\",\n      extra_point_prob = \"ExPoint_Prob\",\n      two_point_conversion_prob = \"TwoPoint_Prob\"\n    ) |>\n    # Create columns with cumulative epa totals for both teams:\n    dplyr::mutate(\n      # helper for end of game\n      end_game = ifelse(\n        stringr::str_detect(tolower(.data$desc), \"(end of game)|(end game)\"),\n        1,\n        0\n      ),\n\n      # Change epa for plays occurring at end of half with no scoring\n      # plays to be just the difference between 0 and starting ep:\n      epa = dplyr::if_else(\n        ((.data$qtr == 2 &\n          (dplyr::lead(.data$qtr) == 3 |\n            dplyr::lead(.data$desc) == \"END QUARTER 2\")) |\n          (.data$qtr == 4 &\n            (dplyr::lead(.data$qtr) == 5 |\n              dplyr::lead(.data$desc) == \"END QUARTER 4\" |\n              dplyr::lead(.data$end_game) == 1))) &\n          .data$sp == 0 &\n          !is.na(.data$play_type),\n        0 - .data$ep,\n        .data$epa\n      ),\n      # last play of OT\n      epa = dplyr::if_else(\n        .data$qtr > 4 & dplyr::lead(.data$end_game) == 1 & .data$sp == 0,\n        0 - .data$ep,\n        .data$epa\n      ),\n      epa = dplyr::if_else(.data$desc == \"END QUARTER 2\", NA_real_, .data$epa),\n      epa = dplyr::if_else(.data$end_game == 1, NA_real_, .data$epa),\n      ep = dplyr::if_else(.data$desc == \"END QUARTER 2\", NA_real_, .data$ep),\n      ep = dplyr::if_else(.data$end_game == 1, NA_real_, .data$ep),\n      home_team_epa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$epa,\n        -.data$epa\n      ),\n      away_team_epa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$epa,\n        -.data$epa\n      ),\n      home_team_epa = dplyr::if_else(\n        is.na(.data$home_team_epa),\n        0,\n        .data$home_team_epa\n      ),\n      away_team_epa = dplyr::if_else(\n        is.na(.data$away_team_epa),\n        0,\n        .data$away_team_epa\n      ),\n      total_home_epa = cumsum(.data$home_team_epa),\n      total_away_epa = cumsum(.data$away_team_epa),\n      # Same thing but separating passing and rushing:\n      home_team_rush_epa = dplyr::if_else(\n        .data$play_type == \"run\",\n        .data$home_team_epa,\n        0\n      ),\n      away_team_rush_epa = dplyr::if_else(\n        .data$play_type == \"run\",\n        .data$away_team_epa,\n        0\n      ),\n      home_team_rush_epa = dplyr::if_else(\n        is.na(.data$home_team_rush_epa),\n        0,\n        .data$home_team_rush_epa\n      ),\n      away_team_rush_epa = dplyr::if_else(\n        is.na(.data$away_team_rush_epa),\n        0,\n        .data$away_team_rush_epa\n      ),\n      total_home_rush_epa = cumsum(.data$home_team_rush_epa),\n      total_away_rush_epa = cumsum(.data$away_team_rush_epa),\n      home_team_pass_epa = dplyr::if_else(\n        .data$play_type == \"pass\",\n        .data$home_team_epa,\n        0\n      ),\n      away_team_pass_epa = dplyr::if_else(\n        .data$play_type == \"pass\",\n        .data$away_team_epa,\n        0\n      ),\n      home_team_pass_epa = dplyr::if_else(\n        is.na(.data$home_team_pass_epa),\n        0,\n        .data$home_team_pass_epa\n      ),\n      away_team_pass_epa = dplyr::if_else(\n        is.na(.data$away_team_pass_epa),\n        0,\n        .data$away_team_pass_epa\n      ),\n      total_home_pass_epa = cumsum(.data$home_team_pass_epa),\n      total_away_pass_epa = cumsum(.data$away_team_pass_epa)\n    ) |>\n    dplyr::ungroup()\n}\n\n\n#################################################################\n# Calculate WP and WPA:\nadd_wp_variables <- function(pbp_data) {\n  #testing only\n  # pbp_data <- g\n\n  # Initialize the df to store predicted win probability\n  OffWinProb <- rep(NA_real_, nrow(pbp_data))\n  OffWinProb_spread <- rep(NA_real_, nrow(pbp_data))\n\n  pbp_data <- pbp_data |>\n    prepare_wp_data()\n\n  # First check if there's any overtime plays:\n  if (any(pbp_data$qtr > 4)) {\n    # Find the rows that are overtime:\n    overtime_i <- which(pbp_data$qtr > 4)\n\n    # Separate the dataset into regular_df and overtime_df:\n    overtime_df <- pbp_data[overtime_i, ]\n\n    # Separate routine for overtime:\n\n    # Create a column that is just the first drive of overtime repeated:\n    overtime_df$First_Drive <- rep(\n      min(overtime_df$drive, na.rm = TRUE),\n      nrow(overtime_df)\n    )\n\n    # Calculate the difference in drive number\n    overtime_df <- dplyr::mutate(\n      overtime_df,\n      Drive_Diff = .data$drive - .data$First_Drive\n    )\n\n    # Create an indicator column that means the posteam is losing by 3 and\n    # its the second drive of overtime:\n    overtime_df$One_FG_Game <- ifelse(\n      overtime_df$score_differential == -3 &\n        overtime_df$Drive_Diff == 1,\n      1,\n      0\n    )\n\n    # Now create a copy of the dataset to then make the EP predictions for when\n    # a field goal is scored and its not sudden death:\n    overtime_df_ko <- overtime_df\n\n    overtime_df_ko$yrdline100 <- with(\n      overtime_df_ko,\n      ifelse(\n        game_year < 2016 |\n          (game_year == 2016 & game_month < 4),\n        80,\n        75\n      )\n    )\n\n    # Now first down:\n    overtime_df_ko$down1 <- rep(1, nrow(overtime_df_ko))\n    overtime_df_ko$down2 <- rep(0, nrow(overtime_df_ko))\n    overtime_df_ko$down3 <- rep(0, nrow(overtime_df_ko))\n    overtime_df_ko$down4 <- rep(0, nrow(overtime_df_ko))\n    # 10 ydstogo:\n    overtime_df_ko$ydstogo <- rep(10, nrow(overtime_df_ko))\n\n    # Get the predictions from the EP model and calculate the necessary probability:\n    overtime_df_ko_preds <- get_preds(overtime_df_ko)\n\n    overtime_df_ko_preds <- dplyr::mutate(\n      overtime_df_ko_preds,\n      Win_Back = .data$No_Score +\n        .data$Opp_Field_Goal +\n        .data$Opp_Safety +\n        .data$Opp_Touchdown\n    )\n\n    # Calculate the two possible win probability types, Sudden Death and one Field Goal:\n    overtime_df$Sudden_Death_WP <- overtime_df$fg_prob +\n      overtime_df$td_prob +\n      overtime_df$safety_prob\n    overtime_df$One_FG_WP <- overtime_df$td_prob +\n      (overtime_df$fg_prob * overtime_df_ko_preds$Win_Back)\n\n    # Decide which win probability to use:\n    OffWinProb[overtime_i] <- ifelse(\n      overtime_df$game_year >= 2012 &\n        (overtime_df$Drive_Diff == 0 |\n          (overtime_df$Drive_Diff == 1 & overtime_df$One_FG_Game == 1)),\n      overtime_df$One_FG_WP,\n      overtime_df$Sudden_Death_WP\n    )\n    OffWinProb_spread[overtime_i] <- OffWinProb[overtime_i]\n  }\n\n  #regulation plays\n  regular_i <- which(pbp_data$qtr <= 4)\n\n  # df of just the regulation plays:\n  regular_df <- pbp_data[regular_i, ]\n\n  # do predictions for the regular df\n  OffWinProb[regular_i] <- get_preds_wp(regular_df)\n  OffWinProb_spread[regular_i] <- get_preds_wp_spread(regular_df)\n\n  ## set to NA WP for plays down is missing\n  # for kickoffs and PATs, these will get overwritten by the fixes after this\n\n  down_na <- which(is.na(pbp_data$down))\n  OffWinProb[down_na] <- NA_real_\n  OffWinProb_spread[down_na] <- NA_real_\n\n  ## start PAT fix\n\n  make_pat_prob <- as.numeric(\n    mgcv::predict.bam(\n      fastrmodels::fg_model,\n      newdata = pbp_data |>\n        mutate(\n          yardline_100 = ifelse(.data$season >= 2015, 15, 3)\n        ),\n      type = \"response\"\n    )\n  )\n\n  # plays with 1 point PAT attempts\n  pat_i <- which(\n    (pbp_data$kickoff_attempt == 0 &\n      !(stringr::str_detect(pbp_data$desc, 'Onside Kick')) &\n      (stringr::str_detect(pbp_data$desc, 'Kick formation')) &\n      is.na(pbp_data$down)) |\n      # or has PAT indicators\n      stringr::str_detect(pbp_data$desc, 'extra point') |\n      !is.na(pbp_data$extra_point_result)\n  )\n\n  # plays with 2 point PAT attempts\n  two_pt_i <- which(\n    (pbp_data$kickoff_attempt == 0 &\n      !(stringr::str_detect(pbp_data$desc, 'Onside Kick')) &\n      (stringr::str_detect(pbp_data$desc, 'Pass formation')) &\n      is.na(pbp_data$down)) |\n      # or has PAT indicators\n      stringr::str_detect(pbp_data$desc, 'TWO-POINT CONVERSION ATTEMPT') |\n      !is.na(pbp_data$two_point_conv_result)\n  )\n\n  # some rare 2 point PAT attempts have duplicated matches in 1 point PAT attempts\n  # so we remove them in the next line\n  pat_i <- pat_i[!pat_i %in% two_pt_i]\n\n  # make df of post-PAT plays\n  pat_data <- pbp_data |>\n    dplyr::mutate(\n      # swap timeouts\n      to_pos = .data$posteam_timeouts_remaining,\n      to_def = .data$defteam_timeouts_remaining,\n      posteam_timeouts_remaining = .data$to_def,\n      defteam_timeouts_remaining = .data$to_pos,\n      # swap score\n      score_differential = -.data$score_differential,\n      # 1st and 10\n      down = 1,\n      ydstogo = 10,\n      # flip receive_2h_ko var\n      receive_2h_ko = case_when(\n        .data$qtr <= 2 & .data$receive_2h_ko == 0 ~ 1,\n        .data$qtr <= 2 & .data$receive_2h_ko == 1 ~ 0,\n        TRUE ~ .data$receive_2h_ko\n      ),\n      # switch posteam\n      posteam = if_else(\n        .data$home_team == .data$posteam,\n        .data$away_team,\n        .data$home_team\n      ),\n      yardline_100 = 75\n    ) |>\n    dplyr::mutate(\n      home = case_when(\n        .data$home == 0 ~ 1,\n        .data$home == 1 ~ 0\n      ),\n      posteam_spread = dplyr::if_else(\n        .data$home == 1,\n        .data$spread_line,\n        -1 * .data$spread_line\n      ),\n      elapsed_share = (3600 - .data$game_seconds_remaining) / 3600,\n      spread_time = .data$posteam_spread * exp(-4 * .data$elapsed_share)\n    )\n\n  ## start with spread version\n  # get pat if 0, 1, or 2\n  pat_0 <- get_preds_wp_spread(pat_data |> add_esdtr())\n  pat_1 <- get_preds_wp_spread(\n    pat_data |>\n      dplyr::mutate(score_differential = .data$score_differential - 1) |>\n      add_esdtr()\n  )\n  pat_2 <- get_preds_wp_spread(\n    pat_data |>\n      dplyr::mutate(score_differential = .data$score_differential - 2) |>\n      add_esdtr()\n  )\n\n  # Using nflscrapR version of 2pt make prob on 2nd line here\n  pat_go_for_1 <- 1 - (make_pat_prob * pat_1 + (1 - make_pat_prob) * pat_0)\n  pat_go_for_2 <- 1 - (0.4735 * pat_2 + (1 - 0.4735) * pat_0)\n\n  OffWinProb_spread[two_pt_i] <- pat_go_for_2[two_pt_i]\n  OffWinProb_spread[pat_i] <- pat_go_for_1[pat_i]\n\n  ## repeat for non-spread version\n  # get pat if 0, 1, or 2\n  pat_0 <- get_preds_wp(pat_data |> add_esdtr())\n  pat_1 <- get_preds_wp(\n    pat_data |>\n      dplyr::mutate(score_differential = .data$score_differential - 1) |>\n      add_esdtr()\n  )\n  pat_2 <- get_preds_wp(\n    pat_data |>\n      dplyr::mutate(score_differential = .data$score_differential - 2) |>\n      add_esdtr()\n  )\n\n  # Using nflscrapR version of 2pt make prob on 2nd line here\n  pat_go_for_1 <- 1 - (make_pat_prob * pat_1 + (1 - make_pat_prob) * pat_0)\n  pat_go_for_2 <- 1 - (0.4735 * pat_2 + (1 - 0.4735) * pat_0)\n\n  OffWinProb[two_pt_i] <- pat_go_for_2[two_pt_i]\n  OffWinProb[pat_i] <- pat_go_for_1[pat_i]\n\n  ## end PAT fix\n\n  ## now we need to fix WP on kickoffs, which will be WP associated with touchback\n  kickoff_data <- pbp_data\n\n  # Change the yard line to be 80 for 2009-2015 and 75 otherwise\n  kickoff_data$yardline_100 <- with(kickoff_data, ifelse(season < 2016, 80, 75))\n  # Now first down:\n  kickoff_data$down <- rep(1, nrow(pbp_data))\n  kickoff_data$down1 <- rep(1, nrow(pbp_data))\n  kickoff_data$down2 <- rep(0, nrow(pbp_data))\n  kickoff_data$down3 <- rep(0, nrow(pbp_data))\n  kickoff_data$down4 <- rep(0, nrow(pbp_data))\n  # 10 ydstogo:\n  kickoff_data$ydstogo <- rep(10, nrow(pbp_data))\n\n  # Get the new predicted probabilites:\n  kickoff_preds <- get_preds_wp(kickoff_data)\n  kickoff_preds_spread <- get_preds_wp_spread(kickoff_data)\n\n  # Find the kickoffs in regulation:\n  kickoff_i <- which(\n    (pbp_data$play_type == \"kickoff\" | pbp_data$kickoff_attempt == 1) &\n      pbp_data$qtr <= 4\n  )\n\n  # Now update the probabilities:\n  OffWinProb[kickoff_i] <- kickoff_preds[kickoff_i]\n  OffWinProb_spread[kickoff_i] <- kickoff_preds_spread[kickoff_i]\n\n  ## end fix for kickoffs\n\n  # Now create the win probability columns and return:\n  pbp_data <- pbp_data |>\n    dplyr::mutate(\n      wp = OffWinProb,\n      vegas_wp = OffWinProb_spread,\n      # for figuring out posteam on NA posteam lines\n      tmp_posteam = .data$posteam\n    ) |>\n    tidyr::fill(\n      .data$wp,\n      .direction = \"up\"\n    ) |>\n    tidyr::fill(\n      .data$vegas_wp,\n      .direction = \"up\"\n    ) |>\n    tidyr::fill(\n      .data$tmp_posteam,\n      .direction = \"up\"\n    ) |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      #add columns for home WP\n      home_wp = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$wp,\n        1 - .data$wp\n      ),\n      vegas_home_wp = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$vegas_wp,\n        1 - .data$vegas_wp\n      ),\n\n      # convenience to mark end of game\n      end_game = ifelse(\n        stringr::str_detect(tolower(.data$desc), \"(end of game)|(end game)\"),\n        1,\n        0\n      ),\n\n      # convenience for marking home win prob on last line\n      final_value = dplyr::case_when(\n        .data$home_score > .data$away_score ~ 1,\n        .data$away_score > .data$home_score ~ 0,\n        .data$home_score == .data$away_score ~ .5\n      ),\n\n      #make 1 or 0 the final win prob\n      vegas_home_wp = dplyr::if_else(\n        .data$end_game == 1,\n        .data$final_value,\n        .data$vegas_home_wp\n      ),\n\n      # can we make this and the above into a function? feels like a lot of repitition\n      home_wp = dplyr::if_else(\n        .data$end_game == 1,\n        .data$final_value,\n        .data$home_wp\n      ),\n\n      away_wp = 1 - .data$home_wp,\n\n      # make wp of posteam on last line NA because there's no posteam\n      vegas_wp = dplyr::if_else(\n        .data$end_game == 1,\n        NA_real_,\n        .data$vegas_wp\n      ),\n\n      wp = dplyr::if_else(\n        .data$end_game == 1,\n        NA_real_,\n        .data$wp\n      ),\n\n      def_wp = 1 - .data$wp,\n\n      # make wpa\n      vegas_home_wpa = dplyr::lead(.data$vegas_home_wp) - .data$vegas_home_wp,\n      vegas_wpa = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$vegas_home_wpa,\n        -.data$vegas_home_wpa\n      ),\n      vegas_wpa = dplyr::if_else(\n        stringr::str_detect(\n          tolower(.data$desc),\n          \"( kneels )|(end of game)|(end game)\"\n        ),\n        NA_real_,\n        .data$vegas_wpa\n      ),\n\n      # home wpa isn't saved but needed for next line\n      home_wpa = dplyr::lead(.data$home_wp) - .data$home_wp,\n      wpa = dplyr::if_else(\n        .data$tmp_posteam == .data$home_team,\n        .data$home_wpa,\n        -.data$home_wpa\n      ),\n      wpa = dplyr::if_else(\n        stringr::str_detect(\n          tolower(.data$desc),\n          \"( kneels )|(end of game)|(end game)\"\n        ),\n        NA_real_,\n        .data$wpa\n      )\n    ) |>\n    dplyr::ungroup()\n\n  # Home and Away post:\n\n  pbp_data$home_wp_post <- ifelse(\n    pbp_data$posteam == pbp_data$home_team,\n    pbp_data$home_wp + pbp_data$wpa,\n    pbp_data$home_wp - pbp_data$wpa\n  )\n  pbp_data$away_wp_post <- ifelse(\n    pbp_data$posteam == pbp_data$away_team,\n    pbp_data$away_wp + pbp_data$wpa,\n    pbp_data$away_wp - pbp_data$wpa\n  )\n\n  # If next thing is end of game, and post score differential is tied because it's\n  # overtime then make both the home_wp_post and away_wp_post equal to 0:\n  pbp_data <- pbp_data |>\n    dplyr::mutate(\n      home_wp_post = dplyr::if_else(\n        .data$qtr == 5 &\n          stringr::str_detect(\n            tolower(dplyr::lead(.data$desc)),\n            \"(end of game)|(end game)\"\n          ) &\n          .data$score_differential_post == 0,\n        0,\n        .data$home_wp_post\n      ),\n      away_wp_post = dplyr::if_else(\n        .data$qtr == 5 &\n          stringr::str_detect(\n            tolower(dplyr::lead(.data$desc)),\n            \"(end of game)|(end game)\"\n          ) &\n          .data$score_differential_post == 0,\n        0,\n        .data$away_wp_post\n      )\n    )\n\n  # For plays with playtype of End of Game, use the previous play's WP_post columns\n  # as the pre and post, since those are already set to be 1 and 0:\n\n  pbp_data$home_wp_post <- with(\n    pbp_data,\n    ifelse(\n      stringr::str_detect(tolower(desc), \"(end of game)|(end game)\"),\n      dplyr::lag(home_wp_post),\n      ifelse(\n        dplyr::lag(play_type) == \"no_play\" & play_type == \"no_play\",\n        dplyr::lag(home_wp_post),\n        home_wp_post\n      )\n    )\n  )\n\n  pbp_data$away_wp_post <- with(\n    pbp_data,\n    ifelse(\n      stringr::str_detect(tolower(desc), \"(end of game)|(end game)\"),\n      dplyr::lag(away_wp_post),\n      ifelse(\n        dplyr::lag(play_type) == \"no_play\" & play_type == \"no_play\",\n        dplyr::lag(away_wp_post),\n        away_wp_post\n      )\n    )\n  )\n\n  # Now drop the unnecessary columns, rename variables back, and return:\n  pbp_data |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      # Generate columns to keep track of cumulative rushing and\n      # passing WPA values:\n      home_team_wpa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$wpa,\n        -.data$wpa\n      ),\n      away_team_wpa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$wpa,\n        -.data$wpa\n      ),\n      home_team_wpa = dplyr::if_else(\n        is.na(.data$home_team_wpa),\n        0,\n        .data$home_team_wpa\n      ),\n      away_team_wpa = dplyr::if_else(\n        is.na(.data$away_team_wpa),\n        0,\n        .data$away_team_wpa\n      ),\n      # Same thing but separating passing and rushing:\n      home_team_rush_wpa = dplyr::if_else(\n        .data$play_type == \"run\",\n        .data$home_team_wpa,\n        0\n      ),\n      away_team_rush_wpa = dplyr::if_else(\n        .data$play_type == \"run\",\n        .data$away_team_wpa,\n        0\n      ),\n      home_team_rush_wpa = dplyr::if_else(\n        is.na(.data$home_team_rush_wpa),\n        0,\n        .data$home_team_rush_wpa\n      ),\n      away_team_rush_wpa = dplyr::if_else(\n        is.na(.data$away_team_rush_wpa),\n        0,\n        .data$away_team_rush_wpa\n      ),\n      total_home_rush_wpa = cumsum(.data$home_team_rush_wpa),\n      total_away_rush_wpa = cumsum(.data$away_team_rush_wpa),\n      home_team_pass_wpa = dplyr::if_else(\n        .data$play_type == \"pass\",\n        .data$home_team_wpa,\n        0\n      ),\n      away_team_pass_wpa = dplyr::if_else(\n        .data$play_type == \"pass\",\n        .data$away_team_wpa,\n        0\n      ),\n      home_team_pass_wpa = dplyr::if_else(\n        is.na(.data$home_team_pass_wpa),\n        0,\n        .data$home_team_pass_wpa\n      ),\n      away_team_pass_wpa = dplyr::if_else(\n        is.na(.data$away_team_pass_wpa),\n        0,\n        .data$away_team_pass_wpa\n      ),\n      total_home_pass_wpa = cumsum(.data$home_team_pass_wpa),\n      total_away_pass_wpa = cumsum(.data$away_team_pass_wpa)\n    ) |>\n    dplyr::ungroup()\n}\n\n\n# helper function to get expected score diff to time ratio\n# needed after flipping teams in WP for getting PAT WP\nadd_esdtr <- function(data) {\n  data |>\n    dplyr::mutate(\n      Diff_Time_Ratio = .data$score_differential /\n        (exp(-4 * .data$elapsed_share))\n    )\n}\n\n\n#################################################################\n# air and YAC EP:\n# as with the rest, heavily borrowed from nflscrapR:\n# https://github.com/maksimhorowitz/nflscrapR/blob/master/R/add_ep_wp_variables.R\nadd_air_yac_ep_variables <- function(pbp_data) {\n  #testing\n  #pbp_data <- g\n\n  # Final all pass attempts that are not sacks:\n  pass_plays_i <- which(\n    !is.na(pbp_data$air_yards) & pbp_data$play_type == 'pass'\n  )\n  pass_pbp_data <- pbp_data[pass_plays_i, ]\n\n  # Using the air_yards need to update the following:\n  # - yrdline100\n  # - TimeSecs_Remaining\n  # - ydstogo\n  # - down\n  # - timeouts\n\n  # Get everything set up for calculation\n  pass_pbp_data <- pass_pbp_data |>\n    dplyr::mutate(\n      posteam_timeouts_pre = .data$posteam_timeouts_remaining,\n      defeam_timeouts_pre = .data$defteam_timeouts_remaining\n    ) |>\n    # Rename the old columns to update for calculating the EP from the air:\n    dplyr::rename(\n      old_yrdline100 = .data$yardline_100,\n      old_ydstogo = .data$ydstogo,\n      old_TimeSecs_Remaining = .data$half_seconds_remaining,\n      old_down = .data$down\n    ) |>\n    dplyr::mutate(\n      Turnover_Ind = dplyr::if_else(\n        .data$old_down == 4 & .data$air_yards < .data$old_ydstogo,\n        1,\n        0\n      ),\n      yardline_100 = dplyr::if_else(\n        .data$Turnover_Ind == 0,\n        .data$old_yrdline100 - .data$air_yards,\n        100 - (.data$old_yrdline100 - .data$air_yards)\n      ),\n      ydstogo = dplyr::if_else(\n        .data$air_yards >= .data$old_ydstogo |\n          .data$Turnover_Ind == 1,\n        10,\n        .data$old_ydstogo - .data$air_yards\n      ),\n      down = dplyr::if_else(\n        .data$air_yards >= .data$old_ydstogo |\n          .data$Turnover_Ind == 1,\n        1,\n        as.numeric(.data$old_down) + 1\n      ),\n      half_seconds_remaining = .data$old_TimeSecs_Remaining - 5.704673,\n      down1 = dplyr::if_else(.data$down == 1, 1, 0),\n      down2 = dplyr::if_else(.data$down == 2, 1, 0),\n      down3 = dplyr::if_else(.data$down == 3, 1, 0),\n      down4 = dplyr::if_else(.data$down == 4, 1, 0),\n      posteam_timeouts_remaining = dplyr::if_else(\n        .data$Turnover_Ind == 1,\n        .data$defeam_timeouts_pre,\n        .data$posteam_timeouts_pre\n      ),\n      defteam_timeouts_remaining = dplyr::if_else(\n        .data$Turnover_Ind == 1,\n        .data$posteam_timeouts_pre,\n        .data$defeam_timeouts_pre\n      )\n    )\n\n  #get EP predictions\n  pass_pbp_data_preds <- get_preds(pass_pbp_data)\n\n  # Convert to air EP:\n  pass_pbp_data_preds <- dplyr::mutate(\n    pass_pbp_data_preds,\n    airEP = (.data$Opp_Safety * -2) +\n      (.data$Opp_Field_Goal * -3) +\n      (.data$Opp_Touchdown * -7) +\n      (.data$Safety * 2) +\n      (.data$Field_Goal * 3) +\n      (.data$Touchdown * 7)\n  )\n\n  # Return back to the passing data:\n  pass_pbp_data$airEP <- pass_pbp_data_preds$airEP\n\n  # For the plays that have TimeSecs_Remaining 0 or less, set airEP to 0:\n  pass_pbp_data$airEP[which(pass_pbp_data$half_seconds_remaining <= 0)] <- 0\n\n  # Calculate the airEPA based on 4 scenarios:\n  pass_pbp_data$airEPA <- with(\n    pass_pbp_data,\n    ifelse(\n      old_yrdline100 - air_yards <= 0,\n      7 - ep,\n      ifelse(\n        old_yrdline100 - air_yards > 99,\n        -2 - ep,\n        ifelse(Turnover_Ind == 1, (-1 * airEP) - ep, airEP - ep)\n      )\n    )\n  )\n\n  # If the play is a two-point conversion then change the airEPA to NA since\n  # no air yards are provided:\n  pass_pbp_data$airEPA <- with(\n    pass_pbp_data,\n    ifelse(two_point_attempt == 1, NA, airEPA)\n  )\n  # Calculate the yards after catch EPA:\n  pass_pbp_data <- dplyr::mutate(\n    pass_pbp_data,\n    yacEPA = .data$epa - .data$airEPA\n  )\n\n  # if Yards after catch is 0 make yacEPA set to 0:\n  pass_pbp_data$yacEPA <- ifelse(\n    pass_pbp_data$penalty == 0 &\n      pass_pbp_data$yards_after_catch == 0 &\n      pass_pbp_data$complete_pass == 1,\n    0,\n    pass_pbp_data$yacEPA\n  )\n\n  # if Yards after catch is 0 make airEPA set to EPA:\n  pass_pbp_data$airEPA <- ifelse(\n    pass_pbp_data$penalty == 0 &\n      pass_pbp_data$yards_after_catch == 0 &\n      pass_pbp_data$complete_pass == 1,\n    pass_pbp_data$epa,\n    pass_pbp_data$airEPA\n  )\n\n  # Now add airEPA and yacEPA to the original dataset:\n  pbp_data$airEPA <- NA\n  pbp_data$yacEPA <- NA\n  pbp_data$airEPA[pass_plays_i] <- pass_pbp_data$airEPA\n  pbp_data$yacEPA[pass_plays_i] <- pass_pbp_data$yacEPA\n\n  # Now change the names to be the right style, calculate the completion form\n  # of the variables, as well as the cumulative totals and return:\n  pbp_data |>\n    dplyr::rename(air_epa = \"airEPA\", yac_epa = \"yacEPA\") |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      comp_air_epa = dplyr::if_else(.data$complete_pass == 1, .data$air_epa, 0),\n      comp_yac_epa = dplyr::if_else(.data$complete_pass == 1, .data$yac_epa, 0),\n      home_team_comp_air_epa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$comp_air_epa,\n        -.data$comp_air_epa\n      ),\n      away_team_comp_air_epa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$comp_air_epa,\n        -.data$comp_air_epa\n      ),\n      home_team_comp_yac_epa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$comp_yac_epa,\n        -.data$comp_yac_epa\n      ),\n      away_team_comp_yac_epa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$comp_yac_epa,\n        -.data$comp_yac_epa\n      ),\n      home_team_comp_air_epa = dplyr::if_else(\n        is.na(.data$home_team_comp_air_epa),\n        0,\n        .data$home_team_comp_air_epa\n      ),\n      away_team_comp_air_epa = dplyr::if_else(\n        is.na(.data$away_team_comp_air_epa),\n        0,\n        .data$away_team_comp_air_epa\n      ),\n      home_team_comp_yac_epa = dplyr::if_else(\n        is.na(.data$home_team_comp_yac_epa),\n        0,\n        .data$home_team_comp_yac_epa\n      ),\n      away_team_comp_yac_epa = dplyr::if_else(\n        is.na(.data$away_team_comp_yac_epa),\n        0,\n        .data$away_team_comp_yac_epa\n      ),\n      total_home_comp_air_epa = cumsum(.data$home_team_comp_air_epa),\n      total_away_comp_air_epa = cumsum(.data$away_team_comp_air_epa),\n      total_home_comp_yac_epa = cumsum(.data$home_team_comp_yac_epa),\n      total_away_comp_yac_epa = cumsum(.data$away_team_comp_yac_epa),\n      # Same but for raw - not just completions:\n      home_team_raw_air_epa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$air_epa,\n        -.data$air_epa\n      ),\n      away_team_raw_air_epa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$air_epa,\n        -.data$air_epa\n      ),\n      home_team_raw_yac_epa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$yac_epa,\n        -.data$yac_epa\n      ),\n      away_team_raw_yac_epa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$yac_epa,\n        -.data$yac_epa\n      ),\n      home_team_raw_air_epa = dplyr::if_else(\n        is.na(.data$home_team_raw_air_epa),\n        0,\n        .data$home_team_raw_air_epa\n      ),\n      away_team_raw_air_epa = dplyr::if_else(\n        is.na(.data$away_team_raw_air_epa),\n        0,\n        .data$away_team_raw_air_epa\n      ),\n      home_team_raw_yac_epa = dplyr::if_else(\n        is.na(.data$home_team_raw_yac_epa),\n        0,\n        .data$home_team_raw_yac_epa\n      ),\n      away_team_raw_yac_epa = dplyr::if_else(\n        is.na(.data$away_team_raw_yac_epa),\n        0,\n        .data$away_team_raw_yac_epa\n      ),\n      total_home_raw_air_epa = cumsum(.data$home_team_raw_air_epa),\n      total_away_raw_air_epa = cumsum(.data$away_team_raw_air_epa),\n      total_home_raw_yac_epa = cumsum(.data$home_team_raw_yac_epa),\n      total_away_raw_yac_epa = cumsum(.data$away_team_raw_yac_epa)\n    ) |>\n    dplyr::ungroup()\n}\n\n\n#################################################################\n# air and YAC WP:\n# as with the rest, heavily borrowed from nflscrapR:\n# https://github.com/maksimhorowitz/nflscrapR/blob/master/R/add_ep_wp_variables.R\nadd_air_yac_wp_variables <- function(pbp_data) {\n  #testing\n  #pbp_data <- g\n\n  # Change the names to reflect the old style - will update this later on:\n  pbp_data <- pbp_data |>\n    dplyr::mutate(\n      posteam_timeouts_pre = .data$posteam_timeouts_remaining,\n      defeam_timeouts_pre = .data$defteam_timeouts_remaining\n    )\n\n  # Final all pass attempts that are not sacks:\n  pass_plays_i <- which(\n    !is.na(pbp_data$air_yards) & pbp_data$play_type == 'pass'\n  )\n  pass_pbp_data <- pbp_data[pass_plays_i, ]\n\n  pass_pbp_data <- pass_pbp_data |>\n    dplyr::mutate(\n      half_seconds_remaining = .data$half_seconds_remaining - 5.704673,\n      game_seconds_remaining = .data$game_seconds_remaining - 5.704673,\n      Diff_Time_Ratio = .data$score_differential /\n        (exp(-4 * .data$elapsed_share)),\n      Turnover_Ind = dplyr::if_else(\n        .data$down == 4 & .data$air_yards < .data$ydstogo,\n        1,\n        0\n      ),\n      Diff_Time_Ratio = dplyr::if_else(\n        .data$Turnover_Ind == 1,\n        -1 * .data$Diff_Time_Ratio,\n        .data$Diff_Time_Ratio\n      ),\n      posteam_timeouts_remaining = dplyr::if_else(\n        .data$Turnover_Ind == 1,\n        .data$defeam_timeouts_pre,\n        .data$posteam_timeouts_pre\n      ),\n      defteam_timeouts_remaining = dplyr::if_else(\n        .data$Turnover_Ind == 1,\n        .data$posteam_timeouts_pre,\n        .data$defeam_timeouts_pre\n      )\n    )\n\n  # Calculate the airWP:\n  pass_pbp_data$airWP <- get_preds_wp(pass_pbp_data)\n\n  # Now for plays marked with Turnover_Ind, use 1 - airWP to flip back to the original\n  # team with possession:\n  pass_pbp_data$airWP <- ifelse(\n    pass_pbp_data$Turnover_Ind == 1,\n    1 - pass_pbp_data$airWP,\n    pass_pbp_data$airWP\n  )\n\n  # For the plays that have TimeSecs_Remaining 0 or less, set airWP to 0:\n  pass_pbp_data$airWP[which(pass_pbp_data$half_seconds_remaining <= 0)] <- 0\n  pass_pbp_data$airWP[which(pass_pbp_data$game_seconds_remaining <= 0)] <- 0\n\n  # Calculate the airWPA and yacWPA:\n  pass_pbp_data <- dplyr::mutate(\n    pass_pbp_data,\n    airWPA = .data$airWP - .data$wp,\n    yacWPA = .data$wpa - .data$airWPA\n  )\n\n  # If the play is a two-point conversion then change the airWPA to NA since\n  # no air yards are provided:\n  pass_pbp_data$airWPA <- with(\n    pass_pbp_data,\n    ifelse(two_point_attempt == 1, NA, airWPA)\n  )\n  pass_pbp_data$yacWPA <- with(\n    pass_pbp_data,\n    ifelse(two_point_attempt == 1, NA, yacWPA)\n  )\n\n  # Check to see if there is any overtime plays, if so then need to calculate\n  # by essentially taking the same process as the airEP calculation and using\n  # the resulting probabilities for overtime:\n\n  # First check if there's any overtime plays:\n  if (any(pass_pbp_data$qtr == 5 | pass_pbp_data$qtr == 6)) {\n    # Find the rows that are overtime:\n    pass_overtime_i <- which(pass_pbp_data$qtr == 5 | pass_pbp_data$qtr == 6)\n    pass_overtime_df <- pass_pbp_data[pass_overtime_i, ]\n\n    # Find the rows that are overtime:\n\n    # Need to generate same overtime scenario data as before in the wp function:\n    # Find the rows that are overtime:\n    overtime_i <- which(pbp_data$qtr == 5 | pbp_data$qtr == 6)\n\n    overtime_df <- pbp_data[overtime_i, ]\n\n    # Separate routine for overtime:\n\n    # Create a column that is just the first drive of overtime repeated:\n    overtime_df$First_Drive <- rep(\n      min(overtime_df$drive, na.rm = TRUE),\n      nrow(overtime_df)\n    )\n\n    # Calculate the difference in drive number\n    overtime_df <- dplyr::mutate(\n      overtime_df,\n      Drive_Diff = .data$drive - .data$First_Drive\n    )\n\n    # Create an indicator column that means the posteam is losing by 3 and\n    # its the second drive of overtime:\n    overtime_df$One_FG_Game <- ifelse(\n      overtime_df$score_differential == -3 &\n        overtime_df$Drive_Diff == 1,\n      1,\n      0\n    )\n\n    # Now create a copy of the dataset to then make the EP predictions for when\n    # a field goal is scored and its not sudden death:\n    overtime_df_ko <- overtime_df\n\n    overtime_df_ko$yardline_100 <- with(\n      overtime_df_ko,\n      ifelse(\n        game_year < 2016 |\n          (game_year == 2016 & game_month < 4),\n        80,\n        75\n      )\n    )\n\n    # Now first down:\n    overtime_df_ko$down1 <- rep(1, nrow(overtime_df_ko))\n    overtime_df_ko$down2 <- rep(0, nrow(overtime_df_ko))\n    overtime_df_ko$down3 <- rep(0, nrow(overtime_df_ko))\n    overtime_df_ko$down4 <- rep(0, nrow(overtime_df_ko))\n    # 10 ydstogo:\n    overtime_df_ko$ydstogo <- rep(10, nrow(overtime_df_ko))\n\n    # Get the predictions from the EP model and calculate the necessary probability:\n    if (nrow(overtime_df_ko) > 1) {\n      overtime_df_ko_preds <- get_preds(overtime_df_ko)\n    } else {\n      overtime_df_ko_preds <- get_preds(overtime_df_ko)\n    }\n\n    overtime_df_ko_preds <- dplyr::mutate(\n      overtime_df_ko_preds,\n      Win_Back = .data$No_Score +\n        .data$Opp_Field_Goal +\n        .data$Opp_Safety +\n        .data$Opp_Touchdown\n    )\n\n    # Calculate the two possible win probability types, Sudden Death and one Field Goal:\n    overtime_df$Sudden_Death_WP <- overtime_df$fg_prob +\n      overtime_df$td_prob +\n      overtime_df$safety_prob\n    overtime_df$One_FG_WP <- overtime_df$td_prob +\n      (overtime_df$fg_prob * overtime_df_ko_preds$Win_Back)\n\n    # Find all Pass Attempts that are also actual plays in overtime:\n    overtime_pass_plays_i <- which(\n      overtime_df$play_type == \"pass\" &\n        !is.na(overtime_df$air_yards)\n    )\n\n    overtime_pass_df <- overtime_df[overtime_pass_plays_i, ]\n    overtime_df_ko_preds_pass <- overtime_df_ko_preds[overtime_pass_plays_i, ]\n\n    # Using the AirYards need to update the following:\n    # - yardline_100\n    # - half_seconds_remaining\n    # - ydstogo\n    # - down\n\n    # First rename the old columns to update for calculating the EP from the air:\n    overtime_pass_df <- dplyr::rename(\n      overtime_pass_df,\n      old_yrdline100 = \"yardline_100\",\n      old_ydstogo = \"ydstogo\",\n      old_TimeSecs_Remaining = \"half_seconds_remaining\",\n      old_down = \"down\"\n    )\n\n    # Create an indicator column for the air yards failing to convert the first down:\n    overtime_pass_df$Turnover_Ind <- ifelse(\n      overtime_pass_df$old_down == 4 &\n        overtime_pass_df$air_yards < overtime_pass_df$old_ydstogo,\n      1,\n      0\n    )\n    # Adjust the field position variables:\n    overtime_pass_df$yardline_100 <- ifelse(\n      overtime_pass_df$Turnover_Ind == 0,\n      overtime_pass_df$old_yrdline100 - overtime_pass_df$air_yards,\n      100 - (overtime_pass_df$old_yrdline100 - overtime_pass_df$air_yards)\n    )\n\n    overtime_pass_df$ydstogo <- ifelse(\n      overtime_pass_df$air_yards >= overtime_pass_df$old_ydstogo |\n        overtime_pass_df$Turnover_Ind == 1,\n      10,\n      overtime_pass_df$old_ydstogo - overtime_pass_df$air_yards\n    )\n\n    overtime_pass_df$down <- ifelse(\n      overtime_pass_df$air_yards >= overtime_pass_df$old_ydstogo |\n        overtime_pass_df$Turnover_Ind == 1,\n      1,\n      as.numeric(overtime_pass_df$old_down) + 1\n    )\n\n    # Adjust the time with the average incomplete pass time:\n    overtime_pass_df$half_seconds_remaining <- overtime_pass_df$old_TimeSecs_Remaining -\n      5.704673\n\n    overtime_pass_df <- overtime_pass_df |>\n      dplyr::mutate(\n        down1 = dplyr::if_else(.data$down == 1, 1, 0),\n        down2 = dplyr::if_else(.data$down == 2, 1, 0),\n        down3 = dplyr::if_else(.data$down == 3, 1, 0),\n        down4 = dplyr::if_else(.data$down == 4, 1, 0)\n      )\n\n    # Get the predictions from the EP model and calculate the necessary probability:\n    if (nrow(overtime_df_ko) > 1) {\n      overtime_pass_data_preds <- get_preds(overtime_pass_df)\n    } else {\n      overtime_pass_data_preds <- get_preds(overtime_pass_df)\n    }\n\n    # For the turnover plays flip the scoring probabilities:\n    overtime_pass_data_preds <- dplyr::mutate(\n      overtime_pass_data_preds,\n      old_Opp_Field_Goal = .data$Opp_Field_Goal,\n      old_Opp_Safety = .data$Opp_Safety,\n      old_Opp_Touchdown = .data$Opp_Touchdown,\n      old_Field_Goal = .data$Field_Goal,\n      old_Safety = .data$Safety,\n      old_Touchdown = .data$Touchdown\n    )\n    overtime_pass_data_preds$Opp_Field_Goal <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Field_Goal,\n      overtime_pass_data_preds$Opp_Field_Goal\n    )\n    overtime_pass_data_preds$Opp_Safety <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Safety,\n      overtime_pass_data_preds$Opp_Safety\n    )\n    overtime_pass_data_preds$Opp_Touchdown <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Touchdown,\n      overtime_pass_data_preds$Opp_Touchdown\n    )\n    overtime_pass_data_preds$Field_Goal <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Opp_Field_Goal,\n      overtime_pass_data_preds$Field_Goal\n    )\n    overtime_pass_data_preds$Safety <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Opp_Safety,\n      overtime_pass_data_preds$Safety\n    )\n    overtime_pass_data_preds$Touchdown <- ifelse(\n      overtime_pass_df$Turnover_Ind == 1,\n      overtime_pass_data_preds$old_Opp_Touchdown,\n      overtime_pass_data_preds$Touchdown\n    )\n\n    # Calculate the two possible win probability types, Sudden Death and one Field Goal:\n    pass_overtime_df$Sudden_Death_airWP <- with(\n      overtime_pass_data_preds,\n      Field_Goal + Touchdown + Safety\n    )\n    pass_overtime_df$One_FG_airWP <- overtime_pass_data_preds$Touchdown +\n      (overtime_pass_data_preds$Field_Goal * overtime_df_ko_preds_pass$Win_Back)\n\n    # Decide which win probability to use:\n    pass_overtime_df$airWP <- ifelse(\n      overtime_pass_df$game_year >= 2012 &\n        (overtime_pass_df$Drive_Diff == 0 |\n          (overtime_pass_df$Drive_Diff == 1 &\n            overtime_pass_df$One_FG_Game == 1)),\n      pass_overtime_df$One_FG_airWP,\n      pass_overtime_df$Sudden_Death_airWP\n    )\n\n    # For the plays that have TimeSecs_Remaining 0 or less, set airWP to 0:\n    pass_overtime_df$airWP[which(\n      overtime_pass_df$half_seconds_remaining <= 0\n    )] <- 0\n\n    # Calculate the airWPA and yacWPA:\n    pass_overtime_df <- dplyr::mutate(\n      pass_overtime_df,\n      airWPA = .data$airWP - .data$wp,\n      yacWPA = .data$wpa - .data$airWPA\n    )\n\n    # If the play is a two-point conversion then change the airWPA to NA since\n    # no air yards are provided:\n    pass_overtime_df$airWPA <- with(\n      pass_overtime_df,\n      ifelse(two_point_attempt == 1, NA, airWPA)\n    )\n    pass_overtime_df$yacWPA <- with(\n      pass_overtime_df,\n      ifelse(two_point_attempt == 1, NA, yacWPA)\n    )\n\n    pass_overtime_df <- pass_pbp_data[pass_overtime_i, ]\n\n    # Now update the overtime rows in the original pass_pbp_data for airWPA and yacWPA:\n    pass_pbp_data$airWPA[pass_overtime_i] <- pass_overtime_df$airWPA\n    pass_pbp_data$yacWPA[pass_overtime_i] <- pass_overtime_df$yacWPA\n  }\n\n  # if Yards after catch is 0 make yacWPA set to 0:\n  pass_pbp_data$yacWPA <- ifelse(\n    pass_pbp_data$penalty == 0 &\n      pass_pbp_data$yards_after_catch == 0 &\n      pass_pbp_data$complete_pass == 1,\n    0,\n    pass_pbp_data$yacWPA\n  )\n  # if Yards after catch is 0 make airWPA set to WPA:\n  pass_pbp_data$airWPA <- ifelse(\n    pass_pbp_data$penalty == 0 &\n      pass_pbp_data$yards_after_catch == 0 &\n      pass_pbp_data$complete_pass == 1,\n    pass_pbp_data$wpa,\n    pass_pbp_data$airWPA\n  )\n\n  # Now add airWPA and yacWPA to the original dataset:\n  pbp_data$airWPA <- NA\n  pbp_data$yacWPA <- NA\n  pbp_data$airWPA[pass_plays_i] <- pass_pbp_data$airWPA\n  pbp_data$yacWPA[pass_plays_i] <- pass_pbp_data$yacWPA\n\n  # Now change the names to be the right style, calculate the completion form\n  # of the variables, as well as the cumulative totals and return:\n  pbp_data |>\n    dplyr::rename(air_wpa = \"airWPA\", yac_wpa = \"yacWPA\") |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      comp_air_wpa = dplyr::if_else(.data$complete_pass == 1, .data$air_wpa, 0),\n      comp_yac_wpa = dplyr::if_else(.data$complete_pass == 1, .data$yac_wpa, 0),\n      home_team_comp_air_wpa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$comp_air_wpa,\n        -.data$comp_air_wpa\n      ),\n      away_team_comp_air_wpa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$comp_air_wpa,\n        -.data$comp_air_wpa\n      ),\n      home_team_comp_yac_wpa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$comp_yac_wpa,\n        -.data$comp_yac_wpa\n      ),\n      away_team_comp_yac_wpa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$comp_yac_wpa,\n        -.data$comp_yac_wpa\n      ),\n      home_team_comp_air_wpa = dplyr::if_else(\n        is.na(.data$home_team_comp_air_wpa),\n        0,\n        .data$home_team_comp_air_wpa\n      ),\n      away_team_comp_air_wpa = dplyr::if_else(\n        is.na(.data$away_team_comp_air_wpa),\n        0,\n        .data$away_team_comp_air_wpa\n      ),\n      home_team_comp_yac_wpa = dplyr::if_else(\n        is.na(.data$home_team_comp_yac_wpa),\n        0,\n        .data$home_team_comp_yac_wpa\n      ),\n      away_team_comp_yac_wpa = dplyr::if_else(\n        is.na(.data$away_team_comp_yac_wpa),\n        0,\n        .data$away_team_comp_yac_wpa\n      ),\n      total_home_comp_air_wpa = cumsum(.data$home_team_comp_air_wpa),\n      total_away_comp_air_wpa = cumsum(.data$away_team_comp_air_wpa),\n      total_home_comp_yac_wpa = cumsum(.data$home_team_comp_yac_wpa),\n      total_away_comp_yac_wpa = cumsum(.data$away_team_comp_yac_wpa),\n      # Same but for raw - not just completions:\n      home_team_raw_air_wpa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$air_wpa,\n        -.data$air_wpa\n      ),\n      away_team_raw_air_wpa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$air_wpa,\n        -.data$air_wpa\n      ),\n      home_team_raw_yac_wpa = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$yac_wpa,\n        -.data$yac_wpa\n      ),\n      away_team_raw_yac_wpa = dplyr::if_else(\n        .data$posteam == .data$away_team,\n        .data$yac_wpa,\n        -.data$yac_wpa\n      ),\n      home_team_raw_air_wpa = dplyr::if_else(\n        is.na(.data$home_team_raw_air_wpa),\n        0,\n        .data$home_team_raw_air_wpa\n      ),\n      away_team_raw_air_wpa = dplyr::if_else(\n        is.na(.data$away_team_raw_air_wpa),\n        0,\n        .data$away_team_raw_air_wpa\n      ),\n      home_team_raw_yac_wpa = dplyr::if_else(\n        is.na(.data$home_team_raw_yac_wpa),\n        0,\n        .data$home_team_raw_yac_wpa\n      ),\n      away_team_raw_yac_wpa = dplyr::if_else(\n        is.na(.data$away_team_raw_yac_wpa),\n        0,\n        .data$away_team_raw_yac_wpa\n      ),\n      total_home_raw_air_wpa = cumsum(.data$home_team_raw_air_wpa),\n      total_away_raw_air_wpa = cumsum(.data$away_team_raw_air_wpa),\n      total_home_raw_yac_wpa = cumsum(.data$home_team_raw_yac_wpa),\n      total_away_raw_yac_wpa = cumsum(.data$away_team_raw_yac_wpa)\n    ) |>\n    dplyr::ungroup()\n}\n"
  },
  {
    "path": "R/helper_add_fixed_drives.R",
    "content": "################################################################################\n# Author: Sebastian Carl, Ben Baldwin\n# Purpose: Function to add drive variables\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n## fixed_drive =\n##  starts at 1, each new drive, numbers shared across both teams\n## fixed_drive_result =\n##  result of  given drive\nadd_drive_results <- function(d) {\n  drive_df <- d |>\n    dplyr::mutate(\n      old_posteam = .data$posteam,\n      posteam = dplyr::case_when(\n        # on kickoffs the kicking team is the defteam but this should be swapped\n        # in terms of this function if the kickoff is recovered\n        .data$kickoff_attempt == 1 &\n          (.data$own_kickoff_recovery == 1 |\n            .data$fumble_lost == 1) ~ .data$defteam,\n        # if a kickoff has to be replayed due to a penalty and is then recovered,\n        # the prior (reversed) kickoff shouldn't be a new drive/series\n        stringr::str_detect(.data$desc, kickoff_finder) &\n          .data$own_kickoff_recovery == 0 &\n          dplyr::lead(.data$own_kickoff_recovery == 1) ~ .data$defteam,\n        TRUE ~ .data$posteam\n      )\n    ) |>\n    dplyr::group_by(.data$game_id, .data$game_half) |>\n    dplyr::mutate(\n      row = 1:dplyr::n(),\n      new_drive = dplyr::if_else(\n        # change in posteam\n        .data$posteam != dplyr::lag(.data$posteam) |\n          # change in posteam in t-2 and na posteam in t-1\n          (.data$posteam != dplyr::lag(.data$posteam, 2) &\n            is.na(dplyr::lag(.data$posteam))) |\n          # change in posteam in t-3 and na posteam in t-1 and t-2\n          (.data$posteam != dplyr::lag(.data$posteam, 3) &\n            is.na(dplyr::lag(.data$posteam, 2)) &\n            is.na(dplyr::lag(.data$posteam))),\n        1,\n        0\n      ),\n      # PAT after defensive TD is not a new drive\n      new_drive = dplyr::if_else(\n        dplyr::lag(.data$touchdown == 1) &\n          (dplyr::lag(.data$posteam) != dplyr::lag(.data$td_team)) &\n          # this last part is needed because otherwise it was overwriting\n          # the existing value of new_drive with NA on plays following timeouts\n          !is.na(dplyr::lag(.data$posteam)),\n        0,\n        .data$new_drive\n      ),\n      # PAT after defensive TD is not a new drive even if a Timeout follows the TD\n      new_drive = dplyr::if_else(\n        dplyr::lag(stringr::str_detect(\n          .data$desc,\n          \"(Timeout)|(Two-Minute Warning)\"\n        )) &\n          dplyr::lag(.data$touchdown == 1, 2L) &\n          (dplyr::lag(.data$posteam, 2L) != dplyr::lag(.data$td_team, 2L)),\n        0,\n        .data$new_drive,\n        missing = .data$new_drive\n      ),\n      # PAT after defensive TD is not a new drive even if 2 Timeouts follow the TD\n      new_drive = dplyr::if_else(\n        dplyr::lag(stringr::str_detect(\n          .data$desc,\n          \"(Timeout)|(Two-Minute Warning)\"\n        )) &\n          dplyr::lag(\n            stringr::str_detect(.data$desc, \"(Timeout)|(Two-Minute Warning)\"),\n            2L\n          ) &\n          dplyr::lag(.data$touchdown == 1, 3L) &\n          (dplyr::lag(.data$posteam, 3L) != dplyr::lag(.data$td_team, 3L)),\n        0,\n        .data$new_drive,\n        missing = .data$new_drive\n      ),\n      # if same team has the ball as prior play, but prior play was a punt with lost fumble, it's a new drive\n      # or if the prior play was a lost fumble or interception\n      new_drive = dplyr::if_else(\n        # this line is to prevent it from overwriting already-defined new drives with NA\n        # when there's a timeout on prior line bc if_else is obnoxious like that\n        (.data$new_drive != 1 | is.na(.data$new_drive)) &\n          (\n            # same team has ball after lost fumble on punt, fg, pass or rush\n            (.data$posteam == dplyr::lag(.data$posteam) &\n              dplyr::lag(.data$fumble_lost) == 1 &\n              dplyr::lag(.data$play_type) %in%\n                c(\"punt\", \"pass\", \"run\", \"field_goal\") &\n              # but not if the play resulted in a touchdown because otherwise the\n              # following extra point or 2pt conversion will be new drives\n              dplyr::lag(.data$touchdown) == 0) |\n\n              # same team has ball after lost fumble on punt, fg, pass or rush 2 plays earlier with prior play missing posteam\n              (is.na(dplyr::lag(.data$posteam)) &\n                # posteam is same as posteam 2 plays ago\n                .data$posteam == dplyr::lag(.data$posteam, 2) &\n                # lost fumble 2 plays ago\n                dplyr::lag(.data$fumble_lost, 2) == 1 &\n                dplyr::lag(.data$play_type, 2) %in%\n                  c(\"punt\", \"pass\", \"run\", \"field_goal\") &\n                # but not if the lost fumble 2 plays ago resulted in a touchdown because otherwise the\n                # following extra point or 2pt conversion will be new drives\n                dplyr::lag(.data$touchdown, 2) == 0)\n          ),\n        1,\n        .data$new_drive\n      ),\n      # first observation of a half is also a new drive\n      new_drive = dplyr::if_else(.data$row == 1, 1, .data$new_drive),\n\n      # if you recovered an onside kick or muffed return, it's a new drive\n      new_drive = dplyr::case_when(\n        .data$play_type == \"kickoff\" &\n          (.data$own_kickoff_recovery == 1 | .data$fumble_lost == 1) ~ 1,\n        TRUE ~ .data$new_drive\n      ),\n\n      # if it's a kickoff and the prior play was a safety, it's a new drive\n      new_drive = dplyr::case_when(\n        # safety prior play\n        .data$kickoff_attempt == 1 & dplyr::lag(.data$safety) == 1 ~ 1,\n        # safety 2 plays ago and timeout on previous play\n        .data$kickoff_attempt == 1 &\n          dplyr::lag(.data$safety, 2) == 1 &\n          (is.na(dplyr::lag(.data$play_type)) |\n            dplyr::lag(.data$play_type) == \"no_play\") ~ 1,\n        TRUE ~ .data$new_drive\n      ),\n\n      # if there's a missing, make it not a new drive (0)\n      new_drive = dplyr::if_else(is.na(.data$new_drive), 0, .data$new_drive)\n    ) |>\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      fixed_drive = cumsum(.data$new_drive),\n      tmp_result = dplyr::case_when(\n        .data$touchdown == 1 & .data$posteam == .data$td_team ~ \"Touchdown\",\n        .data$touchdown == 1 & .data$posteam != .data$td_team ~ \"Opp touchdown\",\n        .data$field_goal_result == \"made\" ~ \"Field goal\",\n        .data$field_goal_result %in%\n          c(\"blocked\", \"missed\") ~ \"Missed field goal\",\n        .data$safety == 1 ~ \"Safety\",\n        .data$play_type == \"punt\" | .data$punt_attempt == 1 ~ \"Punt\",\n        .data$interception == 1 | .data$fumble_lost == 1 ~ \"Turnover\",\n        .data$down == 4 &\n          .data$yards_gained < .data$ydstogo &\n          .data$play_type != \"no_play\" ~ \"Turnover on downs\",\n        stringr::str_detect(\n          .data$desc,\n          \"(END QUARTER 2)|(END QUARTER 4)|(END GAME)\"\n        ) ~ \"End of half\"\n      )\n    ) |>\n    dplyr::group_by(.data$game_id, .data$fixed_drive) |>\n    dplyr::mutate(\n      fixed_drive_result = dplyr::if_else(\n        # if it's end of half, take the first thing we see\n        dplyr::last(stats::na.omit(.data$tmp_result)) == \"End of half\",\n        dplyr::first(stats::na.omit(.data$tmp_result)),\n        # otherwise take the last\n        dplyr::last(stats::na.omit(.data$tmp_result))\n      )\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate(posteam = .data$old_posteam) |>\n    dplyr::select(-\"row\", -\"new_drive\", -\"tmp_result\", -\"old_posteam\")\n\n  user_message(\"added fixed drive variables\", \"done\")\n  return(drive_df)\n}\n"
  },
  {
    "path": "R/helper_add_game_data.R",
    "content": "################################################################################\n# Author: Ben Baldwin\n# Purpose: Function to add Lee Sharpe's game data\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n# Thanks Lee!\nadd_game_data <- function(pbp, games = NULL, ...) {\n  out <- pbp\n  warn <- 0\n  tryCatch(\n    expr = {\n      # we use dir to specify the directory of a locally stored games file\n      # for unit tests\n      if (is.null(games)) {\n        games <- nflreadr::load_schedules()\n      } else {\n        stopifnot(\n          inherits(games, \"nflverse_data\"),\n          isTRUE(attr(games, \"nflverse_type\") == \"games and schedules\")\n        )\n      }\n\n      out <- out |>\n        dplyr::left_join(\n          games |>\n            dplyr::select(\n              \"game_id\",\n              \"old_game_id\",\n              \"away_score\",\n              \"home_score\",\n              \"location\",\n              \"result\",\n              \"total\",\n              \"spread_line\",\n              \"total_line\",\n              \"div_game\",\n              \"roof\",\n              \"surface\",\n              \"temp\",\n              \"wind\",\n              \"home_coach\",\n              \"away_coach\",\n              \"stadium\",\n              \"stadium_id\",\n              \"gameday\"\n            ) |>\n            dplyr::rename(game_stadium = \"stadium\"),\n          by = c(\"game_id\")\n        ) |>\n        dplyr::mutate(\n          game_date = .data$gameday\n        )\n\n      user_message(\"added game variables\", \"done\")\n    },\n    error = function(e) {\n      message(\"The following error has occured:\")\n      message(e)\n    },\n    warning = function(w) {\n      if (warn == 1) {\n        message(\n          \"Warning: The data hosting servers are down, so we can't add game data in the moment!\"\n        )\n      } else {\n        message(\"The following warning has occured:\")\n        message(w)\n      }\n    },\n    finally = {}\n  )\n  return(out)\n}\n"
  },
  {
    "path": "R/helper_add_nflscrapr_mutations.R",
    "content": "################################################################################\n# Author: Sebastian Carl, Ben Baldwin (Code mostly extracted from nflscrapR)\n# Purpose: Add variables mostly needed for ep(a) and wp(a) calculation\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\nadd_nflscrapr_mutations <- function(pbp) {\n  #testing only\n  #pbp <- combined\n\n  out <-\n    pbp |>\n    dplyr::mutate(index = 1:dplyr::n()) |>\n    # remove duplicate plays. can't do this with play_id because duplicate plays\n    # sometimes have different play_ids\n    dplyr::group_by(\n      .data$game_id,\n      .data$quarter,\n      .data$time,\n      .data$play_description,\n      .data$down\n    ) |>\n    dplyr::slice(1) |>\n    dplyr::ungroup() |>\n    dplyr::mutate(\n      # Modify the time column for the quarter end:\n      time = dplyr::if_else(\n        .data$quarter_end == 1 |\n          (.data$play_description == \"END GAME\" & is.na(.data$time)),\n        \"00:00\",\n        .data$time\n      ),\n      time = dplyr::if_else(\n        .data$play_description == 'GAME',\n        \"15:00\",\n        .data$time\n      ),\n      # Create a column with the time in seconds remaining for the quarter:\n      quarter_seconds_remaining = time_to_seconds(.data$time),\n      play_description = dplyr::case_when(\n        stringr::str_detect(\n          .data$play_description,\n          \"(?<=kicks )[:alpha:]{1,}.[:alpha:]{1,}(?= yards)\"\n        ) ~\n          stringr::str_replace(\n            .data$play_description,\n            \"(?<=kicks )[:alpha:]{1,}.[:alpha:]{1,}(?= yards)\",\n            as.character(.data$kick_distance)\n          ),\n        TRUE ~ .data$play_description\n      )\n    ) |>\n    #put plays in the right order\n    dplyr::group_by(.data$game_id) |>\n    # the !is.na(drive), drive part is to make the initial GAME line show up first\n    # https://stackoverflow.com/questions/43343590/how-to-sort-putting-nas-first-in-dplyr\n    dplyr::arrange(\n      .data$order_sequence,\n      .data$quarter,\n      !is.na(.data$quarter_seconds_remaining),\n      -.data$quarter_seconds_remaining,\n      !is.na(.data$drive),\n      .data$drive,\n      .data$index,\n      .by_group = TRUE\n    ) |>\n    dplyr::mutate(\n      # Using the various two point indicators, create a column denoting the result\n      # outcome for two point conversions:\n      two_point_conv_result = dplyr::if_else(\n        (.data$two_point_rush_good == 1 |\n          .data$two_point_pass_good == 1 |\n          .data$two_point_pass_reception_good == 1) &\n          .data$two_point_attempt == 1,\n        \"success\",\n        NA_character_\n      ),\n      two_point_conv_result = dplyr::if_else(\n        (.data$two_point_rush_failed == 1 |\n          .data$two_point_pass_failed == 1 |\n          .data$two_point_pass_reception_failed == 1) &\n          .data$two_point_attempt == 1,\n        \"failure\",\n        .data$two_point_conv_result\n      ),\n      two_point_conv_result = dplyr::if_else(\n        (.data$two_point_rush_safety == 1 |\n          .data$two_point_pass_safety == 1) &\n          .data$two_point_attempt == 1,\n        \"safety\",\n        .data$two_point_conv_result\n      ),\n      two_point_conv_result = dplyr::if_else(\n        .data$two_point_return == 1 &\n          .data$two_point_attempt == 1,\n        \"return\",\n        .data$two_point_conv_result\n      ),\n      # If the result was a success, make the yards_gained to be 2:\n      yards_gained = dplyr::if_else(\n        !is.na(.data$two_point_conv_result) &\n          .data$two_point_conv_result == \"success\",\n        2,\n        .data$yards_gained\n      ),\n      # Fix yards_gained for plays with laterals\n      yards_gained = dplyr::case_when(\n        !is.na(.data$passing_yards) &\n          .data$yards_gained != .data$passing_yards &\n          .data$penalty == 0 ~ .data$passing_yards,\n        !is.na(.data$rushing_yards) &\n          !is.na(.data$lateral_rushing_yards) &\n          .data$yards_gained != .data$rushing_yards &\n          .data$penalty == 0 ~ .data$rushing_yards +\n          .data$lateral_rushing_yards,\n        TRUE ~ yards_gained\n      ),\n      # Extract the penalty type:\n      penalty_type = dplyr::if_else(\n        .data$penalty == 1,\n        .data$play_description |>\n          stringr::str_extract(\n            \"(?<=PENALTY on .{1,50}, ).{1,50}(?=, [0-9]{1,2} yard)\"\n          ) |>\n          # Face Mask penalties include the yardage as string (either 5 Yards or 15 Yards)\n          # We remove the 15 Yards part and just keep the additional info if it's a\n          # 5 yard Face Mask penalty\n          stringr::str_remove(\"\\\\([0-9]{2}+ Yards\\\\)\") |>\n          stringr::str_squish(),\n        NA_character_\n      ),\n      # The new \"dynamic Kickoff\" in the 2024 season introduces new penalty types\n      penalty_type = dplyr::if_else(\n        .data$penalty == 1 &\n          stringr::str_detect(\n            tolower(.data$play_description),\n            \"kickoff short of landing zone\"\n          ),\n        \"Kickoff Short of Landing Zone\",\n        .data$penalty_type\n      ),\n      penalty_type = dplyr::if_else(\n        .data$penalty == 1 &\n          stringr::str_detect(\n            tolower(.data$play_description),\n            \"kickoff out of bounds\"\n          ),\n        \"Kickoff Out of Bounds\",\n        .data$penalty_type\n      ),\n      # Make plays marked with down == 0 as NA:\n      down = dplyr::if_else(\n        .data$down == 0,\n        NA_real_,\n        .data$down\n      ),\n      # Using the field goal indicators make a column with the field goal result:\n      field_goal_result = dplyr::if_else(\n        .data$field_goal_attempt == 1 &\n          .data$field_goal_made == 1,\n        \"made\",\n        NA_character_\n      ),\n      field_goal_result = dplyr::if_else(\n        .data$field_goal_attempt == 1 &\n          .data$field_goal_missed == 1,\n        \"missed\",\n        .data$field_goal_result\n      ),\n      field_goal_result = dplyr::if_else(\n        .data$field_goal_attempt == 1 &\n          .data$field_goal_blocked == 1,\n        \"blocked\",\n        .data$field_goal_result\n      ),\n\n      # Using the indicators make a column with the extra point result:\n      extra_point_result = dplyr::if_else(\n        .data$extra_point_attempt == 1 &\n          .data$extra_point_good == 1,\n        \"good\",\n        NA_character_\n      ),\n      extra_point_result = dplyr::if_else(\n        .data$extra_point_attempt == 1 &\n          .data$extra_point_failed == 1,\n        \"failed\",\n        .data$extra_point_result\n      ),\n      extra_point_result = dplyr::if_else(\n        .data$extra_point_attempt == 1 &\n          .data$extra_point_blocked == 1,\n        \"blocked\",\n        .data$extra_point_result\n      ),\n      extra_point_result = dplyr::if_else(\n        .data$extra_point_attempt == 1 &\n          .data$extra_point_safety == 1,\n        \"safety\",\n        .data$extra_point_result\n      ),\n      extra_point_result = dplyr::if_else(\n        .data$extra_point_attempt == 1 &\n          .data$extra_point_aborted == 1,\n        \"aborted\",\n        .data$extra_point_result\n      ),\n\n      # find kickoffs with penalty: a play where the next play is a kickoff\n      # and the prior play wasn't a safety or PAT\n      lead_ko = case_when(\n        dplyr::lead(.data$kickoff_attempt) == 1 &\n          .data$game_id == dplyr::lead(.data$game_id) &\n          !stringr::str_detect(\n            tolower(.data$play_description),\n            \"(injured sf )|(tonight's attendance )|(injury update )|(end quarter)|(timeout)|( captains:)|( captains )|( captians:)|( humidity:)|(note - )|( deferred)|(game start )|( game has been suspended)\"\n          ) &\n          !stringr::str_detect(.data$play_description, \"GAME \") &\n          !.data$play_description %in%\n            c(\"GAME\", \"Two-Minute Warning\", \"The game has resumed.\") &\n          is.na(.data$two_point_conv_result) &\n          is.na(.data$extra_point_result) &\n          is.na(.data$field_goal_result) &\n          (.data$safety == 0 | is.na(.data$safety)) &\n          # because things too messed up before\n          .data$season > 2000 ~ 1,\n        TRUE ~ 0\n      ),\n\n      # we overwrite kickoff_attempt for kickoffs with penalties because\n      # those mess with ep/epa/wp/wpa. Since this is inconsistent compared to\n      # all other *_attempt variables, we will restore kickoff_attempt after\n      # models are applied. That's done with a temporary copy of kickoff_attempt.\n      # See #556, #202, #199 for example\n      copy_of_kickoff_attempt = .data$kickoff_attempt,\n      kickoff_attempt = dplyr::if_else(\n        .data$lead_ko == 1,\n        1,\n        .data$kickoff_attempt\n      ),\n\n      # https://github.com/nflverse/nflfastR/issues/199#issuecomment-792321171\n      kickoff_attempt = dplyr::if_else(\n        .data$game_id == \"2014_02_ATL_CIN\" & .data$play_id == 3498,\n        1,\n        .data$kickoff_attempt\n      ),\n\n      # Make the possession team for kickoffs be the return team, since that is\n      # more intuitive from the EPA / WPA point of view:\n      posteam = dplyr::case_when(\n        # kickoff_finder is defined below\n        (.data$lead_ko == 1 |\n          .data$kickoff_attempt == 1 |\n          stringr::str_detect(.data$play_description, kickoff_finder)) &\n          .data$posteam == .data$home_team ~ .data$away_team,\n        (.data$lead_ko == 1 |\n          .data$kickoff_attempt == 1 |\n          stringr::str_detect(.data$play_description, kickoff_finder)) &\n          .data$posteam == .data$away_team ~ .data$home_team,\n        TRUE ~ .data$posteam\n      ),\n\n      # Fill in the rows with missing posteam with the lead:\n      posteam = dplyr::if_else(\n        (.data$quarter_end == 1 | .data$posteam == \"\"),\n        dplyr::lead(.data$posteam),\n        .data$posteam\n      ),\n      posteam_id = dplyr::if_else(\n        (.data$quarter_end == 1 | .data$posteam_id == \"\"),\n        dplyr::lead(.data$posteam_id),\n        .data$posteam_id\n      ),\n\n      # remove posteam from END Q2 plays or END Q4 plays (when game goes in OT)\n      # because it doesn't make sense and breaks fixed_drive and fixed_drive_result\n      posteam = dplyr::if_else(\n        stringr::str_detect(\n          .data$play_description,\n          \"(END QUARTER 2)|(END QUARTER 4)\"\n        ),\n        NA_character_,\n        .data$posteam\n      ),\n\n      # Denote whether the home or away team has possession:\n      posteam_type = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        \"home\",\n        \"away\"\n      ),\n\n      # manual posteam adjustments for rare plays with issues related to game\n      # delays.\n      posteam = dplyr::case_when(\n        # 2025_01_CAR_JAX, 1317: Game resumed after weather delay\n        # AND it was delayed right after a PAT.\n        # Prior two plays were delay info that shouldn't have posteam in order\n        # to get correct fixed drive results #529\n        # https://github.com/nflverse/nflfastR/issues/529\n        .data$game_id == \"2025_01_CAR_JAX\" &\n          .data$play_id %in% c(1282, 1303) ~ NA_character_,\n        TRUE ~ .data$posteam\n      ),\n\n      # Column denoting which team is on defense:\n      defteam = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$away_team,\n        .data$home_team\n      ),\n\n      yardline = dplyr::if_else(\n        stringr::str_detect(.data$yardline, \"50\"),\n        \"MID 50\",\n        .data$yardline\n      ),\n      yardline = dplyr::if_else(\n        nchar(.data$yardline) == 0 |\n          is.null(.data$yardline) |\n          .data$yardline == \"NULL\" |\n          is.na(.data$yardline),\n        dplyr::lead(.data$yardline),\n        .data$yardline\n      ),\n      yardline_number = dplyr::if_else(\n        .data$yardline == \"MID 50\",\n        50,\n        .data$yardline_number\n      ),\n      yardline_100 = dplyr::if_else(\n        .data$yardline_side == .data$posteam | .data$yardline == \"MID 50\",\n        100 - .data$yardline_number,\n        .data$yardline_number\n      ),\n      # Set the kick_distance for extra points by adding 18 to the yardline_100:\n      kick_distance = dplyr::if_else(\n        .data$extra_point_attempt == 1,\n        .data$yardline_100 + 18,\n        .data$kick_distance\n      ),\n      # Create a column with the time in seconds remaining for each half:\n      half_seconds_remaining = dplyr::if_else(\n        .data$quarter %in% c(1, 3),\n        .data$quarter_seconds_remaining + 900,\n        .data$quarter_seconds_remaining\n      ),\n      # Create a column with the time in seconds remaining for the game:\n      game_seconds_remaining = dplyr::if_else(\n        .data$quarter %in% c(1, 2, 3, 4),\n        .data$quarter_seconds_remaining +\n          (900 * (4 - as.numeric(.data$quarter))),\n        .data$quarter_seconds_remaining\n      ),\n      # Add column for replay or challenge:\n      replay_or_challenge = stringr::str_detect(\n        .data$play_description,\n        \"(Replay Official reviewed)|( challenge(d)? )|(Challenged)\"\n      ) |>\n        as.numeric(),\n      # Result of replay or challenge:\n      replay_or_challenge_result = dplyr::if_else(\n        .data$replay_or_challenge == 1,\n        dplyr::if_else(\n          stringr::str_detect(\n            tolower(.data$play_description),\n            \"( upheld)|( reversed)|( confirmed)\"\n          ),\n          stringr::str_extract(\n            tolower(.data$play_description),\n            \"( upheld)|( reversed)|( confirmed)\"\n          ) |>\n            stringr::str_trim(),\n          \"denied\"\n        ),\n        NA_character_\n      ),\n\n      # Create the column denoting the categorical description of the pass length:\n      pass_length = dplyr::if_else(\n        .data$two_point_attempt == 0 &\n          .data$sack == 0 &\n          .data$pass_attempt == 1,\n        .data$play_description |>\n          stringr::str_extract(\"pass (incomplete )?(short|deep)\") |>\n          stringr::str_extract(\"short|deep\"),\n        NA_character_\n      ),\n      # Create the column denoting the categorical location of the pass:\n      pass_location = dplyr::if_else(\n        .data$two_point_attempt == 0 &\n          .data$sack == 0 &\n          .data$pass_attempt == 1,\n        .data$play_description |>\n          stringr::str_extract(\"(short|deep) (left|middle|right)\") |>\n          stringr::str_extract(\"left|middle|right\"),\n        NA_character_\n      ),\n      # Indicator columns for both QB kneels, spikes, scrambles,\n      # no huddle, shotgun plays:\n      qb_kneel = dplyr::if_else(\n        stringr::str_detect(.data$play_description, \" kneels \") &\n          .data$kickoff_attempt != 1,\n        1,\n        0\n      ),\n      qb_spike = stringr::str_detect(.data$play_description, \" spiked \") |>\n        as.numeric(),\n      qb_scramble = stringr::str_detect(\n        .data$play_description,\n        \" scrambles \"\n      ) |>\n        as.numeric(),\n      shotgun = stringr::str_detect(.data$play_description, \"Shotgun\") |>\n        as.numeric(),\n      no_huddle = stringr::str_detect(.data$play_description, \"No Huddle\") |>\n        as.numeric(),\n\n      # Create a play type column: either pass, run, field_goal, extra_point,\n      # kickoff, punt, qb_kneel, qb_spike, or no_play (which includes timeouts and\n      # penalties):\n      play_type = translate_play_type_nfl(\n        .data$play_type_nfl,\n        qb_spike = .data$qb_spike,\n        qb_kneel = .data$qb_kneel,\n        pass_attempt = .data$pass_attempt,\n        rush_attempt = .data$rush_attempt,\n        punt_attempt = .data$punt_attempt,\n        field_goal_attempt = .data$field_goal_attempt,\n        penalty = .data$penalty,\n        is_penalty_enforced_between_downs = stringr::str_detect(\n          tolower(.data$play_description),\n          \"enforced between downs\"\n        )\n      ),\n\n      # Indicator for QB dropbacks (exclude spikes and kneels):\n      qb_dropback = dplyr::if_else(\n        .data$play_type == \"pass\" |\n          (.data$play_type == \"run\" &\n            .data$qb_scramble == 1),\n        1,\n        0\n      ),\n      # Columns denoting the run location and gap:\n      run_location = dplyr::if_else(\n        .data$two_point_attempt == 0 &\n          .data$rush_attempt == 1,\n        .data$play_description |>\n          stringr::str_extract(\" (left|middle|right) \") |>\n          stringr::str_trim(),\n        NA_character_\n      ),\n      run_gap = dplyr::if_else(\n        .data$two_point_attempt == 0 &\n          .data$rush_attempt == 1,\n        .data$play_description |>\n          stringr::str_extract(\" (guard|tackle|end) \") |>\n          stringr::str_trim(),\n        NA_character_\n      ),\n      game_half = dplyr::case_when(\n        .data$quarter %in% c(1, 2) ~ \"Half1\",\n        .data$quarter %in% c(3, 4) ~ \"Half2\",\n        .data$quarter >= 5 ~ \"Overtime\",\n        FALSE ~ NA_character_\n      ),\n      # Create columns to denote the timeouts remaining for each team, making\n      # columns for both home/away and pos/def (this will involve creating\n      # temporary columns that will not be included):\n      # Initialize both home and away to have 3 timeouts for each\n      # half except overtime where they have 2:\n\n      # extract timeouts from failed challenges when it's not otherwise there\n      tmp_timeout = stringr::str_extract(\n        .data$play_description,\n        \"(?<=by\\\\s)[:upper:]{2,3}(?=\\\\s)\"\n      ),\n      timeout_team = dplyr::if_else(\n        .data$replay_or_challenge == 1 &\n          .data$timeout == 1 &\n          is.na(.data$timeout_team),\n        .data$tmp_timeout,\n        .data$timeout_team\n      ),\n\n      home_timeouts_remaining = dplyr::if_else(\n        .data$quarter %in% c(1, 2, 3, 4),\n        3,\n        2\n      ),\n      away_timeouts_remaining = dplyr::if_else(\n        .data$quarter %in% c(1, 2, 3, 4),\n        3,\n        2\n      ),\n      home_timeout_used = dplyr::if_else(\n        .data$timeout == 1 &\n          .data$timeout_team == .data$home_team,\n        1,\n        0\n      ),\n      away_timeout_used = dplyr::if_else(\n        .data$timeout == 1 &\n          .data$timeout_team == .data$away_team,\n        1,\n        0\n      ),\n      home_timeout_used = dplyr::if_else(\n        is.na(.data$home_timeout_used),\n        0,\n        .data$home_timeout_used\n      ),\n      away_timeout_used = dplyr::if_else(\n        is.na(.data$away_timeout_used),\n        0,\n        .data$away_timeout_used\n      )\n    ) |>\n    # replace empty strings in yard line variables\n    dplyr::mutate_at(\n      .vars = c(\"yardline\", \"drive_start_yard_line\", \"drive_end_yard_line\"),\n      .funs = ~ dplyr::na_if(.x, \"\")\n    ) |>\n    # fix cases where a yardline variable misses the blank space between team name\n    # and yard number. At the point of adding this, the only spot where this happened\n    # was in the variable drive_start_yard_line in the games\n    # \"2000_01_CAR_WAS\", \"2000_02_NE_NYJ\", and \"2000_03_ATL_CAR\"\n    dplyr::mutate_at(\n      .vars = c(\"yardline\", \"drive_start_yard_line\", \"drive_end_yard_line\"),\n      .funs = ~ dplyr::case_when(\n        stringr::str_detect(.x, \"[:upper:]{2,3}(?=[:digit:]{1,2})\") ~\n          stringr::str_c(\n            stringr::str_extract(.x, \"[:upper:]{2,3}\"),\n            stringr::str_extract(.x, \"[:digit:]{1,2}\"),\n            sep = \" \"\n          ),\n        TRUE ~ .x\n      )\n    ) |>\n    # Group by the game_half to then create cumulative timeouts used for both\n    # the home and away teams:\n    dplyr::group_by(.data$game_id, .data$game_half) |>\n    dplyr::mutate(\n      total_home_timeouts_used = dplyr::if_else(\n        cumsum(.data$home_timeout_used) > 3,\n        3,\n        cumsum(.data$home_timeout_used)\n      ),\n      total_away_timeouts_used = dplyr::if_else(\n        cumsum(.data$away_timeout_used) > 3,\n        3,\n        cumsum(.data$away_timeout_used)\n      )\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::group_by(.data$game_id) |>\n    # Now just take the difference between the timeouts remaining\n    # columns and the total timeouts used, and create the columns for both\n    # the pos and def team timeouts remaining:\n    dplyr::mutate(\n      home_timeouts_remaining = .data$home_timeouts_remaining -\n        .data$total_home_timeouts_used,\n      away_timeouts_remaining = .data$away_timeouts_remaining -\n        .data$total_away_timeouts_used,\n      posteam_timeouts_remaining = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$home_timeouts_remaining,\n        .data$away_timeouts_remaining\n      ),\n      defteam_timeouts_remaining = dplyr::if_else(\n        .data$defteam == .data$home_team,\n        .data$home_timeouts_remaining,\n        .data$away_timeouts_remaining\n      ),\n      # Same type of logic to calculate the score for each team and the score\n      # differential in the game. First create columns to track how many points\n      # were scored on a particular play based on various scoring indicators for\n      # both the home and away teams:\n      home_points_scored = dplyr::if_else(\n        .data$touchdown == 1 &\n          .data$td_team == .data$home_team,\n        6,\n        0\n      ),\n      home_points_scored = dplyr::if_else(\n        .data$posteam == .data$home_team &\n          .data$field_goal_made == 1,\n        3,\n        .data$home_points_scored\n      ),\n      home_points_scored = dplyr::if_else(\n        .data$posteam == .data$home_team &\n          (.data$extra_point_good == 1 |\n            .data$extra_point_safety == 1 |\n            .data$two_point_rush_safety == 1 |\n            .data$two_point_pass_safety == 1),\n        1,\n        .data$home_points_scored\n      ),\n      home_points_scored = dplyr::if_else(\n        .data$posteam == .data$home_team &\n          (.data$two_point_rush_good == 1 |\n            .data$two_point_pass_good == 1 |\n            .data$two_point_pass_reception_good == 1),\n        2,\n        .data$home_points_scored\n      ),\n      home_points_scored = dplyr::if_else(\n        .data$defteam == .data$home_team &\n          (.data$two_point_return == 1 | .data$defensive_two_point_conv == 1),\n        2,\n        .data$home_points_scored\n      ),\n      home_points_scored = dplyr::if_else(\n        .data$safety_team == .data$home_team & .data$safety == 1,\n        2,\n        .data$home_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$touchdown == 1 &\n          .data$td_team == .data$away_team,\n        6,\n        0\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$posteam == .data$away_team &\n          .data$field_goal_made == 1,\n        3,\n        .data$away_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$posteam == .data$away_team &\n          (.data$extra_point_good == 1 |\n            .data$extra_point_safety == 1 |\n            .data$two_point_rush_safety == 1 |\n            .data$two_point_pass_safety == 1),\n        1,\n        .data$away_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$posteam == .data$away_team &\n          (.data$two_point_rush_good == 1 |\n            .data$two_point_pass_good == 1 |\n            .data$two_point_pass_reception_good == 1),\n        2,\n        .data$away_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$defteam == .data$away_team &\n          (.data$two_point_return == 1 | .data$defensive_two_point_conv == 1),\n        2,\n        .data$away_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        .data$safety_team == .data$away_team & .data$safety == 1,\n        2,\n        .data$away_points_scored\n      ),\n      home_points_scored = dplyr::if_else(\n        is.na(.data$home_points_scored),\n        0,\n        .data$home_points_scored\n      ),\n      away_points_scored = dplyr::if_else(\n        is.na(.data$away_points_scored),\n        0,\n        .data$away_points_scored\n      ),\n      # Now create cumulative totals:\n      total_home_score = cumsum(.data$home_points_scored),\n      total_away_score = cumsum(.data$away_points_scored),\n      posteam_score = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        dplyr::lag(.data$total_home_score),\n        dplyr::lag(.data$total_away_score)\n      ),\n      defteam_score = dplyr::if_else(\n        .data$defteam == .data$home_team,\n        dplyr::lag(.data$total_home_score),\n        dplyr::lag(.data$total_away_score)\n      ),\n      score_differential = .data$posteam_score - .data$defteam_score,\n      abs_score_differential = abs(.data$score_differential),\n      # Make post score differential columns to be used for the final\n      # game indicators in the win probability calculations:\n      posteam_score_post = dplyr::if_else(\n        .data$posteam == .data$home_team,\n        .data$total_home_score,\n        .data$total_away_score\n      ),\n      defteam_score_post = dplyr::if_else(\n        .data$defteam == .data$home_team,\n        .data$total_home_score,\n        .data$total_away_score\n      ),\n      score_differential_post = .data$posteam_score_post -\n        .data$defteam_score_post,\n      abs_score_differential_post = abs(\n        .data$posteam_score_post - .data$defteam_score_post\n      ),\n      # Create a variable for whether or not a touchback occurred, this\n      # will apply to any type of play:\n      touchback = as.numeric(stringr::str_detect(\n        tolower(.data$play_description),\n        \"touchback\"\n      )),\n      # There are a few plays with air_yards prior 2006 (most likely accidently)\n      # To not crash the air_yac ep and wp calculation they are being set to NA\n      air_yards = dplyr::if_else(.data$season < 2006, NA_real_, .data$air_yards)\n    ) |>\n    dplyr::rename(\n      ydstogo = \"yards_to_go\",\n      desc = \"play_description\",\n      yrdln = \"yardline\",\n      side_of_field = \"yardline_side\",\n      qtr = \"quarter\"\n    ) |>\n    dplyr::filter(\n      !is.na(.data$desc),\n      .data$desc != \"\",\n      !is.na(.data$qtr)\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate(\n      game_id = as.character(.data$game_id),\n      # kick distance is NA on kickoffs and punts that result in touchbacks\n      # (unless the kick/punt) was caught between endzones\n      # we use yardline_100 to add it in those cases\n      is_relevant_touchback = as.numeric(\n        is.na(.data$kick_distance) &\n          .data$touchback == 1 &\n          .data$play_type %in% c(\"punt\", \"kickoff\")\n      ),\n      kick_distance = dplyr::case_when(\n        .data$is_relevant_touchback == 1 &\n          .data$kickoff_attempt == 0 ~ yardline_100,\n        # gotta reverse yardline_100 on kickoffs\n        .data$is_relevant_touchback == 1 & .data$kickoff_attempt == 1 ~ 100 -\n          yardline_100,\n        TRUE ~ .data$kick_distance\n      ),\n      # drop helper variable\n      is_relevant_touchback = NULL\n    ) |>\n    fix_scrambles() |>\n    make_model_mutations()\n\n  user_message(\"added nflscrapR variables\", \"done\")\n  return(out)\n}\n\n# to help find kickoffs on plays with penalties\n# otherwise win prob breaks down the road\nkickoff_finder <- \"(Offside on Free Kick)|(Delay of Kickoff)|(Onside Kick formation)|(kicks onside)|( kicks [:digit:]+ yards from)\"\n\n\n##some steps to prepare the data for the EP/WP/CP/FG models\nmake_model_mutations <- function(pbp) {\n  pbp <- pbp |>\n    dplyr::mutate(\n      #for EP, CP, and WP model, xgb needs 0/1 for eras\n      era0 = dplyr::if_else(.data$season <= 2001, 1, 0),\n      era1 = dplyr::if_else(.data$season > 2001 & .data$season <= 2005, 1, 0),\n      era2 = dplyr::if_else(.data$season > 2005 & .data$season <= 2013, 1, 0),\n      era3 = dplyr::if_else(.data$season > 2013 & .data$season <= 2017, 1, 0),\n      era4 = dplyr::if_else(.data$season > 2017, 1, 0),\n      #for fg model, an era factor\n      era = dplyr::case_when(\n        .data$era0 == 1 ~ 0,\n        .data$era1 == 1 ~ 1,\n        .data$era2 == 1 ~ 2,\n        .data$era3 == 1 | era4 == 1 ~ 3\n      ),\n      era = as.factor(.data$era),\n      down1 = dplyr::if_else(.data$down == 1, 1, 0),\n      down2 = dplyr::if_else(.data$down == 2, 1, 0),\n      down3 = dplyr::if_else(.data$down == 3, 1, 0),\n      down4 = dplyr::if_else(.data$down == 4, 1, 0),\n      home = dplyr::if_else(.data$posteam == .data$home_team, 1, 0),\n      model_roof = dplyr::if_else(\n        is.na(.data$roof) | .data$roof == 'open' | .data$roof == 'closed',\n        as.character('retractable'),\n        as.character(.data$roof)\n      ),\n      model_roof = as.factor(.data$model_roof),\n      retractable = dplyr::if_else(.data$model_roof == 'retractable', 1, 0),\n      dome = dplyr::if_else(.data$model_roof == 'dome', 1, 0),\n      outdoors = dplyr::if_else(.data$model_roof == 'outdoors', 1, 0)\n    )\n\n  return(pbp)\n}\n\n\nfix_scrambles <- function(pbp) {\n  # skip below code if <= 2005 is not in the data\n  if (min(pbp$season) > 2005) {\n    return(pbp)\n  }\n\n  pbp |>\n    dplyr::mutate(\n      scramble_id = paste0(.data$game_id, \"_\", .data$play_id),\n      qb_scramble = dplyr::if_else(\n        .data$scramble_id %in% scramble_fix,\n        1,\n        .data$qb_scramble\n      )\n    ) |>\n    dplyr::select(-\"scramble_id\")\n\n  # Some notes on the scramble_fix:\n  # This marks scrambles in the 1999 - 2005 season using charting data\n  # Because NFL did not put scramble in play description during this season\n  # Data from Aaron Schatz!\n}\n\ntranslate_play_type_nfl <- function(\n  play_type_nfl,\n  qb_spike,\n  qb_kneel,\n  pass_attempt,\n  rush_attempt,\n  punt_attempt,\n  field_goal_attempt,\n  penalty,\n  is_penalty_enforced_between_downs\n) {\n  # I want the arg name to be descriptive, but I want a short variable name\n  # for the code below\n  x <- play_type_nfl\n\n  out <- dplyr::case_when(\n    x == \"COMMENT\" ~ NA_character_,\n    x == \"END_GAME\" ~ NA_character_,\n    x == \"END_QUARTER\" ~ NA_character_,\n    x == \"FIELD_GOAL\" ~ \"field_goal\",\n    x == \"FREE_KICK\" ~ \"kickoff\",\n    x == \"GAME_START\" ~ NA_character_,\n    x == \"INTERCEPTION\" ~ \"pass\",\n    x == \"KICK_OFF\" ~ \"kickoff\",\n    x == \"PASS\" ~ \"pass\",\n    x == \"PAT2\" & pass_attempt == 1 ~ \"pass\",\n    x == \"PAT2\" & rush_attempt == 1 ~ \"run\",\n    x == \"PENALTY\" &\n      pass_attempt == 1 &\n      is_penalty_enforced_between_downs ~ \"pass\",\n    x == \"PENALTY\" &\n      rush_attempt == 1 &\n      is_penalty_enforced_between_downs ~ \"run\",\n    x == \"PENALTY\" ~ \"no_play\",\n    x == \"PUNT\" ~ \"punt\",\n    x == \"RUSH\" ~ \"run\",\n    x == \"SACK\" ~ \"pass\",\n    x == \"TIMEOUT\" ~ \"no_play\",\n    x == \"XP_KICK\" ~ \"extra_point\",\n\n    # UNSPECIFIED is a mix of all sorts of weird plays\n    x == \"UNSPECIFIED\" & penalty == 1 ~ \"no_play\",\n\n    # the following lines imply penalty == 0 because penalty == 1 triggers above\n    x == \"UNSPECIFIED\" & pass_attempt == 1 ~ \"pass\",\n    x == \"UNSPECIFIED\" & rush_attempt == 1 ~ \"run\",\n    x == \"UNSPECIFIED\" & punt_attempt == 1 ~ \"punt\",\n    x == \"UNSPECIFIED\" & field_goal_attempt == 1 ~ \"field_goal\",\n\n    # most of the remaining UNSPECIFIED plays will be declined penalties\n    # from punt or fg formation. These don't really count as play so we define\n    # them as no_play\n    x == \"UNSPECIFIED\" ~ \"no_play\",\n\n    # default\n    TRUE ~ \"\"\n  )\n\n  # every play_type_nfl that we do not catch in the above cases\n  # will be an empty string. We try to resolve these as good as we can\n  # also need to replace passes and runs that were spikes and kneel downs\n  dplyr::case_when(\n    out == \"\" & penalty == 1 ~ \"no_play\",\n    out == \"\" & pass_attempt == 1 ~ \"pass\",\n    out == \"\" & rush_attempt == 1 ~ \"run\",\n    out == \"\" & punt_attempt == 1 ~ \"punt\",\n    out == \"\" & field_goal_attempt == 1 ~ \"field_goal\",\n    qb_spike == 1 & out %in% c(\"pass\", \"run\") ~ \"qb_spike\",\n    qb_kneel == 1 & out %in% c(\"pass\", \"run\") ~ \"qb_kneel\",\n    TRUE ~ out\n  )\n}\n\n# we overwrite kickoff_attempt for kickoffs with penalties because\n# those mess with ep/epa/wp/wpa. Since this is inconsistent compared to\n# all other *_attempt variables, we will restore kickoff_attempt after\n# models are applied. That's done with a temporary copy of kickoff_attempt.\n# See #556, #202, #199 for example\nrestore_kickoff_attempt <- function(pbp) {\n  pbp |>\n    dplyr::mutate(\n      kickoff_attempt = .data$copy_of_kickoff_attempt,\n      copy_of_kickoff_attempt = NULL\n    )\n}\n"
  },
  {
    "path": "R/helper_add_series_data.R",
    "content": "################################################################################\n# Author: Sebastian Carl, Ben Baldwin\n# Purpose: Function to add series variables analogue Lee Sharpe's Version\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n## series =\n##  starts at 1, each new first down increments, numbers shared across both teams\n##  NA: kickoffs, extra point/two point conversion attempts, non-plays, no posteam\n## series_success =\n##  1: scored touchdown, gained enough yards for first down\n##  0: everything else\nadd_series_data <- function(pbp) {\n  out <-\n    pbp |>\n    dplyr::mutate(\n      old_posteam = .data$posteam,\n      posteam = dplyr::case_when(\n        # on kickoffs the kicking team is the defteam but this should be swapped\n        # in terms of this function if the kickoff is recovered\n        .data$kickoff_attempt == 1 &\n          (.data$own_kickoff_recovery == 1 |\n            .data$fumble_lost == 1) ~ .data$defteam,\n        # if a kickoff has to be replayed due to a penalty and is then recovered,\n        # the prior (reversed) kickoff shouldn't be a new drive/series\n        stringr::str_detect(.data$desc, kickoff_finder) &\n          .data$own_kickoff_recovery == 0 &\n          dplyr::lead(.data$own_kickoff_recovery == 1) ~ .data$defteam,\n        TRUE ~ .data$posteam\n      )\n    ) |>\n    dplyr::group_by(.data$game_id, .data$game_half) |>\n    dplyr::mutate(\n      row = 1:dplyr::n(),\n      new_series = dplyr::if_else(\n        # a new drive\n        .data$fixed_drive != dplyr::lag(.data$fixed_drive) |\n          # or a first down on the prior play except touchdown plays\n          ((dplyr::lag(.data$first_down_rush) == 1 |\n            dplyr::lag(.data$first_down_pass) == 1 |\n            dplyr::lag(.data$first_down_penalty) == 1) &\n            dplyr::lag(.data$touchdown) == 0) |\n          # or the first play\n          .data$row == 1,\n        1,\n        0\n      ),\n      new_series = dplyr::if_else(is.na(.data$new_series), 0, .data$new_series)\n    ) |>\n    # now compute series number with cumsum (for the calculation NA are being relaced with 0)\n    dplyr::group_by(.data$game_id) |>\n    dplyr::mutate(\n      series = cumsum(.data$new_series),\n      tmp_result = dplyr::case_when(\n        (.data$first_down_penalty == 1 |\n          .data$first_down_rush == 1 |\n          .data$first_down_pass == 1) &\n          touchdown == 0 ~ \"First down\",\n        .data$touchdown == 1 & .data$posteam == .data$td_team ~ \"Touchdown\",\n        .data$touchdown == 1 & .data$posteam != .data$td_team ~ \"Opp touchdown\",\n        .data$field_goal_result == \"made\" ~ \"Field goal\",\n        .data$field_goal_result %in%\n          c(\"blocked\", \"missed\") ~ \"Missed field goal\",\n        .data$safety == 1 ~ \"Safety\",\n        .data$play_type == \"punt\" | .data$punt_attempt == 1 ~ \"Punt\",\n        .data$interception == 1 | .data$fumble_lost == 1 ~ \"Turnover\",\n        .data$down == 4 &\n          .data$yards_gained < .data$ydstogo &\n          .data$play_type != \"no_play\" ~ \"Turnover on downs\",\n        .data$qb_kneel == 1 ~ \"QB kneel\",\n        stringr::str_detect(\n          .data$desc,\n          \"(END QUARTER 2)|(END QUARTER 4)|(END GAME)\"\n        ) ~ \"End of half\"\n      )\n    ) |>\n    dplyr::group_by(.data$game_id, .data$series) |>\n    dplyr::mutate(\n      series_result = dplyr::if_else(\n        # if it's end of half, take the first thing we see\n        dplyr::last(stats::na.omit(.data$tmp_result)) == \"End of half\",\n        dplyr::first(stats::na.omit(.data$tmp_result)),\n        # otherwise take the last\n        dplyr::last(stats::na.omit(.data$tmp_result))\n      ),\n      series_success = dplyr::if_else(\n        .data$series_result %in% c(\"Touchdown\", \"First down\"),\n        1,\n        0\n      )\n    ) |>\n    dplyr::ungroup() |>\n    dplyr::mutate(posteam = .data$old_posteam) |>\n    dplyr::select(-\"row\", -\"tmp_result\", -\"new_series\", -\"old_posteam\")\n\n  user_message(\"added series variables\", \"done\")\n  return(out)\n}\n"
  },
  {
    "path": "R/helper_add_xpass.R",
    "content": "################################################################################\n# Author: Ben Baldwin\n# Stlyeguide: styler::tidyverse_style()\n################################################################################\n\n#' Add expected pass columns\n#'\n#' @inheritParams clean_pbp\n#' @description Build columns from the expected dropback model. Will return\n#' `NA` on data prior to 2006 since that was before NFL started marking scrambles.\n#' Must be run on a dataframe that has already had [clean_pbp()] run on it.\n#' Note that the functions [build_nflfastR_pbp()] and\n#' the database function [update_db()] already include this function.\n#' @return The input Data Frame of the parameter `pbp` with the following columns\n#' added:\n#' \\describe{\n#' \\item{xpass}{Probability of dropback scaled from 0 to 1.}\n#' \\item{pass_oe}{Dropback percent over expected on a given play scaled from 0 to 100.}\n#' }\n#' @export\nadd_xpass <- function(pbp, ...) {\n  if (nrow(pbp) == 0) {\n    user_message(\"Nothing to do. Return passed data frame.\", \"info\")\n    return(pbp)\n  }\n  pbp <- pbp |> dplyr::select(-dplyr::any_of(c(\"xpass\", \"pass_oe\")))\n  plays <- prepare_xpass_data(pbp)\n\n  if (!nrow(plays |> dplyr::filter(.data$valid_play == 1)) == 0) {\n    user_message(\"Computing xpass...\", \"todo\")\n\n    pred <- stats::predict(\n      load_model(\"xpass\"),\n      as.matrix(plays |> dplyr::select(-\"valid_play\"))\n    ) |>\n      tibble::as_tibble() |>\n      dplyr::rename(xpass = \"value\") |>\n      dplyr::bind_cols(plays) |>\n      dplyr::select(\"xpass\", \"valid_play\")\n\n    pbp <- pbp |>\n      dplyr::bind_cols(pred) |>\n      dplyr::mutate(\n        xpass = dplyr::if_else(\n          .data$valid_play == 1,\n          .data$xpass,\n          NA_real_\n        ),\n        pass_oe = dplyr::if_else(\n          !is.na(.data$xpass),\n          100 * (.data$pass - .data$xpass),\n          NA_real_\n        ),\n        pass_oe = dplyr::if_else(\n          .data$rush == 0 & .data$pass == 0,\n          NA_real_,\n          .data$pass_oe\n        )\n      ) |>\n      dplyr::select(-\"valid_play\")\n\n    message_completed(\"added xpass and pass_oe\", ...)\n  } else {\n    pbp <- pbp |>\n      dplyr::mutate(\n        xpass = NA_real_,\n        pass_oe = NA_real_\n      )\n    user_message(\n      \"No non-NA values for xpass calculation detected. xpass and pass_oe set to NA\",\n      \"info\"\n    )\n  }\n  return(pbp)\n}\n\nprepare_xpass_data <- function(pbp) {\n  plays <- pbp |>\n    dplyr::mutate(\n      valid_play = dplyr::if_else(\n        .data$season >= 2006 &\n          .data$play_type %in% c(\"no_play\", \"pass\", \"run\") &\n          !is.na(.data$posteam) &\n          !is.na(.data$down) &\n          !is.na(.data$defteam_timeouts_remaining) &\n          !is.na(.data$posteam_timeouts_remaining) &\n          !is.na(.data$yardline_100) &\n          !is.na(.data$score_differential),\n        1,\n        0\n      )\n    ) |>\n    make_model_mutations() |>\n    dplyr::select(\n      \"valid_play\",\n      \"down\",\n      \"ydstogo\",\n      \"yardline_100\",\n      \"qtr\",\n      \"wp\",\n      \"vegas_wp\",\n      \"era2\",\n      \"era3\",\n      \"era4\",\n      \"score_differential\",\n      \"home\",\n      \"half_seconds_remaining\",\n      \"posteam_timeouts_remaining\",\n      \"defteam_timeouts_remaining\",\n      \"outdoors\",\n      \"retractable\",\n      \"dome\"\n    )\n\n  return(plays)\n}\n"
  },
  {
    "path": "R/helper_add_xyac.R",
    "content": "################################################################################\n# Author: Ben Baldwin, Sebastian Carl\n# Purpose: Function to add expected yac variables.\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n#' Add expected yards after completion (xyac) variables\n#'\n#' @inheritParams clean_pbp\n#' @details Build columns that capture what we should expect after the catch.\n#' @return The input Data Frame of the parameter 'pbp' with the following columns\n#' added:\n#' \\describe{\n#' \\item{xyac_epa}{Expected value of EPA gained after the catch, starting from where the catch was made. Zero yards after the catch would be listed as zero EPA.}\n#' \\item{xyac_success}{Probability play earns positive EPA (relative to where play started) based on where ball was caught.}\n#' \\item{xyac_fd}{Probability play earns a first down based on where the ball was caught.}\n#' \\item{xyac_mean_yardage}{Average expected yards after the catch based on where the ball was caught.}\n#' \\item{xyac_median_yardage}{Median expected yards after the catch based on where the ball was caught.}\n#' }\n#' @export\nadd_xyac <- function(pbp, ...) {\n  if (nrow(pbp) == 0) {\n    user_message(\"Nothing to do. Return passed data frame.\", \"info\")\n  } else {\n    # testing only\n    # pbp <- g\n\n    pbp <- pbp |> dplyr::select(-dplyr::any_of(drop.cols.xyac))\n\n    # for joining at the end\n    pbp <- pbp |>\n      dplyr::mutate(index = 1:dplyr::n())\n\n    # prepare_xyac_data helper function shown below\n    passes <- prepare_xyac_data(pbp) |>\n      dplyr::filter(.data$valid_pass == 1, .data$distance_to_goal != 0)\n\n    if (!nrow(passes) == 0) {\n      user_message(\"Computing xyac...\", \"todo\")\n      join_data <- passes |>\n        dplyr::select(\n          \"index\",\n          \"distance_to_goal\",\n          \"season\",\n          \"week\",\n          \"home_team\",\n          \"posteam\",\n          \"roof\",\n          \"half_seconds_remaining\",\n          \"down\",\n          \"ydstogo\",\n          \"posteam_timeouts_remaining\",\n          \"defteam_timeouts_remaining\",\n          \"original_spot\" = \"yardline_100\",\n          \"original_ep\" = \"ep\",\n          \"air_epa\",\n          \"air_yards\"\n        ) |>\n        dplyr::mutate(\n          down = as.integer(.data$down),\n          ydstogo = as.integer(.data$ydstogo),\n          original_ydstogo = .data$ydstogo\n        ) |>\n        dplyr::select(\n          \"index\":\"ydstogo\",\n          \"original_ydstogo\",\n          dplyr::everything()\n        )\n\n      preds <- stats::predict(\n        load_model(\"xyac\"),\n        as.matrix(passes |> xyac_model_select())\n      )\n\n      # xgboost v3 returns a matrix of predictions but the below code is designed\n      # to work with a vector as returned by xgboost v1.\n      # We switch back to the vector (transposing might be expensive) in order\n      # to keep the rest of the code (for now).\n      if (is.matrix(preds)) {\n        preds <- preds |>\n          t() |>\n          as.vector(\"numeric\")\n      }\n\n      xyac_vars <- preds |>\n        tibble::as_tibble() |>\n        dplyr::rename(prob = \"value\") |>\n        dplyr::bind_cols(\n          tibble::tibble(\n            \"yac\" = rep_len(-5:70, length.out = nrow(passes) * 76),\n            \"index\" = rep(\n              passes$index,\n              times = rep_len(76, length.out = nrow(passes))\n            )\n          ) |>\n            dplyr::left_join(join_data, by = \"index\") |>\n            dplyr::mutate(\n              half_seconds_remaining = dplyr::if_else(\n                .data$half_seconds_remaining <= 6,\n                0,\n                .data$half_seconds_remaining - 6\n              )\n            )\n        ) |>\n        dplyr::group_by(.data$index) |>\n        dplyr::mutate(\n          max_loss = dplyr::if_else(\n            .data$distance_to_goal < 95,\n            -5,\n            .data$distance_to_goal - 99\n          ),\n          max_gain = dplyr::if_else(\n            .data$distance_to_goal > 70,\n            70,\n            .data$distance_to_goal\n          ),\n          cum_prob = cumsum(.data$prob),\n          prob = dplyr::case_when(\n            # truncate probs at loss greater than max loss\n            .data$yac == .data$max_loss ~ .data$cum_prob,\n            # same for gains bigger than possible\n            .data$yac == .data$max_gain ~ 1 - dplyr::lag(.data$cum_prob, 1),\n            TRUE ~ .data$prob\n          ),\n          # get updated end result for each possibility\n          yardline_100 = .data$distance_to_goal - .data$yac\n        ) |>\n        dplyr::filter(\n          .data$yac >= .data$max_loss,\n          .data$yac <= .data$max_gain\n        ) |>\n        dplyr::select(-\"cum_prob\") |>\n        dplyr::mutate(\n          posteam_timeouts_pre = .data$posteam_timeouts_remaining,\n          defeam_timeouts_pre = .data$defteam_timeouts_remaining,\n          gain = .data$original_spot - .data$yardline_100,\n          turnover = dplyr::if_else(\n            .data$down == 4 & .data$gain < .data$ydstogo,\n            as.integer(1),\n            as.integer(0)\n          ),\n          down = dplyr::if_else(.data$gain >= .data$ydstogo, 1, .data$down + 1),\n          ydstogo = dplyr::if_else(\n            .data$gain >= .data$ydstogo,\n            10,\n            .data$ydstogo - .data$gain\n          ),\n          # possession change if 4th down failed\n          down = dplyr::if_else(\n            .data$turnover == 1,\n            as.integer(1),\n            as.integer(.data$down)\n          ),\n          ydstogo = dplyr::if_else(\n            .data$turnover == 1,\n            as.integer(10),\n            as.integer(.data$ydstogo)\n          ),\n          # save yardline_100 for yards gained calculation\n          yardline_100_noflip = .data$yardline_100,\n          # flip yardline_100 and timeouts for turnovers for EP calculation\n          yardline_100 = dplyr::if_else(\n            .data$turnover == 1,\n            as.integer(100 - .data$yardline_100),\n            as.integer(.data$yardline_100)\n          ),\n          posteam_timeouts_remaining = dplyr::if_else(\n            .data$turnover == 1,\n            .data$defeam_timeouts_pre,\n            .data$posteam_timeouts_pre\n          ),\n          defteam_timeouts_remaining = dplyr::if_else(\n            .data$turnover == 1,\n            .data$posteam_timeouts_pre,\n            .data$defeam_timeouts_pre\n          ),\n          # ydstogo can't be bigger than yardline\n          ydstogo = dplyr::if_else(\n            .data$ydstogo >= .data$yardline_100,\n            as.integer(.data$yardline_100),\n            as.integer(.data$ydstogo)\n          )\n        ) |>\n        dplyr::ungroup() |>\n        nflfastR::calculate_expected_points() |>\n        dplyr::group_by(.data$index) |>\n        dplyr::mutate(\n          ep = dplyr::case_when(\n            .data$yardline_100 == 0 ~ 7,\n            .data$turnover == 1 ~ -1 * .data$ep,\n            TRUE ~ ep\n          ),\n          epa = .data$ep - .data$original_ep,\n          wt_epa = .data$epa * .data$prob,\n          wt_yardln = .data$yardline_100_noflip * .data$prob,\n          med = dplyr::if_else(\n            cumsum(.data$prob) > .5 & dplyr::lag(cumsum(.data$prob) < .5),\n            .data$yac,\n            as.integer(0)\n          )\n        ) |>\n        dplyr::summarise(\n          xyac_epa = sum(.data$wt_epa) - dplyr::first(.data$air_epa),\n          xyac_mean_yardage = (dplyr::first(.data$original_spot) -\n            dplyr::first(.data$air_yards)) -\n            sum(.data$wt_yardln),\n          xyac_median_yardage = max(.data$med),\n          xyac_success = sum((.data$ep > .data$original_ep) * .data$prob),\n          xyac_fd = sum((.data$gain >= .data$original_ydstogo) * .data$prob)\n        ) |>\n        dplyr::ungroup()\n\n      pbp <- pbp |>\n        dplyr::left_join(xyac_vars, by = \"index\") |>\n        dplyr::select(-\"index\")\n\n      message_completed(\"added xyac variables\", ...)\n    } else {\n      # means no valid pass plays in the pbp\n      pbp <- pbp |>\n        dplyr::mutate(\n          xyac_epa = NA_real_,\n          xyac_mean_yardage = NA_real_,\n          xyac_median_yardage = NA_real_,\n          xyac_success = NA_real_,\n          xyac_fd = NA_real_\n        ) |>\n        dplyr::select(-\"index\")\n      user_message(\n        \"No non-NA values for xyac calculation detected. xyac variables set to NA\",\n        \"info\"\n      )\n    }\n  }\n\n  return(pbp)\n}\n\n\n### helper function for getting the data ready\nprepare_xyac_data <- function(pbp) {\n  # valid pass play: at least -15 air yards, less than 70 air yards, has intended receiver, has pass location\n  passes <- pbp |>\n    make_model_mutations() |>\n    dplyr::mutate(\n      receiver_player_name = stringr::str_extract(\n        .data$desc,\n        glue::glue('{receiver_finder}{big_parser}')\n      ),\n      pass_middle = dplyr::if_else(.data$pass_location == \"middle\", 1, 0),\n      air_is_zero = dplyr::if_else(.data$air_yards == 0, 1, 0),\n      distance_to_sticks = .data$air_yards - .data$ydstogo,\n      distance_to_goal = .data$yardline_100 - .data$air_yards,\n      valid_pass = dplyr::if_else(\n        (.data$complete_pass == 1 |\n          .data$incomplete_pass == 1 |\n          .data$interception == 1) &\n          !is.na(.data$air_yards) &\n          .data$air_yards >= -15 &\n          .data$air_yards < 70 &\n          !is.na(.data$receiver_player_name) &\n          !is.na(.data$pass_location),\n        1,\n        0\n      )\n    )\n  return(passes)\n}\n\n### another helper function for getting the data ready\nxyac_model_select <- function(pbp) {\n  pbp |>\n    dplyr::select(\n      \"air_yards\",\n      \"yardline_100\",\n      \"ydstogo\",\n      \"distance_to_goal\",\n      \"down1\",\n      \"down2\",\n      \"down3\",\n      \"down4\",\n      \"air_is_zero\",\n      \"pass_middle\",\n      \"era2\",\n      \"era3\",\n      \"era4\",\n      \"qb_hit\",\n      \"home\",\n      \"outdoors\",\n      \"retractable\",\n      \"dome\",\n      \"distance_to_sticks\"\n    )\n}\n\n# These columns are being generated by add_xyac and the function tries to drop\n# them in case it is being used on a pbp dataset where the columns already exist\ndrop.cols.xyac <- c(\n  \"xyac_epa\",\n  \"xyac_mean_yardage\",\n  \"xyac_median_yardage\",\n  \"xyac_success\",\n  \"xyac_fd\"\n)\n"
  },
  {
    "path": "R/helper_additional_functions.R",
    "content": "################################################################################\n# Author: Ben Baldwin, Sebastian Carl, Tan Ho\n# Stlyeguide: styler::tidyverse_style()\n################################################################################\n\n#' Clean Play by Play Data\n#'\n#' @param pbp is a Data frame of play-by-play data scraped using [fast_scraper()].\n#' @param ... Additional arguments passed to a message function (for internal use).\n#' @details Build columns that capture what happens on all plays, including\n#' penalties, using string extraction from play description.\n#' Loosely based on Ben's nflfastR guide (<https://nflfastr.com/articles/beginners_guide.html>)\n#' but updated to work with the RS data, which has a different player format in\n#' the play description; e.g. 24-M.Lynch instead of M.Lynch.\n#' The function also standardizes team abbreviations so that, for example,\n#' the Chargers are always represented by 'LAC' regardless of which year it was.\n#' Starting in 2022, play-by-play data was missing gsis player IDs of rookies.\n#' This functions tries to fix as many as possible.\n#' @seealso For information on parallel processing and progress updates please\n#' see [nflfastR].\n#' @return The input Data Frame of the parameter 'pbp' with the following columns\n#' added:\n#' \\describe{\n#' \\item{success}{Binary indicator wheter epa > 0 in the given play. }\n#' \\item{passer}{Name of the dropback player (scrambles included) including plays with penalties.}\n#' \\item{passer_jersey_number}{Jersey number of the passer.}\n#' \\item{rusher}{Name of the rusher (no scrambles) including plays with penalties.}\n#' \\item{rusher_jersey_number}{Jersey number of the rusher.}\n#' \\item{receiver}{Name of the receiver including plays with penalties.}\n#' \\item{receiver_jersey_number}{Jersey number of the receiver.}\n#' \\item{pass}{Binary indicator if the play was a pass play (sacks and scrambles included).}\n#' \\item{rush}{Binary indicator if the play was a rushing play.}\n#' \\item{special}{Binary indicator if the play was a special teams play.}\n#' \\item{first_down}{Binary indicator if the play ended in a first down.}\n#' \\item{aborted_play}{Binary indicator if the play description indicates \"Aborted\".}\n#' \\item{play}{Binary indicator: 1 if the play was a 'normal' play (including penalties), 0 otherwise.}\n#' \\item{passer_id}{ID of the player in the 'passer' column.}\n#' \\item{rusher_id}{ID of the player in the 'rusher' column.}\n#' \\item{receiver_id}{ID of the player in the 'receiver' column.}\n#' \\item{name}{Name of the 'passer' if it is not 'NA', or name of the 'rusher' otherwise.}\n#' \\item{fantasy}{Name of the rusher on rush plays or receiver on pass plays.}\n#' \\item{fantasy_id}{ID of the rusher on rush plays or receiver on pass plays.}\n#' \\item{fantasy_player_name}{Name of the rusher on rush plays or receiver on pass plays (from official stats).}\n#' \\item{fantasy_player_id}{ID of the rusher on rush plays or receiver on pass plays (from official stats).}\n#' \\item{jersey_number}{Jersey number of the player listed in the 'name' column.}\n#' \\item{id}{ID of the player in the 'name' column.}\n#' \\item{out_of_bounds}{= 1 if play description contains \"ran ob\", \"pushed ob\", or \"sacked ob\"; = 0 otherwise.}\n#' \\item{home_opening_kickoff}{= 1 if the home team received the opening kickoff, 0 otherwise.}\n#' }\n#' @export\nclean_pbp <- function(pbp, ...) {\n  if (nrow(pbp) == 0) {\n    user_message(\"Nothing to clean. Return passed data frame.\", \"info\")\n    r <- pbp\n  } else {\n    user_message(\"Cleaning up play-by-play...\", \"todo\")\n\n    # drop existing values of clean_pbp\n    pbp <- pbp |> dplyr::select(-dplyr::any_of(drop.cols))\n\n    r <- pbp |>\n      dplyr::mutate(\n        aborted_play = dplyr::if_else(\n          stringr::str_detect(.data$desc, 'Aborted'),\n          1,\n          0\n        ),\n        #get rid of extraneous spaces that mess with player name finding\n        #if there is a space or dash, and then a capital letter, and then a period, and then a space, take out the space\n        desc = stringr::str_replace_all(\n          .data$desc,\n          \"(((\\\\s)|(\\\\-))[A-Z]\\\\.)\\\\s+\",\n          \"\\\\1\"\n        ),\n        success = dplyr::if_else(\n          is.na(.data$epa),\n          NA_real_,\n          dplyr::if_else(.data$epa > 0, 1, 0)\n        ),\n        passer = stringr::str_extract(\n          .data$desc,\n          glue::glue('{big_parser}{pass_finder}')\n        ),\n        passer_jersey_number = stringr::str_extract(\n          stringr::str_extract(\n            .data$desc,\n            glue::glue('{number_parser}{big_parser}{pass_finder}')\n          ),\n          \"[:digit:]*\"\n        ) |>\n          as.integer(),\n        rusher = stringr::str_extract(\n          .data$desc,\n          glue::glue('{big_parser}{rush_finder}')\n        ),\n        rusher_jersey_number = stringr::str_extract(\n          stringr::str_extract(\n            .data$desc,\n            glue::glue('{number_parser}{big_parser}{rush_finder}')\n          ),\n          \"[:digit:]*\"\n        ) |>\n          as.integer(),\n        #get rusher_player_name as a measure of last resort\n        #finds things like aborted snaps and \"F.Last to NYG 44.\"\n        rusher = dplyr::if_else(\n          is.na(.data$rusher) &\n            is.na(.data$passer) &\n            !is.na(.data$rusher_player_name),\n          .data$rusher_player_name,\n          .data$rusher\n        ),\n        receiver = stringr::str_extract(\n          .data$desc,\n          glue::glue('{receiver_finder}{big_parser}')\n        ),\n        receiver_jersey_number = stringr::str_extract(\n          stringr::str_extract(\n            .data$desc,\n            glue::glue('{receiver_number}{big_parser}')\n          ),\n          \"[:digit:]*\"\n        ) |>\n          as.integer(),\n        #overwrite all these weird plays messing with the parser\n        receiver = dplyr::case_when(\n          stringr::str_detect(.data$desc, glue::glue('{abnormal_play}')) &\n            !is.na(.data$receiver_player_name) ~ .data$receiver_player_name,\n          TRUE ~ .data$receiver\n        ),\n        rusher = dplyr::case_when(\n          stringr::str_detect(.data$desc, glue::glue('{abnormal_play}')) &\n            !is.na(.data$rusher_player_name) ~ .data$rusher_player_name,\n          TRUE ~ .data$rusher\n        ),\n        passer = dplyr::case_when(\n          stringr::str_detect(.data$desc, glue::glue('{abnormal_play}')) &\n            !is.na(.data$passer_player_name) ~ .data$passer_player_name,\n          TRUE ~ .data$passer\n        ),\n        # fix the plays where scramble was fixed using charting data from 1999 to 2005\n        passer = dplyr::case_when(\n          is.na(.data$passer) &\n            .data$qb_scramble == 1 &\n            !is.na(.data$rusher) &\n            .data$season <= 2005 ~ .data$rusher,\n          TRUE ~ .data$passer\n        ),\n        # finally, for rusher, if there was already a passer (eg from scramble), set rusher to NA\n        rusher = dplyr::if_else(\n          !is.na(.data$passer),\n          NA_character_,\n          .data$rusher\n        ),\n        # if no pass is thrown, there shouldn't be a receiver\n        receiver = dplyr::if_else(\n          stringr::str_detect(.data$desc, ' pass '),\n          .data$receiver,\n          NA_character_\n        ),\n        # if there's a pass, sack, or scramble, it's a pass play...\n        pass = dplyr::if_else(\n          stringr::str_detect(.data$desc, \"( pass )|(sacked)|(scramble)\") |\n            .data$qb_scramble == 1,\n          1,\n          0\n        ),\n        # ...unless it says \"backward(s) pass\" or \"lateral pass\" and there's a rusher\n        pass = dplyr::if_else(\n          stringr::str_detect(\n            stringr::str_to_lower(.data$desc),\n            \"(backward pass)|(backwards pass)|(lateral pass)\"\n          ) &\n            !is.na(.data$rusher),\n          0,\n          .data$pass\n        ),\n        # and make sure there's no pass on a kickoff (sometimes there's forward pass on kickoff but that's not a pass play)\n        pass = dplyr::case_when(\n          .data$kickoff_attempt == 1 ~ 0,\n          TRUE ~ .data$pass\n        ),\n        # in very rare cases, the pass logic can fail. We do a hard coded overwrite here because it's not worth the time\n        # to overthink the logic to catch weird play descriptions.\n        pass = fix_weird_pass_plays(.data$pass, .data$game_id, .data$play_id),\n        #if there's a rusher and it wasn't a QB kneel or pass play, it's a run play\n        rush = dplyr::if_else(\n          !is.na(.data$rusher) & .data$qb_kneel == 0 & .data$pass == 0,\n          1,\n          0\n        ),\n        #fix some common QBs with inconsistent names\n        passer = dplyr::case_when(\n          passer == \"Jos.Allen\" ~ \"J.Allen\",\n          passer == \"Alex Smith\" | passer == \"Ale.Smith\" ~ \"A.Smith\",\n          passer == \"Ryan\" & .data$posteam == \"ATL\" ~ \"M.Ryan\",\n          passer == \"Tr.Brown\" ~ \"T.Brown\",\n          passer == \"Sh.Hill\" ~ \"S.Hill\",\n          passer == \"Matt.Moore\" | passer == \"Mat.Moore\" ~ \"M.Moore\",\n          passer == \"Jo.Freeman\" ~ \"J.Freeman\",\n          passer == \"G.Minshew\" ~ \"G.Minshew II\",\n          passer == \"R.Griffin\" ~ \"R.Griffin III\",\n          passer == \"Randel El\" ~ \"A.Randle El\",\n          passer == \"Randle El\" ~ \"A.Randle El\",\n          season <= 2003 & passer == \"Van Pelt\" ~ \"A.Van Pelt\",\n          season > 2003 & passer == \"Van Pelt\" ~ \"B.Van Pelt\",\n          passer == \"Dom.Davis\" ~ \"D.Davis\",\n          TRUE ~ .data$passer\n        ),\n        rusher = dplyr::case_when(\n          rusher == \"D.Johnson\" &\n            posteam == \"HOU\" &\n            season == 2020 &\n            rusher_jersey_number == 31 ~ \"Da.Johnson\",\n          rusher == \"D.Johnson\" &\n            posteam == \"HOU\" &\n            season == 2020 &\n            rusher_jersey_number == 25 ~ \"Du.Johnson\",\n          rusher == \"Jos.Allen\" ~ \"J.Allen\",\n          rusher == \"Alex Smith\" | rusher == \"Ale.Smith\" ~ \"A.Smith\",\n          rusher == \"Ryan\" & .data$posteam == \"ATL\" ~ \"M.Ryan\",\n          rusher == \"Tr.Brown\" ~ \"T.Brown\",\n          rusher == \"Sh.Hill\" ~ \"S.Hill\",\n          rusher == \"Matt.Moore\" | rusher == \"Mat.Moore\" ~ \"M.Moore\",\n          rusher == \"Jo.Freeman\" ~ \"J.Freeman\",\n          rusher == \"G.Minshew\" ~ \"G.Minshew II\",\n          rusher == \"R.Griffin\" ~ \"R.Griffin III\",\n          rusher == \"Randel El\" ~ \"A.Randle El\",\n          rusher == \"Randle El\" ~ \"A.Randle El\",\n          season <= 2003 & rusher == \"Van Pelt\" ~ \"A.Van Pelt\",\n          season > 2003 & rusher == \"Van Pelt\" ~ \"B.Van Pelt\",\n          rusher == \"Dom.Davis\" ~ \"D.Davis\",\n          TRUE ~ rusher\n        ),\n        receiver = dplyr::case_when(\n          receiver == \"F.R\" ~ \"F.Jones\",\n          receiver_player_name == \"D.Wells\" &\n            receiver_player_id == \"00-0017421\" ~ \"D.Wells\",\n          receiver_player_name == \"D.Hayes\" &\n            receiver_player_id == \"00-0007144\" ~ \"D.Hayes\",\n          receiver_player_name == \"DanielThomas\" ~ \"D.Thomas\",\n          receiver_player_name == \"JulioJones\" ~ \"J.Jones\",\n          receiver_player_name == \"Andre' Davis\" ~ \"A.Davis\",\n          receiver_player_name == \"A.al-Jabbar\" ~ \"A.al-Jabbar\",\n          receiver_player_name == \"A.St. Brown\" ~ \"A.St. Brown\",\n          TRUE ~ receiver\n        ),\n        first_down = dplyr::if_else(\n          .data$first_down_rush == 1 |\n            .data$first_down_pass == 1 |\n            .data$first_down_penalty == 1,\n          1,\n          0\n        ),\n        # easy filter: play is 1 if a \"special teams\" play, or 0 otherwise\n        # with thanks to Lee Sharpe for the code\n        special = dplyr::if_else(\n          .data$play_type %in%\n            c(\"extra_point\", \"field_goal\", \"kickoff\", \"punt\"),\n          1,\n          0\n        ),\n        # easy filter: play is 1 if a \"normal\" play (including penalties), or 0 otherwise\n        # with thanks to Lee Sharpe for the code\n        play = dplyr::if_else(\n          !is.na(.data$epa) &\n            !is.na(.data$posteam) &\n            .data$desc != \"*** play under review ***\" &\n            substr(.data$desc, 1, 8) != \"Timeout \" &\n            .data$play_type %in% c(\"no_play\", \"pass\", \"run\"),\n          1,\n          0\n        )\n      ) |>\n      #standardize team names (eg Chargers are always LAC even when they were playing in SD)\n      dplyr::mutate_at(\n        dplyr::vars(\n          \"posteam\",\n          \"defteam\",\n          \"home_team\",\n          \"away_team\",\n          \"timeout_team\",\n          \"td_team\",\n          \"return_team\",\n          \"penalty_team\",\n          \"side_of_field\",\n          \"forced_fumble_player_1_team\",\n          \"forced_fumble_player_2_team\",\n          \"solo_tackle_1_team\",\n          \"solo_tackle_2_team\",\n          \"assist_tackle_1_team\",\n          \"assist_tackle_2_team\",\n          \"assist_tackle_3_team\",\n          \"assist_tackle_4_team\",\n          \"tackle_with_assist_1_team\",\n          \"tackle_with_assist_2_team\",\n          \"fumbled_1_team\",\n          \"fumbled_2_team\",\n          \"fumble_recovery_1_team\",\n          \"fumble_recovery_2_team\",\n          \"yrdln\",\n          \"end_yard_line\",\n          \"drive_start_yard_line\",\n          \"drive_end_yard_line\"\n        ),\n        team_name_fn\n      ) |>\n\n      #Seb's stuff for fixing player ids\n      dplyr::mutate(index = 1:dplyr::n()) |> # to re-sort after all the group_bys\n\n      dplyr::group_by(.data$passer, .data$posteam, .data$season) |>\n      dplyr::mutate(\n        passer_id = dplyr::if_else(\n          is.na(.data$passer),\n          NA_character_,\n          custom_mode(.data$passer_player_id)\n        )\n      ) |>\n\n      dplyr::group_by(.data$passer_id) |>\n      dplyr::mutate(\n        passer = dplyr::if_else(\n          is.na(.data$passer_id),\n          NA_character_,\n          custom_mode(.data$passer)\n        )\n      ) |>\n\n      dplyr::group_by(.data$rusher, .data$posteam, .data$season) |>\n      dplyr::mutate(\n        rusher_id = dplyr::if_else(\n          is.na(.data$rusher),\n          NA_character_,\n          custom_mode(.data$rusher_player_id)\n        )\n      ) |>\n\n      dplyr::group_by(.data$rusher_id) |>\n      dplyr::mutate(\n        rusher = dplyr::if_else(\n          is.na(.data$rusher_id),\n          NA_character_,\n          custom_mode(.data$rusher)\n        )\n      ) |>\n\n      dplyr::group_by(.data$receiver, .data$posteam, .data$season) |>\n      dplyr::mutate(\n        receiver_id = dplyr::if_else(\n          is.na(.data$receiver),\n          NA_character_,\n          custom_mode(.data$receiver_player_id)\n        )\n      ) |>\n\n      dplyr::group_by(.data$receiver_id) |>\n      dplyr::mutate(\n        receiver = dplyr::if_else(\n          is.na(.data$receiver_id),\n          NA_character_,\n          custom_mode(.data$receiver)\n        )\n      ) |>\n\n      dplyr::ungroup() |>\n      dplyr::mutate(\n        # if there's an aborted snap and qb didn't get a pass off,\n        # then charge it to whoever charged with the fumble\n        # this has to go after all the custom_mode stuff or it gets messed up\n        rusher = dplyr::if_else(\n          .data$aborted_play == 1 &\n            is.na(.data$passer) &\n            !is.na(.data$fumbled_1_player_name),\n          .data$fumbled_1_player_name,\n          .data$rusher\n        ),\n        rusher_id = dplyr::if_else(\n          .data$aborted_play == 1 &\n            is.na(.data$passer) &\n            !is.na(.data$fumbled_1_player_id),\n          .data$fumbled_1_player_id,\n          .data$rusher_id\n        ),\n\n        name = dplyr::if_else(!is.na(.data$passer), .data$passer, .data$rusher),\n        jersey_number = dplyr::if_else(\n          !is.na(.data$passer_jersey_number),\n          .data$passer_jersey_number,\n          .data$rusher_jersey_number\n        ),\n        id = dplyr::if_else(\n          !is.na(.data$passer_id),\n          .data$passer_id,\n          .data$rusher_id\n        )\n      ) |>\n      dplyr::arrange(.data$index) |>\n      dplyr::select(-\"index\") |>\n      # add action player\n      dplyr::mutate(\n        fantasy_player_name = case_when(\n          !is.na(.data$rusher_player_name) ~ .data$rusher_player_name,\n          is.na(.data$rusher_player_name) &\n            !is.na(.data$receiver_player_name) ~ .data$receiver_player_name,\n          TRUE ~ NA_character_\n        ),\n        fantasy_player_id = case_when(\n          !is.na(.data$rusher_player_id) ~ .data$rusher_player_id,\n          is.na(.data$rusher_player_id) &\n            !is.na(.data$receiver_player_id) ~ .data$receiver_player_id,\n          TRUE ~ NA_character_\n        ),\n        fantasy = case_when(\n          !is.na(.data$rusher) ~ .data$rusher,\n          is.na(.data$rusher) & !is.na(.data$receiver) ~ .data$receiver,\n          .data$qb_scramble == 1 ~ .data$passer,\n          TRUE ~ NA_character_\n        ),\n        fantasy_id = case_when(\n          !is.na(.data$rusher_id) ~ .data$rusher_id,\n          is.na(.data$rusher_id) &\n            !is.na(.data$receiver_id) ~ .data$receiver_id,\n          .data$qb_scramble == 1 ~ .data$passer_id,\n          TRUE ~ NA_character_\n        ),\n        out_of_bounds = dplyr::if_else(\n          stringr::str_detect(.data$desc, \"(ran ob)|(pushed ob)|(sacked ob)\"),\n          1,\n          0\n        )\n      ) |>\n      dplyr::group_by(.data$game_id) |>\n      dplyr::mutate(\n        home_opening_kickoff = dplyr::if_else(\n          .data$home_team == dplyr::first(stats::na.omit(.data$posteam)),\n          1,\n          0\n        )\n      ) |>\n      dplyr::ungroup()\n  }\n\n  message_completed(\"Cleaning completed\", ...)\n\n  return(r)\n}\n\n#these things are used in clean_pbp() above\n\n# look for First[period or space]Last[maybe - or ' in last][maybe more letters in last][maybe Jr. or II or IV]\nbig_parser <- \"(?<=)[A-Z][A-z]*+(\\\\.|\\\\s)+[A-Z][A-z]*+\\\\'*\\\\-*[A-Z]*+[a-z]*+(\\\\s((Jr.)|(Sr.)|I{2,3})|(IV))?\"\n# maybe some spaces and letters, and then a rush direction unless they fumbled\nrush_finder <- \"(?=\\\\s*[a-z]*+\\\\s*((FUMBLES) | (left end)|(left tackle)|(left guard)|(up the middle)|(right guard)|(right tackle)|(right end)))\"\n# maybe some spaces and letters, and then pass / sack / scramble\npass_finder <- \"(?=\\\\s*[a-z]*+\\\\s*(( pass)|(sack)|(scramble)))\"\n# to or for, maybe a jersey number and a dash\nreceiver_finder <- \"(?<=((to)|(for))\\\\s[:digit:]{0,2}\\\\-{0,1})\"\n# weird play finder\nabnormal_play <- \"(Lateral)|(lateral)|(pitches to)|(Direct snap to)|(New quarterback for)|(Aborted)|(backwards pass)|(Pass back to)|(Flea-flicker)\"\n# look for 1-2 numbers before a dash\nnumber_parser <- \"((?<=)[:digit:]{1,2}(-))?\"\n# special case for receivers\nreceiver_number <- \"(?<=((to)|(for))\\\\s)[:digit:]{0,2}\\\\-{0,1}\"\n\n# These columns are being generated by clean_pbp and the function tries to drop\n# them in case it is being used on a pbp dataset where the columns already exist\ndrop.cols <- c(\n  \"success\",\n  \"passer\",\n  \"rusher\",\n  \"receiver\",\n  \"pass\",\n  \"rush\",\n  \"special\",\n  \"first_down\",\n  \"play\",\n  \"passer_id\",\n  \"rusher_id\",\n  \"receiver_id\",\n  \"name\",\n  \"id\",\n  \"passer_jersey_number\",\n  \"rusher_jersey_number\",\n  \"receiver_jersey_number\",\n  \"jersey_number\",\n  \"aborted_play\",\n  \"fantasy\",\n  \"fantasy_id\",\n  \"fantasy_player_name\",\n  \"fantasy_player_id\",\n  \"out_of_bounds\"\n)\n\n# fixes team names on columns with yard line\n# example: 'SD 49' --> 'LAC 49'\n# thanks to awgymer for the contribution:\n# https://github.com/nflverse/nflfastR/issues/29#issuecomment-654592195\nteam_name_fn <- function(var) {\n  stringr::str_replace_all(\n    var,\n    c(\n      \"JAC\" = \"JAX\",\n      \"STL\" = \"LA\",\n      \"SL\" = \"LA\",\n      \"LAR\" = \"LA\",\n      \"ARZ\" = \"ARI\",\n      \"BLT\" = \"BAL\",\n      \"CLV\" = \"CLE\",\n      \"HST\" = \"HOU\",\n      \"SD\" = \"LAC\",\n      \"OAK\" = \"LV\"\n    )\n  )\n}\n\n#' Compute QB epa\n#'\n#' @inheritParams clean_pbp\n#' @details Add the variable 'qb_epa', which gives QB credit for EPA for up to the point where\n#' a receiver lost a fumble after a completed catch and makes EPA work more\n#' like passing yards on plays with fumbles\n#' @export\nadd_qb_epa <- function(pbp, ...) {\n  if (nrow(pbp) == 0) {\n    user_message(\"Nothing to do. Return passed data frame.\", \"info\")\n  } else {\n    # drop existing values of clean_pbp\n    pbp <- pbp |> dplyr::select(-dplyr::any_of(\"qb_epa\"))\n\n    fumbles_df <- pbp |>\n      dplyr::filter(\n        .data$complete_pass == 1 &\n          .data$fumble_lost == 1 &\n          !is.na(.data$epa) &\n          !is.na(.data$down)\n      ) |>\n      dplyr::mutate(\n        half_seconds_remaining = dplyr::if_else(\n          .data$half_seconds_remaining <= 6,\n          0,\n          .data$half_seconds_remaining - 6\n        ),\n        down = as.numeric(.data$down),\n        # save old stuff for testing/checking\n        posteam_timeouts_pre = .data$posteam_timeouts_remaining,\n        defeam_timeouts_pre = .data$defteam_timeouts_remaining,\n        down_old = .data$down,\n        ydstogo_old = .data$ydstogo,\n        epa_old = .data$epa,\n        # update yard line, down, yards to go from play result\n        yardline_100 = .data$yardline_100 - .data$yards_gained,\n        down = dplyr::if_else(\n          .data$yards_gained >= .data$ydstogo,\n          1,\n          .data$down + 1\n        ),\n        # if the fumble spot would have resulted in turnover on downs, need to give other team the ball and fix\n        change = dplyr::if_else(.data$down == 5, 1, 0),\n        down = dplyr::if_else(.data$down == 5, 1, .data$down),\n        # yards to go is 10 if its a first down, update otherwise\n        ydstogo = dplyr::if_else(\n          .data$down == 1,\n          10,\n          .data$ydstogo - .data$yards_gained\n        ),\n        # 10 yards to go if possession change\n        ydstogo = dplyr::if_else(.data$change == 1, 10, .data$ydstogo),\n        # flip field and timeouts for possession change\n        yardline_100 = dplyr::if_else(\n          .data$change == 1,\n          100 - .data$yardline_100,\n          .data$yardline_100\n        ),\n        posteam_timeouts_remaining = dplyr::if_else(\n          .data$change == 1,\n          .data$defeam_timeouts_pre,\n          .data$posteam_timeouts_pre\n        ),\n        defteam_timeouts_remaining = dplyr::if_else(\n          .data$change == 1,\n          .data$posteam_timeouts_pre,\n          .data$defeam_timeouts_pre\n        ),\n        # fix yards to go for goal line (eg can't have 1st & 10 inside opponent 10 yard line)\n        ydstogo = dplyr::if_else(\n          .data$yardline_100 < .data$ydstogo,\n          .data$yardline_100,\n          .data$ydstogo\n        ),\n        ep_old = .data$ep\n      ) |>\n      dplyr::select(\n        \"game_id\",\n        \"play_id\",\n        \"season\",\n        \"home_team\",\n        \"posteam\",\n        \"roof\",\n        \"half_seconds_remaining\",\n        \"yardline_100\",\n        \"down\",\n        \"ydstogo\",\n        \"posteam_timeouts_remaining\",\n        \"defteam_timeouts_remaining\",\n        \"down_old\",\n        \"ep_old\",\n        \"change\"\n      )\n\n    if (nrow(fumbles_df) > 0) {\n      new_ep_df <- calculate_expected_points(fumbles_df) |>\n        dplyr::mutate(\n          ep = dplyr::if_else(.data$change == 1, -.data$ep, .data$ep),\n          fixed_epa = .data$ep - .data$ep_old\n        ) |>\n        dplyr::select(\"game_id\", \"play_id\", \"fixed_epa\")\n\n      pbp <- pbp |>\n        dplyr::left_join(new_ep_df, by = c(\"game_id\", \"play_id\")) |>\n        dplyr::mutate(\n          qb_epa = dplyr::if_else(\n            !is.na(.data$fixed_epa),\n            .data$fixed_epa,\n            .data$epa\n          )\n        ) |>\n        dplyr::select(-\"fixed_epa\")\n    } else {\n      pbp <- pbp |> dplyr::mutate(qb_epa = .data$epa)\n    }\n  }\n\n  message_completed(\"added qb_epa\", ...)\n\n  return(pbp)\n}\n\n# Function that fixes false \"pass\" positives in some hard coded plays where\n# the parser logic reached its limit\nfix_weird_pass_plays <- function(pass, game_id, play_id) {\n  combined_id <- paste(game_id, play_id, sep = \"_\")\n  false_positives <- c(\n    \"1999_01_ARI_PHI_1611\",\n    \"1999_01_SF_JAX_1788\",\n    \"1999_01_SF_JAX_2081\",\n    \"1999_11_ATL_TB_1740\",\n    \"2001_09_MIN_PHI_1307\",\n    \"2001_14_NE_BUF_452\",\n    \"2002_16_PIT_TB_527\",\n    \"2003_02_HOU_NO_3924\",\n    \"2003_15_PIT_NYJ_873\",\n    \"2004_05_BUF_NYJ_2555\",\n    \"2005_07_SD_PHI_321\",\n    \"2011_02_STL_NYG_1369\",\n    \"2016_05_NE_CLE_912\",\n    \"2016_06_CAR_NO_2690\",\n    \"2020_10_BAL_NE_2013\"\n  )\n  data.table::fifelse(combined_id %chin% false_positives, 0, pass, pass)\n}\n"
  },
  {
    "path": "R/helper_database_functions.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Create and update database with nflfastR pbp data\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n#' Update or Create a nflfastR Play-by-Play Database\n#'\n#' `update_db` updates or creates a database with `nflfastR`\n#' play by play data of all completed games since 1999.\n#'\n#' @details This function creates and updates a data table with the name `tblname`\n#' within a SQLite database (other drivers via `db_connection`) located in\n#' `dbdir` and named `dbname`.\n#' The data table combines all play by play data for every available game back\n#' to the 1999 season and adds the most recent completed games as soon as they\n#' are available for `nflfastR`.\n#'\n#' The argument `force_rebuild` is of hybrid type. It can rebuild the play\n#' by play data table either for the whole nflfastR era (with `force_rebuild = TRUE`)\n#' or just for specified seasons (e.g. `force_rebuild = c(2019, 2020)`).\n#' Please note the following behavior:\n#' * `force_rebuild = TRUE`: The data table with the name `tblname`\n#'   will be removed completely and rebuilt from scratch. This is helpful when\n#'   new columns are added during the Off-Season.\n#' * `force_rebuild = c(2019, 2020)`: The data table with the name `tblname`\n#'   will be preserved and only rows from the 2019 and 2020 seasons will be\n#'   deleted and re-added. This is intended to be used for ongoing seasons because\n#'   the NFL fixes bugs in the underlying data during the week and we recommend\n#'   rebuilding the current season every Thursday during the season.\n#'\n#' The parameter `db_connection` is intended for advanced users who want\n#' to use other DBI drivers, such as MariaDB, Postgres or odbc. Please note that\n#' the arguments `dbdir` and `dbname` are dropped in case a `db_connection`\n#' is provided but the argument `tblname` will still be used to write the\n#' data table into the database.\n#'\n#' @param dbdir Directory in which the database is or shall be located. Can also\n#'   be set globally with `options(nflfastR.dbdirectory)`\n#' @param dbname File name of an existing or desired SQLite database within `dbdir`\n#' @param tblname The name of the play by play data table within the database\n#' @param force_rebuild Hybrid parameter (logical or numeric) to rebuild parts\n#' of or the complete play by play data table within the database (please see details for further information)\n#' @param db_connection A `DBIConnection` object, as returned by\n#' [DBI::dbConnect()] (please see details for further information)\n#' @export\nupdate_db <- function(\n  dbdir = getOption(\"nflfastR.dbdirectory\", default = \".\"),\n  dbname = \"pbp_db\",\n  tblname = \"nflfastR_pbp\",\n  force_rebuild = FALSE,\n  db_connection = NULL\n) {\n  rule_header(\"Update nflfastR Play-by-Play Database\")\n\n  if (\n    !is_installed(\"DBI\") |\n      !is_installed(\"purrr\") |\n      (!is_installed(\"RSQLite\") & is.null(db_connection))\n  ) {\n    cli::cli_abort(\n      \"{my_time()} | Packages {.pkg DBI}, {.pkg RSQLite} and {.pkg purrr} required for database communication. Please install them.\"\n    )\n  }\n\n  if (any(force_rebuild == \"NEW\")) {\n    cli::cli_abort(\n      \"{my_time()} | The argument {.arg force_rebuild = NEW} is only for internal usage!\"\n    )\n  }\n\n  if (!(is.logical(force_rebuild) | is.numeric(force_rebuild))) {\n    cli::cli_abort(\n      \"{my_time()} | The argument {.arg force_rebuild} has to be either logical or numeric!\"\n    )\n  }\n\n  if (!dir.exists(dbdir) & is.null(db_connection)) {\n    cli::cli_alert_danger(\n      \"{my_time()} | Directory {.file {dbdir}} doesn't exist yet. Try creating...\"\n    )\n    dir.create(dbdir)\n  }\n\n  if (is.null(db_connection)) {\n    connection <- DBI::dbConnect(RSQLite::SQLite(), file.path(dbdir, dbname))\n  } else {\n    connection <- db_connection\n  }\n\n  # create db if it doesn't exist or user forces rebuild\n  if (!DBI::dbExistsTable(connection, tblname)) {\n    build_db(tblname, connection, rebuild = \"NEW\")\n  } else if (\n    DBI::dbExistsTable(connection, tblname) & all(force_rebuild != FALSE)\n  ) {\n    build_db(tblname, connection, rebuild = force_rebuild)\n  }\n\n  # get completed games using Lee's file (thanks Lee!)\n  user_message(\"Checking for missing completed games...\", \"todo\")\n  completed_games <- nflreadr::load_schedules() |>\n    # completed games since 1999, excluding the broken games\n    dplyr::filter(\n      .data$season >= 1999,\n      !is.na(.data$result),\n      !.data$game_id %in%\n        c(\"1999_01_BAL_STL\", \"2000_06_BUF_MIA\", \"2000_03_SD_KC\")\n    ) |>\n    dplyr::arrange(.data$gameday) |>\n    dplyr::pull(.data$game_id)\n\n  # function below\n  missing <- get_missing_games(completed_games, connection, tblname)\n\n  # rebuild db if number of missing games is too large\n  if (length(missing) > 16) {\n    # limit set to >16 to make sure this doesn't get triggered on gameday (e.g. week 17)\n    build_db(\n      tblname,\n      connection,\n      show_message = FALSE,\n      rebuild = as.numeric(unique(stringr::str_sub(missing, 1, 4)))\n    )\n    missing <- get_missing_games(completed_games, connection, tblname)\n  }\n\n  # if there's missing games, scrape and write to db\n  if (length(missing) > 0) {\n    new_pbp <- build_nflfastR_pbp(missing, rules = FALSE)\n\n    if (nrow(new_pbp) == 0) {\n      user_message(\n        \"Raw data of new games are not yet ready. Please try again in about 10 minutes.\",\n        \"oops\"\n      )\n    } else {\n      user_message(\"Appending new data to database...\", \"todo\")\n      DBI::dbWriteTable(connection, tblname, new_pbp, append = TRUE)\n    }\n  }\n\n  # Remove default play which is just a helper to define columns correctly\n  DBI::dbExecute(\n    connection,\n    glue::glue_sql(\n      \"DELETE FROM {`tblname`} WHERE game_id IN ({vals*})\",\n      vals = \"9999_99_DEF_TYP\",\n      .con = connection\n    )\n  )\n\n  message_completed(\"Database update completed\", in_builder = TRUE)\n  cli::cli_alert_info(\n    \"{my_time()} | Path to your db: {.file {DBI::dbGetInfo(connection)$dbname}}\"\n  )\n  if (is.null(db_connection)) {\n    DBI::dbDisconnect(connection)\n  }\n  rule_footer(\"DONE\")\n}\n\n# this is a helper function to build nflfastR database from Scratch\nbuild_db <- function(\n  tblname = \"nflfastR_pbp\",\n  db_conn,\n  rebuild = FALSE,\n  show_message = TRUE\n) {\n  valid_seasons <- nflreadr::load_schedules() |>\n    dplyr::filter(.data$season >= 1999 & !is.na(.data$result)) |>\n    dplyr::group_by(.data$season) |>\n    dplyr::summarise() |>\n    dplyr::ungroup()\n\n  if (all(rebuild == TRUE)) {\n    cli::cli_ul(\n      \"{my_time()} | Purging the complete data table {.val {tblname}}\n                in your connected database...\"\n    )\n    DBI::dbRemoveTable(db_conn, tblname)\n    seasons <- valid_seasons |> dplyr::pull(\"season\")\n    cli::cli_ul(\n      \"{my_time()} | Starting download of {length(seasons)} seasons\n                between {min(seasons)} and {max(seasons)}...\"\n    )\n  } else if (is.numeric(rebuild) & all(rebuild %in% valid_seasons$season)) {\n    # s <- glue::glue_collapse(rebuild, sep = \", \", last = \", and \")\n    # string <- stringr::str_c(stringr::str_sub(s, 1, 11), \"...\", stringr::str_sub(s, -16, -1))\n    if (show_message) {\n      cli::cli_ul(\n        \"{my_time()} | Purging\n                                  {cli::qty(length(rebuild))}season{?s} {rebuild}\n                                  from the data table {.val {tblname}} in your\n                                  connected database...\"\n      )\n    }\n    DBI::dbExecute(\n      db_conn,\n      glue::glue_sql(\n        \"DELETE FROM {`tblname`} WHERE season IN ({vals*})\",\n        vals = rebuild,\n        .con = db_conn\n      )\n    )\n    seasons <- valid_seasons |>\n      dplyr::filter(.data$season %in% rebuild) |>\n      dplyr::pull(\"season\")\n    cli::cli_ul(\n      \"{my_time()} | Starting download of the {length(rebuild)}\n                season{?s} {rebuild}\"\n    )\n  } else if (all(rebuild == \"NEW\")) {\n    cli::cli_alert_info(\n      \"{my_time()} | Can't find the data table {.val {tblname}}\n                        in your database. Will load the play by play data from\n                        scratch.\"\n    )\n    seasons <- valid_seasons |> dplyr::pull(\"season\")\n    cli::cli_ul(\n      \"{my_time()} | Starting download of {length(seasons)} seasons\n                between {min(seasons)} and {max(seasons)}...\"\n    )\n  } else {\n    seasons <- NULL\n    cli::cli_alert_danger(\n      \"{my_time()} | At least one invalid value passed to argument {.arg force_rebuild}. Please try again with valid input.\"\n    )\n  }\n\n  if (!is.null(seasons)) {\n    # this function lives in R/utils.R\n    write_pbp(seasons, dbConnection = db_conn, tablename = tblname)\n  }\n}\n\n# this is a helper function to check a list of completed games\n# against the games that exist in a database connection\nget_missing_games <- function(completed_games, dbConnection, tablename) {\n  db_ids <- dplyr::tbl(dbConnection, tablename) |>\n    dplyr::select(\"game_id\") |>\n    dplyr::filter(.data$game_id != \"9999_99_DEF_TYP\") |>\n    dplyr::distinct() |>\n    dplyr::collect() |>\n    dplyr::pull(\"game_id\")\n\n  need_scrape <- completed_games[\n    !completed_games %in% c(db_ids, \"9999_99_DEF_TYP\")\n  ]\n\n  cli::cli_alert_info(\n    \"{my_time()} | You have {length(db_ids)} game{?s} and are missing {length(need_scrape)}.\"\n  )\n  return(need_scrape)\n}\n"
  },
  {
    "path": "R/helper_decode_player_ids.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Function to decode play-by-play player IDs.\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n#' Decode the player IDs in nflfastR play-by-play data\n#'\n#' @inheritParams clean_pbp\n#' @param fast If `TRUE` the IDs will be decoded with the high efficient\n#' function [decode_ids][gsisdecoder::decode_ids]. If `FALSE` an nflfastR internal\n#' function will be used for decoding (it is generally not recommended to do this,\n#' unless there is a problem with [decode_ids][gsisdecoder::decode_ids]\n#' which can take several days to fix on CRAN.)\n#'\n#' @description Takes all columns ending with \\code{'player_id'} as well as the\n#' variables \\code{'passer_id'}, \\code{'rusher_id'}, \\code{'fantasy_id'},\n#' \\code{'receiver_id'}, and \\code{'id'} of an nflfastR play-by-play data set\n#' and decodes the player IDs to the commonly known GSIS ID format 00-00xxxxx.\n#'\n#' The function uses by default the high efficient [decode_ids][gsisdecoder::decode_ids]\n#' of the package [`gsisdecoder`](https://cran.r-project.org/package=gsisdecoder).\n#' In the unlikely event that there is a problem with this function, an nflfastR\n#' internal decoder can be used with the option `fast = FALSE`.\n#'\n#' The 2022 play by play data introduced new player IDs that can't be decoded\n#' with gsisdecoder. In that case, IDs are joined through [nflreadr::load_players].\n#'\n#' @return The input data frame of the parameter `pbp` with decoded player IDs.\n#' @export\n#' @examples\n#' \\donttest{\n#' # Decode data frame consisting of some names and ids\n#' decode_player_ids(data.frame(\n#'   name = c(\"P.Mahomes\", \"B.Baldwin\", \"P.Mahomes\", \"S.Carl\", \"J.Jones\"),\n#'   id = c(\n#'     \"32013030-2d30-3033-3338-3733fa30c4fa\",\n#'     NA_character_,\n#'     \"00-0033873\",\n#'     NA_character_,\n#'     \"32013030-2d30-3032-3739-3434d4d3846d\"\n#'   )\n#' ))\n#' }\ndecode_player_ids <- function(pbp, ..., fast = TRUE) {\n  # need newer version of nflreadr to use load_players\n  rlang::check_installed(\"nflreadr (>= 1.3.0)\", \"to decode player IDs.\")\n\n  if (isFALSE(fast)) {\n    if (nrow(pbp) > 1000 && is_sequential()) {\n      cli::cli_alert_info(c(\n        \"It is recommended to use parallel processing when trying to to decode big data frames.\",\n        \"Please consider running {.code future::plan(\\\"multisession\\\")}! \",\n        \"Will go on sequentially...\"\n      ))\n    }\n    decode_gsis <- decode_ids\n  } else if (isTRUE(fast)) {\n    rlang::check_installed(\"gsisdecoder\", \"to run fast decoding of player IDs.\")\n    decode_gsis <- gsisdecoder::decode_ids\n  }\n\n  user_message(\"Decode player ids...\", \"todo\")\n\n  players <- nflreadr::load_players()\n\n  id_vector <- players$gsis_id\n  names(id_vector) <- players$esb_id\n\n  ret <- pbp |>\n    dplyr::mutate_at(\n      dplyr::vars(\n        dplyr::any_of(c(\n          \"passer_id\",\n          \"rusher_id\",\n          \"receiver_id\",\n          \"id\",\n          \"fantasy_id\"\n        )),\n        dplyr::ends_with(\"player_id\")\n      ),\n      function(id, id_vec = id_vector) {\n        chars <- nchar(id)\n        dplyr::case_when(\n          is.na(chars) ~ NA_character_,\n          # this means it's gsis ID. 30 30 2d 30 30 translates to 00-00\n          stringr::str_sub(id, 5, 16) == \"3030-2d30-30\" ~ decode_gsis(id),\n          # if it's not gsis, it is likely elias. We drop names to avoid confusion\n          nchar(id) == 36 ~ unname(id_vec[extract_elias(\n            id,\n            decoder = decode_gsis\n          )]),\n          TRUE ~ id\n        )\n      }\n    )\n\n  message_completed(\"Decoding of player ids completed\", ...)\n\n  ret\n}\n\ndecode_ids <- function(var) {\n  furrr::future_map_chr(var, convert_to_gsis_id)\n}\n\nconvert_to_gsis_id <- function(new_id) {\n  if (is.na(new_id) | stringr::str_length(new_id) != 36) {\n    ret <- new_id\n  } else {\n    to_decode <- new_id |>\n      stringr::str_sub(5, -9) |>\n      stringr::str_replace_all(\"-\", \"\")\n    hex_raw <- sapply(seq(1, nchar(to_decode), by = 2), function(x) {\n      substr(to_decode, x, x + 1)\n    })\n    ret <- rawToChar(as.raw(strtoi(hex_raw, 16L)))\n  }\n  return(ret)\n}\n\nextract_elias <- function(smart_id, decoder) {\n  name_abbr <- decoder(smart_id) |> substr(1, 3)\n  id_no <- stringr::str_remove_all(smart_id, \"-\") |>\n    stringr::str_sub(11, 16)\n  elias_id <- paste0(name_abbr, id_no)\n  elias_id\n}\n"
  },
  {
    "path": "R/helper_get_scheds_and_rosters.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Function for loading schedules and rosters from nflfastR repos\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\nget_scheds_and_rosters <- function(season, type) {\n  type <- match.arg(type, choices = c(\"schedule\", \"roster\"))\n\n  switch(\n    type,\n    \"schedule\" = nflreadr::load_schedules(season),\n    \"roster\" = nflreadr::load_rosters(season)\n  )\n}\n"
  },
  {
    "path": "R/helper_scrape_gc.R",
    "content": "################################################################################\n# Author: Ben Baldwin\n# Stlyeguide: styler::tidyverse_style()\n################################################################################\n\n# Build a tidy version of scraped gamecenter data\n# Data exist since 1999\n#\n# @param gameId Specifies the game\n\nget_pbp_gc <- function(\n  gameId,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...\n) {\n  # testing only\n  # gameId = '2013120812'\n  # gameId = '2019_01_GB_CHI'\n  # gameId = '2009_18_NYJ_CIN'\n  # gameId = '2007_01_ARI_SF'\n  # gameId = '1999_01_BAL_STL'\n  # gameId <- \"2000_03_PIT_CLE\"\n\n  if (gameId %in% c(\"2000_03_SD_KC\", \"2000_06_BUF_MIA\", \"1999_01_BAL_STL\")) {\n    cli::cli_abort(\"You asked for GameID {.val {gameId}} is broken. Skipping.\")\n  }\n\n  season <- as.integer(substr(gameId, 1, 4))\n\n  raw <- fetch_raw(game_id = gameId, dir = dir)\n\n  game_json <- raw[[1]]\n\n  date_parse <- names(raw)[1] |> stringr::str_extract(pattern = \"[0-9]{8}\")\n  date_year <- stringr::str_sub(date_parse, 1, 4)\n  date_month <- stringr::str_sub(date_parse, 5, 6)\n  date_day <- stringr::str_sub(\n    date_parse,\n    nchar(date_parse) - 1,\n    nchar(date_parse)\n  )\n\n  week <- as.integer(substr(gameId, 6, 7))\n  if (week <= 17) {\n    season_type <- \"REG\"\n  } else {\n    season_type <- \"POST\"\n  }\n\n  if (date_year < 1999) {\n    cli::cli_abort(\n      \"You asked a game from {date_year}, but data only goes back to 1999.\"\n    )\n  }\n\n  # excluding last element since it's \"crntdrv\" and not an actual\n  drives <- game_json$drives[-length(game_json$drives)]\n\n  # list of plays\n  # each play has \"players\" column which is a list of player stats from the play\n  plays <- suppressWarnings(furrr::future_map_dfr(\n    seq_along(drives),\n    function(x) {\n      cbind(\n        \"drive\" = x,\n        data.frame(do.call(\n          rbind,\n          drives[[x]]$plays\n        ))[, c(1:11)]\n      ) |>\n        dplyr::mutate(\n          play_id = names(drives[[x]]$plays),\n          play_id = as.numeric(.data$play_id)\n        )\n    }\n  ))\n\n  # some 2000 games have play_ids like 2767.375 and 2767.703 which results in\n  # duplicates that can be fixed. We save play IDs as numeric first and then\n  # check whether or not there are duplicates when we convert them to integer\n  # If there are duplicates, we multiply all play IDs by 10 and check again\n  # If there are still duplicates, we multiply all play IDs by 100 and so on\n  # As soon as play IDs are unique, we save them as integer and go on\n  plays$play_id <- uniquify_ids(plays$play_id)\n\n  plays$quarter_end <- dplyr::if_else(\n    stringr::str_detect(\n      plays$desc,\n      \"(END QUARTER)|(END GAME)|(End of quarter)\"\n    ),\n    1,\n    0\n  )\n  plays$home_team <- game_json$home$abbr\n  plays$away_team <- game_json$away$abbr\n\n  # get df with 1 line per statId\n  stats <- furrr::future_map_dfr(seq_along(plays$play_id), function(x) {\n    dplyr::bind_rows(plays[x, ]$players[[1]], .id = \"player_id\") |>\n      dplyr::mutate(play_id = plays[x, ]$play_id)\n  }) |>\n    dplyr::mutate(\n      sequence = as.numeric(.data$sequence),\n      statId = as.numeric(.data$statId),\n      play_id = as.character(.data$play_id),\n      yards = as.integer(.data$yards)\n    ) |>\n    dplyr::arrange(.data$play_id, .data$sequence) |>\n    dplyr::rename(\n      playId = \"play_id\",\n      teamAbbr = \"clubcode\",\n      player.esbId = \"player_id\",\n      player.displayName = \"playerName\",\n      playStatSeq = \"sequence\"\n    )\n\n  pbp_stats <- lapply(unique(stats$playId), sum_play_stats, stats)\n  pbp_stats <- data.table::rbindlist(pbp_stats) |> tibble::as_tibble()\n\n  # drive info\n  d <- tibble::tibble(drives) |>\n    tidyr::unnest_wider(drives) |>\n    # dplyr::select(-plays) |>\n    tidyr::unnest_wider(\"start\", names_sep = \"_\") |>\n    tidyr::unnest_wider(\"end\", names_sep = \"_\") |>\n    dplyr::mutate(drive = 1:dplyr::n()) |>\n    dplyr::rename(\n      drive_play_count = \"numplays\",\n      drive_time_of_possession = \"postime\",\n      drive_first_downs = \"fds\",\n      drive_inside20 = \"redzone\",\n      drive_quarter_start = \"start_qtr\",\n      drive_quarter_end = \"end_qtr\",\n      drive_end_transition = \"result\",\n      drive_game_clock_start = \"start_time\",\n      drive_game_clock_end = \"end_time\",\n      drive_start_yard_line = \"start_yrdln\",\n      drive_end_yard_line = \"end_yrdln\"\n    ) |>\n    dplyr::mutate(\n      drive_inside20 = dplyr::if_else(.data$drive_inside20, 1, 0),\n      drive_how_ended_description = .data$drive_end_transition,\n      drive_ended_with_score = dplyr::if_else(\n        .data$drive_how_ended_description == \"Touchdown\" |\n          .data$drive_how_ended_description == \"Field Goal\",\n        1,\n        0\n      ),\n      drive_start_transition = dplyr::lag(.data$drive_how_ended_description, 1),\n      drive_how_started_description = .data$drive_start_transition\n    ) |>\n    dplyr::select(\n      \"drive\",\n      \"drive_play_count\",\n      \"drive_time_of_possession\",\n      \"drive_first_downs\",\n      \"drive_inside20\",\n      \"drive_ended_with_score\",\n      \"drive_quarter_start\",\n      \"drive_quarter_end\",\n      \"drive_end_transition\",\n      \"drive_how_ended_description\",\n      \"drive_game_clock_start\",\n      \"drive_game_clock_end\",\n      \"drive_start_yard_line\",\n      \"drive_end_yard_line\",\n      \"drive_start_transition\",\n      \"drive_how_started_description\"\n    )\n\n  combined <- plays |>\n    dplyr::left_join(pbp_stats, by = \"play_id\") |>\n    dplyr::mutate_if(is.logical, as.numeric) |>\n    dplyr::mutate_if(is.integer, as.numeric) |>\n    dplyr::select(-\"players\", -\"note\") |>\n    #Weirdly formatted and missing anyway\n    dplyr::mutate(note = NA_character_) |>\n    dplyr::rename(\n      yardline = \"yrdln\",\n      quarter = \"qtr\",\n      play_description = \"desc\",\n      yards_to_go = \"ydstogo\"\n    ) |>\n    tidyr::unnest(\n      cols = c(\n        \"sp\",\n        \"quarter\",\n        \"down\",\n        \"time\",\n        \"yardline\",\n        \"yards_to_go\",\n        \"ydsnet\",\n        \"posteam\",\n        \"play_description\",\n        \"note\"\n      )\n    ) |>\n    dplyr::left_join(d, by = \"drive\") |>\n    dplyr::mutate(\n      posteam_id = .data$posteam,\n      game_id = gameId,\n      game_year = as.integer(date_year),\n      game_month = as.integer(date_month),\n      game_date = as.Date(\n        paste(date_month, date_day, date_year, sep = \"/\"),\n        format = \"%m/%d/%Y\"\n      ),\n      season = season,\n\n      # fix up yardline before doing stuff. from nflscrapr\n      yardline = dplyr::if_else(\n        .data$yardline == \"50\",\n        \"MID 50\",\n        .data$yardline\n      ),\n      yardline = dplyr::if_else(\n        nchar(.data$yardline) == 0 |\n          is.null(.data$yardline) |\n          .data$yardline == \"NULL\",\n        dplyr::lag(.data$yardline),\n        .data$yardline\n      ),\n\n      # have to do all this nonsense to make goal_to_go and yardline_side for compatibility with later functions\n      yardline_side = furrr::future_map_chr(\n        stringr::str_split(.data$yardline, \" \"),\n        function(x) x[1]\n      ),\n      yardline_number = as.numeric(furrr::future_map_chr(\n        stringr::str_split(.data$yardline, \" \"),\n        function(x) x[2]\n      )),\n      goal_to_go = dplyr::if_else(\n        .data$yardline_side != .data$posteam &\n          ((.data$yards_to_go == .data$yardline_number) |\n            (.data$yards_to_go <= 1 & .data$yardline_number == 1)),\n        1,\n        0\n      ),\n      down = as.double(.data$down),\n      quarter = as.double(.data$quarter),\n      week = week,\n      season_type = season_type,\n      # missing from older gc data\n      drive_real_start_time = NA_character_,\n      start_time = NA_character_,\n      stadium = NA_character_,\n      weather = NA_character_,\n      nfl_api_id = NA_character_,\n      play_clock = NA_character_,\n      play_deleted = NA_real_,\n      play_type_nfl = NA_character_,\n      drive_yards_penalized = NA_real_,\n      end_clock_time = NA_character_,\n      end_yard_line = NA_character_,\n      order_sequence = NA_real_,\n      time_of_day = NA_character_,\n      special_teams_play = NA_real_,\n      st_play_type = NA_character_,\n      # there seems to be no easy way to find the safety scoring team. Will hard code the plays\n      # as there are only 6 of them in the game center data\n      safety_team = dplyr::case_when(\n        .data$safety == 1 &\n          .data$game_id == \"1999_04_PHI_NYG\" &\n          .data$play_id == 827 ~ .data$posteam,\n        .data$safety == 1 &\n          .data$game_id == \"2000_03_ATL_CAR\" &\n          .data$play_id == 3423 ~ .data$posteam,\n        .data$safety == 1 &\n          .data$game_id == \"2000_16_OAK_SEA\" &\n          .data$play_id == 3590 ~ .data$posteam,\n        .data$safety == 1 &\n          .data$game_id == \"2001_14_DAL_SEA\" &\n          .data$play_id == 2552 ~ .data$posteam,\n        .data$safety == 1 &\n          .data$game_id == \"2003_03_NO_TEN\" &\n          .data$play_id == 416 ~ .data$posteam,\n        .data$safety == 1 &\n          .data$game_id == \"2009_08_STL_DET\" &\n          .data$play_id == 987 ~ .data$posteam,\n        .data$safety == 1 & .data$posteam == .data$home_team ~ .data$away_team,\n        .data$safety == 1 & .data$posteam == .data$away_team ~ .data$home_team,\n        TRUE ~ NA_character_\n      )\n    ) |>\n    dplyr::group_by(.data$drive) |>\n    dplyr::mutate(\n      drive_play_id_started = min(.data$play_id, na.rm = TRUE),\n      drive_play_seq_started = min(.data$play_id, na.rm = TRUE),\n      drive_play_id_ended = max(.data$play_id, na.rm = TRUE),\n      drive_play_seq_ended = max(.data$play_id, na.rm = TRUE)\n    ) |>\n    dplyr::ungroup()\n\n  # missing space in side of field breaks parser\n  if (gameId %in% c('2000_01_CAR_WAS', '2000_02_NE_NYJ', '2000_03_ATL_CAR')) {\n    combined <- combined |>\n      dplyr::mutate(\n        yardline_number = case_when(\n          .data$yardline %in% c(\"WAS20\", \"NYJ20\", \"ATL20\") ~ 20,\n          TRUE ~ .data$yardline_number\n        ),\n        yardline = case_when(\n          .data$yardline == \"WAS20\" ~ \"WAS 20\",\n          .data$yardline == \"NYJ20\" ~ \"NYJ 20\",\n          .data$yardline == \"ATL20\" ~ \"ATL 20\",\n          TRUE ~ .data$yardline\n        ),\n        yardline_side = case_when(\n          .data$yardline_side == \"WAS20\" ~ \"WAS\",\n          .data$yardline_side == \"NYJ20\" ~ \"NYJ\",\n          .data$yardline_side == \"ATL20\" ~ \"ATL\",\n          TRUE ~ .data$yardline_side\n        )\n      )\n  }\n  return(combined)\n}\n"
  },
  {
    "path": "R/helper_scrape_nfl.R",
    "content": "################################################################################\n# Author: Sebastian Carl, Ben Baldwin\n# Purpose: Function for scraping pbp data from the new NFL web site\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n# Build a tidy version of scraped NFL data\n#\n# @param id Specifies the game\nget_pbp_nfl <- function(\n  id,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...\n) {\n  #testing\n  #id = '2022_01_PHI_DET'\n  # id = '2015_01_CAR_JAX'\n  #id = '2011_01_NO_GB'\n\n  season <- substr(id, 1, 4)\n  week <- as.integer(substr(id, 6, 7))\n\n  raw_data <- fetch_raw(game_id = id, dir = dir)\n\n  season_type <- dplyr::case_when(\n    season <= 2020 & week <= 17 ~ \"REG\",\n    season >= 2021 & week <= 18 ~ \"REG\",\n    TRUE ~ \"POST\"\n  )\n\n  game_id <- raw_data$data$viewer$gameDetail$id\n  home_team <- raw_data$data$viewer$gameDetail$homeTeam$abbreviation\n  away_team <- raw_data$data$viewer$gameDetail$visitorTeam$abbreviation\n  home_team <- data.table::fcase(\n    home_team == \"JAC\" , \"JAX\" ,\n    home_team == \"SD\"  , \"LAC\" ,\n    default = home_team\n  )\n  away_team <- data.table::fcase(\n    away_team == \"JAC\" , \"JAX\" ,\n    away_team == \"SD\"  , \"LAC\" ,\n    default = away_team\n  )\n\n  # if home team and away team are the same, the game is messed up and needs fixing\n  if (home_team == away_team) {\n    # get correct home and away from the game ID\n    id_parts <- stringr::str_split(id, \"_\")\n    away_team <- id_parts[[1]][3]\n    home_team <- id_parts[[1]][4]\n    bad_game <- 1\n  } else {\n    bad_game <- 0\n  }\n\n  weather <- ifelse(\n    is.null(raw_data$data$viewer$gameDetail$weather$shortDescription),\n    NA_character_,\n    raw_data$data$viewer$gameDetail$weather$shortDescription\n  )\n  stadium <- ifelse(\n    is.null(raw_data$data$viewer$gameDetail$stadium),\n    NA_character_,\n    raw_data$data$viewer$gameDetail$stadium\n  )\n  start_time <- raw_data$data$viewer$gameDetail$startTime\n\n  game_info <- tibble::tibble(\n    game_id = as.character(game_id),\n    home_team,\n    away_team,\n    weather,\n    stadium,\n    start_time\n  )\n\n  plays <- raw_data$data$viewer$gameDetail$plays |>\n    dplyr::mutate(game_id = as.character(game_id))\n\n  # We have this issue https://github.com/nflverse/nflfastR/issues/309 with 2013 postseason games\n  # where the driveSequenceNumber in the plays df is NA for all plays. That prevents drive information\n  # from being joined.\n  # In this case, we compute our own driveSequenceNumber by incrementing a counter depending on the\n  # value of driveTimeOfPossession.\n  # driveTimeOfPossession will be a constant value during a drive so this should actually be accurate\n  if (all(is.na(plays$driveSequenceNumber))) {\n    plays <- plays |>\n      dplyr::mutate(\n        # First, create a trigger for cumsum\n        drive_trigger = dplyr::case_when(\n          # this is the first play of the first drive\n          is.na(dplyr::lag(.data$driveTimeOfPossession)) &\n            !is.na(.data$driveTimeOfPossession) ~ 1,\n          # if driveTimeOfPossession changes, there is a new drive\n          dplyr::lag(.data$driveTimeOfPossession) !=\n            .data$driveTimeOfPossession ~ 1,\n          TRUE ~ 0\n        ),\n        # Now create the drive number by accumulationg triggers\n        driveSequenceNumber = cumsum(.data$drive_trigger),\n        # driveSequenceNumber should be NA on plays where driveTimeOfPossession is NA\n        driveSequenceNumber = ifelse(\n          is.na(.data$driveTimeOfPossession),\n          NA_real_,\n          .data$driveSequenceNumber\n        ),\n        # drop the helper\n        drive_trigger = NULL\n      )\n  }\n\n  drives <- raw_data$data$viewer$gameDetail$drives |>\n    dplyr::mutate(ydsnet = .data$yards + .data$yardsPenalized) |>\n    # these are already in plays\n    dplyr::select(\n      -\"possessionTeam.abbreviation\",\n      -\"possessionTeam.nickName\",\n      -\"possessionTeam.franchise.currentLogo.url\"\n    ) |>\n    janitor::clean_names()\n  colnames(drives) <- paste0(\"drive_\", colnames(drives))\n\n  stats <- tidyr::unnest(\n    plays |> dplyr::select(-\"yards\"),\n    cols = c(\"playStats\")\n  ) |>\n    dplyr::mutate(\n      yards = as.integer(.data$yards),\n      statId = as.numeric(.data$statId),\n      team.abbreviation = as.character(.data$team.abbreviation)\n    ) |>\n    dplyr::rename(\n      player.esbId = \"gsisPlayer.id\",\n      player.displayName = \"playerName\",\n      teamAbbr = \"team.abbreviation\"\n    ) |>\n    dplyr::select(\n      \"playId\",\n      \"statId\",\n      \"yards\",\n      \"teamAbbr\",\n      \"player.displayName\",\n      \"player.esbId\"\n    )\n\n  # there was a penalty on this play so these stat IDs shouldn't exist\n  if (id == \"2020_10_DEN_LV\") {\n    stats <- stats |>\n      dplyr::filter(!(.data$playId == 979 & .data$statId %in% c(8, 10, 79)))\n  }\n\n  pbp_stats <- lapply(unique(stats$playId), sum_play_stats, stats)\n  pbp_stats <- data.table::rbindlist(pbp_stats) |> tibble::as_tibble()\n\n  combined <- game_info |>\n    dplyr::bind_cols(plays |> dplyr::select(-\"playStats\", -\"game_id\")) |>\n    dplyr::left_join(\n      drives,\n      by = c(\"driveSequenceNumber\" = \"drive_order_sequence\")\n    ) |>\n    dplyr::left_join(pbp_stats, by = c(\"playId\" = \"play_id\")) |>\n    dplyr::mutate_if(is.logical, as.numeric) |>\n    dplyr::mutate_if(is.integer, as.numeric) |>\n    dplyr::mutate_if(is.factor, as.character) |>\n    # The abbreviations SD <-> LAC and JAC <-> JAX are mixed up in the raw json data\n    # to make sure team names match, we normalize the names here\n    # We also remove new line characters esp. from desc\n    dplyr::mutate_if(\n      .predicate = is.character,\n      .funs = ~ team_name_fn(.x) |>\n        stringr::str_replace_all(\"[\\r\\n]\", \" \") |>\n        stringr::str_squish()\n    ) |>\n    janitor::clean_names() |>\n    dplyr::select(\n      -\"drive_play_count\",\n      -\"drive_time_of_possession\",\n      -\"next_play_type\"\n    ) |>\n    dplyr::rename(\n      time = \"clock_time\",\n      play_type_nfl = \"play_type\",\n      posteam = \"possession_team_abbreviation\",\n      yardline = \"yard_line\",\n      sp = \"scoring_play\",\n      drive = \"drive_sequence_number\",\n      nfl_api_id = \"game_id\",\n      drive_play_count = \"drive_play_count_2\",\n      drive_time_of_possession = \"drive_time_of_possession_2\",\n      ydsnet = \"drive_ydsnet\"\n    ) |>\n    dplyr::mutate(\n      posteam_id = .data$posteam,\n      # have to do all this nonsense to make goal_to_go and yardline_side for compatibility with later functions\n      yardline_side = str_split_and_extract(.data$yardline, \" \", 1),\n      yardline_number = as.numeric(str_split_and_extract(\n        .data$yardline,\n        \" \",\n        2\n      )),\n      quarter_end = dplyr::if_else(\n        stringr::str_detect(.data$play_description, \"END QUARTER\"),\n        1,\n        0\n      ),\n      game_year = as.integer(season),\n      season = as.integer(season),\n      # this is only needed for epa and dropped later\n      game_month = as.integer(11),\n      game_id = id,\n      play_description = .data$play_description_with_jersey_numbers,\n      week = week,\n      season_type = season_type,\n      play_clock = as.character(.data$play_clock),\n      st_play_type = as.character(.data$st_play_type),\n\n      # fix muffed punt td in JAC game\n      td_team = dplyr::if_else(\n        id == \"2011_14_TB_JAX\" & .data$play_id == 1343 & .data$td_team != \"JAX\",\n        'JAX',\n        .data$td_team\n      ),\n\n      # kickoff return TDs in old JAC games\n      td_team = dplyr::if_else(\n        id == \"2006_14_IND_JAX\" &\n          .data$play_id == 2078 &\n          .data$td_team != \"JAX\",\n        'JAX',\n        .data$td_team\n      ),\n      td_team = dplyr::if_else(\n        id == \"2007_17_JAX_HOU\" &\n          .data$play_id %in% c(1907, 2042) &\n          .data$td_team != \"JAX\",\n        'HOU',\n        .data$td_team\n      ),\n      td_team = dplyr::if_else(\n        id == \"2008_09_JAX_CIN\" &\n          .data$play_id == 3145 &\n          .data$td_team != \"JAX\",\n        'JAX',\n        .data$td_team\n      ),\n      td_team = dplyr::if_else(\n        id == \"2009_15_IND_JAX\" &\n          .data$play_id == 1088 &\n          .data$td_team != \"JAX\",\n        'IND',\n        .data$td_team\n      ),\n      td_team = dplyr::if_else(\n        id == \"2010_15_JAX_IND\" &\n          .data$play_id == 3848 &\n          .data$td_team != \"JAX\",\n        'IND',\n        .data$td_team\n      ),\n\n      time = dplyr::case_when(\n        id == '2012_04_NO_GB' & .data$play_id == 1085 ~ '3:34',\n        id == '2012_16_BUF_MIA' & .data$play_id == 2571 ~ '8:31',\n        TRUE ~ .data$time\n      ),\n      drive_real_start_time = as.character(.data$drive_real_start_time),\n      # get the safety team to ensure the correct team gets the points\n      # usage of base ifelse is important here for non-scoring games (i.e. early live games)\n      safety_team = ifelse(\n        .data$safety == 1,\n        .data$scoring_team_abbreviation,\n        NA_character_\n      ),\n\n      # can't trust the goal_to_go variable so we overwrite it here\n      goal_to_go = as.numeric(stringr::str_detect(\n        tolower(.data$pre_play_by_play),\n        \"goal\"\n      ))\n    ) |>\n    dplyr::mutate_if(\n      .predicate = is.character,\n      .funs = ~ dplyr::na_if(.x, \"\")\n    ) |>\n    # Data in 2023 pbp introduced separate \"plays\" for TV timeouts and two minute warnings\n    # These mess up some of our logic. Since they are useless, we remove them here\n    dplyr::filter(\n      !(is.na(.data$timeout_team) &\n        stringr::str_detect(\n          tolower(.data$play_description),\n          \"timeout at|two-minute\"\n        ))\n    ) |>\n    # Data in 2024 pbp introduced separate \"plays\" for injury updates\n    # These mess up some of our logic. Since they are useless, we remove them here\n    dplyr::filter(\n      !(is.na(.data$timeout_team) &\n        stringr::str_starts(\n          tolower(.data$play_description),\n          \"\\\\*\\\\* injury update:\"\n        ))\n    ) |>\n    fix_posteams()\n\n  # fix for games where home_team == away_team and fields are messed up\n  if (bad_game == 1) {\n    combined <- combined |>\n      fix_bad_games()\n  }\n\n  # nfl didn't fill in first downs on this game\n  if (id == '2018_01_ATL_PHI') {\n    combined <- combined |>\n      dplyr::mutate(\n        first_down_pass = dplyr::if_else(\n          .data$pass_attempt == 1 & .data$first_down == 1,\n          1,\n          .data$first_down_pass\n        ),\n        first_down_rush = dplyr::if_else(\n          .data$rush_attempt == 1 & .data$first_down == 1,\n          1,\n          .data$first_down_rush\n        ),\n\n        third_down_converted = dplyr::if_else(\n          .data$first_down == 1 & .data$down == 3,\n          1,\n          .data$third_down_converted\n        ),\n        fourth_down_converted = dplyr::if_else(\n          .data$first_down == 1 & .data$down == 4,\n          1,\n          .data$fourth_down_converted\n        ),\n\n        third_down_failed = dplyr::if_else(\n          .data$first_down == 0 & .data$down == 3,\n          1,\n          .data$third_down_failed\n        ),\n        fourth_down_failed = dplyr::if_else(\n          .data$first_down == 0 &\n            .data$down == 4 &\n            .data$play_type_nfl != \"FIELD_GOAL\" &\n            .data$play_type_nfl != \"PUNT\" &\n            .data$play_type_nfl != \"PENALTY\",\n          1,\n          .data$fourth_down_failed\n        )\n      )\n  }\n\n  return(combined)\n}\n\n# helper function to manually fill in fields for problematic games\nfix_bad_games <- function(pbp) {\n  fixed <- pbp |>\n    dplyr::mutate(\n      #if team has the ball and scored, make them the scoring team\n      td_team = dplyr::if_else(\n        .data$drive_how_ended_description == 'Touchdown' &\n          !is.na(.data$td_team),\n        .data$posteam,\n        .data$td_team\n      ),\n      #if team defensive team score, fill in the right team\n      td_team = dplyr::if_else(\n        #game involving the jags\n        #defensive TD\n        .data$drive_how_ended_description != 'Touchdown' &\n          !is.na(.data$td_team),\n        #if home team has ball, then away team scored, otherwise home team scored\n        dplyr::if_else(\n          .data$posteam == .data$home_team,\n          .data$away_team,\n          .data$home_team\n        ),\n        .data$td_team\n      ),\n      # fill in return team\n      return_team = dplyr::if_else(\n        !is.na(.data$return_team),\n        dplyr::if_else(\n          # if the home team has the ball, return team is away team (this is before we flip posteam for kickoffs)\n          .data$posteam == .data$home_team,\n          .data$away_team,\n          .data$home_team\n        ),\n        .data$return_team\n      ),\n      fumble_recovery_1_team = dplyr::if_else(\n        !is.na(.data$fumble_recovery_1_team),\n        # assign possession based on fumble_lost\n        dplyr::case_when(\n          .data$fumble_lost == 1 &\n            .data$posteam == .data$home_team ~ .data$away_team,\n          .data$fumble_lost == 1 &\n            .data$posteam == .data$away_team ~ .data$home_team,\n          .data$fumble_lost == 0 &\n            .data$posteam == .data$home_team ~ .data$home_team,\n          .data$fumble_lost == 0 &\n            .data$posteam == .data$away_team ~ .data$away_team\n        ),\n        .data$fumble_recovery_1_team\n      ),\n      timeout_team = dplyr::if_else(\n        # if there's a timeout in the affected seasons\n        !is.na(.data$timeout_team),\n        # extract from play description\n        stringr::str_extract(\n          .data$play_description,\n          \"(?<=Timeout #[1-3] by )[:upper:]+\"\n        ),\n        .data$timeout_team\n      )\n    )\n\n  return(fixed)\n}\n\nfix_posteams <- function(pbp) {\n  # Data source switch in 2023 introduced new problems\n  # 1. Definition of posteam on kick offs changed to receiving team. That's our\n  #    definition and we swap teams later.\n  # 2. Posteam doesn't change on the PAT after defensive TD\n  #\n  # We adjust both things here\n  # We need the variable pre_play_by_play which usually looks like \"KC  1-10  NYJ 40\"\n  if (\"pre_play_by_play\" %in% names(pbp)) {\n    # Let's be as explicit as possible about what we want to extract from the string\n    # It's really only the first valid team abbreviation followed by a blank space\n    valid_team_abbrs <- paste(\n      nflfastR::teams_colors_logos$team_abbr,\n      collapse = \" |\"\n    )\n    posteam_regex <- paste0(\"^\", valid_team_abbrs, \"(?=[:space:])\")\n\n    pbp <- pbp |>\n      dplyr::mutate(\n        parsed_posteam = stringr::str_extract(\n          .data$pre_play_by_play,\n          posteam_regex\n        ) |>\n          stringr::str_trim(),\n        posteam = dplyr::case_when(\n          stringr::str_detect(\n            .data$play_description,\n            \"^Timeout \"\n          ) ~ NA_character_,\n          is.na(.data$parsed_posteam) ~ .data$posteam,\n          .data$play_description == \"GAME\" ~ NA_character_,\n          TRUE ~ .data$parsed_posteam\n        ),\n        # drop helper\n        parsed_posteam = NULL\n      )\n  }\n\n  pbp\n}\n"
  },
  {
    "path": "R/helper_tidy_play_stats.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Create a single row with all play stats of a given play built in the\n#          Scraper Functions\n# Stlyeguide: styler::tidyverse_style()\n################################################################################\n\n# Build a single row for tidy data structure\n#\n# This is a sub-function for the get_pbp_nfl and get_pbp_gc functions.\n#\n# @param play_Id (integer) Specifies the play_Id for which the stats should be combined\n# @param stats A dataframe including multiple rows for each play_Id holding\n# gsis stat ids and stats\nsum_play_stats <- function(play_Id, stats) {\n  play_stats <- stats[stats$playId == play_Id, ]\n\n  row <- c(\"play_id\" = as.integer(play_Id), tidy_play_stats_row)\n\n  for (index in seq_along(play_stats$playId)) {\n    stat_id <- play_stats$statId[index]\n    if (stat_id == 2) {\n      row$punt_blocked <- 1\n      row$punt_attempt <- 1\n      row$kick_distance <- play_stats$yards[index]\n      row$punter_player_id <- play_stats$player.esbId[index]\n      row$punter_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 3) {\n      row$first_down_rush <- 1\n    } else if (stat_id == 4) {\n      row$first_down_pass <- 1\n    } else if (stat_id == 5) {\n      row$first_down_penalty <- 1\n    } else if (stat_id == 6) {\n      row$third_down_converted <- 1\n    } else if (stat_id == 7) {\n      row$third_down_failed <- 1\n    } else if (stat_id == 8) {\n      row$fourth_down_converted <- 1\n    } else if (stat_id == 9) {\n      row$fourth_down_failed <- 1\n    } else if (stat_id == 10) {\n      row$rush_attempt <- 1\n      row$rusher_player_id <- play_stats$player.esbId[index]\n      row$rusher_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$rushing_yards <- play_stats$yards[index]\n    } else if (stat_id == 11) {\n      row$rush_attempt <- 1\n      row$touchdown <- 1\n      row$first_down_rush <- 1\n      row$rush_touchdown <- 1\n      row$rusher_player_id <- play_stats$player.esbId[index]\n      row$rusher_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$rushing_yards <- play_stats$yards[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 12) {\n      row$rush_attempt <- 1\n      row$lateral_rush <- 1\n      row$lateral_rusher_player_id <- play_stats$player.esbId[index]\n      row$lateral_rusher_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$lateral_rushing_yards <- play_stats$yards[index]\n    } else if (stat_id == 13) {\n      row$rush_attempt <- 1\n      row$touchdown <- 1\n      row$rush_touchdown <- 1\n      row$lateral_rush <- 1\n      row$lateral_rusher_player_id <- play_stats$player.esbId[index]\n      row$lateral_rusher_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$lateral_rushing_yards <- play_stats$yards[index]\n    } else if (stat_id == 14) {\n      row$incomplete_pass <- 1\n      row$pass_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 15) {\n      row$pass_attempt <- 1\n      row$complete_pass <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$passing_yards <- play_stats$yards[index]\n    } else if (stat_id == 16) {\n      row$pass_attempt <- 1\n      row$touchdown <- 1\n      row$pass_touchdown <- 1\n      row$complete_pass <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$passing_yards <- play_stats$yards[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 19) {\n      row$interception <- 1\n      row$pass_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 20) {\n      row$pass_attempt <- 1\n      row$sack <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n    } else if (stat_id == 21) {\n      row$pass_attempt <- 1\n      row$complete_pass <- 1\n      row$receiver_player_id <- play_stats$player.esbId[index]\n      row$receiver_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$receiving_yards <- play_stats$yards[index]\n    } else if (stat_id == 22) {\n      row$pass_attempt <- 1\n      row$touchdown <- 1\n      row$pass_touchdown <- 1\n      row$complete_pass <- 1\n      row$receiver_player_id <- play_stats$player.esbId[index]\n      row$receiver_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$receiving_yards <- play_stats$yards[index]\n    } else if (stat_id == 23) {\n      row$pass_attempt <- 1\n      row$complete_pass <- 1\n      row$lateral_reception <- 1\n      row$lateral_receiver_player_id <- play_stats$player.esbId[index]\n      row$lateral_receiver_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$lateral_receiving_yards <- play_stats$yards[index]\n    } else if (stat_id == 24) {\n      row$pass_attempt <- 1\n      row$touchdown <- 1\n      row$pass_touchdown <- 1\n      row$complete_pass <- 1\n      row$lateral_reception <- 1\n      row$lateral_receiver_player_id <- play_stats$player.esbId[index]\n      row$lateral_receiver_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$yards_gained <- play_stats$yards[index]\n      row$lateral_receiving_yards <- play_stats$yards[index]\n    } else if (stat_id == 25) {\n      row$pass_attempt <- 1\n      row$interception_player_id <- play_stats$player.esbId[index]\n      row$interception_player_name <- play_stats$player.displayName[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 26) {\n      row$pass_attempt <- 1\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$interception_player_id <- play_stats$player.esbId[index]\n      row$interception_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 27) {\n      row$pass_attempt <- 1\n      row$lateral_return <- 1\n      row$lateral_interception_player_id <- play_stats$player.esbId[index]\n      row$lateral_interception_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 28) {\n      row$pass_attempt <- 1\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$lateral_return <- 1\n      row$lateral_interception_player_id <- play_stats$player.esbId[index]\n      row$lateral_interception_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 29) {\n      row$punt_attempt <- 1\n      row$punter_player_id <- play_stats$player.esbId[index]\n      row$punter_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 30) {\n      # yards always zero for stat_id 30 (punt inside 20) so we don't write kick_distance here\n      row$punt_inside_twenty <- 1\n      row$punt_attempt <- 1\n      row$punter_player_id <- play_stats$player.esbId[index]\n      row$punter_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 31) {\n      row$punt_in_endzone <- 1\n      row$punt_attempt <- 1\n      row$punter_player_id <- play_stats$player.esbId[index]\n      row$punter_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 32) {\n      row$punt_attempt <- 1\n      row$kick_distance <- play_stats$yards[index]\n      row$punter_player_id <- play_stats$player.esbId[index]\n      row$punter_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 33) {\n      row$punt_attempt <- 1\n      row$punt_returner_player_id <- play_stats$player.esbId[index]\n      row$punt_returner_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 34) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$punt_attempt <- 1\n      row$punt_returner_player_id <- play_stats$player.esbId[index]\n      row$punt_returner_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 35) {\n      row$punt_attempt <- 1\n      row$lateral_return <- 1\n      row$lateral_punt_returner_player_id <- play_stats$player.esbId[index]\n      row$lateral_punt_returner_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$return_yards <- play_stats$yards[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 36) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$punt_attempt <- 1\n      row$lateral_return <- 1\n      row$lateral_punt_returner_player_id <- play_stats$player.esbId[index]\n      row$lateral_punt_returner_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 37) {\n      row$punt_out_of_bounds <- 1\n      row$punt_attempt <- 1\n      row$return_yards <- 0\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 38) {\n      row$punt_downed <- 1\n      row$punt_attempt <- 1\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 39) {\n      row$punt_fair_catch <- 1\n      row$punt_attempt <- 1\n      row$punt_returner_player_id <- play_stats$player.esbId[index]\n      row$punt_returner_player_name <- play_stats$player.displayName[index]\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 40) {\n      row$punt_attempt <- 1\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 41) {\n      row$kickoff_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 42) {\n      # yards always zero for stat_id 42 so we don't write kick_distance here\n      row$kickoff_inside_twenty <- 1\n      row$kickoff_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 43) {\n      row$kickoff_in_endzone <- 1\n      row$kickoff_attempt <- 1\n      row$kick_distance <- play_stats$yards[index]\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 44) {\n      row$kickoff_attempt <- 1\n      row$kick_distance <- play_stats$yards[index]\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 45) {\n      row$kickoff_attempt <- 1\n      row$kickoff_returner_player_id <- play_stats$player.esbId[index]\n      row$kickoff_returner_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 46) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$kickoff_attempt <- 1\n      row$kickoff_returner_player_id <- play_stats$player.esbId[index]\n      row$kickoff_returner_player_name <- play_stats$player.displayName[index]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 47) {\n      row$kickoff_attempt <- 1\n      row$lateral_return <- 1\n      row$lateral_kickoff_returner_player_id <- play_stats$player.esbId[index]\n      row$lateral_kickoff_returner_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 48) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$kickoff_attempt <- 1\n      row$lateral_return <- 1\n      row$lateral_kickoff_returner_player_id <- play_stats$player.esbId[index]\n      row$lateral_kickoff_returner_player_name <- play_stats$player.displayName[\n        index\n      ]\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$return_yards <- play_stats$yards[index]\n      row$return_team <- play_stats$teamAbbr[index]\n      row$return_penalty_fix <- 1\n    } else if (stat_id == 49) {\n      row$kickoff_out_of_bounds <- 1\n      row$kickoff_attempt <- 1\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 50) {\n      row$kickoff_fair_catch <- 1\n      row$kickoff_attempt <- 1\n      row$kickoff_returner_player_id <- play_stats$player.esbId[index]\n      row$kickoff_returner_player_name <- play_stats$player.displayName[index]\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 51) {\n      row$kickoff_attempt <- 1\n      row$return_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 52) {\n      row$fumble_forced <- 1\n      row$fumble <- 1\n      row$fumbled_1_player_id <-\n        if_else(\n          is.na(row$fumbled_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumbled_1_player_id\n        )\n      row$fumbled_1_player_name <-\n        if_else(\n          is.na(row$fumbled_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumbled_1_player_name\n        )\n      row$fumbled_1_team <-\n        if_else(\n          is.na(row$fumbled_1_team),\n          play_stats$teamAbbr[index],\n          row$fumbled_1_team\n        )\n      row$fumbled_2_player_id <-\n        if_else(\n          is.na(row$fumbled_2_player_id) &\n            row$fumbled_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumbled_2_player_id\n        )\n      row$fumbled_2_player_name <-\n        if_else(\n          is.na(row$fumbled_2_player_name) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumbled_2_player_name\n        )\n      row$fumbled_2_team <-\n        if_else(\n          is.na(row$fumbled_2_team) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          # row$fumbled_1_team != play_stats$teamAbbr[index], # can't use team here because multiple players of the same team are possible\n          play_stats$teamAbbr[index],\n          row$fumbled_2_team\n        )\n    } else if (stat_id == 53) {\n      row$fumble_not_forced <- 1\n      row$fumble <- 1\n      row$fumbled_1_player_id <-\n        if_else(\n          is.na(row$fumbled_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumbled_1_player_id\n        )\n      row$fumbled_1_player_name <-\n        if_else(\n          is.na(row$fumbled_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumbled_1_player_name\n        )\n      row$fumbled_1_team <-\n        if_else(\n          is.na(row$fumbled_1_team),\n          play_stats$teamAbbr[index],\n          row$fumbled_1_team\n        )\n      row$fumbled_2_player_id <-\n        if_else(\n          is.na(row$fumbled_2_player_id) &\n            row$fumbled_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumbled_2_player_id\n        )\n      row$fumbled_2_player_name <-\n        if_else(\n          is.na(row$fumbled_2_player_name) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumbled_2_player_name\n        )\n      row$fumbled_2_team <-\n        if_else(\n          is.na(row$fumbled_2_team) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          # row$fumbled_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumbled_2_team\n        )\n    } else if (stat_id == 54) {\n      row$fumble_out_of_bounds <- 1\n      row$fumble <- 1\n      row$fumbled_1_player_id <-\n        if_else(\n          is.na(row$fumbled_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumbled_1_player_id\n        )\n      row$fumbled_1_player_name <-\n        if_else(\n          is.na(row$fumbled_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumbled_1_player_name\n        )\n      row$fumbled_1_team <-\n        if_else(\n          is.na(row$fumbled_1_team),\n          play_stats$teamAbbr[index],\n          row$fumbled_1_team\n        )\n      row$fumbled_2_player_id <-\n        if_else(\n          is.na(row$fumbled_2_player_id) &\n            row$fumbled_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumbled_2_player_id\n        )\n      row$fumbled_2_player_name <-\n        if_else(\n          is.na(row$fumbled_2_player_name) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumbled_2_player_name\n        )\n      row$fumbled_2_team <-\n        if_else(\n          is.na(row$fumbled_2_team) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          # row$fumbled_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumbled_2_team\n        )\n    } else if (stat_id == 55) {\n      row$fumble <- 1\n      row$fumble_recovery_1_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumble_recovery_1_player_id\n        )\n      row$fumble_recovery_1_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumble_recovery_1_player_name\n        )\n      row$fumble_recovery_1_team <-\n        if_else(\n          is.na(row$fumble_recovery_1_team),\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_1_team\n        )\n      row$fumble_recovery_1_yards <-\n        if_else(\n          is.na(row$fumble_recovery_1_yards),\n          play_stats$yards[index],\n          row$fumble_recovery_1_yards\n        )\n      row$fumble_recovery_2_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_id) &\n            row$fumble_recovery_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumble_recovery_2_player_id\n        )\n      row$fumble_recovery_2_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_name) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumble_recovery_2_player_name\n        )\n      row$fumble_recovery_2_team <-\n        if_else(\n          is.na(row$fumble_recovery_2_team) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_2_team\n        )\n      row$fumble_recovery_2_yards <-\n        if_else(\n          is.na(row$fumble_recovery_2_yards) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_yards != play_stats$yards[index],\n          play_stats$yards[index],\n          row$fumble_recovery_2_yards\n        )\n    } else if (stat_id == 56) {\n      row$touchdown <- 1\n      row$fumble <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$fumble_recovery_1_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumble_recovery_1_player_id\n        )\n      row$fumble_recovery_1_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumble_recovery_1_player_name\n        )\n      row$fumble_recovery_1_team <-\n        if_else(\n          is.na(row$fumble_recovery_1_team),\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_1_team\n        )\n      row$fumble_recovery_1_yards <-\n        if_else(\n          is.na(row$fumble_recovery_1_yards),\n          play_stats$yards[index],\n          row$fumble_recovery_1_yards\n        )\n      row$fumble_recovery_2_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_id) &\n            row$fumble_recovery_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumble_recovery_2_player_id\n        )\n      row$fumble_recovery_2_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_name) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumble_recovery_2_player_name\n        )\n      row$fumble_recovery_2_team <-\n        if_else(\n          is.na(row$fumble_recovery_2_team) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_2_team\n        )\n      row$fumble_recovery_2_yards <-\n        if_else(\n          is.na(row$fumble_recovery_2_yards) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_yards != play_stats$yards[index],\n          play_stats$yards[index],\n          row$fumble_recovery_2_yards\n        )\n    } else if (stat_id == 57) {\n      row$fumble <- 1\n      row$lateral_recovery <- 1\n    } else if (stat_id == 58) {\n      row$touchdown <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$fumble <- 1\n      row$lateral_recovery <- 1\n    } else if (stat_id == 59) {\n      row$fumble <- 1\n      row$fumble_recovery_1_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumble_recovery_1_player_id\n        )\n      row$fumble_recovery_1_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumble_recovery_1_player_name\n        )\n      row$fumble_recovery_1_team <-\n        if_else(\n          is.na(row$fumble_recovery_1_team),\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_1_team\n        )\n      row$fumble_recovery_1_yards <-\n        if_else(\n          is.na(row$fumble_recovery_1_yards),\n          play_stats$yards[index],\n          row$fumble_recovery_1_yards\n        )\n      row$fumble_recovery_2_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_id) &\n            row$fumble_recovery_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumble_recovery_2_player_id\n        )\n      row$fumble_recovery_2_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_name) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumble_recovery_2_player_name\n        )\n      row$fumble_recovery_2_team <-\n        if_else(\n          is.na(row$fumble_recovery_2_team) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_2_team\n        )\n      row$fumble_recovery_2_yards <-\n        if_else(\n          is.na(row$fumble_recovery_2_yards) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_yards != play_stats$yards[index],\n          play_stats$yards[index],\n          row$fumble_recovery_2_yards\n        )\n    } else if (stat_id == 60) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$fumble <- 1\n      row$fumble_recovery_1_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumble_recovery_1_player_id\n        )\n      row$fumble_recovery_1_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumble_recovery_1_player_name\n        )\n      row$fumble_recovery_1_team <-\n        if_else(\n          is.na(row$fumble_recovery_1_team),\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_1_team\n        )\n      row$fumble_recovery_1_yards <-\n        if_else(\n          is.na(row$fumble_recovery_1_yards),\n          play_stats$yards[index],\n          row$fumble_recovery_1_yards\n        )\n      row$fumble_recovery_2_player_id <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_id) &\n            row$fumble_recovery_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumble_recovery_2_player_id\n        )\n      row$fumble_recovery_2_player_name <-\n        if_else(\n          is.na(row$fumble_recovery_2_player_name) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumble_recovery_2_player_name\n        )\n      row$fumble_recovery_2_team <-\n        if_else(\n          is.na(row$fumble_recovery_2_team) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumble_recovery_2_team\n        )\n      row$fumble_recovery_2_yards <-\n        if_else(\n          is.na(row$fumble_recovery_2_yards) &\n            row$fumble_recovery_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$fumble_recovery_1_yards != play_stats$yards[index],\n          play_stats$yards[index],\n          row$fumble_recovery_2_yards\n        )\n    } else if (stat_id == 61) {\n      row$fumble <- 1\n      row$lateral_recovery <- 1\n    } else if (stat_id == 62) {\n      row$touchdown <- 1\n      row$return_touchdown <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$fumble <- 1\n      row$lateral_recovery <- 1\n    } else if (stat_id == 63) {\n      NULL\n    } else if (stat_id == 64) {\n      row$touchdown <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 68) {\n      row$timeout <- 1\n      row$timeout_team <- play_stats$teamAbbr[index]\n    } else if (stat_id == 69) {\n      row$field_goal_missed <- 1\n      row$field_goal_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 70) {\n      row$field_goal_made <- 1\n      row$field_goal_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 71) {\n      row$field_goal_blocked <- 1\n      row$field_goal_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n      row$kick_distance <- play_stats$yards[index]\n    } else if (stat_id == 72) {\n      row$extra_point_good <- 1\n      row$extra_point_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 73) {\n      row$extra_point_failed <- 1\n      row$extra_point_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 74) {\n      row$extra_point_blocked <- 1\n      row$extra_point_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 75) {\n      row$two_point_rush_good <- 1\n      row$rush_attempt <- 1\n      row$two_point_attempt <- 1\n      row$rusher_player_id <- play_stats$player.esbId[index]\n      row$rusher_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 76) {\n      row$two_point_rush_failed <- 1\n      row$rush_attempt <- 1\n      row$two_point_attempt <- 1\n      row$rusher_player_id <- play_stats$player.esbId[index]\n      row$rusher_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 77) {\n      row$two_point_pass_good <- 1\n      row$pass_attempt <- 1\n      row$two_point_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 78) {\n      row$two_point_pass_failed <- 1\n      row$pass_attempt <- 1\n      row$two_point_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 79) {\n      row$solo_tackle <- 1\n      row$solo_tackle_1_player_id <-\n        if_else(\n          is.na(row$solo_tackle_1_player_id),\n          play_stats$player.esbId[index],\n          row$solo_tackle_1_player_id\n        )\n      row$solo_tackle_1_player_name <-\n        if_else(\n          is.na(row$solo_tackle_1_player_name),\n          play_stats$player.displayName[index],\n          row$solo_tackle_1_player_name\n        )\n      row$solo_tackle_1_team <-\n        if_else(\n          is.na(row$solo_tackle_1_team),\n          play_stats$teamAbbr[index],\n          row$solo_tackle_1_team\n        )\n      row$solo_tackle_2_player_id <-\n        if_else(\n          is.na(row$solo_tackle_2_player_id) &\n            row$solo_tackle_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$solo_tackle_2_player_id\n        )\n      row$solo_tackle_2_player_name <-\n        if_else(\n          is.na(row$solo_tackle_2_player_name) &\n            row$solo_tackle_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$solo_tackle_2_player_name\n        )\n      row$solo_tackle_2_team <-\n        if_else(\n          is.na(row$solo_tackle_2_team) &\n            row$solo_tackle_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$solo_tackle_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$solo_tackle_2_team\n        )\n    } else if (stat_id == 80) {\n      row$tackle_with_assist <- 1\n      row$tackle_with_assist_1_player_id <-\n        if_else(\n          is.na(row$tackle_with_assist_1_player_id),\n          play_stats$player.esbId[index],\n          row$tackle_with_assist_1_player_id\n        )\n      row$tackle_with_assist_1_player_name <-\n        if_else(\n          is.na(row$tackle_with_assist_1_player_name),\n          play_stats$player.displayName[index],\n          row$tackle_with_assist_1_player_name\n        )\n      row$tackle_with_assist_1_team <-\n        if_else(\n          is.na(row$tackle_with_assist_1_team),\n          play_stats$teamAbbr[index],\n          row$tackle_with_assist_1_team\n        )\n      row$tackle_with_assist_2_player_id <-\n        if_else(\n          is.na(row$tackle_with_assist_2_player_id) &\n            row$tackle_with_assist_1_player_id !=\n              play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$tackle_with_assist_2_player_id\n        )\n      row$tackle_with_assist_2_player_name <-\n        if_else(\n          is.na(row$tackle_with_assist_2_player_name) &\n            row$tackle_with_assist_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$tackle_with_assist_2_player_name\n        )\n      row$tackle_with_assist_2_team <-\n        if_else(\n          is.na(row$tackle_with_assist_2_team) &\n            row$tackle_with_assist_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$tackle_with_assist_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$tackle_with_assist_2_team\n        )\n    } else if (stat_id == 82) {\n      # =81\n      row$assist_tackle <- 1\n      row$assist_tackle_1_player_id <-\n        if_else(\n          is.na(row$assist_tackle_1_player_id),\n          play_stats$player.esbId[index],\n          row$assist_tackle_1_player_id\n        )\n      row$assist_tackle_1_player_name <-\n        if_else(\n          is.na(row$assist_tackle_1_player_name),\n          play_stats$player.displayName[index],\n          row$assist_tackle_1_player_name\n        )\n      row$assist_tackle_1_team <-\n        if_else(\n          is.na(row$assist_tackle_1_team),\n          play_stats$teamAbbr[index],\n          row$assist_tackle_1_team\n        )\n      row$assist_tackle_2_player_id <-\n        if_else(\n          is.na(row$assist_tackle_2_player_id) &\n            row$assist_tackle_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$assist_tackle_2_player_id\n        )\n      row$assist_tackle_2_player_name <-\n        if_else(\n          is.na(row$assist_tackle_2_player_name) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$assist_tackle_2_player_name\n        )\n      row$assist_tackle_2_team <-\n        if_else(\n          is.na(row$assist_tackle_2_team) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$assist_tackle_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$assist_tackle_2_team\n        )\n      row$assist_tackle_3_player_id <-\n        if_else(\n          (is.na(row$assist_tackle_3_player_id) &\n            row$assist_tackle_1_player_id != play_stats$player.esbId[index] &\n            row$assist_tackle_2_player_id != play_stats$player.esbId[index]),\n          play_stats$player.esbId[index],\n          row$assist_tackle_3_player_id\n        )\n      row$assist_tackle_3_player_name <-\n        if_else(\n          (is.na(row$assist_tackle_3_player_name) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_2_player_name !=\n              play_stats$player.displayName[index]),\n          play_stats$player.displayName[index],\n          row$assist_tackle_3_player_name\n        )\n      row$assist_tackle_3_team <-\n        if_else(\n          (is.na(row$assist_tackle_3_team) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_2_player_name !=\n              play_stats$player.displayName[index]),\n          # row$assist_tackle_1_team != play_stats$teamAbbr[index] &\n          #  row$assist_tackle_2_team != play_stats$teamAbbr[index]),\n          play_stats$teamAbbr[index],\n          row$assist_tackle_3_team\n        )\n      row$assist_tackle_4_player_id <-\n        if_else(\n          (is.na(row$assist_tackle_4_player_id) &\n            row$assist_tackle_1_player_id != play_stats$player.esbId[index] &\n            row$assist_tackle_2_player_id != play_stats$player.esbId[index] &\n            row$assist_tackle_3_player_id != play_stats$player.esbId[index]),\n          play_stats$player.esbId[index],\n          row$assist_tackle_4_player_id\n        )\n      row$assist_tackle_4_player_name <-\n        if_else(\n          (is.na(row$assist_tackle_4_player_name) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_2_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_3_player_name !=\n              play_stats$player.displayName[index]),\n          play_stats$player.displayName[index],\n          row$assist_tackle_4_player_name\n        )\n      row$assist_tackle_4_team <-\n        if_else(\n          (is.na(row$assist_tackle_4_team) &\n            row$assist_tackle_1_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_2_player_name !=\n              play_stats$player.displayName[index] &\n            row$assist_tackle_3_player_name !=\n              play_stats$player.displayName[index]),\n          # row$assist_tackle_1_team != play_stats$teamAbbr[index] &\n          #  row$assist_tackle_2_team != play_stats$teamAbbr[index] &\n          #  row$assist_tackle_3_team != play_stats$teamAbbr[index]),\n          play_stats$teamAbbr[index],\n          row$assist_tackle_4_team\n        )\n    } else if (stat_id == 83) {\n      row$sack <- 1\n      row$sack_player_id <- play_stats$player.esbId[index]\n      row$sack_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 84) {\n      row$sack <- 1\n      row$assist_tackle <- 1\n      row$half_sack_1_player_id <-\n        if_else(\n          is.na(row$half_sack_1_player_id),\n          play_stats$player.esbId[index],\n          row$half_sack_1_player_id\n        )\n      row$half_sack_1_player_name <-\n        if_else(\n          is.na(row$half_sack_1_player_name),\n          play_stats$player.displayName[index],\n          row$half_sack_1_player_name\n        )\n      row$half_sack_2_player_id <-\n        if_else(\n          is.na(row$half_sack_2_player_id) &\n            row$half_sack_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$half_sack_2_player_id\n        )\n      row$half_sack_2_player_name <-\n        if_else(\n          is.na(row$half_sack_2_player_name) &\n            row$half_sack_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$half_sack_2_player_name\n        )\n    } else if (stat_id == 85) {\n      row$pass_defense_1_player_id <-\n        if_else(\n          is.na(row$pass_defense_1_player_id),\n          play_stats$player.esbId[index],\n          row$pass_defense_1_player_id\n        )\n      row$pass_defense_1_player_name <-\n        if_else(\n          is.na(row$pass_defense_1_player_name),\n          play_stats$player.displayName[index],\n          row$pass_defense_1_player_name\n        )\n      row$pass_defense_2_player_id <-\n        if_else(\n          is.na(row$pass_defense_2_player_id) &\n            row$pass_defense_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$pass_defense_2_player_id\n        )\n      row$pass_defense_2_player_name <-\n        if_else(\n          is.na(row$pass_defense_2_player_name) &\n            row$pass_defense_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$pass_defense_2_player_name\n        )\n    } else if (stat_id == 86) {\n      row$punt_attempt <- 1\n      row$blocked_player_id <- play_stats$player.esbId[index]\n      row$blocked_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 87) {\n      row$blocked_player_id <- play_stats$player.esbId[index]\n      row$blocked_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 88) {\n      row$field_goal_attempt <- 1\n      row$blocked_player_id <- play_stats$player.esbId[index]\n      row$blocked_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 89) {\n      row$safety <- 1\n      row$safety_player_id <- play_stats$player.esbId[index]\n      row$safety_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 91) {\n      row$fumble <- 1\n      row$forced_fumble_player_1_player_id <-\n        if_else(\n          is.na(row$forced_fumble_player_1_player_id),\n          play_stats$player.esbId[index],\n          row$forced_fumble_player_1_player_id\n        )\n      row$forced_fumble_player_1_player_name <-\n        if_else(\n          is.na(row$forced_fumble_player_1_player_name),\n          play_stats$player.displayName[index],\n          row$forced_fumble_player_1_player_name\n        )\n      row$forced_fumble_player_1_team <-\n        if_else(\n          is.na(row$forced_fumble_player_1_team),\n          play_stats$teamAbbr[index],\n          row$forced_fumble_player_1_team\n        )\n      row$forced_fumble_player_2_player_id <-\n        if_else(\n          is.na(row$forced_fumble_player_2_player_id) &\n            row$forced_fumble_player_1_player_id !=\n              play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$forced_fumble_player_2_player_id\n        )\n      row$forced_fumble_player_2_player_name <-\n        if_else(\n          is.na(row$forced_fumble_player_2_player_name) &\n            row$forced_fumble_player_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$forced_fumble_player_2_player_name\n        )\n      row$forced_fumble_player_2_team <-\n        if_else(\n          is.na(row$forced_fumble_player_2_team) &\n            row$forced_fumble_player_1_player_name !=\n              play_stats$player.displayName[index],\n          # row$forced_fumble_player_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$forced_fumble_player_2_team\n        )\n    } else if (stat_id == 93) {\n      row$penalty <- 1\n      row$penalty_player_id <- play_stats$player.esbId[index]\n      row$penalty_player_name <- play_stats$player.displayName[index]\n      row$penalty_team <- play_stats$teamAbbr[index]\n      row$penalty_yards <- play_stats$yards[index]\n    } else if (stat_id == 95) {\n      row$tackled_for_loss <- 1\n    } else if (stat_id == 96) {\n      row$extra_point_safety <- 1\n      row$extra_point_attempt <- 1\n    } else if (stat_id == 99) {\n      row$two_point_rush_safety <- 1\n      row$rush_attempt <- 1\n      row$two_point_attempt <- 1\n      row$rusher_player_id <- play_stats$player.esbId[index]\n      row$rusher_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 100) {\n      row$two_point_pass_safety <- 1\n      row$pass_attempt <- 1\n      row$two_point_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 102) {\n      row$kickoff_downed <- 1\n      row$kickoff_attempt <- 1\n    } else if (stat_id == 103) {\n      row$lateral_sack_player_id <- play_stats$player.esbId[index]\n      row$lateral_sack_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 104) {\n      row$two_point_pass_reception_good <- 1\n      row$pass_attempt <- 1\n      row$two_point_attempt <- 1\n      row$receiver_player_id <- play_stats$player.esbId[index]\n      row$receiver_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 105) {\n      row$two_point_pass_reception_failed <- 1\n      row$pass_attempt <- 1\n      row$two_point_attempt <- 1\n      row$receiver_player_id <- play_stats$player.esbId[index]\n      row$receiver_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 106) {\n      row$fumble_lost <- 1\n      row$fumble <- 1\n      row$fumbled_1_player_id <-\n        if_else(\n          is.na(row$fumbled_1_player_id),\n          play_stats$player.esbId[index],\n          row$fumbled_1_player_id\n        )\n      row$fumbled_1_player_name <-\n        if_else(\n          is.na(row$fumbled_1_player_name),\n          play_stats$player.displayName[index],\n          row$fumbled_1_player_name\n        )\n      row$fumbled_1_team <-\n        if_else(\n          is.na(row$fumbled_1_team),\n          play_stats$teamAbbr[index],\n          row$fumbled_1_team\n        )\n      row$fumbled_2_player_id <-\n        if_else(\n          is.na(row$fumbled_2_player_id) &\n            row$fumbled_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$fumbled_2_player_id\n        )\n      row$fumbled_2_player_name <-\n        if_else(\n          is.na(row$fumbled_2_player_name) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$fumbled_2_player_name\n        )\n      row$fumbled_2_team <-\n        if_else(\n          is.na(row$fumbled_2_team) &\n            row$fumbled_1_player_name != play_stats$player.displayName[index],\n          # row$fumbled_1_team != play_stats$teamAbbr[index],\n          play_stats$teamAbbr[index],\n          row$fumbled_2_team\n        )\n    } else if (stat_id == 107) {\n      row$own_kickoff_recovery <- 1\n      row$kickoff_attempt <- 1\n      row$own_kickoff_recovery_player_id <- play_stats$player.esbId[index]\n      row$own_kickoff_recovery_player_name <- play_stats$player.displayName[\n        index\n      ]\n    } else if (stat_id == 108) {\n      row$own_kickoff_recovery_td <- 1\n      row$touchdown <- 1\n      row$td_team <- play_stats$teamAbbr[index]\n      row$td_player_id <- play_stats$player.esbId[index]\n      row$td_player_name <- play_stats$player.displayName[index]\n      row$kickoff_attempt <- 1\n      row$own_kickoff_recovery_player_id <- play_stats$player.esbId[index]\n      row$own_kickoff_recovery_player_name <- play_stats$player.displayName[\n        index\n      ]\n    } else if (stat_id == 110) {\n      row$qb_hit <- 1\n      row$qb_hit_1_player_id <-\n        if_else(\n          is.na(row$qb_hit_1_player_id),\n          play_stats$player.esbId[index],\n          row$qb_hit_1_player_id\n        )\n      row$qb_hit_1_player_name <-\n        if_else(\n          is.na(row$qb_hit_1_player_name),\n          play_stats$player.displayName[index],\n          row$qb_hit_1_player_name\n        )\n      row$qb_hit_2_player_id <-\n        if_else(\n          is.na(row$qb_hit_2_player_id) &\n            row$qb_hit_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$qb_hit_2_player_id\n        )\n      row$qb_hit_2_player_name <-\n        if_else(\n          is.na(row$qb_hit_2_player_name) &\n            row$qb_hit_1_player_name != play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$qb_hit_2_player_name\n        )\n    } else if (stat_id == 111) {\n      row$pass_attempt <- 1\n      row$complete_pass <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n      row$air_yards <- play_stats$yards[index]\n    } else if (stat_id == 112) {\n      row$pass_attempt <- 1\n      row$passer_player_id <- play_stats$player.esbId[index]\n      row$passer_player_name <- play_stats$player.displayName[index]\n      row$air_yards <- play_stats$yards[index]\n    } else if (stat_id == 113) {\n      row$pass_attempt <- 1\n      row$complete_pass <- 1\n      if (is.na(row$receiver_player_id)) {\n        row$receiver_player_id <- play_stats$player.esbId[index]\n        row$receiver_player_name <- play_stats$player.displayName[index]\n      }\n      if (is.na(row$yards_after_catch)) {\n        row$yards_after_catch <- play_stats$yards[index]\n      }\n    } else if (stat_id == 115) {\n      row$pass_attempt <- 1\n      row$receiver_player_id <- play_stats$player.esbId[index]\n      row$receiver_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 120) {\n      row$tackle_for_loss_1_player_id <-\n        if_else(\n          is.na(row$tackle_for_loss_1_player_id),\n          play_stats$player.esbId[index],\n          row$tackle_for_loss_1_player_id\n        )\n      row$tackle_for_loss_1_player_name <-\n        if_else(\n          is.na(row$tackle_for_loss_1_player_name),\n          play_stats$player.displayName[index],\n          row$tackle_for_loss_1_player_name\n        )\n      row$tackle_for_loss_2_player_id <-\n        if_else(\n          is.na(row$tackle_for_loss_2_player_id) &\n            row$tackle_for_loss_1_player_id != play_stats$player.esbId[index],\n          play_stats$player.esbId[index],\n          row$tackle_for_loss_2_player_id\n        )\n      row$tackle_for_loss_2_player_name <-\n        if_else(\n          is.na(row$tackle_for_loss_2_player_name) &\n            row$tackle_for_loss_1_player_name !=\n              play_stats$player.displayName[index],\n          play_stats$player.displayName[index],\n          row$tackle_for_loss_2_player_name\n        )\n    } else if (stat_id == 301) {\n      row$extra_point_aborted <- 1\n      row$extra_point_attempt <- 1\n    } else if (stat_id == 402) {\n      # tackle for loss player information is recorded in stat id 120\n      NULL\n    } else if (stat_id == 403) {\n      row$defensive_two_point_attempt <- 1\n    } else if (stat_id == 404) {\n      row$defensive_two_point_conv <- 1\n    } else if (stat_id == 405) {\n      row$defensive_extra_point_attempt <- 1\n    } else if (stat_id == 406) {\n      row$defensive_extra_point_conv <- 1\n    } else if (stat_id == 410) {\n      row$kickoff_attempt <- 1\n      row$kicker_player_id <- play_stats$player.esbId[index]\n      row$kicker_player_name <- play_stats$player.displayName[index]\n    } else if (stat_id == 420) {\n      row$two_point_return <- 1\n      row$two_point_attempt <- 1\n    } else {\n      NULL\n    }\n  }\n  return(row)\n}\n"
  },
  {
    "path": "R/helper_variable_selector.R",
    "content": "################################################################################\n# Author: Ben Baldwin, Sebastian Carl\n# Purpose: Build the final output of the pbp functions\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\nselect_variables <- function(pbp) {\n  suppressWarnings(\n    out <-\n      pbp |>\n      dplyr::select(\n        dplyr::any_of(\n          c(nflscrapr_cols, new_cols, api_cols)\n        )\n      )\n  )\n\n  return(out)\n}\n\n# columns that are not in gamecenter that we created\nnew_cols <- c(\n  \"season\",\n  \"cp\",\n  \"cpoe\",\n  \"series\",\n  \"series_success\",\n  \"series_result\"\n)\n\n# original nflscrapr columns\nnflscrapr_cols <-\n  c(\n    \"play_id\",\n    \"game_id\",\n    \"old_game_id\",\n    \"home_team\",\n    \"away_team\",\n    #added these to new gc scraper\n    \"season_type\",\n    \"week\",\n    \"posteam\",\n    \"posteam_type\",\n    \"defteam\",\n    \"side_of_field\",\n    \"yardline_100\",\n    \"game_date\",\n    \"quarter_seconds_remaining\",\n    \"half_seconds_remaining\",\n    \"game_seconds_remaining\",\n    \"game_half\",\n    \"quarter_end\",\n    \"drive\",\n    \"sp\",\n    \"qtr\",\n    \"down\",\n    \"goal_to_go\",\n    \"time\",\n    \"yrdln\",\n    \"ydstogo\",\n    \"ydsnet\",\n    \"desc\",\n    \"play_type\",\n    \"yards_gained\",\n    \"shotgun\",\n    \"no_huddle\",\n    \"qb_dropback\",\n    \"qb_kneel\",\n    \"qb_spike\",\n    \"qb_scramble\",\n    \"pass_length\",\n    \"pass_location\",\n    \"air_yards\",\n    \"yards_after_catch\",\n    \"run_location\",\n    \"run_gap\",\n    \"field_goal_result\",\n    \"kick_distance\",\n    \"extra_point_result\",\n    \"two_point_conv_result\",\n    \"home_timeouts_remaining\",\n    \"away_timeouts_remaining\",\n    \"timeout\",\n    \"timeout_team\",\n    \"td_team\",\n    \"td_player_name\",\n    \"td_player_id\",\n    \"posteam_timeouts_remaining\",\n    \"defteam_timeouts_remaining\",\n    \"total_home_score\",\n    \"total_away_score\",\n    \"posteam_score\",\n    \"defteam_score\",\n    \"score_differential\",\n    \"posteam_score_post\",\n    \"defteam_score_post\",\n    \"score_differential_post\",\n    \"no_score_prob\",\n    \"opp_fg_prob\",\n    \"opp_safety_prob\",\n    \"opp_td_prob\",\n    \"fg_prob\",\n    \"safety_prob\",\n    \"td_prob\",\n    \"extra_point_prob\",\n    \"two_point_conversion_prob\",\n    \"ep\",\n    \"epa\",\n    \"total_home_epa\",\n    \"total_away_epa\",\n    \"total_home_rush_epa\",\n    \"total_away_rush_epa\",\n    \"total_home_pass_epa\",\n    \"total_away_pass_epa\",\n    \"air_epa\",\n    \"yac_epa\",\n    \"comp_air_epa\",\n    \"comp_yac_epa\",\n    \"total_home_comp_air_epa\",\n    \"total_away_comp_air_epa\",\n    \"total_home_comp_yac_epa\",\n    \"total_away_comp_yac_epa\",\n    \"total_home_raw_air_epa\",\n    \"total_away_raw_air_epa\",\n    \"total_home_raw_yac_epa\",\n    \"total_away_raw_yac_epa\",\n    \"wp\",\n    \"def_wp\",\n    \"home_wp\",\n    \"away_wp\",\n    \"wpa\",\n    \"vegas_wpa\",\n    \"vegas_home_wpa\",\n    \"home_wp_post\",\n    \"away_wp_post\",\n    \"vegas_wp\",\n    \"vegas_home_wp\",\n    \"total_home_rush_wpa\",\n    \"total_away_rush_wpa\",\n    \"total_home_pass_wpa\",\n    \"total_away_pass_wpa\",\n    \"air_wpa\",\n    \"yac_wpa\",\n    \"comp_air_wpa\",\n    \"comp_yac_wpa\",\n    \"total_home_comp_air_wpa\",\n    \"total_away_comp_air_wpa\",\n    \"total_home_comp_yac_wpa\",\n    \"total_away_comp_yac_wpa\",\n    \"total_home_raw_air_wpa\",\n    \"total_away_raw_air_wpa\",\n    \"total_home_raw_yac_wpa\",\n    \"total_away_raw_yac_wpa\",\n    \"punt_blocked\",\n    \"first_down_rush\",\n    \"first_down_pass\",\n    \"first_down_penalty\",\n    \"third_down_converted\",\n    \"third_down_failed\",\n    \"fourth_down_converted\",\n    \"fourth_down_failed\",\n    \"incomplete_pass\",\n    \"touchback\",\n    \"interception\",\n    \"punt_inside_twenty\",\n    \"punt_in_endzone\",\n    \"punt_out_of_bounds\",\n    \"punt_downed\",\n    \"punt_fair_catch\",\n    \"kickoff_inside_twenty\",\n    \"kickoff_in_endzone\",\n    \"kickoff_out_of_bounds\",\n    \"kickoff_downed\",\n    \"kickoff_fair_catch\",\n    \"fumble_forced\",\n    \"fumble_not_forced\",\n    \"fumble_out_of_bounds\",\n    \"solo_tackle\",\n    \"safety\",\n    \"penalty\",\n    \"tackled_for_loss\",\n    \"fumble_lost\",\n    \"own_kickoff_recovery\",\n    \"own_kickoff_recovery_td\",\n    \"qb_hit\",\n    \"rush_attempt\",\n    \"pass_attempt\",\n    \"sack\",\n    \"touchdown\",\n    \"pass_touchdown\",\n    \"rush_touchdown\",\n    \"return_touchdown\",\n    \"extra_point_attempt\",\n    \"two_point_attempt\",\n    \"field_goal_attempt\",\n    \"kickoff_attempt\",\n    \"punt_attempt\",\n    \"fumble\",\n    \"complete_pass\",\n    \"assist_tackle\",\n    \"lateral_reception\",\n    \"lateral_rush\",\n    \"lateral_return\",\n    \"lateral_recovery\",\n    \"passer_player_id\",\n    \"passer_player_name\",\n    \"passing_yards\",\n    \"receiver_player_id\",\n    \"receiver_player_name\",\n    \"receiving_yards\",\n    \"rusher_player_id\",\n    \"rusher_player_name\",\n    \"rushing_yards\",\n    \"lateral_receiver_player_id\",\n    \"lateral_receiver_player_name\",\n    \"lateral_receiving_yards\",\n    \"lateral_rusher_player_id\",\n    \"lateral_rusher_player_name\",\n    \"lateral_rushing_yards\",\n    \"lateral_sack_player_id\",\n    \"lateral_sack_player_name\",\n    \"interception_player_id\",\n    \"interception_player_name\",\n    \"lateral_interception_player_id\",\n    \"lateral_interception_player_name\",\n    \"punt_returner_player_id\",\n    \"punt_returner_player_name\",\n    \"lateral_punt_returner_player_id\",\n    \"lateral_punt_returner_player_name\",\n    \"kickoff_returner_player_name\",\n    \"kickoff_returner_player_id\",\n    \"lateral_kickoff_returner_player_id\",\n    \"lateral_kickoff_returner_player_name\",\n    \"punter_player_id\",\n    \"punter_player_name\",\n    \"kicker_player_name\",\n    \"kicker_player_id\",\n    \"own_kickoff_recovery_player_id\",\n    \"own_kickoff_recovery_player_name\",\n    \"blocked_player_id\",\n    \"blocked_player_name\",\n    \"tackle_for_loss_1_player_id\",\n    \"tackle_for_loss_1_player_name\",\n    \"tackle_for_loss_2_player_id\",\n    \"tackle_for_loss_2_player_name\",\n    \"qb_hit_1_player_id\",\n    \"qb_hit_1_player_name\",\n    \"qb_hit_2_player_id\",\n    \"qb_hit_2_player_name\",\n    \"forced_fumble_player_1_team\",\n    \"forced_fumble_player_1_player_id\",\n    \"forced_fumble_player_1_player_name\",\n    \"forced_fumble_player_2_team\",\n    \"forced_fumble_player_2_player_id\",\n    \"forced_fumble_player_2_player_name\",\n    \"solo_tackle_1_team\",\n    \"solo_tackle_2_team\",\n    \"solo_tackle_1_player_id\",\n    \"solo_tackle_2_player_id\",\n    \"solo_tackle_1_player_name\",\n    \"solo_tackle_2_player_name\",\n    \"assist_tackle_1_player_id\",\n    \"assist_tackle_1_player_name\",\n    \"assist_tackle_1_team\",\n    \"assist_tackle_2_player_id\",\n    \"assist_tackle_2_player_name\",\n    \"assist_tackle_2_team\",\n    \"assist_tackle_3_player_id\",\n    \"assist_tackle_3_player_name\",\n    \"assist_tackle_3_team\",\n    \"assist_tackle_4_player_id\",\n    \"assist_tackle_4_player_name\",\n    \"assist_tackle_4_team\",\n    #new in nflfastR v4.0\n    \"tackle_with_assist\",\n    \"tackle_with_assist_1_player_id\",\n    \"tackle_with_assist_1_player_name\",\n    \"tackle_with_assist_1_team\",\n    \"tackle_with_assist_2_player_id\",\n    \"tackle_with_assist_2_player_name\",\n    \"tackle_with_assist_2_team\",\n    \"pass_defense_1_player_id\",\n    \"pass_defense_1_player_name\",\n    \"pass_defense_2_player_id\",\n    \"pass_defense_2_player_name\",\n    \"fumbled_1_team\",\n    \"fumbled_1_player_id\",\n    \"fumbled_1_player_name\",\n    \"fumbled_2_player_id\",\n    \"fumbled_2_player_name\",\n    \"fumbled_2_team\",\n    \"fumble_recovery_1_team\",\n    \"fumble_recovery_1_yards\",\n    \"fumble_recovery_1_player_id\",\n    \"fumble_recovery_1_player_name\",\n    \"fumble_recovery_2_team\",\n    \"fumble_recovery_2_yards\",\n    \"fumble_recovery_2_player_id\",\n    \"fumble_recovery_2_player_name\",\n    #new in nflfastR v4.1\n    \"sack_player_id\",\n    \"sack_player_name\",\n    \"half_sack_1_player_id\",\n    \"half_sack_1_player_name\",\n    \"half_sack_2_player_id\",\n    \"half_sack_2_player_name\",\n    \"return_team\",\n    \"return_yards\",\n    \"penalty_team\",\n    \"penalty_player_id\",\n    \"penalty_player_name\",\n    \"penalty_yards\",\n    \"replay_or_challenge\",\n    \"replay_or_challenge_result\",\n    \"penalty_type\",\n    \"defensive_two_point_attempt\",\n    \"defensive_two_point_conv\",\n    \"defensive_extra_point_attempt\",\n    \"defensive_extra_point_conv\",\n    #new in nflfastR > v4.1\n    \"safety_player_name\",\n    \"safety_player_id\"\n  )\n\n\n# these are columns in the RS data that aren't in nflscrapR\nrs_cols <- c(\n  \"season_type\",\n  \"week\",\n  \"game_key\",\n  \"game_time_eastern\",\n  \"game_time_local\",\n  \"iso_time\",\n  \"game_type\",\n  \"site_id\",\n  \"site_city\",\n  \"site_fullname\",\n  \"site_state\",\n  \"roof_type\",\n  \"drive_start_time\",\n  \"drive_end_time\",\n  \"drive_start_yardline\",\n  \"drive_end_yardline\",\n  \"drive_how_started\",\n  \"drive_how_ended\",\n  \"drive_play_count\",\n  \"drive_yards_penalized\",\n  \"drive_time_of_possession\",\n  \"drive_inside20\",\n  \"drive_first_downs\",\n  \"drive_possession_team_abbr\",\n  \"scoring_team_abbr\",\n  \"scoring_type\",\n  \"alert_play_type\",\n  \"play_type_nfl\",\n  \"time_of_day\",\n  \"yards\",\n  \"end_yardline_side\",\n  \"end_yardline_number\"\n)\n\n\n# these are columns in the new API that aren't in nflscrapR\napi_cols <- c(\n  \"order_sequence\",\n  \"start_time\",\n  \"time_of_day\",\n  \"stadium\",\n  \"weather\",\n  \"nfl_api_id\",\n  \"play_clock\",\n  \"play_deleted\",\n  \"play_type_nfl\",\n  \"special_teams_play\",\n  \"st_play_type\",\n  \"end_clock_time\",\n  \"end_yard_line\",\n\n  \"fixed_drive\",\n  \"fixed_drive_result\",\n  \"drive_real_start_time\",\n\n  \"drive_play_count\",\n  \"drive_time_of_possession\",\n  \"drive_first_downs\",\n  \"drive_inside20\",\n  \"drive_ended_with_score\",\n  \"drive_quarter_start\",\n  \"drive_quarter_end\",\n  \"drive_yards_penalized\",\n\n  \"drive_start_transition\",\n  \"drive_end_transition\",\n\n  \"drive_game_clock_start\",\n  \"drive_game_clock_end\",\n  \"drive_start_yard_line\",\n  \"drive_end_yard_line\",\n  \"drive_play_id_started\",\n  \"drive_play_id_ended\",\n  \"away_score\",\n  \"home_score\",\n  \"location\",\n  \"result\",\n  \"total\",\n  \"spread_line\",\n  \"total_line\",\n  \"div_game\",\n  \"roof\",\n  \"surface\",\n  \"temp\",\n  \"wind\",\n  \"home_coach\",\n  \"away_coach\",\n  \"stadium_id\",\n  \"game_stadium\"\n)\n"
  },
  {
    "path": "R/nflfastR-package.R",
    "content": "#' @details # Parallel Processing and Progress Updates in nflfastR\n#'\n#' ## Preface\n#'\n#' Prior to nflfastR v4.0, parallel processing could be activated with an\n#' argument `pp` in the relevant functions and progress updates were always\n#' shown. Both of these methods are bad practice and were therefore removed\n#' in nflfastR v4.0\n#'\n#' The next sections describe how to make nflfastR work in parallel processes\n#' and show progress updates if the user wants to.\n#'\n#' ## More Speed Using Parallel Processing\n#'\n#' Nearly all nflfastR functions support parallel processing\n#' using [furrr::future_map()] if it is enabled by a call to [future::plan()]\n#' prior to the function call.\n#' Please see the documentation of the functions for detailed information.\n#'\n#' As an example, the following code block will resolve all function calls in the\n#' current session using multiple sessions in the background and load play-by-play\n#' data for the 2018 through 2020 seasons or build them freshly for the 2018 and\n#' 2019 Super Bowls:\n#' ```\n#' future::plan(\"multisession\")\n#' load_pbp(2018:2020)\n#' build_nflfastR_pbp(c(\"2018_21_NE_LA\", \"2019_21_SF_KC\"))\n#' ```\n#' We recommend choosing a default parallel processing method and saving it\n#' as an environment variable in the R user profile to make sure all futures\n#' will be resolved with the chosen method by default.\n#' This can be done by following the below given steps.\n#'\n#' First, run the following line and the file `.Renviron` should be opened automatically.\n#' If you haven't saved any environment variables yet, this will be an empty file.\n#' ```\n#' usethis::edit_r_environ()\n#'```\n#' In the opened file `.Renviron` add the next line, then save the file and restart your R session.\n#' Please note that this example sets \"multisession\" as default. For most users\n#' this should be the appropriate plan but please make sure it truly is.\n#' ```\n#' R_FUTURE_PLAN=\"multisession\"\n#' ```\n#' After the session is freshly restarted please check if the above method worked\n#' by running the next line. If the output is `FALSE` you successfully set up a\n#' default non-sequential [future::plan()]. If the output is `TRUE` all functions\n#' will behave like they were called with [purrr::map()] and NOT in multisession.\n#' ```\n#' inherits(future::plan(), \"sequential\")\n#' ```\n#' For more information on possible plans please see\n#' [the future package Readme](https://github.com/futureverse/future/blob/develop/README.md).\n#'\n#' For more information on `.Renviron` please see\n#' [this book chapter](https://rstats.wtf/r-startup.html).\n#'\n#' ## Get Progress Updates while Functions are Running\n#'\n#' Most nflfastR functions are able to show progress updates\n#' using [progressr::progressor()] if they are turned on before the function is\n#' called. There are at least two basic ways to do this by either activating\n#' progress updates globally (for the current session) with\n#' ```\n#' progressr::handlers(global = TRUE)\n#' ```\n#' or by piping the function call into [progressr::with_progress()]:\n#' ```\n#' load_pbp(2018:2020) |>\n#'   progressr::with_progress()\n#' ```\n#'\n#' Just like in the previous section, it is possible to activate global\n#' progression handlers by default. This can be done by following the below given steps.\n#'\n#' First, run the following line and the file `.Rprofile` should be opened automatically.\n#' If you haven't saved any code yet, this will be an empty file.\n#' ```\n#' usethis::edit_r_profile()\n#'```\n#' In the opened file `.Rprofile` add the next line, then save the file and restart your R\n#' session. All code in this file will be executed when a new R session starts.\n#' The part `if (require(\"progressr\"))` makes sure this will only run if the\n#' package progressr is installed to avoid crashing R sessions.\n#' ```\n#' if (requireNamespace(\"progressr\", quietly = TRUE)) progressr::handlers(global = TRUE)\n#' ```\n#'\n#' After the session is freshly restarted please check if the above method worked\n#' by running the next line. If the output is `TRUE` you successfully activated\n#' global progression handlers for all sessions.\n#' ```\n#' progressr::handlers(global = NA)\n#' ```\n#'\n#' For more information how to work with progress handlers please see [progressr::progressr].\n#'\n#' For more information on `.Rprofile` please see\n#' [this book chapter](https://rstats.wtf/r-startup.html).\n#'\n\"_PACKAGE\"\n\n# The following block is used by usethis to automatically manage\n# roxygen namespace tags. Modify with care!\n## usethis namespace: start\n#' @import dplyr\n#' @import fastrmodels\n#' @importFrom data.table %between% %chin%\n#' @importFrom rlang .data := .env %||%\n# We have to import something from xgboost because it is listed as dependency to\n# be able to apply models.\n#' @importFrom xgboost getinfo\n## usethis namespace: end\nNULL\n\n\n# Re-Exports --------------------------------------------------------------\n\n#' @importFrom nflreadr load_pbp\n#' @export\nnflreadr::load_pbp\n\n#' @importFrom nflreadr load_player_stats\n#' @export\nnflreadr::load_player_stats\n\n#' @importFrom nflreadr load_team_stats\n#' @export\nnflreadr::load_team_stats\n\n#' @importFrom nflreadr load_schedules\n#' @export\nnflreadr::load_schedules\n\n#' @importFrom nflreadr load_rosters\n#' @export\nnflreadr::load_rosters\n\n#' @importFrom nflreadr nflverse_sitrep\n#' @export\nnflreadr::nflverse_sitrep\n\n#' @importFrom nflreadr most_recent_season\n#' @export\nnflreadr::most_recent_season\n"
  },
  {
    "path": "R/report.R",
    "content": "#' Get a Situation Report on System, nflverse Package Versions and Dependencies\n#'\n#' @description\n#'\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated. Please use [`nflreadr::nflverse_sitrep`].\n#'\n#' This function gives a quick overview of the versions of R and\n#'   the operating system as well as the versions of nflverse packages, options,\n#'   and their dependencies. It's primarily designed to help you get a quick\n#'   idea of what's going on when you're helping someone else debug a problem.\n#' @details See [`nflreadr::nflverse_sitrep`] for details.\n#' @inheritDotParams nflreadr::nflverse_sitrep\n#' @inherit nflreadr::nflverse_sitrep\n#' @keywords internal\n#' @examples\n#' \\donttest{\n#' \\dontshow{\n#' # set CRAN mirror to avoid failing checks in weird scenarios\n#' old_ops <- options(repos = c(\"CRAN\" = \"https://cran.rstudio.com/\"))\n#' }\n#'\n#' # report(recursive = FALSE)\n#' nflverse_sitrep(pkg = \"nflreadr\", recursive = TRUE)\n#'\n#' \\dontshow{\n#' # restore old options\n#' options(old_ops)\n#' }\n#' }\n#' @export\nreport <- function(...) {\n  lifecycle::deprecate_warn(\n    \"5.2.0\",\n    \"report()\",\n    \"nflreadr::nflverse_sitrep()\"\n  )\n  nflreadr::nflverse_sitrep(...)\n}\n"
  },
  {
    "path": "R/save_raw_pbp.R",
    "content": "#' Download Raw PBP Data to Local Filesystem\n#'\n#' The functions [build_nflfastR_pbp()] and [fast_scraper()] support loading\n#' raw pbp data from local file systems instead of Github servers.\n#' This function is intended to help setting this up. It loads raw pbp data\n#' and saves it in the given directory split by season in subdirectories.\n#'\n#' @param game_ids A vector of nflverse game IDs.\n#' @param dir Path to local directory (defaults to option \"nflfastR.raw_directory\").\n#'   nflfastR will download the raw game files split by season into one sub\n#'   directory per season.\n#'\n#' @returns The function returns a data frame with one row for each downloaded file and\n#' the following columns:\n#'  - `success` if the HTTP request was successfully performed, regardless of the\n#'  response status code. This is `FALSE` in case of a network error, or in case\n#'  you tried to resume from a server that did not support this. A value of `NA`\n#'  means the download was interrupted while in progress.\n#'  - `status_code` the HTTP status code from the request. A successful download is\n#'  usually `200` for full requests or `206` for resumed requests. Anything else\n#'  could indicate that the downloaded file contains an error page instead of the\n#'  requested content.\n#'  - `resumefrom` the file size before the request, in case a download was resumed.\n#'  - `url` final url (after redirects) of the request.\n#'  - `destfile` downloaded file on disk.\n#'  - `error` if `success == FALSE` this column contains an error message.\n#'  - `type` the `Content-Type` response header value.\n#'  - `modified` the `Last-Modified` response header value.\n#'  - `time` total elapsed download time for this file in seconds.\n#'  - `headers` vector with http response headers for the request.\n#' @export\n#'\n#' @seealso [build_nflfastR_pbp()], [missing_raw_pbp()]\n#'\n#' @examples\n#' \\donttest{\n#' # CREATE LOCAL TEMP DIRECTORY\n#' local_dir <- tempdir()\n#'\n#' # LOAD AND SAVE A GAME TO TEMP DIRECTORY\n#' save_raw_pbp(\"2021_20_BUF_KC\", dir = local_dir)\n#'\n#' # REMOVE THE DIRECTORY\n#' unlink(file.path(local_dir, 2021))\n#' }\nsave_raw_pbp <- function(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL)\n) {\n  verify_game_ids(game_ids = game_ids)\n  if (is.null(dir)) {\n    cli::cli_abort(\n      \"Invalid argument {.arg dir}. Do you need to set \\\\\n                   {.code options(nflfastR.raw_directory)}?\"\n    )\n  } else if (!dir.exists(dir)) {\n    cli::cli_abort(\n      \"You've asked to save raw pbp to {.path {dir}} which \\\\\n                   doesn't exist. Please create it.\"\n    )\n  }\n  seasons <- substr(game_ids, 1, 4)\n  season_folders <- file.path(dir, unique(seasons)) |> sort()\n  missing_season_folders <- season_folders[!dir.exists(season_folders)]\n  created_folders <- vapply(\n    missing_season_folders,\n    dir.create,\n    FUN.VALUE = logical(1L)\n  )\n  to_load <- raw_pbp_urls(game_ids)\n  save_to <- file.path(\n    dir,\n    seasons,\n    paste0(game_ids, \".rds\")\n  )\n  dl <- curl::multi_download(to_load, save_to)\n  failed <- dl$status_code != 200\n  if (any(failed)) {\n    cli::cli_alert_danger(\n      \"Failed to download: {.var {game_ids[failed]}}\"\n    )\n    file.remove(save_to[failed])\n  }\n  dl\n}\n\n#' Compute Missing Raw PBP Data on Local Filesystem\n#'\n#' Uses [nflreadr::load_schedules()] to load game IDs of finished games and\n#' compares these IDs to all files saved under `dir`.\n#' This function is intended to serve as input for [save_raw_pbp()].\n#'\n#' @inheritParams save_raw_pbp\n#' @inheritParams nflreadr::load_schedules\n#' @param verbose If `TRUE`, will print number of missing game files as well as\n#'   oldest and most recent missing ID to console.\n#'\n#' @return A character vector of missing game IDs. If no files are missing,\n#'  returns `NULL` invisibly.\n#' @export\n#'\n#' @seealso [save_raw_pbp()]\n#'\n#' @examples\n#' \\donttest{\n#' try(\n#' missing <- missing_raw_pbp(tempdir())\n#' )\n#' }\nmissing_raw_pbp <- function(\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  seasons = TRUE,\n  verbose = TRUE\n) {\n  if (is.null(dir)) {\n    cli::cli_abort(\n      \"Invalid argument {.arg dir}. Do you need to set \\\\\n                   {.code options(nflfastR.raw_directory)}?\"\n    )\n  } else if (!dir.exists(dir)) {\n    cli::cli_abort(\n      \"You've asked to check raw pbp in {.path {dir}} which \\\\\n                   doesn't exist. Please create it.\"\n    )\n  }\n  local_games <- sapply(list.files(dir, full.names = TRUE), list.files) |>\n    unlist(use.names = FALSE) |>\n    tools::file_path_sans_ext()\n\n  finished_games <- nflreadr::load_schedules(seasons = seasons) |>\n    dplyr::filter(!is.na(.data$result)) |>\n    dplyr::pull(.data$game_id)\n\n  local_missing_games <- finished_games[!finished_games %in% local_games]\n\n  if (length(local_missing_games) == 0) {\n    cli::cli_alert_success(\"No missing games!\")\n    return(invisible(NULL))\n  }\n\n  if (isTRUE(verbose)) {\n    cli::cli_alert_info(\n      \"You are missing {length(local_missing_games)} game file{?s}. \\\\\n       The oldest missing game is {.val {local_missing_games[[1]]}}. \\\\\n       The most recent missing game is \\\\\n       {.val {local_missing_games[length(local_missing_games)]}}.\"\n    )\n  }\n\n  local_missing_games\n}\n\n\nverify_game_ids <- function(game_ids) {\n  # game_ids <- c(\n  #   \"2021_02_LAC_KC\",\n  #   \"Hello World\",\n  #   \"2028_01_LAC_JAX\",\n  #   \"2022_27_LAC_BUF\",\n  #   \"2021_02_LAC_KAC\"\n  # )\n  season_check <- substr(game_ids, 1, 4) %in%\n    seq.int(1999, as.integer(format(Sys.Date(), \"%Y\")) + 1, 1)\n  week_check <- as.integer(substr(game_ids, 6, 7)) %in% seq_len(22)\n  team_name_check <-\n    vapply(\n      stringr::str_extract_all(game_ids, \"(?<=_)[:upper:]{2,3}\"),\n      function(t) all(t %in% nflfastR::teams_colors_logos$team_abbr),\n      FUN.VALUE = logical(1L)\n    )\n  combined_check <- season_check & week_check & team_name_check\n\n  if (any(combined_check == FALSE)) {\n    cli::cli_abort(\n      \"The game IDs {.val {game_ids[!combined_check]}} seem to be invalid!\"\n    )\n  }\n\n  invisible(NULL)\n}\n"
  },
  {
    "path": "R/top-level_scraper.R",
    "content": "################################################################################\n# Author: Sebastian Carl\n# Purpose: Top-Level functions which will be made available through the package\n# Code Style Guide: styler::tidyverse_style()\n################################################################################\n\n# pbp ---------------------------------------------------------------------\n\n#' Get NFL Play by Play Data\n#'\n#' @description Load and parse NFL play-by-play data and add all of the original\n#'   nflfastR variables. As nflfastR now provides multiple functions which add\n#'   information to the output of this function, it is recommended to use\n#'   \\code{\\link{build_nflfastR_pbp}} instead.\n#'\n#' @param game_ids Vector of character ids or a data frame including the variable\n#' `game_id` (see details for further information).\n#' @param dir Path to local directory (defaults to option \"nflfastR.raw_directory\")\n#'   where nflfastR searches for raw game play-by-play data.\n#'   See [save_raw_pbp()] for additional information.\n#' @param ... Additional arguments passed to the scraping functions (for internal use)\n#' @param in_builder If \\code{TRUE}, the final message will be suppressed (for usage inside of \\code{\\link{build_nflfastR_pbp}}).\n#' @details To load valid game_ids please use the package function\n#' \\code{\\link{fast_scraper_schedules}} (the function can directly handle the\n#' output of that function)\n#' @seealso For information on parallel processing and progress updates please\n#' see [nflfastR].\n#' @seealso [build_nflfastR_pbp()], [save_raw_pbp()]\n#' @return Data frame where each individual row represents a single play for\n#' all passed game_ids containing the following\n#' detailed information (description partly extracted from nflscrapR):\n#' \\describe{\n#' \\item{play_id}{Numeric play id that when used with game_id and drive provides the unique identifier for a single play.}\n#' \\item{game_id}{Ten digit identifier for NFL game.}\n#' \\item{old_game_id}{Legacy NFL game ID.}\n#' \\item{home_team}{String abbreviation for the home team.}\n#' \\item{away_team}{String abbreviation for the away team.}\n#' \\item{season_type}{'REG' or 'POST' indicating if the game belongs to regular or post season.}\n#' \\item{week}{Season week.}\n#' \\item{posteam}{String abbreviation for the team with possession.}\n#' \\item{posteam_type}{String indicating whether the posteam team is home or away.}\n#' \\item{defteam}{String abbreviation for the team on defense.}\n#' \\item{side_of_field}{String abbreviation for which team's side of the field the team with possession is currently on.}\n#' \\item{yardline_100}{Numeric distance in the number of yards from the opponent's endzone for the posteam.}\n#' \\item{game_date}{Date of the game.}\n#' \\item{quarter_seconds_remaining}{Numeric seconds remaining in the quarter.}\n#' \\item{half_seconds_remaining}{Numeric seconds remaining in the half.}\n#' \\item{game_seconds_remaining}{Numeric seconds remaining in the game.}\n#' \\item{game_half}{String indicating which half the play is in, either Half1, Half2, or Overtime.}\n#' \\item{quarter_end}{Binary indicator for whether or not the row of the data is marking the end of a quarter.}\n#' \\item{drive}{Numeric drive number in the game.}\n#' \\item{sp}{Binary indicator for whether or not a score occurred on the play.}\n#' \\item{qtr}{Quarter of the game (5 is overtime).}\n#' \\item{down}{The down for the given play.}\n#' \\item{goal_to_go}{Binary indicator for whether or not the posteam is in a goal down situation.}\n#' \\item{time}{Time at start of play provided in string format as minutes:seconds remaining in the quarter.}\n#' \\item{yrdln}{String indicating the current field position for a given play.}\n#' \\item{ydstogo}{Numeric yards in distance from either the first down marker or the endzone in goal down situations.}\n#' \\item{ydsnet}{Numeric value for total yards gained on the given drive.}\n#' \\item{desc}{Detailed string description for the given play.}\n#' \\item{play_type}{String indicating the type of play: pass (includes sacks), run (includes scrambles), punt, field_goal, kickoff, extra_point, qb_kneel, qb_spike, no_play (timeouts and penalties), and missing for rows indicating end of play.}\n#' \\item{yards_gained}{Numeric yards gained (or lost) by the possessing team, excluding yards gained via fumble recoveries and laterals.}\n#' \\item{shotgun}{Binary indicator for whether or not the play was in shotgun formation.}\n#' \\item{no_huddle}{Binary indicator for whether or not the play was in no_huddle formation.}\n#' \\item{qb_dropback}{Binary indicator for whether or not the QB dropped back on the play (pass attempt, sack, or scrambled).}\n#' \\item{qb_kneel}{Binary indicator for whether or not the QB took a knee.}\n#' \\item{qb_spike}{Binary indicator for whether or not the QB spiked the ball.}\n#' \\item{qb_scramble}{Binary indicator for whether or not the QB scrambled.}\n#' \\item{pass_length}{String indicator for pass length: short or deep.}\n#' \\item{pass_location}{String indicator for pass location: left, middle, or right.}\n#' \\item{air_yards}{Numeric value for distance in yards perpendicular to the line of scrimmage at where the targeted receiver either caught or didn't catch the ball.}\n#' \\item{yards_after_catch}{Numeric value for distance in yards perpendicular to the yard line where the receiver made the reception to where the play ended.}\n#' \\item{run_location}{String indicator for location of run: left, middle, or right.}\n#' \\item{run_gap}{String indicator for line gap of run: end, guard, or tackle}\n#' \\item{field_goal_result}{String indicator for result of field goal attempt: made, missed, or blocked.}\n#' \\item{kick_distance}{Numeric distance in yards for kickoffs, field goals, and punts.}\n#' \\item{extra_point_result}{String indicator for the result of the extra point attempt: good, failed, blocked, safety (touchback in defensive endzone is 1 point apparently), or aborted.}\n#' \\item{two_point_conv_result}{String indicator for result of two point conversion attempt: success, failure, safety (touchback in defensive endzone is 1 point apparently), or return.}\n#' \\item{home_timeouts_remaining}{Numeric timeouts remaining in the half for the home team.}\n#' \\item{away_timeouts_remaining}{Numeric timeouts remaining in the half for the away team.}\n#' \\item{timeout}{Binary indicator for whether or not a timeout was called by either team.}\n#' \\item{timeout_team}{String abbreviation for which team called the timeout.}\n#' \\item{td_team}{String abbreviation for which team scored the touchdown.}\n#' \\item{td_player_name}{String name of the player who scored a touchdown.}\n#' \\item{td_player_id}{Unique identifier of the player who scored a touchdown.}\n#' \\item{posteam_timeouts_remaining}{Number of timeouts remaining for the possession team.}\n#' \\item{defteam_timeouts_remaining}{Number of timeouts remaining for the team on defense.}\n#' \\item{total_home_score}{Score for the home team at the end of the play.}\n#' \\item{total_away_score}{Score for the away team at the end of the play.}\n#' \\item{posteam_score}{Score the posteam at the start of the play.}\n#' \\item{defteam_score}{Score the defteam at the start of the play.}\n#' \\item{score_differential}{Score differential between the posteam and defteam at the start of the play.}\n#' \\item{posteam_score_post}{Score for the posteam at the end of the play.}\n#' \\item{defteam_score_post}{Score for the defteam at the end of the play.}\n#' \\item{score_differential_post}{Score differential between the posteam and defteam at the end of the play.}\n#' \\item{no_score_prob}{Predicted probability of no score occurring for the rest of the half based on the expected points model.}\n#' \\item{opp_fg_prob}{Predicted probability of the defteam scoring a FG next.}\n#' \\item{opp_safety_prob}{Predicted probability of the defteam scoring a safety next.}\n#' \\item{opp_td_prob}{Predicted probability of the defteam scoring a TD next.}\n#' \\item{fg_prob}{Predicted probability of the posteam scoring a FG next.}\n#' \\item{safety_prob}{Predicted probability of the posteam scoring a safety next.}\n#' \\item{td_prob}{Predicted probability of the posteam scoring a TD next.}\n#' \\item{extra_point_prob}{Predicted probability of the posteam scoring an extra point.}\n#' \\item{two_point_conversion_prob}{Predicted probability of the posteam scoring the two point conversion.}\n#' \\item{ep}{Using the scoring event probabilities, the estimated expected points with respect to the possession team for the given play.}\n#' \\item{epa}{Expected points added (EPA) by the posteam for the given play.}\n#' \\item{total_home_epa}{Cumulative total EPA for the home team in the game so far.}\n#' \\item{total_away_epa}{Cumulative total EPA for the away team in the game so far.}\n#' \\item{total_home_rush_epa}{Cumulative total rushing EPA for the home team in the game so far.}\n#' \\item{total_away_rush_epa}{Cumulative total rushing EPA for the away team in the game so far.}\n#' \\item{total_home_pass_epa}{Cumulative total passing EPA for the home team in the game so far.}\n#' \\item{total_away_pass_epa}{Cumulative total passing EPA for the away team in the game so far.}\n#' \\item{air_epa}{EPA from the air yards alone. For completions this represents the actual value provided through the air. For incompletions this represents the hypothetical value that could've been added through the air if the pass was completed.}\n#' \\item{yac_epa}{EPA from the yards after catch alone. For completions this represents the actual value provided after the catch. For incompletions this represents the difference between the hypothetical air_epa and the play's raw observed EPA (how much the incomplete pass cost the posteam).}\n#' \\item{comp_air_epa}{EPA from the air yards alone only for completions.}\n#' \\item{comp_yac_epa}{EPA from the yards after catch alone only for completions.}\n#' \\item{total_home_comp_air_epa}{Cumulative total completions air EPA for the home team in the game so far.}\n#' \\item{total_away_comp_air_epa}{Cumulative total completions air EPA for the away team in the game so far.}\n#' \\item{total_home_comp_yac_epa}{Cumulative total completions yac EPA for the home team in the game so far.}\n#' \\item{total_away_comp_yac_epa}{Cumulative total completions yac EPA for the away team in the game so far.}\n#' \\item{total_home_raw_air_epa}{Cumulative total raw air EPA for the home team in the game so far.}\n#' \\item{total_away_raw_air_epa}{Cumulative total raw air EPA for the away team in the game so far.}\n#' \\item{total_home_raw_yac_epa}{Cumulative total raw yac EPA for the home team in the game so far.}\n#' \\item{total_away_raw_yac_epa}{Cumulative total raw yac EPA for the away team in the game so far.}\n#' \\item{wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play.}\n#' \\item{def_wp}{Estimated win probability for the defteam.}\n#' \\item{home_wp}{Estimated win probability for the home team.}\n#' \\item{away_wp}{Estimated win probability for the away team.}\n#' \\item{wpa}{Win probability added (WPA) for the posteam.}\n#' \\item{vegas_wpa}{Win probability added (WPA) for the posteam: spread_adjusted model.}\n#' \\item{vegas_home_wpa}{Win probability added (WPA) for the home team: spread_adjusted model.}\n#' \\item{home_wp_post}{Estimated win probability for the home team at the end of the play.}\n#' \\item{away_wp_post}{Estimated win probability for the away team at the end of the play.}\n#' \\item{vegas_wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play, incorporating pre-game Vegas line.}\n#' \\item{vegas_home_wp}{Estimated win probability for the home team incorporating pre-game Vegas line.}\n#' \\item{total_home_rush_wpa}{Cumulative total rushing WPA for the home team in the game so far.}\n#' \\item{total_away_rush_wpa}{Cumulative total rushing WPA for the away team in the game so far.}\n#' \\item{total_home_pass_wpa}{Cumulative total passing WPA for the home team in the game so far.}\n#' \\item{total_away_pass_wpa}{Cumulative total passing WPA for the away team in the game so far.}\n#' \\item{air_wpa}{WPA through the air (same logic as air_epa).}\n#' \\item{yac_wpa}{WPA from yards after the catch (same logic as yac_epa).}\n#' \\item{comp_air_wpa}{The air_wpa for completions only.}\n#' \\item{comp_yac_wpa}{The yac_wpa for completions only.}\n#' \\item{total_home_comp_air_wpa}{Cumulative total completions air WPA for the home team in the game so far.}\n#' \\item{total_away_comp_air_wpa}{Cumulative total completions air WPA for the away team in the game so far.}\n#' \\item{total_home_comp_yac_wpa}{Cumulative total completions yac WPA for the home team in the game so far.}\n#' \\item{total_away_comp_yac_wpa}{Cumulative total completions yac WPA for the away team in the game so far.}\n#' \\item{total_home_raw_air_wpa}{Cumulative total raw air WPA for the home team in the game so far.}\n#' \\item{total_away_raw_air_wpa}{Cumulative total raw air WPA for the away team in the game so far.}\n#' \\item{total_home_raw_yac_wpa}{Cumulative total raw yac WPA for the home team in the game so far.}\n#' \\item{total_away_raw_yac_wpa}{Cumulative total raw yac WPA for the away team in the game so far.}\n#' \\item{punt_blocked}{Binary indicator for if the punt was blocked.}\n#' \\item{first_down_rush}{Binary indicator for if a running play converted the first down.}\n#' \\item{first_down_pass}{Binary indicator for if a passing play converted the first down.}\n#' \\item{first_down_penalty}{Binary indicator for if a penalty converted the first down.}\n#' \\item{third_down_converted}{Binary indicator for if the first down was converted on third down.}\n#' \\item{third_down_failed}{Binary indicator for if the posteam failed to convert first down on third down.}\n#' \\item{fourth_down_converted}{Binary indicator for if the first down was converted on fourth down.}\n#' \\item{fourth_down_failed}{Binary indicator for if the posteam failed to convert first down on fourth down.}\n#' \\item{incomplete_pass}{Binary indicator for if the pass was incomplete.}\n#' \\item{touchback}{Binary indicator for if a touchback occurred on the play.}\n#' \\item{interception}{Binary indicator for if the pass was intercepted.}\n#' \\item{punt_inside_twenty}{Binary indicator for if the punt ended inside the twenty yard line.}\n#' \\item{punt_in_endzone}{Binary indicator for if the punt was in the endzone.}\n#' \\item{punt_out_of_bounds}{Binary indicator for if the punt went out of bounds.}\n#' \\item{punt_downed}{Binary indicator for if the punt was downed.}\n#' \\item{punt_fair_catch}{Binary indicator for if the punt was caught with a fair catch.}\n#' \\item{kickoff_inside_twenty}{Binary indicator for if the kickoff ended inside the twenty yard line.}\n#' \\item{kickoff_in_endzone}{Binary indicator for if the kickoff was in the endzone.}\n#' \\item{kickoff_out_of_bounds}{Binary indicator for if the kickoff went out of bounds.}\n#' \\item{kickoff_downed}{Binary indicator for if the kickoff was downed.}\n#' \\item{kickoff_fair_catch}{Binary indicator for if the kickoff was caught with a fair catch.}\n#' \\item{fumble_forced}{Binary indicator for if the fumble was forced.}\n#' \\item{fumble_not_forced}{Binary indicator for if the fumble was not forced.}\n#' \\item{fumble_out_of_bounds}{Binary indicator for if the fumble went out of bounds.}\n#' \\item{solo_tackle}{Binary indicator if the play had a solo tackle (could be multiple due to fumbles).}\n#' \\item{safety}{Binary indicator for whether or not a safety occurred.}\n#' \\item{penalty}{Binary indicator for whether or not a penalty occurred.}\n#' \\item{tackled_for_loss}{Binary indicator for whether or not a tackle for loss on a run play occurred.}\n#' \\item{fumble_lost}{Binary indicator for if the fumble was lost.}\n#' \\item{own_kickoff_recovery}{Binary indicator for if the kicking team recovered the kickoff.}\n#' \\item{own_kickoff_recovery_td}{Binary indicator for if the kicking team recovered the kickoff and scored a TD.}\n#' \\item{qb_hit}{Binary indicator if the QB was hit on the play.}\n#' \\item{rush_attempt}{Binary indicator for if the play was a run.}\n#' \\item{pass_attempt}{Binary indicator for if the play was a pass attempt (includes sacks).}\n#' \\item{sack}{Binary indicator for if the play ended in a sack.}\n#' \\item{touchdown}{Binary indicator for if the play resulted in a TD.}\n#' \\item{pass_touchdown}{Binary indicator for if the play resulted in a passing TD.}\n#' \\item{rush_touchdown}{Binary indicator for if the play resulted in a rushing TD.}\n#' \\item{return_touchdown}{Binary indicator for if the play resulted in a return TD.}\n#' \\item{extra_point_attempt}{Binary indicator for extra point attempt.}\n#' \\item{two_point_attempt}{Binary indicator for two point conversion attempt.}\n#' \\item{field_goal_attempt}{Binary indicator for field goal attempt.}\n#' \\item{kickoff_attempt}{Binary indicator for kickoff.}\n#' \\item{punt_attempt}{Binary indicator for punts.}\n#' \\item{fumble}{Binary indicator for if a fumble occurred.}\n#' \\item{complete_pass}{Binary indicator for if the pass was completed.}\n#' \\item{assist_tackle}{Binary indicator for if an assist tackle occurred.}\n#' \\item{lateral_reception}{Binary indicator for if a lateral occurred on the reception.}\n#' \\item{lateral_rush}{Binary indicator for if a lateral occurred on a run.}\n#' \\item{lateral_return}{Binary indicator for if a lateral occurred on a return.}\n#' \\item{lateral_recovery}{Binary indicator for if a lateral occurred on a fumble recovery.}\n#' \\item{passer_player_id}{Unique identifier for the player that attempted the pass.}\n#' \\item{passer_player_name}{String name for the player that attempted the pass.}\n#' \\item{passing_yards}{Numeric yards by the passer_player_name, including yards gained in pass plays with laterals.\n#' This should equal official passing statistics.}\n#' \\item{receiver_player_id}{Unique identifier for the receiver that was targeted on the pass.}\n#' \\item{receiver_player_name}{String name for the targeted receiver.}\n#' \\item{receiving_yards}{Numeric yards by the receiver_player_name, excluding yards gained in pass plays with laterals.\n#' This should equal official receiving statistics but could miss yards gained in pass plays with laterals.\n#' Please see the description of `lateral_receiver_player_name` for further information.}\n#' \\item{rusher_player_id}{Unique identifier for the player that attempted the run.}\n#' \\item{rusher_player_name}{String name for the player that attempted the run.}\n#' \\item{rushing_yards}{Numeric yards by the rusher_player_name, excluding yards gained in rush plays with laterals.\n#' This should equal official rushing statistics but could miss yards gained in rush plays with laterals.\n#' Please see the description of `lateral_rusher_player_name` for further information.}\n#' \\item{lateral_receiver_player_id}{Unique identifier for the player that received the last(!) lateral on a pass play.}\n#' \\item{lateral_receiver_player_name}{String name for the player that received the last(!) lateral on a pass play.\n#' If there were multiple laterals in the same play, this will only be the last player who received a lateral.\n#' Please see \\url{https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards}\n#' for a list of plays where multiple players recorded lateral receiving yards.}\n#' \\item{lateral_receiving_yards}{Numeric yards by the `lateral_receiver_player_name` in pass plays with laterals.\n#' Please see the description of `lateral_receiver_player_name` for further information.}\n#' \\item{lateral_rusher_player_id}{Unique identifier for the player that received the last(!) lateral on a run play.}\n#' \\item{lateral_rusher_player_name}{String name for the player that received the last(!) lateral on a run play.\n#' If there were multiple laterals in the same play, this will only be the last player who received a lateral.\n#' Please see \\url{https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards}\n#' for a list of plays where multiple players recorded lateral rushing yards.}\n#' \\item{lateral_rushing_yards}{Numeric yards by the `lateral_rusher_player_name` in run plays with laterals.\n#' Please see the description of `lateral_rusher_player_name` for further information.}\n#' \\item{lateral_sack_player_id}{Unique identifier for the player that received the lateral on a sack.}\n#' \\item{lateral_sack_player_name}{String name for the player that received the lateral on a sack.}\n#' \\item{interception_player_id}{Unique identifier for the player that intercepted the pass.}\n#' \\item{interception_player_name}{String name for the player that intercepted the pass.}\n#' \\item{lateral_interception_player_id}{Unique indentifier for the player that received the lateral on an interception.}\n#' \\item{lateral_interception_player_name}{String name for the player that received the lateral on an interception.}\n#' \\item{punt_returner_player_id}{Unique identifier for the punt returner.}\n#' \\item{punt_returner_player_name}{String name for the punt returner.}\n#' \\item{lateral_punt_returner_player_id}{Unique identifier for the player that received the lateral on a punt return.}\n#' \\item{lateral_punt_returner_player_name}{String name for the player that received the lateral on a punt return.}\n#' \\item{kickoff_returner_player_name}{String name for the kickoff returner.}\n#' \\item{kickoff_returner_player_id}{Unique identifier for the kickoff returner.}\n#' \\item{lateral_kickoff_returner_player_id}{Unique identifier for the player that received the lateral on a kickoff return.}\n#' \\item{lateral_kickoff_returner_player_name}{String name for the player that received the lateral on a kickoff return.}\n#' \\item{punter_player_id}{Unique identifier for the punter.}\n#' \\item{punter_player_name}{String name for the punter.}\n#' \\item{kicker_player_name}{String name for the kicker on FG or kickoff.}\n#' \\item{kicker_player_id}{Unique identifier for the kicker on FG or kickoff.}\n#' \\item{own_kickoff_recovery_player_id}{Unique identifier for the player that recovered their own kickoff.}\n#' \\item{own_kickoff_recovery_player_name}{String name for the player that recovered their own kickoff.}\n#' \\item{blocked_player_id}{Unique identifier for the player that blocked the punt or FG.}\n#' \\item{blocked_player_name}{String name for the player that blocked the punt or FG.}\n#' \\item{tackle_for_loss_1_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_1_player_name}{String name for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_2_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_2_player_name}{String name for one of the potential players with the tackle for loss.}\n#' \\item{qb_hit_1_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_1_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_2_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_2_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{forced_fumble_player_1_team}{Team of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_1_player_id}{Unique identifier of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_1_player_name}{String name of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_team}{Team of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_player_id}{Unique identifier of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_player_name}{String name of one of the players with a forced fumble.}\n#' \\item{solo_tackle_1_team}{Team of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_team}{Team of one of the players with a solo tackle.}\n#' \\item{solo_tackle_1_player_id}{Unique identifier of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_player_id}{Unique identifier of one of the players with a solo tackle.}\n#' \\item{solo_tackle_1_player_name}{String name of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_player_name}{String name of one of the players with a solo tackle.}\n#' \\item{assist_tackle_1_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_1_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_1_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_team}{Team of one of the players with a tackle assist.}\n#' \\item{tackle_with_assist}{Binary indicator for if there has been a tackle with assist.}\n#' \\item{tackle_with_assist_1_player_id}{Unique identifier of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_1_player_name}{String name of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_1_team}{Team of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_player_id}{Unique identifier of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_player_name}{String name of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_team}{Team of one of the players with a tackle with assist.}\n#' \\item{pass_defense_1_player_id}{Unique identifier of one of the players with a pass defense.}\n#' \\item{pass_defense_1_player_name}{String name of one of the players with a pass defense.}\n#' \\item{pass_defense_2_player_id}{Unique identifier of one of the players with a pass defense.}\n#' \\item{pass_defense_2_player_name}{String name of one of the players with a pass defense.}\n#' \\item{fumbled_1_team}{Team of one of the first player with a fumble.}\n#' \\item{fumbled_1_player_id}{Unique identifier of the first player who fumbled on the play.}\n#' \\item{fumbled_1_player_name}{String name of one of the first player who fumbled on the play.}\n#' \\item{fumbled_2_player_id}{Unique identifier of the second player who fumbled on the play.}\n#' \\item{fumbled_2_player_name}{String name of one of the second player who fumbled on the play.}\n#' \\item{fumbled_2_team}{Team of one of the second player with a fumble.}\n#' \\item{fumble_recovery_1_team}{Team of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_yards}{Yards gained by one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_player_id}{Unique identifier of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_player_name}{String name of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_team}{Team of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_yards}{Yards gained by one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_player_id}{Unique identifier of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_player_name}{String name of one of the players with a fumble recovery.}\n#' \\item{sack_player_id}{Unique identifier of the player who recorded a solo sack.}\n#' \\item{sack_player_name}{String name of the player who recorded a solo sack.}\n#' \\item{half_sack_1_player_id}{Unique identifier of the first player who recorded half a sack.}\n#' \\item{half_sack_1_player_name}{String name of the first player who recorded half a sack.}\n#' \\item{half_sack_2_player_id}{Unique identifier of the second player who recorded half a sack.}\n#' \\item{half_sack_2_player_name}{String name of the second player who recorded half a sack.}\n#' \\item{return_team}{String abbreviation of the return team.}\n#' \\item{return_yards}{Yards gained by the return team.}\n#' \\item{penalty_team}{String abbreviation of the team with the penalty.}\n#' \\item{penalty_player_id}{Unique identifier for the player with the penalty.}\n#' \\item{penalty_player_name}{String name for the player with the penalty.}\n#' \\item{penalty_yards}{Yards gained (or lost) by the posteam from the penalty.}\n#' \\item{replay_or_challenge}{Binary indicator for whether or not a replay or challenge.}\n#' \\item{replay_or_challenge_result}{String indicating the result of the replay or challenge.}\n#' \\item{penalty_type}{String indicating the penalty type of the first penalty in the given play. Will be `NA` if `desc` is missing the type.}\n#' \\item{defensive_two_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on a two point conversion, this results following a turnover.}\n#' \\item{defensive_two_point_conv}{Binary indicator whether or not the defense successfully scored on the two point conversion.}\n#' \\item{defensive_extra_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on an extra point attempt, this results following a blocked attempt that the defense recovers the ball.}\n#' \\item{defensive_extra_point_conv}{Binary indicator whether or not the defense successfully scored on an extra point attempt.}\n#' \\item{safety_player_name}{String name for the player who scored a safety.}\n#' \\item{safety_player_id}{Unique identifier for the player who scored a safety.}\n#' \\item{season}{4 digit number indicating to which season the game belongs to.}\n#' \\item{cp}{Numeric value indicating the probability for a complete pass based on comparable game situations.}\n#' \\item{cpoe}{For a single pass play this is 1 - cp when the pass was completed or 0 - cp when the pass was incomplete. Analyzed for a whole game or season an indicator for the passer how much over or under expectation his completion percentage was.}\n#' \\item{series}{Starts at 1, each new first down increments, numbers shared across both teams NA: kickoffs, extra point/two point conversion attempts, non-plays, no posteam}\n#' \\item{series_success}{1: scored touchdown, gained enough yards for first down.}\n#' \\item{series_result}{Possible values: First down, Touchdown, Opp touchdown, Field goal, Missed field goal, Safety, Turnover, Punt, Turnover on downs, QB kneel, End of half}\n#' \\item{start_time}{Kickoff time in eastern time zone.}\n#' \\item{order_sequence}{Column provided by NFL to fix out-of-order plays. Available 2011 and beyond with source \"nfl\".}\n#' \\item{time_of_day}{Time of day of play in UTC \"HH:MM:SS\" format. Available 2011 and beyond with source \"nfl\".}\n#' \\item{stadium}{Game site name.}\n#' \\item{weather}{String describing the weather including temperature, humidity and wind (direction and speed). Doesn't change during the game!}\n#' \\item{nfl_api_id}{UUID of the game in the new NFL API.}\n#' \\item{play_clock}{Time on the playclock when the ball was snapped.}\n#' \\item{play_deleted}{Binary indicator for deleted plays.}\n#' \\item{play_type_nfl}{Play type as listed in the NFL source. Slightly different to the regular play_type variable.}\n#' \\item{special_teams_play}{Binary indicator for whether play is special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n#' \\item{st_play_type}{Type of special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n#' \\item{end_clock_time}{Game time at the end of a given play.}\n#' \\item{end_yard_line}{String indicating the yardline at the end of the given play consisting of team half and yard line number.}\n#' \\item{drive_real_start_time}{Local day time when the drive started (currently not used by the NFL and therefore mostly 'NA').}\n#' \\item{drive_play_count}{Numeric value of how many regular plays happened in a given drive.}\n#' \\item{drive_time_of_possession}{Time of possession in a given drive.}\n#' \\item{drive_first_downs}{Number of first downs in a given drive.}\n#' \\item{drive_inside20}{Binary indicator if the offense was able to get inside the opponents 20 yard line.}\n#' \\item{drive_ended_with_score}{Binary indicator the drive ended with a score.}\n#' \\item{drive_quarter_start}{Numeric value indicating in which quarter the given drive has started.}\n#' \\item{drive_quarter_end}{Numeric value indicating in which quarter the given drive has ended.}\n#' \\item{drive_yards_penalized}{Numeric value of how many yards the offense gained or lost through penalties in the given drive.}\n#' \\item{drive_start_transition}{String indicating how the offense got the ball.}\n#' \\item{drive_end_transition}{String indicating how the offense lost the ball.}\n#' \\item{drive_game_clock_start}{Game time at the beginning of a given drive.}\n#' \\item{drive_game_clock_end}{Game time at the end of a given drive.}\n#' \\item{drive_start_yard_line}{String indicating where a given drive started consisting of team half and yard line number.}\n#' \\item{drive_end_yard_line}{String indicating where a given drive ended consisting of team half and yard line number.}\n#' \\item{drive_play_id_started}{Play_id of the first play in the given drive.}\n#' \\item{drive_play_id_ended}{Play_id of the last play in the given drive.}\n#' \\item{fixed_drive}{Manually created drive number in a game.}\n#' \\item{fixed_drive_result}{Manually created drive result.}\n#' \\item{away_score}{Total points scored by the away team.}\n#' \\item{home_score}{Total points scored by the home team.}\n#' \\item{location}{Either 'Home' o 'Neutral' indicating if the home team played at home or at a neutral site. }\n#' \\item{result}{Equals home_score - away_score and means the game outcome from the perspective of the home team.}\n#' \\item{total}{Equals home_score + away_score and means the total points scored in the given game.}\n#' \\item{spread_line}{The closing spread line for the game. A positive number means the home team was favored by that many points, a negative number means the away team was favored by that many points. (Source: Pro-Football-Reference)}\n#' \\item{total_line}{The closing total line for the game. (Source: Pro-Football-Reference)}\n#' \\item{div_game}{Binary indicator for if the given game was a division game.}\n#' \\item{roof}{One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' \\item{surface}{What type of ground the game was played on. (Source: Pro-Football-Reference)}\n#' \\item{temp}{The temperature at the stadium only for 'roof' = 'outdoors' or 'open'.(Source: Pro-Football-Reference)}\n#' \\item{wind}{The speed of the wind in miles/hour only for 'roof' = 'outdoors' or 'open'. (Source: Pro-Football-Reference)}\n#' \\item{home_coach}{First and last name of the home team coach. (Source: Pro-Football-Reference)}\n#' \\item{away_coach}{First and last name of the away team coach. (Source: Pro-Football-Reference)}\n#' \\item{stadium_id}{ID of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' \\item{game_stadium}{Name of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' }\n#' @export\n#' @examples\n#' \\donttest{\n#' # Get pbp data for two games\n#' try({# to avoid CRAN test problems\n#' fast_scraper(c(\"2019_01_GB_CHI\", \"2013_21_SEA_DEN\"))\n#' })\n#'\n#'\n#' # It is also possible to directly use the\n#' # output of `fast_scraper_schedules` as input\n#' try({# to avoid CRAN test problems\n#' library(dplyr, warn.conflicts = FALSE)\n#' fast_scraper_schedules(2020) |>\n#'   slice_tail(n = 3) |>\n#'   fast_scraper()\n#' })\n#'\n#' \\dontshow{\n#' # Close open connections for R CMD Check\n#' future::plan(\"sequential\")\n#' }\n#' }\nfast_scraper <- function(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...,\n  in_builder = FALSE\n) {\n  if (!is.vector(game_ids) && is.data.frame(game_ids)) {\n    game_ids <- game_ids$game_id\n  }\n\n  if (!is.vector(game_ids)) {\n    cli::cli_abort(\"Param {.code game_ids} is not a valid vector!\")\n  }\n\n  if (length(game_ids) > 1 && is_sequential()) {\n    cli::cli_alert_info(\n      c(\n        \"It is recommended to use parallel processing when trying to load multiple games.\",\n        \"Please consider running {.code future::plan(\\\"multisession\\\")}! \",\n        \"Will go on sequentially...\"\n      )\n    )\n  }\n\n  # nflfastR v6 stopped supporting the 1999 and 2000 seasons because of\n  # inconsistent data sources. Data is still available through load_pbp\n  # but we will not fix any issues.\n  # It's possible to install nflfastR v5.2.0 to parse those seasons.\n  # try pak::pak(\"nflverse/nflfastR@v5.2.0\")\n  game_ids <- check_for_dropped_seasons(game_ids)\n\n  suppressWarnings({\n    p <- progressr::progressor(along = game_ids)\n    pbp <- furrr::future_map_dfr(\n      game_ids,\n      function(x, p, dir, ...) {\n        plays <- please_work(get_pbp_nfl)(x, dir = dir, ...)\n        p(sprintf(\"ID=%s\", as.character(x)))\n        return(plays)\n      },\n      p,\n      dir = dir,\n      ...\n    )\n\n    if (length(pbp) != 0) {\n      user_message(\"Download finished. Adding variables...\", \"done\")\n      pbp <- pbp |>\n        add_game_data(...) |>\n        add_nflscrapr_mutations() |>\n        add_ep() |>\n        add_air_yac_ep() |>\n        add_wp() |>\n        add_air_yac_wp() |>\n        add_cp() |>\n        add_drive_results() |>\n        add_series_data() |>\n        restore_kickoff_attempt() |>\n        select_variables()\n    }\n  })\n\n  if (!in_builder) {\n    str <- paste0(my_time(), \" | Procedure completed.\")\n    cli::cli_alert_success(\"{.field {str}}\")\n  }\n  make_nflverse_data(pbp)\n}\n\n\n# roster ------------------------------------------------------------------\n\n#' Load Team Rosters for Multiple Seasons\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated. Please use [`nflreadr::load_rosters`].\n#'\n#' @details See [`nflreadr::load_rosters`] for details.\n#' @inheritDotParams nflreadr::load_rosters\n#' @inherit nflreadr::load_rosters\n#' @seealso For information on parallel processing and progress updates please\n#' see [nflfastR].\n#' @keywords internal\n#' @examples\n#' \\donttest{\n#' # Roster of the 2019 and 2020 seasons\n#' try({# to avoid CRAN test problems\n#' # fast_scraper_roster(2019:2020)\n#' })\n#' }\n#' @export\nfast_scraper_roster <- function(...) {\n  lifecycle::deprecate_warn(\n    \"5.2.0\",\n    \"fast_scraper_roster()\",\n    \"nflreadr::load_rosters()\"\n  )\n  nflreadr::load_rosters(...)\n}\n\n# schedules ---------------------------------------------------------------\n\n#' Load NFL Season Schedules\n#'\n#' @description\n#' `r lifecycle::badge(\"deprecated\")`\n#'\n#' This function was deprecated. Please use [`nflreadr::load_schedules`].\n#'\n#' @details See [`nflreadr::load_schedules`] for details.\n#' @inheritDotParams nflreadr::load_schedules\n#' @inherit nflreadr::load_schedules\n#' @seealso For information on parallel processing and progress updates please\n#' see [nflfastR].\n#' @keywords internal\n#' @examples\n#'\\donttest{\n#' # Get schedules for the whole 2015 - 2018 seasons\n#' try({# to avoid CRAN test problems\n#' # fast_scraper_schedules(2015:2018)\n#' })\n#' }\n#' @export\nfast_scraper_schedules <- function(...) {\n  lifecycle::deprecate_warn(\n    \"5.2.0\",\n    \"fast_scraper_schedules()\",\n    \"nflreadr::load_schedules()\"\n  )\n  nflreadr::load_schedules(...)\n}\n"
  },
  {
    "path": "R/utils.R",
    "content": "# The function `message_completed` to create the green \"...completed\" message\n# only exists to hide the option `in_builder` in dots\nmessage_completed <- function(x, in_builder = FALSE) {\n  if (isFALSE(in_builder)) {\n    str <- paste0(my_time(), \" | \", x)\n    cli::cli_alert_success(\"{.field {str}}\")\n  } else if (in_builder) {\n    cli::cli_alert_success(\"{my_time()} | {x}\")\n  }\n}\n\nuser_message <- function(x, type) {\n  if (type == \"done\") {\n    cli::cli_alert_success(\"{my_time()} | {x}\")\n  } else if (type == \"todo\") {\n    cli::cli_ul(\"{my_time()} | {x}\")\n  } else if (type == \"info\") {\n    cli::cli_alert_info(\"{my_time()} | {x}\")\n  } else if (type == \"oops\") {\n    cli::cli_alert_danger(\"{my_time()} | {x}\")\n  }\n}\n\ncli_message <- function(\n  msg,\n  ...,\n  .cli_fct = cli::cli_alert_info,\n  .envir = parent.frame()\n) {\n  .cli_fct(c(my_time(), \" | \", msg), ..., .envir = .envir)\n}\n\nmy_time <- function() strftime(Sys.time(), format = \"%H:%M:%S\")\n\n# custom mode function from https://stackoverflow.com/questions/2547402/is-there-a-built-in-function-for-finding-the-mode/8189441\ncustom_mode <- function(x, na.rm = TRUE) {\n  if (na.rm) {\n    x <- x[!is.na(x)]\n  }\n  ux <- unique(x)\n  return(ux[which.max(tabulate(match(x, ux)))])\n}\n\nrule_header <- function(x) {\n  print(cli::rule(\n    left = cli::style_bold(x),\n    right = paste(\"nflfastR version\", utils::packageVersion(\"nflfastR\")),\n  ))\n}\n\nrule_footer <- function(x) {\n  print(cli::rule(\n    left = cli::style_bold(x)\n  ))\n}\n\n# read rds that has been pre-fetched\nread_raw_rds <- function(raw) {\n  con <- gzcon(rawConnection(raw))\n  ret <- readRDS(con)\n  on.exit(close(con))\n  ret\n}\n\n# helper to make sure the output of the\n# schedule scraper is not named 'invalid' if the source file not yet exists\nmaybe_valid <- function(id) {\n  all(\n    length(id) == 1,\n    is.character(id),\n    substr(id, 1, 4) %in%\n      seq.int(1999, as.integer(format(Sys.Date(), \"%Y\")) + 1, 1),\n    as.integer(substr(id, 6, 7)) %in% seq_len(22),\n    stringr::str_extract_all(id, \"(?<=_)[:upper:]{2,3}\")[[1]] %in%\n      nflfastR::teams_colors_logos$team_abbr\n  )\n}\n\n# some 2000 games have play_ids like 2767.375 and 2767.703 which results in\n# duplicates that can be fixed. We save play IDs as numeric first and then\n# check whether or not there are duplicates when we convert them to integer\n# If there are duplicates, we multiply all play IDs by 10 and check again\n# If there are still duplicates, we multiply all play IDs by 100 and so on\n# As soon as play IDs are unique, we save them as integer and go on\nuniquify_ids <- function(ids) {\n  ids <- as.numeric(ids)\n  int_ids <- as.integer(ids)\n  mult <- 10\n  while (anyDuplicated(int_ids) > 0) {\n    int_ids <- as.integer(ids * mult)\n    mult <- mult * 10\n  }\n  int_ids\n}\n\n# check if a package is installed\nis_installed <- function(pkg) requireNamespace(pkg, quietly = TRUE)\n\n# load raw game files esp. for debugging\nload_raw_game <- function(\n  game_id,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  skip_local = FALSE\n) {\n  # game_id <- \"2022_19_LAC_JAX\"\n\n  season <- substr(game_id, 1, 4)\n\n  local_file <- file.path(\n    dir,\n    season,\n    paste0(game_id, \".rds\")\n  )\n\n  if (\n    length(local_file) == 1 && file.exists(local_file) && isFALSE(skip_local)\n  ) {\n    # cli::cli_progress_step(\"Load locally from {.path {local_file}}\")\n    raw <- readRDS(local_file)\n  } else {\n    to_load <- raw_pbp_urls(game_id)\n    raw <- nflreadr::rds_from_url(to_load)\n  }\n\n  raw\n}\n\n# Identify sessions with sequential future resolving\nis_sequential <- function() inherits(future::plan(), \"sequential\")\n\n# take a time string of the format \"MM:SS\" and convert it to seconds\ntime_to_seconds <- function(time) {\n  as.numeric(strptime(time, format = \"%M:%S\")) -\n    as.numeric(strptime(\"0\", format = \"%S\"))\n}\n\n# write season pbp to a connected db\nwrite_pbp <- function(seasons, dbConnection, tablename) {\n  p <- progressr::progressor(along = seasons)\n  purrr::walk(\n    seasons,\n    function(x, p) {\n      pbp <- nflreadr::load_pbp(x)\n      if (!DBI::dbExistsTable(dbConnection, tablename)) {\n        pbp <- dplyr::bind_rows(default_play, pbp)\n      }\n      DBI::dbWriteTable(dbConnection, tablename, pbp, append = TRUE)\n      p(\"loading...\")\n    },\n    p\n  )\n}\n\nmake_nflverse_data <- function(data, type = c(\"play by play\")) {\n  attr(data, \"nflverse_timestamp\") <- Sys.time()\n  attr(data, \"nflverse_type\") <- type\n  attr(data, \"nflfastR_version\") <- utils::packageVersion(\"nflfastR\")\n  class(data) <- c(\"nflverse_data\", \"tbl_df\", \"tbl\", \"data.table\", \"data.frame\")\n  data\n}\n\nstr_split_and_extract <- function(string, pattern, i) {\n  split_list <- stringr::str_split(string, pattern, simplify = TRUE, n = i + 1)\n  split_list[, i]\n}\n\n# slightly modified version of purrr::possibly\nplease_work <- function(.f, otherwise = data.frame(), quiet = FALSE) {\n  function(...) {\n    tryCatch(\n      expr = .f(...),\n      error = function(e) {\n        if (isFALSE(quiet)) {\n          cli::cli_alert_warning(conditionMessage(e))\n        }\n        otherwise\n      }\n    )\n  }\n}\n\n# THIS IS CALLED FROM INSIDE get_pbp_gc AND get_pbp_nfl\n# MODIFY WITH CAUTION\nfetch_raw <- function(\n  game_id,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL)\n) {\n  season <- substr(game_id, 1, 4)\n\n  if (is.null(dir)) {\n    to_load <- raw_pbp_urls(game_id)\n\n    fetched <- curl::curl_fetch_memory(to_load)\n\n    if (fetched$status_code == 404 & maybe_valid(game_id)) {\n      cli::cli_abort(\n        \"The requested GameID {.val {game_id}} is not loaded yet, please try again later!\"\n      )\n    } else if (fetched$status_code == 500) {\n      cli::cli_abort(\n        \"The data hosting servers are down, please try again later!\"\n      )\n    } else if (fetched$status_code == 404) {\n      cli::cli_abort(\"The requested GameID {.val {game_id}} is invalid!\")\n    }\n\n    out <- read_raw_rds(fetched$content)\n  } else {\n    # build path to locally stored game files\n    local_file <- file.path(\n      dir,\n      season,\n      paste0(game_id, \".rds\")\n    )\n\n    if (!file.exists(local_file)) {\n      cli::cli_abort(\"File {.path {local_file}} doesn't exist!\")\n    }\n\n    out <- readRDS(local_file)\n  }\n\n  out\n}\n\nrelease_bullets <- function() {\n  c(\n    '`devtools::check_mac_release()`',\n    '`nflfastR:::my_rhub_check()`',\n    '`pkgdown::check_pkgdown()`',\n    '`nflfastR:::nflverse_thanks()`',\n    NULL\n  )\n}\n\nload_model <- function(name) {\n  model <- switch(\n    name,\n    \"ep\" = fastrmodels::ep_model,\n    \"cp\" = fastrmodels::cp_model,\n    \"wp\" = fastrmodels::wp_model,\n    \"wp_spread\" = fastrmodels::wp_model_spread,\n    \"fg\" = fastrmodels::fg_model,\n    \"xpass\" = fastrmodels::xpass_model,\n    \"xyac\" = fastrmodels::xyac_model\n  )\n\n  # fastrmodels v2 introduced raw model vectors to make sure the models\n  # are compatible with future xgboost versions\n  out <- if (is.raw(model)) {\n    xgboost::xgb.load.raw(model)\n  } else {\n    model\n  }\n  out\n}\n\nmy_rhub_check <- function() {\n  cli::cli_text(\"Please run the following code\")\n  cli::cli_text(\n    \"{.run rhub::rhub_check(platforms = nflfastR:::rhub_check_platforms())}\"\n  )\n}\n\nrhub_check_platforms <- function() {\n  # plts created with\n  # out <- paste0('\"', rhub::rhub_platforms()$name, '\"', collapse = \",\\n\")\n  # cli::cli_code(paste0(\n  #   \"plts <- c(\\n\", out, \"\\n)\"\n  # ))\n\n  plts <- c(\n    \"linux\",\n    \"m1-san\",\n    \"macos\",\n    \"macos-arm64\",\n    \"windows\",\n    \"atlas\",\n    \"c23\",\n    \"clang-asan\",\n    \"clang-ubsan\",\n    \"clang16\",\n    \"clang17\",\n    \"clang18\",\n    \"clang19\",\n    \"clang20\",\n    \"donttest\",\n    \"gcc-asan\",\n    \"gcc13\",\n    \"gcc14\",\n    \"gcc15\",\n    \"intel\",\n    \"mkl\",\n    \"nold\",\n    \"noremap\",\n    \"nosuggests\",\n    \"rchk\",\n    \"ubuntu-clang\",\n    \"ubuntu-gcc12\",\n    \"ubuntu-next\",\n    \"ubuntu-release\",\n    \"valgrind\"\n  )\n  exclude <- c(\"rchk\", \"nosuggests\", \"valgrind\")\n  plts[!plts %in% exclude]\n}\n\nnflverse_thanks <- function() {\n  cli::cli_text(\"Run the following code and copy/paste its output to NEWS.md\")\n\n  cli::cli_code(\n    '\n    contributors <- usethis::use_tidy_thanks()\n    paste(\n      \"Thank you to\",\n      glue::glue_collapse(\n        paste0(\"&#x0040;\", contributors), sep = \", \", last = \", and \"\n      ),\n      \"for their questions, feedback, and contributions towards this release.\"\n    )'\n  )\n}\n\ncheck_for_dropped_seasons <- function(game_ids) {\n  dropped_support <- grep(\"1999|2000\", game_ids, value = TRUE)\n  if (length(dropped_support)) {\n    seasons <- substr(dropped_support, 1, 4) |> unique() |> sort()\n    cli::cli_alert_warning(\n      \"You have supplied game ID(s) of the {seasons} \\\\\n      {cli::qty(length(seasons))}season{?s}. \\\\\n      nflfastR v6 has discontinued support for the parser for {?this/these} \\\\\n      season{?s} because of too many inconsistencies between the data sources. \\\\\n      The data is still available for download, however. \\\\\n      Please run {.run nflfastR::load_pbp(c({paste(seasons, collapse = ', ')}))}\"\n    )\n    game_ids <- game_ids[!game_ids %in% dropped_support]\n  }\n  game_ids\n}\n\nraw_pbp_urls <- function(game_ids) {\n  # pattern\n  # https://github.com/nflverse/nflverse-pbp/releases/download/{season}/{game_id}.rds\n  file.path(\n    \"https://github.com/nflverse/nflverse-pbp/releases/download\",\n    paste0(\"raw_pbp_\", substr(game_ids, 1, 4)),\n    paste0(game_ids, \".rds\"),\n    fsep = \"/\"\n  )\n}\n"
  },
  {
    "path": "README.Rmd",
    "content": "---\noutput: github_document\n---\n\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#>\",\n  fig.path = \"man/figures/readme-\"\n)\n```\n\n# **nflfastR** <img src=\"man/figures/logo.png\" align=\"right\" width=\"25%\" min-width=\"120px\"/>\n\n\n<!-- badges: start -->\n[![CRAN status](https://www.r-pkg.org/badges/version-last-release/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![CRAN downloads](https://cranlogs.r-pkg.org/badges/grand-total/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![Dev status](https://img.shields.io/github/r-package/v/nflverse/nflfastR/master?label=dev%20version&style=flat-square&logo=github)](https://nflfastr.com/)\n[![R build status](https://img.shields.io/github/actions/workflow/status/nflverse/nflfastR/R-CMD-check.yaml?label=R%20check&style=flat-square&logo=github)](https://github.com/nflverse/nflfastR/actions)\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![nflverse support](https://img.shields.io/discord/789805604076126219?color=7289da&label=nflverse%20support&logo=discord&logoColor=fff&style=flat-square)](https://discord.com/invite/5Er2FBnnQa)\n<!-- [![Twitter Follow](https://img.shields.io/twitter/follow/nflfastR.svg?style=social)](https://twitter.com/nflfastR) -->\n<!-- badges: end -->\n\n`nflfastR` is a set of functions to efficiently scrape NFL play-by-play data. `nflfastR` expands upon the features of nflscrapR:\n  \n* The package contains NFL play-by-play data back to 1999\n* As suggested by the package name, it obtains games **much** faster\n* Includes completion probability (`cp`), completion percentage over expected (`cpoe`), and expected yards after the catch (`xyac_epa` and `xyac_mean_yardage`) in play-by-play going back to 2006\n* Includes drive information, including drive starting position and drive result\n* Includes series information, including series number and series success\n* Hosts [a release of play-by-play data going back to 1999](https://github.com/nflverse/nflverse-data/releases/tag/pbp) for very quick access\n* Features models for Expected Points, Win Probability, Completion Probability, and Yards After the Catch (see section below)\n* Includes a function `update_db()` that creates and updates a database\n\nWe owe a debt of gratitude to the original [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, without whose contributions and inspiration this package would not exist.\n\n\n## Installation\n\nThe easiest way to get nflfastR is to install it from [CRAN](https://cran.r-project.org/package=nflfastR) with:\n\n```{r, eval=FALSE}\ninstall.packages(\"nflfastR\")\n```\n\nTo get a bug fix or to use a feature from the development version, you can install the development version of nflfastR either from [GitHub](https://github.com/nflverse/nflfastR/) with:\n\n``` {r eval = FALSE}\nif (!require(\"pak\")) install.packages(\"pak\")\npak::pak(\"nflverse/nflfastR\")\n```\n\nor prebuilt from the [development repo](https://nflverse.r-universe.dev) with:\n\n```{r eval = FALSE}\ninstall.packages(\"nflfastR\", repos = c(\"https://nflverse.r-universe.dev\", getOption(\"repos\")))\n```\n\n## Usage\n\nWe have provided some application examples in the **[Getting Started](https://nflfastr.com/articles/nflfastR.html)** article. However, these require a basic knowledge of R. For this reason we have the **[nflfastR beginner's guide](https://nflfastr.com/articles/beginners_guide.html)**, which we recommend to all those who are looking for an introduction to nflfastR with R.\n\nYou can find column names and descriptions in the **[Field Descriptions](https://nflfastr.com/articles/field_descriptions.html)** article, or by accessing the `field_descriptions` dataframe from the package.\n\n## Data access\n\nEven though `nflfastR` is very fast, **we recommend downloading the data from [here](https://github.com/nflverse/nflverse-data/releases/tag/pbp) or using the `nflreadr` package**. These data sets include play-by-play data of complete seasons going back to 1999 and are updated nightly during the season. The files contain both regular season and postseason data, and one can use game_type or week to figure out which games occurred in the postseason.\n\n## nflfastR models\n\n`nflfastR` uses its own models for Expected Points, Win Probability, Completion Probability, and Expected Yards After the Catch. To read about the models, please see [this post on Open Source Football](https://opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/). For a more detailed description of the motivation for Expected Points models, we highly recommend this paper [from the nflscrapR team located here](https://arxiv.org/pdf/1802.00998). \n\nHere is a visualization of the Expected Points model by down and yardline.\n\n``` {r epa-model, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600, echo=FALSE, eval = FALSE}\n\n# This code was used to create the ep model image. Since we don't want to include \n# the resulting png file in the package for file size reasons it was uploaded to\n# the nflfastR repo and embedded remotely with the next chunk\n\nlibrary(tidyverse)\n\ndf <- nflreadr::load_pbp(2014:2019) |>\n        filter(!is.na(posteam) & !is.na(ep), !is.na(down)) |>\n        select(ep, down, yardline_100, air_yards, pass_location, cp)\n\ndf |>\n  ggplot(aes(x = yardline_100, y = ep, color = as.factor(down))) + \n  geom_smooth(size = 2) + \n  labs(x = \"Yards from opponent's end zone\",\n       y = \"Expected points value\",\n       color = \"Down\",\n       title = \"Expected Points by Yardline and Down\") +\n  theme_bw() + \n  scale_y_continuous(expand=c(0,0), breaks = scales::pretty_breaks(10)) + \n  scale_x_continuous(expand=c(0,0), breaks = seq(from = 5, to = 95, by = 10)) +\n  theme(\n    plot.title = element_text(size = 18, hjust = 0.5),\n    plot.subtitle = element_text(size = 16, hjust = 0.5),\n    axis.title = element_text(size = 18),\n    axis.text = element_text(size = 16),\n    legend.text = element_text(size = 16),\n    legend.title = element_text(size = 16),\n    legend.position = c(.90, .80)) +\n    annotate(\"text\", x = 14, y = -2.2, size = 3, label = \"2014-2019 | Model: @nflfastR\")\n```\n\n```{r echo=FALSE, fig.align='center', fig.cap='', out.width='100%'}\nknitr::include_graphics('man/figures/readme-epa-model-1.png')\n```\n\nHere is a visualization of the Completion Probability model by air yards and pass direction.\n\n``` {r cp-model, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600, echo=FALSE, eval = FALSE}\n\n# This code was used to create the cp model image. Since we don't want to include \n# the resulting png file in the package for file size reasons it was uploaded to\n# the nflfastR repo and embedded remotely with the next chunk\n\ndf |>\n  filter(!is.na(cp), between(air_yards, -5, 45)) |>\n  mutate(pass_middle = if_else(pass_location == \"middle\", \"Yes\", \"No\")) |>\n  ggplot(aes(x = air_yards, y = cp, color = as.factor(pass_middle))) + \n  geom_smooth(size = 2) + \n  labs(x = \"Air yards\",\n       y = \"Expected completion %\",\n       color = \"Pass middle\",\n       title = \"Expected Completion % by Air Yards and Pass Direction\") +\n  theme_bw() + \n  scale_y_continuous(expand=c(0,0), breaks = scales::pretty_breaks(5)) + \n  scale_x_continuous(expand=c(0,0)) +\n  theme(\n    plot.title = element_text(size = 18, hjust = 0.5),\n    plot.subtitle = element_text(size = 16, hjust = 0.5),\n    axis.title = element_text(size = 18),\n    axis.text = element_text(size = 16),\n    legend.text = element_text(size = 16),\n    legend.title = element_text(size = 16),\n    legend.position = c(.80, .80)) +\n    annotate(\"text\", x = 2, y = .32, size = 3, label = \"2014-2019 | Model: @nflfastR\")\n```\n\n```{r echo=FALSE, fig.align='center', fig.cap='', out.width='100%'}\nknitr::include_graphics('man/figures/readme-cp-model-1.png')\n```\n\n`nflfastR` includes two win probability models: one with and one without incorporating the pre-game spread.\n\n## Special thanks\n\n* To Nick Shoemaker for [finding and making available JSON-formatted NFL play-by-play back to 1999](https://github.com/CroppedClamp/nfl_pbps) (`nflfastR` uses this source for 1999 and 2000 and previously also used it for 2001-2010)\n* To Lau Sze Yui for developing a scraping function to access JSON-formatted NFL play-by-play beginning in 2001\n* To Aaron Schatz and FTN Fantasy for providing charting data to correctly mark scrambles in the 1999-2005 seasons\n* To Lee Sharpe for curating a resource for game information\n* To Timo Riske, Lau Sze Yui, Sean Clement, and Daniel Houston for many helpful discussions regarding the development of the new `nflfastR` models\n* To Zach Feldman and Josh Hermsmeyer for many helpful discussions about CPOE models as well as Peter Owen for many helpful suggestions for the CP model\n* To Florian Schmitt for the logo design\n* The many users who found and reported bugs in `nflfastR` 1.0\n* And of course, the original [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, whose work represented a dramatic step forward for the state of public NFL research\n"
  },
  {
    "path": "README.md",
    "content": "\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n# **nflfastR** <img src=\"man/figures/logo.png\" align=\"right\" width=\"25%\" min-width=\"120px\"/>\n\n<!-- badges: start -->\n\n[![CRAN\nstatus](https://www.r-pkg.org/badges/version-last-release/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![CRAN\ndownloads](https://cranlogs.r-pkg.org/badges/grand-total/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![Dev\nstatus](https://img.shields.io/github/r-package/v/nflverse/nflfastR/master?label=dev%20version&style=flat-square&logo=github)](https://nflfastr.com/)\n[![R build\nstatus](https://img.shields.io/github/actions/workflow/status/nflverse/nflfastR/R-CMD-check.yaml?label=R%20check&style=flat-square&logo=github)](https://github.com/nflverse/nflfastR/actions)\n[![Lifecycle:\nstable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![nflverse\nsupport](https://img.shields.io/discord/789805604076126219?color=7289da&label=nflverse%20support&logo=discord&logoColor=fff&style=flat-square)](https://discord.com/invite/5Er2FBnnQa)\n<!-- [![Twitter Follow](https://img.shields.io/twitter/follow/nflfastR.svg?style=social)](https://twitter.com/nflfastR) -->\n<!-- badges: end -->\n\n`nflfastR` is a set of functions to efficiently scrape NFL play-by-play\ndata. `nflfastR` expands upon the features of nflscrapR:\n\n- The package contains NFL play-by-play data back to 1999\n- As suggested by the package name, it obtains games **much** faster\n- Includes completion probability (`cp`), completion percentage over\n  expected (`cpoe`), and expected yards after the catch (`xyac_epa` and\n  `xyac_mean_yardage`) in play-by-play going back to 2006\n- Includes drive information, including drive starting position and\n  drive result\n- Includes series information, including series number and series\n  success\n- Hosts [a release of play-by-play data going back to\n  1999](https://github.com/nflverse/nflverse-data/releases/tag/pbp) for\n  very quick access\n- Features models for Expected Points, Win Probability, Completion\n  Probability, and Yards After the Catch (see section below)\n- Includes a function `update_db()` that creates and updates a database\n\nWe owe a debt of gratitude to the original\n[`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim\nHorowitz, Ronald Yurko, and Samuel Ventura, without whose contributions\nand inspiration this package would not exist.\n\n## Installation\n\nThe easiest way to get nflfastR is to install it from\n[CRAN](https://cran.r-project.org/package=nflfastR) with:\n\n``` r\ninstall.packages(\"nflfastR\")\n```\n\nTo get a bug fix or to use a feature from the development version, you\ncan install the development version of nflfastR either from\n[GitHub](https://github.com/nflverse/nflfastR/) with:\n\n``` r\nif (!require(\"pak\")) install.packages(\"pak\")\npak::pak(\"nflverse/nflfastR\")\n```\n\nor prebuilt from the [development repo](https://nflverse.r-universe.dev)\nwith:\n\n``` r\ninstall.packages(\"nflfastR\", repos = c(\"https://nflverse.r-universe.dev\", getOption(\"repos\")))\n```\n\n## Usage\n\nWe have provided some application examples in the **[Getting\nStarted](https://nflfastr.com/articles/nflfastR.html)** article.\nHowever, these require a basic knowledge of R. For this reason we have\nthe **[nflfastR beginner’s\nguide](https://nflfastr.com/articles/beginners_guide.html)**, which we\nrecommend to all those who are looking for an introduction to nflfastR\nwith R.\n\nYou can find column names and descriptions in the **[Field\nDescriptions](https://nflfastr.com/articles/field_descriptions.html)**\narticle, or by accessing the `field_descriptions` dataframe from the\npackage.\n\n## Data access\n\nEven though `nflfastR` is very fast, **we recommend downloading the data\nfrom [here](https://github.com/nflverse/nflverse-data/releases/tag/pbp)\nor using the `nflreadr` package**. These data sets include play-by-play\ndata of complete seasons going back to 1999 and are updated nightly\nduring the season. The files contain both regular season and postseason\ndata, and one can use game_type or week to figure out which games\noccurred in the postseason.\n\n## nflfastR models\n\n`nflfastR` uses its own models for Expected Points, Win Probability,\nCompletion Probability, and Expected Yards After the Catch. To read\nabout the models, please see [this post on Open Source\nFootball](https://opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/).\nFor a more detailed description of the motivation for Expected Points\nmodels, we highly recommend this paper [from the nflscrapR team located\nhere](https://arxiv.org/pdf/1802.00998).\n\nHere is a visualization of the Expected Points model by down and\nyardline.\n\n<img src=\"man/figures/readme-epa-model-1.png\" alt=\"\" width=\"100%\" style=\"display: block; margin: auto;\" />\n\nHere is a visualization of the Completion Probability model by air yards\nand pass direction.\n\n<img src=\"man/figures/readme-cp-model-1.png\" alt=\"\" width=\"100%\" style=\"display: block; margin: auto;\" />\n\n`nflfastR` includes two win probability models: one with and one without\nincorporating the pre-game spread.\n\n## Special thanks\n\n- To Nick Shoemaker for [finding and making available JSON-formatted NFL\n  play-by-play back to 1999](https://github.com/CroppedClamp/nfl_pbps)\n  (`nflfastR` uses this source for 1999 and 2000 and previously also\n  used it for 2001-2010)\n- To Lau Sze Yui for developing a scraping function to access\n  JSON-formatted NFL play-by-play beginning in 2001\n- To Aaron Schatz and FTN Fantasy for providing charting data to\n  correctly mark scrambles in the 1999-2005 seasons\n- To Lee Sharpe for curating a resource for game information\n- To Timo Riske, Lau Sze Yui, Sean Clement, and Daniel Houston for many\n  helpful discussions regarding the development of the new `nflfastR`\n  models\n- To Zach Feldman and Josh Hermsmeyer for many helpful discussions about\n  CPOE models as well as Peter Owen for many helpful suggestions for the\n  CP model\n- To Florian Schmitt for the logo design\n- The many users who found and reported bugs in `nflfastR` 1.0\n- And of course, the original\n  [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team,\n  Maksim Horowitz, Ronald Yurko, and Samuel Ventura, whose work\n  represented a dramatic step forward for the state of public NFL\n  research\n"
  },
  {
    "path": "air.toml",
    "content": ""
  },
  {
    "path": "cran-comments.md",
    "content": "## Release summary\n\nThis is a minor release that \n* deprecates old functions, and\n* fixes bugs.\n\n## R CMD check results\n\n0 errors | 0 warnings | 0 notes\n\n## revdepcheck results\n\nWe checked 3 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package.\n\n * We saw 0 new problems\n * We failed to check 0 packages\n"
  },
  {
    "path": "data-raw/MODELS.R",
    "content": "################################################################################\n# Author: Ben Baldwin\n# Purpose: Estimate nflfastR models for EP, CP, Field Goals, and WP\n################################################################################\n\nlibrary(tidyverse)\nlibrary(xgboost)\nsource('R/helper_add_ep_wp.R')\nsource('R/helper_add_cp_cpoe.R')\nsource('R/helper_add_nflscrapr_mutations.R')\n\n\n################################################################################\n# Estimate EP model\n################################################################################\n\n# from remote\n# pbp_data <- readRDS(url('https://github.com/nflverse/nflfastR-data/blob/master/models/cal_data.rds?raw=true'))\n\n# from local\npbp_data <- readRDS('../nflfastR-data/models/cal_data.rds')\n\n#function in helper_add_ep_wp.R\nmodel_data <- pbp_data |>\n  make_model_mutations() |>\n  mutate(\n    label = case_when(\n      Next_Score_Half == \"Touchdown\" ~ 0,\n      Next_Score_Half == \"Opp_Touchdown\" ~ 1,\n      Next_Score_Half == \"Field_Goal\" ~ 2,\n      Next_Score_Half == \"Opp_Field_Goal\" ~ 3,\n      Next_Score_Half == \"Safety\" ~ 4,\n      Next_Score_Half == \"Opp_Safety\" ~ 5,\n      Next_Score_Half == \"No_Score\" ~ 6\n    ),\n    label = as.factor(label),\n    # Calculate the drive difference between the next score drive and the\n    # current play drive:\n    Drive_Score_Dist = Drive_Score_Half - drive,\n    # Create a weight column based on difference in drives between play and next score:\n    Drive_Score_Dist_W = (max(Drive_Score_Dist) - Drive_Score_Dist) /\n      (max(Drive_Score_Dist) - min(Drive_Score_Dist)),\n    # Create a weight column based on score differential:\n    ScoreDiff_W = (max(abs(score_differential), na.rm = T) -\n      abs(score_differential)) /\n      (max(abs(score_differential), na.rm = T) -\n        min(abs(score_differential), na.rm = T)),\n    # Add these weights together and scale again:\n    Total_W = Drive_Score_Dist_W + ScoreDiff_W,\n    Total_W_Scaled = (Total_W - min(Total_W, na.rm = T)) /\n      (max(Total_W, na.rm = T) - min(Total_W, na.rm = T))\n  ) |>\n  filter(\n    !is.na(defteam_timeouts_remaining),\n    !is.na(posteam_timeouts_remaining),\n    !is.na(yardline_100)\n  ) |>\n  select(\n    label,\n    half_seconds_remaining,\n    yardline_100,\n    home,\n    retractable,\n    dome,\n    outdoors,\n    ydstogo,\n    era0,\n    era1,\n    era2,\n    era3,\n    era4,\n    down1,\n    down2,\n    down3,\n    down4,\n    posteam_timeouts_remaining,\n    defteam_timeouts_remaining,\n    Total_W_Scaled\n  )\n\nnrounds = 525\nparams <-\n  list(\n    booster = \"gbtree\",\n    objective = \"multi:softprob\",\n    eval_metric = c(\"mlogloss\"),\n    num_class = 7,\n    eta = 0.025,\n    gamma = 1,\n    subsample = 0.8,\n    colsample_bytree = 0.8,\n    max_depth = 5,\n    min_child_weight = 1\n  )\n\nmodel_data <- model_data |>\n  mutate(label = as.numeric(label), label = label - 1)\n\nfull_train = xgboost::xgb.DMatrix(\n  model.matrix(~ . + 0, data = model_data |> select(-label, -Total_W_Scaled)),\n  label = model_data$label,\n  weight = model_data$Total_W_Scaled\n)\n\nset.seed(2013) #GoHawks\nep_model <- xgboost::xgboost(\n  params = params,\n  data = full_train,\n  nrounds = nrounds,\n  verbose = 2\n)\n\n################################################################################\n# Estimate FG model\n################################################################################\n\nfg_model_data <- pbp_data |>\n  filter(\n    play_type %in%\n      c(\"field_goal\", \"extra_point\", \"run\") &\n      (!is.na(extra_point_result) | !is.na(field_goal_result))\n  ) |>\n  make_model_mutations()\n\n#estimate model\nfg_model <- mgcv::bam(\n  sp ~ s(yardline_100, by = interaction(era, model_roof)) + model_roof + era,\n  data = fg_model_data,\n  family = \"binomial\"\n)\n\n################################################################################\n# Estimate CP model\n################################################################################\n\nmodel_vars <- pbp_data |>\n  filter(season >= 2006) |>\n  make_model_mutations() |>\n  prepare_cp_data() |>\n  filter(valid_pass == 1) |>\n  select(-valid_pass)\n\nnrounds = 560\nparams <-\n  list(\n    booster = \"gbtree\",\n    objective = \"binary:logistic\",\n    eval_metric = c(\"logloss\"),\n    eta = 0.025,\n    gamma = 5,\n    subsample = 0.8,\n    colsample_bytree = 0.8,\n    max_depth = 4,\n    min_child_weight = 6,\n    base_score = mean(model_vars$complete_pass)\n  )\n\nfull_train = xgboost::xgb.DMatrix(\n  model.matrix(~ . + 0, data = model_vars |> dplyr::select(-complete_pass)),\n  label = model_vars$complete_pass\n)\nset.seed(2013) #GoHawks\ncp_model <- xgboost::xgboost(\n  params = params,\n  data = full_train,\n  nrounds = nrounds,\n  verbose = 2\n)\n\n\n################################################################################\n# Estimate WP model: spread\n################################################################################\n\nmodel_data <-\n  readRDS(url(\n    'https://github.com/guga31bb/metrics/blob/master/wp_tuning/cal_data.rds?raw=true'\n  )) |>\n  filter(Winner != \"TIE\") |>\n  make_model_mutations() |>\n  prepare_wp_data() |>\n  mutate(label = ifelse(posteam == Winner, 1, 0)) |>\n  filter(\n    qtr <= 4 &\n      !is.na(ep) &\n      !is.na(score_differential) &\n      !is.na(play_type) &\n      !is.na(label),\n    !is.na(yardline_100)\n  ) |>\n  select(\n    label,\n    receive_2h_ko,\n    spread_time,\n    home,\n    half_seconds_remaining,\n    game_seconds_remaining,\n    Diff_Time_Ratio,\n    score_differential,\n    down,\n    ydstogo,\n    yardline_100,\n    posteam_timeouts_remaining,\n    defteam_timeouts_remaining\n  )\n\n\nnrounds = 534\nparams <-\n  list(\n    booster = \"gbtree\",\n    objective = \"binary:logistic\",\n    eval_metric = c(\"logloss\"),\n    eta = 0.05,\n    gamma = .79012017,\n    subsample = 0.9224245,\n    colsample_bytree = 5 / 12,\n    max_depth = 5,\n    min_child_weight = 7,\n    monotone_constraints = \"(0, 0, 0, 0, 0, 1, 1, -1, -1, -1, 1, -1)\"\n  )\n\n\nfull_train = xgboost::xgb.DMatrix(\n  model.matrix(~ . + 0, data = model_data |> select(-label)),\n  label = model_data$label\n)\nset.seed(2013) #GoHawks\nwp_model_spread <- xgboost::xgboost(\n  params = params,\n  data = full_train,\n  nrounds = nrounds,\n  verbose = 2\n)\n\nimportance <- xgboost::xgb.importance(\n  feature_names = colnames(wp_model_spread),\n  model = wp_model_spread\n)\nxgboost::xgb.ggplot.importance(importance_matrix = importance)\n\n#xgboost::xgb.plot.tree(model = wp_model_spread, trees = 1, show_node_id = TRUE)\n\n################################################################################\n# Estimate WP model: no spread\n################################################################################\n\nmodel_data <- model_data |>\n  select(\n    -spread_time\n  )\n\nnrounds = 65\nparams <-\n  list(\n    booster = \"gbtree\",\n    objective = \"binary:logistic\",\n    eval_metric = c(\"logloss\"),\n    eta = 0.2,\n    gamma = 0,\n    subsample = 0.8,\n    colsample_bytree = 0.8,\n    max_depth = 4,\n    min_child_weight = 1\n  )\n\n\nfull_train = xgboost::xgb.DMatrix(\n  model.matrix(~ . + 0, data = model_data |> select(-label)),\n  label = model_data$label\n)\nset.seed(2013) #GoHawks\nwp_model <- xgboost::xgboost(\n  params = params,\n  data = full_train,\n  nrounds = nrounds,\n  verbose = 2\n)\n\n\n################################################################################\n# save models to use in package\n################################################################################\n\nusethis::use_data(\n  ep_model,\n  wp_model,\n  wp_model_spread,\n  fg_model,\n  cp_model,\n  internal = TRUE,\n  overwrite = TRUE\n)\n"
  },
  {
    "path": "data-raw/_tune_spread_wp.R",
    "content": "library(tidyverse)\nlibrary(tidymodels)\nsource('R/helper_add_ep_wp.R')\nsource('R/helper_add_nflscrapr_mutations.R')\n\nset.seed(2013)\n\nmodel_data <-\n  # readRDS('data-raw/cal_data.rds') |>\n  readRDS(url(\n    'https://github.com/nflverse/nflfastR-data/blob/master/models/cal_data.rds?raw=true'\n  )) |>\n  filter(Winner != \"TIE\") |>\n  make_model_mutations() |>\n  prepare_wp_data() |>\n  mutate(label = ifelse(posteam == Winner, 1, 0)) |>\n  filter(\n    !is.na(ep) &\n      !is.na(score_differential) &\n      !is.na(play_type) &\n      !is.na(label) &\n      !is.na(yardline_100),\n    qtr <= 4\n  ) |>\n  select(\n    label,\n    receive_2h_ko,\n    spread_time,\n    home,\n    half_seconds_remaining,\n    game_seconds_remaining,\n    ExpScoreDiff_Time_Ratio,\n    score_differential,\n    # ep,\n    down,\n    ydstogo,\n    yardline_100,\n    posteam_timeouts_remaining,\n    defteam_timeouts_remaining,\n    season\n  )\n\n\nfolds <- map(0:9, function(x) {\n  f <- which(model_data$season %in% c(2000 + x, 2010 + x))\n  return(f)\n})\n\n\nfull_train = xgboost::xgb.DMatrix(\n  model.matrix(~ . + 0, data = model_data |> select(-label, -season)),\n  label = model_data$label\n)\n\n#params\nnrounds = 5000\n\n\n# #################################################################################\n# try tidymodels\n\ngrid <- grid_latin_hypercube(\n  finalize(mtry(), model_data),\n  min_n(),\n  tree_depth(),\n  learn_rate(),\n  loss_reduction(),\n  sample_size = sample_prop(),\n  size = 20\n)\n\ngrid <- grid |>\n  mutate(\n    # it was making dumb learn rates\n    learn_rate = .1 * ((1:nrow(grid)) / nrow(grid)),\n    # has to be between 0 and 1\n    mtry = mtry / length(model_data)\n  )\n\n# bonus round at the end: do more searching after finding good ones\ngrid <- grid |>\n  head(6) |>\n  mutate(\n    learn_rate = c(0.01, 0.02, .03, .04, .05, .06),\n    min_n = 14,\n    tree_depth = 5,\n    mtry = 0.5714286,\n    loss_reduction = 3.445502e-01,\n    sample_size = 0.7204741\n  )\n\ngrid\n\n# function to search over hyperparameter grid\nget_metrics <- function(df, row = 1) {\n  # testing only\n  # df <- grid |> dplyr::slice(1)\n\n  params <-\n    list(\n      booster = \"gbtree\",\n      objective = \"binary:logistic\",\n      eval_metric = c(\"logloss\"),\n      eta = df$learn_rate,\n      gamma = df$loss_reduction,\n      subsample = df$sample_size,\n      colsample_bytree = df$mtry,\n      max_depth = df$tree_depth,\n      min_child_weight = df$min_n\n    )\n\n  #train\n  wp_cv_model <- xgboost::xgb.cv(\n    data = full_train,\n    params = params,\n    nrounds = nrounds,\n    folds = folds,\n    metrics = list(\"logloss\"),\n    early_stopping_rounds = 10,\n    print_every_n = 10\n  )\n\n  output <- params\n  output$iter = wp_cv_model$best_iteration\n  output$logloss = wp_cv_model$evaluation_log[output$iter]$test_logloss_mean\n  output$error = wp_cv_model$evaluation_log[output$iter]$test_error_mean\n\n  this_param <- bind_rows(output)\n\n  if (row == 1) {\n    saveRDS(this_param, \"data-raw/modeling.rds\")\n  } else {\n    prev <- readRDS(\"data-raw/modeling.rds\")\n    for_save <- bind_rows(prev, this_param)\n    saveRDS(for_save, \"data-raw/modeling.rds\")\n  }\n\n  return(this_param)\n}\n\n# get results\nresults <- map_df(1:nrow(grid), function(x) {\n  message(glue::glue(\"Row {x}\"))\n  get_metrics(grid |> dplyr::slice(x), row = x)\n})\n\n# plot\nresults |>\n  select(\n    logloss,\n    eta,\n    gamma,\n    subsample,\n    colsample_bytree,\n    max_depth,\n    min_child_weight\n  ) |>\n  pivot_longer(\n    eta:min_child_weight,\n    values_to = \"value\",\n    names_to = \"parameter\"\n  ) |>\n  ggplot(aes(value, logloss, color = parameter)) +\n  geom_point(alpha = 0.8, show.legend = FALSE, size = 3) +\n  facet_wrap(~parameter, scales = \"free_x\") +\n  labs(x = NULL, y = \"logloss\") +\n  theme_minimal()\n\n# final best model\n#\n# eta 0.02\n# gamma 0.3445502\n# subsample 0.7204741\n# colsample_bytree 0.5714286\n# max_depth 5\n# min_child_weight 14\n# iter 760\n# logloss 0.4485878\n\n# https://parsnip.tidymodels.org/reference/boost_tree.html\n# https://xgboost.readthedocs.io/en/latest/parameter.html\n"
  },
  {
    "path": "data-raw/build_scramble_fix.R",
    "content": "library(tidyverse)\n\npbp <- nflfastR::load_pbp(1999:2005) |>\n  # plays that could plausibly be scramble\n  filter(\n    !is.na(rusher_player_id) | penalty == 1,\n    is.na(passer_player_id),\n    is.na(receiver_player_id)\n  ) |>\n  select(\n    season,\n    game_id,\n    play_id,\n    week,\n    away_team,\n    home_team,\n    posteam,\n    qtr,\n    down,\n    ydstogo,\n    time,\n    desc\n  ) |>\n  # not in scramble data this year\n  mutate(\n    time = case_when(\n      nchar(time) == 3 ~ paste0(\"00\", time),\n      nchar(time) == 4 ~ paste0(\"0\", time),\n      TRUE ~ time\n    )\n  )\n\n# Thank you to Aaron Schatz and Football Outsiders\n# For the charting data to fix scrambles in 2005\ns <- readxl::read_xlsx(\"data-raw/scrambles_2005.xlsx\") |>\n  as_tibble() |>\n  janitor::clean_names() |>\n  select(\n    season = year,\n    week,\n    qtr,\n    away_team = away,\n    home_team = home,\n    posteam = offense,\n    down,\n    ydstogo = togo,\n    date_time = time\n  )\n\n# Thank you to Aaron Schatz\n# For the charting data to fix scrambles in 1999 - 2004\ns2 <- readxl::read_xlsx(\n  \"data-raw/Scrambles 1999-2004 UPDATE for NFLfastR.xlsx\",\n  sheet = 1\n) |>\n  as_tibble() |>\n  janitor::clean_names() |>\n  filter(type %in% c(\"scramble\", \"assume scramble\")) |>\n  select(\n    season = year,\n    week,\n    qtr,\n    away_team = away,\n    home_team = home,\n    posteam = offense,\n    down,\n    ydstogo = togo,\n    date_time = time,\n    yards_gained = yards\n  )\n# s3 is a correction. the plays are in s3 should be rushes and therefore excluded from scramble_fix\n# see #475\ns3 <- readxl::read_xlsx(\n  \"data-raw/Scrambles.1999-2003.FURTHER.UPDATE.for.NFLfastR.xlsx\",\n  sheet = 1\n) |>\n  as_tibble() |>\n  janitor::clean_names() |>\n  select(\n    season = year,\n    week,\n    qtr,\n    away_team = away,\n    home_team = home,\n    posteam = offense,\n    down,\n    ydstogo = togo,\n    date_time = time,\n    yards_gained = yards\n  ) |>\n  mutate(\n    time = paste0(\n      formatC(lubridate::hour(date_time), width = 2, flag = \"0\"),\n      \":\",\n      formatC(lubridate::minute(date_time), width = 2, flag = \"0\")\n    )\n  ) |>\n  select(-date_time) |>\n  mutate_at(vars(home_team, away_team, posteam), nflfastR:::team_name_fn) |>\n  dplyr::left_join(\n    pbp,\n    by = c(\n      \"week\",\n      \"away_team\",\n      \"home_team\",\n      \"posteam\",\n      \"qtr\",\n      \"down\",\n      \"ydstogo\",\n      \"time\",\n      \"season\"\n    )\n  ) |>\n  mutate(no_scramble_id = paste0(game_id, \"_\", play_id))\n\ndat <- bind_rows(\n  s2,\n  s\n) |>\n  mutate(\n    time = paste0(\n      formatC(lubridate::hour(date_time), width = 2, flag = \"0\"),\n      \":\",\n      formatC(lubridate::minute(date_time), width = 2, flag = \"0\")\n    )\n  ) |>\n  select(-date_time) |>\n  mutate_at(vars(home_team, away_team, posteam), nflfastR:::team_name_fn)\n\nd <- dat |>\n  dplyr::left_join(\n    pbp,\n    by = c(\n      \"week\",\n      \"away_team\",\n      \"home_team\",\n      \"posteam\",\n      \"qtr\",\n      \"down\",\n      \"ydstogo\",\n      \"time\",\n      \"season\"\n    )\n  ) |>\n  mutate(scramble_id = paste0(game_id, \"_\", play_id)) |>\n  filter(scramble_id != \"2005_09_CIN_BAL_1725\") |>\n  filter(!scramble_id %in% s3$no_scramble_id)\n\n# number non-matched by season\nnrow(d)\nd |> filter(is.na(desc)) |> group_by(season) |> summarise(n = n())\n\n# get rid of non-match\nd <- d |>\n  filter(!is.na(desc))\nd |> group_by(season) |> summarise(n = n())\n\nscramble_fix <- d$scramble_id\nscramble_fix <- scramble_fix |>\n  unique()\nlength(scramble_fix)\nsaveRDS(scramble_fix, file = \"data-raw/scramble_fix.rds\")\n"
  },
  {
    "path": "data-raw/build_stat_id_df.R",
    "content": "stat_ids <- \"https://www.nflgsis.com/gsis/Documentation/Partners/StatIDs_files/sheet001.html\" |>\n  xml2::read_html() |>\n  rvest::html_table(fill = TRUE) |>\n  as.data.frame() |>\n  dplyr::rename(\"stat_id\" = X1, \"name\" = X2, \"comment\" = X3) |>\n  dplyr::select(1:3) |>\n  dplyr::slice(-1) |>\n  dplyr::mutate(stat_id = as.integer(stat_id)) |>\n  dplyr::filter(!is.na(stat_id)) |>\n  dplyr::group_by(stat_id, name) |>\n  dplyr::summarise(comment = paste0(comment, collapse = \" \")) |>\n  dplyr::ungroup() |>\n  dplyr::mutate(comment = stringr::str_squish(comment))\n\nusethis::use_data(stat_ids, overwrite = TRUE)\n"
  },
  {
    "path": "data-raw/compare_dfs.R",
    "content": "library(tidyverse)\nfuture::plan(\"multisession\")\n\n# function for comparing revisions against data in repo\n# make sure to build package first\ncompare_pbp <- function(id, cols) {\n  s <- substr(id[1], 1, 4) |> as.integer()\n  # no idea why this is necessary\n  games <- id\n\n  new_pbp <- build_nflfastR_pbp(\n    id\n    # comment this out to use the \"normal\" way\n    # , dir = \"../nflfastR-raw/raw\"\n  ) |>\n    filter(!stringr::str_detect(desc, \"GAME\")) |>\n    select(all_of(cols)) |>\n    # necessary to pass the equality checks\n    mutate(\n      ep = round(ep, 2),\n      epa = round(epa, 2),\n      vegas_home_wp = round(vegas_home_wp, 2),\n      vegas_home_wpa = round(vegas_home_wpa, 2),\n      home_wp = round(home_wp, 2)\n    )\n\n  repo_pbp <- readRDS(url(glue::glue(\n    \"https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_{s}.rds\"\n  ))) |>\n    filter(game_id %in% games) |>\n    filter(!stringr::str_detect(desc, \"GAME\")) |>\n    select(all_of(cols)) |>\n    mutate(\n      ep = round(ep, 2),\n      epa = round(epa, 2),\n      vegas_home_wp = round(vegas_home_wp, 2),\n      vegas_home_wpa = round(vegas_home_wpa, 2),\n      home_wp = round(home_wp, 2)\n    )\n\n  sum <- arsenal::diffs(arsenal::comparedf(\n    new_pbp |> select(-desc, -game_id, -play_id),\n    repo_pbp |> select(-desc, -game_id, -play_id)\n  ))\n  dfs <- bind_cols(\n    new_pbp |> select(-desc, -game_id, -play_id),\n    repo_pbp |> select(-desc, -game_id, -play_id)\n  )\n\n  dfs$desc <- new_pbp$desc\n  dfs$play_id <- new_pbp$play_id\n  dfs$game_id <- new_pbp$game_id\n\n  return(\n    list(sum, dfs)\n  )\n}\n\ncols <- c(\n  # DO NOT REMOVE THESE ONES OR THE COMPARISON WILL BREAK\n  \"game_id\",\n  \"play_id\",\n  \"desc\",\n  \"ep\",\n  \"epa\",\n  \"vegas_home_wp\",\n  \"vegas_home_wpa\",\n  \"home_wp\"\n\n  # here is stuff you can choose whether to include\n  # , \"posteam_timeouts_remaining\", \"defteam_timeouts_remaining\"\n)\n\nid <- \"2002_05_PHI_JAX\"\nid <- \"2006_01_MIA_PIT\"\nid <- \"2006_02_PIT_JAX\"\nid <- \"2017_08_LAC_NE\"\nid <- \"2006_04_JAX_WAS\"\nid <- \"2019_01_SF_TB\"\nid <- \"2017_12_JAX_ARI\"\n\nids <- nflfastR::fast_scraper_schedules(2020) |>\n  dplyr::slice(1:20) |>\n  pull(game_id)\n\ncompared <- compare_pbp(\n  id = ids,\n  cols = cols\n)\n\n# summary table\ncompared[[1]]\n\n# get row numbers of things with differences\nobs <- compared[[1]]$..row.names.. |> unique()\n\n# dfs\ncompared[[2]] |> arrange(play_id)\n\n# dfs with differences\ncompared[[2]][obs, ] |> arrange(play_id)\n\n# play description of plays with differences\ncompared[[2]][obs, ] |> arrange(play_id) |> select(desc)\n"
  },
  {
    "path": "data-raw/create_field_descriptions.R",
    "content": "library(dplyr)\nlibrary(tidyr)\nlibrary(stringr)\nlibrary(usethis)\n\nx <- readLines(\"data-raw/variable_list.txt\")\n\nfield_descriptions <- tibble(x = x) |>\n  separate(x, \"{\", into = c(NA, \"Field\", \"Description\")) |>\n  mutate_all(str_remove_all, \"\\\\}\")\n\nusethis::use_data(field_descriptions, overwrite = TRUE)\n# save(field_descriptions, file = \"vignettes/field_descriptions.rda\")\n"
  },
  {
    "path": "data-raw/default_play.R",
    "content": "### Create datatype dataframe\n### This is a db that is stored on Seb's local machine\nconnection <- DBI::dbConnect(duckdb::duckdb(), \"../_data_cache/pbp_db\")\npbp_db <- dplyr::tbl(connection, \"nflfastR_pbp\")\n\n### This is heavy, only run if necessary\nsk <- pbp_db |>\n  dplyr::collect() |>\n  skimr::skim()\n\nreadr::write_csv(sk, \"data-raw/pbp_datatypes.csv\")\n\nsk <- readr::read_csv(\"data-raw/pbp_datatypes.csv\")\n\nrandom_play <- pbp_db |>\n  dplyr::filter(season == 1999) |>\n  dplyr::collect() |>\n  dplyr::slice_sample(n = 1) |>\n  as.list()\n\ndefault_play <-\n  purrr::map(\n    seq_along(random_play),\n    function(i, play, sk) {\n      val <- play[[i]]\n      var <- names(play[i])\n      if (is.character(val)) {\n        max_char <- sk$character.max[sk$skim_variable == var]\n        rnd_char <- ifelse(is.na(val), 0L, nchar(val))\n        if (is.na(max_char)) {\n          max_char <- switch(\n            var,\n            \"lateral_sack_player_id\" = 10L,\n            \"lateral_sack_player_name\" = 20L,\n            \"tackle_with_assist_2_player_id\" = 10L,\n            \"tackle_with_assist_2_player_name\" = 20L,\n            \"tackle_with_assist_2_team\" = 3L,\n            \"drive_real_start_time\" = 20L\n          )\n        }\n        if (max_char > rnd_char) {\n          val <- strsplit(val, character(0L))[[1]] |>\n            sample(size = max_char, replace = TRUE) |>\n            paste0(collapse = \"\")\n        }\n      }\n      val\n    },\n    play = random_play,\n    sk = sk\n  ) |>\n  purrr::set_names(names(random_play)) |>\n  tibble::as_tibble_row() |>\n  dplyr::mutate(game_id = \"9999_99_DEF_TYP\")\n\nreadr::write_csv(default_play, \"data-raw/pbp_defaultplay.csv\")\nsaveRDS(default_play, \"data-raw/pbp_defaultplay.rds\")\n"
  },
  {
    "path": "data-raw/nfl_stats_variables.R",
    "content": "s1 <- calculate_stats(2023, \"season\", \"player\")\ns2 <- calculate_stats(2023, \"week\", \"player\")\ns3 <- calculate_stats(2023, \"season\", \"team\")\ns4 <- calculate_stats(2023, \"week\", \"team\")\n\nn1 <- names(s1)\nn2 <- names(s2)\nn3 <- names(s3)\nn4 <- names(s4)\n\nsetdiff(n1, n2)\nsetdiff(n2, n1)\n\nsetdiff(n1, n3)\nsetdiff(n3, n1)\n\n# tibble::tibble(\n#   variable = c(n1, n2, n3, n4) |> unique(),\n#   description = \"\"\n# ) |>\n#   jsonlite::write_json(\"data-raw/nfl_stats_variables.json\", pretty = TRUE)\n\nnfl_stats_variables <- jsonlite::fromJSON(\"data-raw/nfl_stats_variables.json\")\n\nusethis::use_data(nfl_stats_variables, overwrite = TRUE)\n\ns_old_1 <- nflreadr::load_player_stats(2023, \"offense\")\ns_old_2 <- nflreadr::load_player_stats(2023, \"defense\")\ns_old_3 <- nflreadr::load_player_stats(2023, \"kicking\")\nn_old_1 <- names(s_old_1)\nn_old_2 <- names(s_old_2)\nn_old_3 <- names(s_old_3)\n\n\n# Differences to old offense stats ----------------------------------------\n\n# recent_team -> team (recent team in weekly data never made sense)\n# interceptions -> passing_interceptions (all passing stats have the passing prefix)\n# sacks -> sacks_suffered (to make clear it's not on defensive side)\n# sack_yards -> sack_yards_lost (to make clear it's not on defensive side)\n# dakota -> not implemented at the moment\nsetdiff(n_old_1, n2)\nsetdiff(n2, n_old_1)\n\n# Differences to old defense stats ----------------------------------------\n\n# def_tackles -> there is def_tackles_solo and def_tackles_with_assist\n# def_fumble_recovery_own -> fumble_recovery_own (it is not exclusive to defense)\n# def_fumble_recovery_yards_own -> fumble_recovery_yards_own (it is not exclusive to defense)\n# def_fumble_recovery_opp -> fumble_recovery_opp (it is not exclusive to defense)\n# def_fumble_recovery_yards_opp -> fumble_recovery_yards_opp (it is not exclusive to defense)\n# def_safety -> def_safeties (we use plural everywhere)\n# def_penalty -> penalties (it is not exclusive to defense)\n# def_penalty_yards -> penalty_yards (it is not exclusive to defense)\nsetdiff(n_old_2, n2)\nsetdiff(n2, n_old_2)\n\n\n# Differences to old kicking stats ----------------------------------------\n\n# No differences\nsetdiff(n_old_3, n2)\nsetdiff(n2, n_old_3)\n"
  },
  {
    "path": "data-raw/nfl_stats_variables.json",
    "content": "[\n  {\n    \"variable\": \"player_id\",\n    \"description\": \"GSIS player ID. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"player_name\",\n    \"description\": \"Short player name as listed in play-by-play data. Please keep in mind that this name is not always unique for one player and can change from season to season and sometimes even within a season. Do not group by this variable. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"player_display_name\",\n    \"description\": \"Full name of player. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"position\",\n    \"description\": \"Position of player. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"position_group\",\n    \"description\": \"Position group of player. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"headshot_url\",\n    \"description\": \"URL to a player headshot image. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"season\",\n    \"description\": \"The NFL season\"\n  },\n  {\n    \"variable\": \"week\",\n    \"description\": \"The NFL week. Available if summary_level = 'week'\"\n  },\n  {\n    \"variable\": \"season_type\",\n    \"description\": \"One of 'REG', 'POST', or 'REG+POST'\"\n  },\n  {\n    \"variable\": \"game_id\",\n    \"description\": \"The nflverse game id of the form '{season}_{week}_{away abbreviation}_{home abbreviation}'. Available if summary_level = 'week'\"\n  },\n  {\n    \"variable\": \"recent_team\",\n    \"description\": \"Most recent team player appears in data with. Available if stat_type = 'player' & summary_level = 'season'.\"\n  },\n  {\n    \"variable\": \"team\",\n    \"description\": \"Team stats are counted for.\"\n  },\n  {\n    \"variable\": \"opponent_team\",\n    \"description\": \"The opponent team in that week. Available if summary_level = 'week'.\"\n  },\n  {\n    \"variable\": \"games\",\n    \"description\": \"The number of games where stats were counted. Available if summary_level = 'season'.\"\n  },\n  {\n    \"variable\": \"completions\",\n    \"description\": \"The number of completed passes.\"\n  },\n  {\n    \"variable\": \"attempts\",\n    \"description\": \"The number of pass attempts as defined by the NFL.\"\n  },\n  {\n    \"variable\": \"passing_yards\",\n    \"description\": \"Yards gained on pass plays.\"\n  },\n  {\n    \"variable\": \"passing_tds\",\n    \"description\": \"The number of passing touchdowns.\"\n  },\n  {\n    \"variable\": \"passing_interceptions\",\n    \"description\": \"The number of interceptions thrown.\"\n  },\n  {\n    \"variable\": \"sacks_suffered\",\n    \"description\": \"The Number of times sacked.\"\n  },\n  {\n    \"variable\": \"sack_yards_lost\",\n    \"description\": \"Yards lost on sack plays.\"\n  },\n  {\n    \"variable\": \"sack_fumbles\",\n    \"description\": \"The number of sacks with a fumble.\"\n  },\n  {\n    \"variable\": \"sack_fumbles_lost\",\n    \"description\": \"The number of sacks with a lost fumble.\"\n  },\n  {\n    \"variable\": \"passing_air_yards\",\n    \"description\": \"Passing air yards (includes incomplete passes).\"\n  },\n  {\n    \"variable\": \"passing_yards_after_catch\",\n    \"description\": \"Yards after the catch gained on plays in which player was the passer (this is an unofficial stat and may differ slightly between different sources).\"\n  },\n  {\n    \"variable\": \"passing_first_downs\",\n    \"description\": \"First downs on pass attempts.\"\n  },\n  {\n    \"variable\": \"passing_epa\",\n    \"description\": \"Total expected points added on pass attempts and sacks. NOTE: this uses the variable `qb_epa`, which gives QB credit for EPA for up to the point where a receiver lost a fumble after a completed catch and makes EPA work more like passing yards on plays with fumbles.\"\n  },\n  {\n    \"variable\": \"passing_cpoe\",\n    \"description\": \"Completion percentage over expectation\"\n  },\n  {\n    \"variable\": \"passing_2pt_conversions\",\n    \"description\": \"Two-point conversion passes.\"\n  },\n  {\n    \"variable\": \"pacr\",\n    \"description\": \"Passing Air Conversion Ratio. PACR = `passing_yards` / `passing_air_yards`. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"passing_10\",\n    \"description\": \"The number of passes that gained 10 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"passing_16\",\n    \"description\": \"The number of passes that gained 16 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"passing_20\",\n    \"description\": \"The number of passes that gained 20 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"passing_40\",\n    \"description\": \"The number of passes that gained 40 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"carries\",\n    \"description\": \"The number of official rush attempts (incl. scrambles and kneel downs). Rushes after a lateral reception don't count as carry.\"\n  },\n  {\n    \"variable\": \"rushing_yards\",\n    \"description\": \"Yards gained when rushing with the ball (incl. scrambles and kneel downs). Also includes yards gained after obtaining a lateral on a play that started with a rushing attempt.\"\n  },\n  {\n    \"variable\": \"rushing_tds\",\n    \"description\": \"The number of rushing touchdowns (incl. scrambles). Also includes touchdowns after obtaining a lateral on a play that started with a rushing attempt.\"\n  },\n  {\n    \"variable\": \"rushing_fumbles\",\n    \"description\": \"The number of rushes with a fumble.\"\n  },\n  {\n    \"variable\": \"rushing_fumbles_lost\",\n    \"description\": \"The number of rushes with a lost fumble.\"\n  },\n  {\n    \"variable\": \"rushing_first_downs\",\n    \"description\": \"First downs on rush attempts (incl. scrambles).\"\n  },\n  {\n    \"variable\": \"rushing_epa\",\n    \"description\": \"Expected points added on rush attempts (incl. scrambles and kneel downs).\"\n  },\n  {\n    \"variable\": \"rushing_2pt_conversions\",\n    \"description\": \"Two-point conversion rushes.\"\n  },\n  {\n    \"variable\": \"rushing_10\",\n    \"description\": \"The number of runs that gained 10 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"rushing_12\",\n    \"description\": \"The number of runs that gained 12 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"rushing_20\",\n    \"description\": \"The number of runs that gained 20 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"rushing_40\",\n    \"description\": \"The number of runs that gained 40 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"receptions\",\n    \"description\": \"The number of pass receptions. Lateral receptions officially don't count as reception.\"\n  },\n  {\n    \"variable\": \"targets\",\n    \"description\": \"The number of pass plays where the player was the targeted receiver.\"\n  },\n  {\n    \"variable\": \"receiving_yards\",\n    \"description\": \"Yards gained after a pass reception. Includes yards gained after receiving a lateral on a play that started as a pass play.\"\n  },\n  {\n    \"variable\": \"receiving_tds\",\n    \"description\": \"The number of touchdowns following a pass reception. Also includes touchdowns after receiving a lateral on a play that started as a pass play.\"\n  },\n  {\n    \"variable\": \"receiving_fumbles\",\n    \"description\": \"The number of fumbles after a pass reception.\"\n  },\n  {\n    \"variable\": \"receiving_fumbles_lost\",\n    \"description\": \"The number of fumbles lost after a pass reception.\"\n  },\n  {\n    \"variable\": \"receiving_air_yards\",\n    \"description\": \"Receiving air yards (incl. incomplete passes).\"\n  },\n  {\n    \"variable\": \"receiving_yards_after_catch\",\n    \"description\": \"Yards after the catch gained on plays in which player was receiver (this is an unofficial stat and may differ slightly between different sources).\"\n  },\n  {\n    \"variable\": \"receiving_first_downs\",\n    \"description\": \"First downs on receptions.\"\n  },\n  {\n    \"variable\": \"receiving_epa\",\n    \"description\": \"Expected points added on receptions.\"\n  },\n  {\n    \"variable\": \"receiving_2pt_conversions\",\n    \"description\": \"Two-point conversion receptions.\"\n  },\n  {\n    \"variable\": \"receiving_10\",\n    \"description\": \"The number of receptions that gained 10 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"receiving_16\",\n    \"description\": \"The number of receptions that gained 16 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"receiving_20\",\n    \"description\": \"The number of receptions that gained 20 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"receiving_40\",\n    \"description\": \"The number of receptions that gained 40 or more yards. Some define this as an 'explosive' play.\"\n  },\n  {\n    \"variable\": \"racr\",\n    \"description\": \"Receiver Air Conversion Ratio. RACR = `receiving_yards` / `receiving_air_yards`. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"target_share\",\n    \"description\": \"The share of targets of the player in all targets of his team. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"air_yards_share\",\n    \"description\": \"The share of receiving_air_yards of the player in all air_yards of his team. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"wopr\",\n    \"description\": \"Weighted Opportunity Rating. WOPR = 1.5 × `target_share` + 0.7 × `air_yards_share`. Available if stat_type = 'player'.\"\n  },\n  {\n    \"variable\": \"special_teams_tds\",\n    \"description\": \"The number of touchdowns scored in special teams plays.\"\n  },\n  {\n    \"variable\": \"def_tackles_solo\",\n    \"description\": \"Solo tackles.\"\n  },\n  {\n    \"variable\": \"def_tackles_with_assist\",\n    \"description\": \"Tackles where another player assisted.\"\n  },\n  {\n    \"variable\": \"def_tackle_assists\",\n    \"description\": \"Assist to another player's tackle.\"\n  },\n  {\n    \"variable\": \"def_tackles_for_loss\",\n    \"description\": \"Tackles for loss.\"\n  },\n  {\n    \"variable\": \"def_tackles_for_loss_yards\",\n    \"description\": \"Yards lost by the opposing team through a tackle.\"\n  },\n  {\n    \"variable\": \"def_fumbles_forced\",\n    \"description\": \"Forced fumbles.\"\n  },\n  {\n    \"variable\": \"def_sacks\",\n    \"description\": \"Number of Sacks.\"\n  },\n  {\n    \"variable\": \"def_sack_yards\",\n    \"description\": \"Yards lost by the opposing team through a sack.\"\n  },\n  {\n    \"variable\": \"def_qb_hits\",\n    \"description\": \"Number of QB hits\"\n  },\n  {\n    \"variable\": \"def_interceptions\",\n    \"description\": \"Interceptions caught.\"\n  },\n  {\n    \"variable\": \"def_interception_yards\",\n    \"description\": \"Yards gained after interceptions.\"\n  },\n  {\n    \"variable\": \"def_pass_defended\",\n    \"description\": \"Number of defended passes.\"\n  },\n  {\n    \"variable\": \"def_tds\",\n    \"description\": \"Defensive touchdowns.\"\n  },\n  {\n    \"variable\": \"def_fumbles\",\n    \"description\": \"Number of fumbles while playing on defense.\"\n  },\n  {\n    \"variable\": \"def_safeties\",\n    \"description\": \"Tackles that resulted in a safety.\"\n  },\n  {\n    \"variable\": \"misc_yards\",\n    \"description\": \"Yardage gained/lost that doesn't fall into any other category. Examples are blocked field goals or blocked punts.\"\n  },\n  {\n    \"variable\": \"fumble_recovery_own\",\n    \"description\": \"Recovered fumbles where the ball was fumbled by own team.\"\n  },\n  {\n    \"variable\": \"fumble_recovery_yards_own\",\n    \"description\": \"Yardage gained/lost by a player after he recovered a fumble by his own team. Includes yardage gained/lost where a team mate recovered a fumble and lateraled the ball to the player.\"\n  },\n  {\n    \"variable\": \"fumble_recovery_opp\",\n    \"description\": \"Recovered fumbles where the ball was fumbled by opposing team.\"\n  },\n  {\n    \"variable\": \"fumble_recovery_yards_opp\",\n    \"description\": \"Yardage gained/lost by a player after he recovered a fumble by the opposing team. Includes yardage gained/lost where a team mate recovered a fumble and lateraled the ball to the player.\"\n  },\n  {\n    \"variable\": \"fumble_recovery_tds\",\n    \"description\": \"Touchdowns scored after a fumble recovery. This can be in any unit. And both the own team and the opposing team can have fumbled the ball initially. Includes touchdowns where a team mate recovered a fumble and lateraled the ball to the player.\"\n  },\n  {\n    \"variable\": \"penalties\",\n    \"description\": \"Penalties caused.\"\n  },\n  {\n    \"variable\": \"penalty_yards\",\n    \"description\": \"Yardage lost through penalties.\"\n  },\n  {\n    \"variable\": \"timeouts\",\n    \"description\": \"Number of timeouts taken by team. Available if stat_type = 'team'.\"\n  },\n  {\n    \"variable\": \"fumbles_forced_by_opp\",\n    \"description\": \"The number of fumbles by the player. The fumble was forced by the opponent.\"\n  },\n  {\n    \"variable\": \"fumbles_not_forced\",\n    \"description\": \"The number of fumbles by the player. The fumble was NOT forced by the opponent.\"\n  },\n  {\n    \"variable\": \"fumbles_out_of_bounds\",\n    \"description\": \"The number of fumbles by the player, and the ball went out of bounds. The fumble may or may not have been forced.\"\n  },\n  {\n    \"variable\": \"fumbles_total\",\n    \"description\": \"The total number of fumbles by the player. Equals `fumbles_forced_by_opp` + `fumbles_not_forced`.\"\n  },\n  {\n    \"variable\": \"fumbles_lost_total\",\n    \"description\": \"The total number of fumbles lost by the player.\"\n  },\n  {\n    \"variable\": \"punt_returns\",\n    \"description\": \"Number of punts returned.\"\n  },\n  {\n    \"variable\": \"punt_return_yards\",\n    \"description\": \"Yardage gained/lost by a player during a punt return.\"\n  },\n  {\n    \"variable\": \"kickoff_returns\",\n    \"description\": \"Number of kickoffs returned.\"\n  },\n  {\n    \"variable\": \"kickoff_return_yards\",\n    \"description\": \"Yardage gained/lost by a player during a kickoff return.\"\n  },\n  {\n    \"variable\": \"fg_made\",\n    \"description\": \"Successful field goal attempts.\"\n  },\n  {\n    \"variable\": \"fg_att\",\n    \"description\": \"Attempted field goals.\"\n  },\n  {\n    \"variable\": \"fg_missed\",\n    \"description\": \"Missed field goals.\"\n  },\n  {\n    \"variable\": \"fg_blocked\",\n    \"description\": \"Attempted field goals that were blocked.\"\n  },\n  {\n    \"variable\": \"fg_long\",\n    \"description\": \"Distance of longest made field goal.\"\n  },\n  {\n    \"variable\": \"fg_pct\",\n    \"description\": \"Percentage of successful field goal attempts.\"\n  },\n  {\n    \"variable\": \"fg_made_0_19\",\n    \"description\": \"Successful field goal attempts where distance was between 0 and 19 yards.\"\n  },\n  {\n    \"variable\": \"fg_made_20_29\",\n    \"description\": \"Successful field goal attempts where distance was between 20 and 29 yards.\"\n  },\n  {\n    \"variable\": \"fg_made_30_39\",\n    \"description\": \"Successful field goal attempts where distance was between 30 and 39 yards.\"\n  },\n  {\n    \"variable\": \"fg_made_40_49\",\n    \"description\": \"Successful field goal attempts where distance was between 40 and 49 yards.\"\n  },\n  {\n    \"variable\": \"fg_made_50_59\",\n    \"description\": \"Successful field goal attempts where distance was between 50 and 59 yards.\"\n  },\n  {\n    \"variable\": \"fg_made_60_\",\n    \"description\": \"Successful field goal attempts where distance was 60+ yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_0_19\",\n    \"description\": \"Missed field goal attempts where distance was between 0 and 19 yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_20_29\",\n    \"description\": \"Missed field goal attempts where distance was between 20 and 29 yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_30_39\",\n    \"description\": \"Missed field goal attempts where distance was between 30 and 39 yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_40_49\",\n    \"description\": \"Missed field goal attempts where distance was between 40 and 49 yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_50_59\",\n    \"description\": \"Missed field goal attempts where distance was between 50 and 59 yards.\"\n  },\n  {\n    \"variable\": \"fg_missed_60_\",\n    \"description\": \"Missed field goal attempts where distance was 60+ yards.\"\n  },\n  {\n    \"variable\": \"fg_made_list\",\n    \"description\": \"Distances of all successful field goal attempts.\"\n  },\n  {\n    \"variable\": \"fg_missed_list\",\n    \"description\": \"Distances of all missed field goal attempts.\"\n  },\n  {\n    \"variable\": \"fg_blocked_list\",\n    \"description\": \"Distances of all blocked field goal attempts.\"\n  },\n  {\n    \"variable\": \"fg_made_distance\",\n    \"description\": \"Sum of distances of all made field goals.\"\n  },\n  {\n    \"variable\": \"fg_missed_distance\",\n    \"description\": \"Sum of distances of all missed field goals.\"\n  },\n  {\n    \"variable\": \"fg_blocked_distance\",\n    \"description\": \"Sum of distances of all blocked field goals.\"\n  },\n  {\n    \"variable\": \"pat_made\",\n    \"description\": \"Successful extra point attempts.\"\n  },\n  {\n    \"variable\": \"pat_att\",\n    \"description\": \"Attempted extra points.\"\n  },\n  {\n    \"variable\": \"pat_missed\",\n    \"description\": \"Missed extra points.\"\n  },\n  {\n    \"variable\": \"pat_blocked\",\n    \"description\": \"Extra points blocked by opponent.\"\n  },\n  {\n    \"variable\": \"pat_pct\",\n    \"description\": \"Percentage of successful extra point attempts.\"\n  },\n  {\n    \"variable\": \"gwfg_made\",\n    \"description\": \"Successful game winning field goal attempts.\"\n  },\n  {\n    \"variable\": \"gwfg_att\",\n    \"description\": \"Attempted game winning field goals.\"\n  },\n  {\n    \"variable\": \"gwfg_missed\",\n    \"description\": \"Missed game winning field goals.\"\n  },\n  {\n    \"variable\": \"gwfg_blocked\",\n    \"description\": \"Game winning field goal attempts blocked by opponent.\"\n  },\n  {\n    \"variable\": \"gwfg_distance\",\n    \"description\": \"Distance of game winning field goal attempt. Available if summary_level = 'week'.\"\n  },\n  {\n    \"variable\": \"gwfg_distance_list\",\n    \"description\": \"Distances of game winning field goal attempts. Available if summary_level = 'season'.\"\n  },\n  {\n    \"variable\": \"pt_att\",\n    \"description\": \"Kicked punts.\"\n  },\n  {\n    \"variable\": \"pt_blocked\",\n    \"description\": \"Kicked punts that were blocked.\"\n  },\n  {\n    \"variable\": \"pt_long\",\n    \"description\": \"Longest punt kicked.\"\n  },\n  {\n    \"variable\": \"pt_yards\",\n    \"description\": \"Length of punts kicked.\"\n  },\n  {\n    \"variable\": \"pt_inside_20\",\n    \"description\": \"The number of punts where the RETURN ended inside the opponent's 20 yard line.\"\n  },\n  {\n    \"variable\": \"pt_out_of_bounds\",\n    \"description\": \"The number of punts that went out of bounds without return.\"\n  },\n  {\n    \"variable\": \"pt_downed\",\n    \"description\": \"The number of punts downed without return.\"\n  },\n  {\n    \"variable\": \"pt_touchback\",\n    \"description\": \"The number of punts that resulted in a touchback.\"\n  },\n  {\n    \"variable\": \"pt_fair_caught\",\n    \"description\": \"The number of punts that resulted in a fair catch.\"\n  },\n  {\n    \"variable\": \"pt_returned\",\n    \"description\": \"The number of punts that were returned by the opponent team.\"\n  },\n  {\n    \"variable\": \"pt_return_yards\",\n    \"description\": \"The punt return yardage of the opponent team.\"\n  },\n  {\n    \"variable\": \"pt_return_tds\",\n    \"description\": \"The number of punts that were returned for a touchdown by the opponent team.\"\n  },\n  {\n    \"variable\": \"pt_net_yards\",\n    \"description\": \"Net punt yardage. Equals `pt_yards` - `pt_return_yards` - `pt_touchback` * `20`.\"\n  },\n  {\n    \"variable\": \"fantasy_points\",\n    \"description\": \"Standard fantasy points.\"\n  },\n  {\n    \"variable\": \"fantasy_points_ppr\",\n    \"description\": \"PPR fantasy points.\"\n  }\n]\n"
  },
  {
    "path": "data-raw/pbp_datatypes.csv",
    "content": "skim_type,skim_variable,n_missing,complete_rate,character.min,character.max,character.empty,character.n_unique,character.whitespace,numeric.mean,numeric.sd,numeric.p0,numeric.p25,numeric.p50,numeric.p75,numeric.p100,numeric.hist\ncharacter,game_id,0,1,13,15,0,6134,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,old_game_id,0,1,10,10,0,6134,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,home_team,0,1,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,away_team,0,1,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,season_type,0,1,3,4,0,2,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,posteam,74782,0.931927869867191,0,3,104,33,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,posteam_type,74782,0.931927869867191,4,4,0,2,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,defteam,74782,0.931927869867191,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,side_of_field,85173,0.9224692099729649,0,5,79,38,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,game_date,0,1,10,10,0,1169,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,game_half,0,1,5,8,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,time,1137,0.9989650181599716,0,5,62,1502,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,yrdln,7356,0.9933040225019798,0,6,2,1659,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,desc,0,1,4,1079,0,990769,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,play_type,50328,0.9541877167590596,3,11,0,9,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_length,806758,0.26562895400384134,4,5,0,2,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_location,806698,0.26568357045977953,4,6,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,run_location,778466,0.2913824335272217,4,6,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,run_gap,870143,0.20793121967648853,3,6,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,field_goal_result,1075311,0.02117206914443326,4,7,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,extra_point_result,1070162,0.025859071338194206,4,7,0,4,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,two_point_conv_result,1096605,0.0017886889319752575,7,7,0,2,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,timeout_team,1053072,0.04141565853791751,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,td_team,1068075,0.02775881373057698,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,td_player_name,1068075,0.02775881373057698,5,19,0,2989,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,td_player_id,1068076,0.02775790345631135,10,10,0,3264,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,passer_player_id,653500,0.40513576740671964,10,10,0,767,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,passer_player_name,653500,0.40513576740671964,5,17,0,802,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,receiver_player_id,715364,0.34882256023739955,10,10,0,3088,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,receiver_player_name,715364,0.34882256023739955,5,18,0,2971,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,rusher_player_id,764576,0.3040261430769091,10,10,0,2262,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,rusher_player_name,764576,0.3040261430769091,5,18,0,2206,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_receiver_player_id,1098347,2.0299116123689842e-4,10,10,0,181,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_receiver_player_name,1098347,2.0299116123689842e-4,6,16,0,181,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_rusher_player_id,1098531,3.550069635982478e-5,10,10,0,37,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_rusher_player_name,1098531,3.550069635982478e-5,6,13,0,37,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_sack_player_id,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_sack_player_name,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,interception_player_id,1086906,0.01061743903438106,10,10,0,1982,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,interception_player_name,1086906,0.01061743903438106,5,19,0,1904,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_interception_player_id,1098491,7.19116669852804e-5,10,10,0,70,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_interception_player_name,1098491,7.19116669852804e-5,5,14,0,70,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,punt_returner_player_id,1059306,0.03574100876594122,10,10,0,930,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,punt_returner_player_name,1059305,0.03574191904020685,5,19,0,910,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_punt_returner_player_id,1098509,5.552673020381427e-5,10,10,0,56,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_punt_returner_player_name,1098509,5.552673020381427e-5,5,19,0,56,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,kickoff_returner_player_name,1059285,0.03576012452551958,5,19,0,2072,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,kickoff_returner_player_id,1059286,0.035759214251253946,10,10,0,2222,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_kickoff_returner_player_id,1098419,1.374514141110339e-4,10,10,0,132,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,lateral_kickoff_returner_player_name,1098419,1.374514141110339e-4,6,16,0,132,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,punter_player_id,1041530,0.05192204411189094,10,10,0,222,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,punter_player_name,1041530,0.05192204411189094,5,16,0,243,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,kicker_player_name,985912,0.10254967821804706,3,14,0,334,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,kicker_player_id,985912,0.10254967821804706,10,10,0,310,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,own_kickoff_recovery_player_id,1098310,2.3667130906546152e-4,10,10,0,238,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,own_kickoff_recovery_player_name,1098310,2.3667130906546152e-4,5,19,0,236,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,blocked_player_id,1097590,8.920687803235516e-4,10,10,0,622,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,blocked_player_name,1097589,8.929790545891825e-4,4,19,0,603,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_for_loss_1_player_id,1048762,0.04533894062280963,10,10,0,3367,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_for_loss_1_player_name,1048699,0.04539628790154471,4,19,0,3226,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_for_loss_2_player_id,1098559,1.0013016922050255e-5,10,10,0,11,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_for_loss_2_player_name,1098559,1.0013016922050255e-5,6,12,0,11,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,qb_hit_1_player_id,1050497,0.0437596147719308,10,10,0,2982,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,qb_hit_1_player_name,1050490,0.043765986691790215,4,19,0,2829,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,qb_hit_2_player_id,1096391,0.0019834876248213673,10,10,0,1015,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,qb_hit_2_player_name,1096391,0.0019834876248213673,5,19,0,995,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_1_team,1087072,0.010466333506285452,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_1_player_id,1087105,0.010436294455519413,10,10,0,2977,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_1_player_name,1087079,0.010459961586426036,4,19,0,2720,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_2_team,1098504,6.0078101531968464e-5,2,3,0,26,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_2_player_id,1098504,6.0078101531968464e-5,10,10,0,65,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,forced_fumble_player_2_player_name,1098504,6.0078101531968464e-5,6,14,0,65,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_1_team,599256,0.45451268467189165,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_2_team,1092308,0.005700137451414067,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_1_player_id,599641,0.45416222907962167,10,10,0,8252,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_2_player_id,1092376,0.005638238801350837,10,10,0,3134,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_1_player_name,599306,0.4544671709586098,4,19,0,7320,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,solo_tackle_2_player_name,1092313,0.005695586080085913,4,18,0,2863,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_1_player_id,956715,0.12912695595182833,10,10,0,5669,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_1_player_name,956715,0.12912695595182833,5,19,0,5142,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_1_team,956715,0.12912695595182833,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_2_player_id,1039980,0.05333296922362707,10,10,0,4638,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_2_player_name,1039980,0.05333296922362707,5,19,0,4165,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_2_team,1039980,0.05333296922362707,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_3_player_id,1098567,2.730822796892518e-6,10,10,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_3_player_name,1098567,2.730822796892518e-6,8,9,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_3_team,1098567,2.730822796892518e-6,3,3,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_4_player_id,1098567,2.730822796892518e-6,10,10,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_4_player_name,1098567,2.730822796892518e-6,7,8,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,assist_tackle_4_team,1098567,2.730822796892518e-6,3,3,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_1_player_id,1015313,0.0757867045340761,10,10,0,4738,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_1_player_name,1015313,0.0757867045340761,5,19,0,4380,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_1_team,1015313,0.0757867045340761,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_2_player_id,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_2_player_name,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,tackle_with_assist_2_team,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_defense_1_player_id,1042237,0.05127848020608605,10,10,0,3379,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_defense_1_player_name,1042236,0.05127939048035168,4,19,0,3197,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_defense_2_player_id,1096864,0.0015529278971754268,10,10,0,996,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,pass_defense_2_player_name,1096864,0.0015529278971754268,5,19,0,963,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_1_team,1081551,0.01549195772686307,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_1_player_id,1081555,0.015488316629800547,10,10,0,2696,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_1_player_name,1081555,0.015488316629800547,5,19,0,2517,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_2_player_id,1098456,1.0377126628258182e-4,10,10,0,111,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_2_player_name,1098456,1.0377126628258182e-4,5,12,0,110,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumbled_2_team,1098456,1.0377126628258182e-4,2,3,0,30,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_1_team,1082877,0.014284934050629472,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_1_player_id,1082887,0.014275831307973053,10,10,0,5004,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_1_player_name,1082887,0.014275831307973053,5,19,0,4398,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_2_team,1098447,1.1196373467325937e-4,2,3,0,31,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_2_player_id,1098447,1.1196373467325937e-4,10,10,0,120,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fumble_recovery_2_player_name,1098447,1.1196373467325937e-4,5,13,0,118,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,sack_player_id,1073024,0.023253866389943312,10,10,0,2582,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,sack_player_name,1072977,0.023296649280428183,4,19,0,2488,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,half_sack_1_player_id,1096089,0.002258390453043546,10,10,0,1118,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,half_sack_1_player_name,1096089,0.002258390453043546,5,18,0,1084,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,half_sack_2_player_id,1096089,0.002258390453043546,10,10,0,1094,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,half_sack_2_player_name,1096089,0.002258390453043546,5,19,0,1063,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,return_team,971115,0.11601900652666652,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,penalty_team,1021292,0.07034417469983711,2,3,0,32,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,penalty_player_id,1024660,0.06727837097317424,10,47,0,7741,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,penalty_player_name,1024735,0.06721010040325148,5,19,0,6791,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,replay_or_challenge_result,1090418,0.007420555813466567,6,8,0,3,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,penalty_type,1021527,0.07013026024741253,7,38,0,69,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,safety_player_name,1098320,2.275685664090421e-4,4,15,0,217,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,safety_player_id,1098325,2.230171950808879e-4,10,47,0,224,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,series_result,248,0.9997742519821222,4,17,0,11,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,start_time,91628,0.916593389588283,7,8,0,48,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,time_of_day,157093,0.8570022847884068,7,8,0,62195,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,stadium,569653,0.48145953375752115,3,35,0,149,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,weather,563302,0.4872406856185769,22,123,0,2682,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,nfl_api_id,91628,0.916593389588283,36,36,0,5619,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,play_clock,409460,0.6272790991925867,1,2,0,47,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,play_type_nfl,91848,0.916393129249843,4,11,0,15,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,st_play_type,1097598,8.847865861983939e-4,7,7,0,1,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,end_clock_time,893921,0.1862867181881901,4,5,0,902,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,end_yard_line,249182,0.7731760379402314,2,7,0,2240,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fixed_drive_result,556,0.9994938875083063,4,17,0,9,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_real_start_time,1098570,0,NA,NA,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_time_of_possession,14081,0.9871824280655762,4,5,0,675,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_start_transition,18201,0.9834320980911548,4,19,0,29,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_end_transition,50801,0.9537571570314136,4,19,0,31,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_game_clock_start,14081,0.9871824280655762,4,5,0,1500,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_game_clock_end,14081,0.9871824280655762,4,5,0,1500,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_start_yard_line,16021,0.9854164959902418,0,6,241,1614,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,drive_end_yard_line,16021,0.9854164959902418,0,6,241,1634,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,location,0,1,4,7,0,2,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,roof,4387,0.9960066267966539,4,8,0,4,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,surface,0,1,5,10,0,9,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,home_coach,0,1,7,20,0,151,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,away_coach,0,1,7,20,0,151,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,stadium_id,0,1,5,5,0,59,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,game_stadium,0,1,8,35,0,94,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,passer,616172,0.4391144851943891,5,16,0,737,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,rusher,771805,0.2974457704106247,5,18,0,2248,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,receiver,669086,0.3909482327025132,4,18,0,2722,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,passer_id,616172,0.4391144851943891,10,10,0,765,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,rusher_id,771806,0.29744486013635907,10,10,0,2456,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,receiver_id,669086,0.3909482327025132,10,10,0,3088,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,name,289407,0.7365602556050138,5,18,0,2325,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,id,289408,0.7365593453307482,10,10,0,2538,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fantasy_player_name,381375,0.6528441519429804,5,18,0,3442,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fantasy_player_id,381375,0.6528441519429804,10,10,0,3587,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fantasy,326758,0.7025606015092347,4,18,0,3342,0,NA,NA,NA,NA,NA,NA,NA,NA\ncharacter,fantasy_id,326759,0.7025596912349691,10,10,0,3764,0,NA,NA,NA,NA,NA,NA,NA,NA\nnumeric,play_id,0,1,NA,NA,NA,NA,NA,2084.4356254039344,1219.3576690394557,1,1036,2066,3102,5921,▇▇▇▃▁\nnumeric,week,0,1,NA,NA,NA,NA,NA,9.522483774361215,5.284186738284365,1,5,10,14,22,▇▆▆▆▁\nnumeric,yardline_100,85487,0.922183383853555,NA,NA,NA,NA,NA,49.03448977033471,24.905220727748304,1,30,51,70,99,▅▆▆▇▃\nnumeric,quarter_seconds_remaining,1199,0.9989085811555022,NA,NA,NA,NA,NA,416.734325036838,282.64536464321304,0,150,400,663,900,▇▅▅▅▆\nnumeric,half_seconds_remaining,1199,0.9989085811555022,NA,NA,NA,NA,NA,815.5808409371124,558.844159835077,0,287,806,1296,1800,▇▅▅▅▅\nnumeric,game_seconds_remaining,1199,0.9989085811555022,NA,NA,NA,NA,NA,1714.1759377639833,1058.2081334837549,0,801,1800,2607,3600,▇▆▇▆▆\nnumeric,quarter_end,0,1,NA,NA,NA,NA,NA,0.01757557552090445,0.13140277920743262,0,0,0,0,1,▇▁▁▁▁\nnumeric,drive,14081,0.9871824280655762,NA,NA,NA,NA,NA,12.280475873890838,7.123816430268896,1,6,12,18,38,▇▇▇▂▁\nnumeric,sp,0,1,NA,NA,NA,NA,NA,0.07154209563341436,0.2577283155762031,0,0,0,0,1,▇▁▁▁▁\nnumeric,qtr,0,1,NA,NA,NA,NA,NA,2.5646176392947195,1.1313198255404997,1,2,3,4,6,▇▃▅▁▁\nnumeric,down,184365,0.8321772850159753,NA,NA,NA,NA,NA,2.0047494817901894,1.0067423782563265,1,1,2,3,4,▇▆▁▃▂\nnumeric,goal_to_go,13,0.9999881664345467,NA,NA,NA,NA,NA,0.04769165368751918,0.21311311831669968,0,0,0,0,1,▇▁▁▁▁\nnumeric,ydstogo,0,1,NA,NA,NA,NA,NA,7.136612141238155,4.9257457058179615,0,3,9,10,50,▇▁▁▁▁\nnumeric,ydsnet,14081,0.9871824280655762,NA,NA,NA,NA,NA,38.70946777698989,28.730318595751015,-39,12,37,65,99,▁▇▇▇▅\nnumeric,yards_gained,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.9359645816734976,7.807584532205186,-38,0,0,6,99,▁▇▁▁▁\nnumeric,shotgun,0,1,NA,NA,NA,NA,NA,0.2984989577359661,0.45759973839118473,0,0,0,1,1,▇▁▁▁▃\nnumeric,no_huddle,0,1,NA,NA,NA,NA,NA,0.043568457176147136,0.20413300724484748,0,0,0,0,1,▇▁▁▁▁\nnumeric,qb_dropback,50328,0.9541877167590596,NA,NA,NA,NA,NA,0.4362780731930223,0.4959231297944922,0,0,0,1,1,▇▁▁▁▆\nnumeric,qb_kneel,0,1,NA,NA,NA,NA,NA,0.00735046469501261,0.08541921332781825,0,0,0,0,1,▇▁▁▁▁\nnumeric,qb_spike,0,1,NA,NA,NA,NA,NA,0.0015210682978781499,0.03897122055563391,0,0,0,0,1,▇▁▁▁▁\nnumeric,qb_scramble,0,1,NA,NA,NA,NA,NA,0.014260356645457247,0.1185622691649317,0,0,0,0,1,▇▁▁▁▁\nnumeric,air_yards,804184,0.267971999963589,NA,NA,NA,NA,NA,8.308679760586442,10.100947914566389,-93,2,6,13,78,▁▁▇▃▁\nnumeric,yards_after_catch,915119,0.1669907243052332,NA,NA,NA,NA,NA,5.121536541092717,7.023980347723798,-72,1,3,7,91,▁▁▇▁▁\nnumeric,kick_distance,964499,0.1220413810681158,NA,NA,NA,NA,NA,41.25097895890983,15.058408894839559,-2,31,41,53,88,▁▇▇▆▁\nnumeric,home_timeouts_remaining,0,1,NA,NA,NA,NA,NA,2.5088369425707966,0.7915343388518676,-1,2,3,3,3,▁▁▁▃▇\nnumeric,away_timeouts_remaining,0,1,NA,NA,NA,NA,NA,2.4807158396824964,0.813320107324933,0,2,3,3,3,▁▁▁▃▇\nnumeric,timeout,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.04347332414161605,0.20392016552278353,0,0,0,0,1,▇▁▁▁▁\nnumeric,posteam_timeouts_remaining,74782,0.931927869867191,NA,NA,NA,NA,NA,2.5355298167198677,0.7704586202045379,-1,2,3,3,3,▁▁▁▂▇\nnumeric,defteam_timeouts_remaining,74782,0.931927869867191,NA,NA,NA,NA,NA,2.5542592802416126,0.7423333886351559,-1,2,3,3,3,▁▁▁▂▇\nnumeric,total_home_score,0,1,NA,NA,NA,NA,NA,11.808507423286635,10.161861231134086,0,3,10,17,62,▇▅▁▁▁\nnumeric,total_away_score,0,1,NA,NA,NA,NA,NA,10.394369043392777,9.571924926888174,0,3,7,17,59,▇▃▁▁▁\nnumeric,posteam_score,75540,0.9312378819738387,NA,NA,NA,NA,NA,10.229523083389541,9.55150831293642,0,0,7,17,62,▇▃▁▁▁\nnumeric,defteam_score,75540,0.9312378819738387,NA,NA,NA,NA,NA,11.41926434219915,10.011129519281944,0,3,10,17,62,▇▅▁▁▁\nnumeric,score_differential,75540,0.9312378819738387,NA,NA,NA,NA,NA,-1.1897412588096141,10.834589890553639,-59,-7,0,4,59,▁▂▇▁▁\nnumeric,posteam_score_post,74782,0.931927869867191,NA,NA,NA,NA,NA,10.472527515462184,9.629131505015522,0,3,7,17,62,▇▃▁▁▁\nnumeric,defteam_score_post,74782,0.931927869867191,NA,NA,NA,NA,NA,11.424630880612003,10.018617374348864,0,3,10,17,62,▇▅▁▁▁\nnumeric,score_differential_post,74782,0.931927869867191,NA,NA,NA,NA,NA,-0.9521033651498161,10.915129362273541,-59,-7,0,6,59,▁▂▇▁▁\nnumeric,no_score_prob,0,1,NA,NA,NA,NA,NA,0.1318170844272374,0.2041972997826963,0,0.006777436123229563,0.03241235762834549,0.17009594291448593,1,▇▁▁▁▁\nnumeric,opp_fg_prob,0,1,NA,NA,NA,NA,NA,0.0900442372636615,0.07039615565190616,0,0.02218738803640008,0.08674044534564018,0.14487330988049507,0.42820701003074646,▇▆▂▁▁\nnumeric,opp_safety_prob,0,1,NA,NA,NA,NA,NA,0.0024300688393332974,0.005656309098386118,0,4.499515416682698e-4,0.0013044169172644615,0.002646706940140575,0.3727701008319855,▇▁▁▁▁\nnumeric,opp_td_prob,0,1,NA,NA,NA,NA,NA,0.13686450993658264,0.11023031029440622,0,0.030053429771214724,0.12771396338939667,0.22834347561001778,0.5410013794898987,▇▅▅▁▁\nnumeric,fg_prob,0,1,NA,NA,NA,NA,NA,0.2320199917491022,0.16120988501708008,0,0.14781925082206726,0.21269237995147705,0.2992970049381256,0.9976794235436337,▇▇▂▁▁\nnumeric,safety_prob,0,1,NA,NA,NA,NA,NA,0.0026556686697281505,0.002310675572790794,0,0.001050723367370665,0.0024312720634043217,0.003875198948662728,0.3445877730846405,▇▁▁▁▁\nnumeric,td_prob,0,1,NA,NA,NA,NA,NA,0.2910022321016449,0.174859975956561,0,0.1884789690375328,0.3141609728336334,0.39814645051956177,0.9335877299308777,▅▇▅▁▁\nnumeric,extra_point_prob,0,1,NA,NA,NA,NA,NA,0.02504230941374404,0.1535238490465528,0,0,0,0,0.9963630685616458,▇▁▁▁▁\nnumeric,two_point_conversion_prob,0,1,NA,NA,NA,NA,NA,8.469442092902586e-4,0.02000777681298465,0,0,0,0,0.4735,▇▁▁▁▁\nnumeric,ep,17843,0.9837579762782527,NA,NA,NA,NA,NA,1.687473113671753,1.7057576445967129,-3.8036762471310794,0.4870520166296046,1.4019992728717625,2.8423898457549512,6.593665642023552,▁▃▇▅▁\nnumeric,epa,18018,0.9835986782817663,NA,NA,NA,NA,NA,-0.007954446230637966,1.2560468088794987,-13.58485902735265,-0.5379002675181255,-0,0.5221822524326853,9.579868719680235,▁▁▇▃▁\nnumeric,total_home_epa,0,1,NA,NA,NA,NA,NA,0.24846134966507072,12.107505734241665,-63.419816149475174,-6.520948685618038,0.1629172118846327,7.009262653625,65.34455354430793,▁▂▇▁▁\nnumeric,total_away_epa,0,1,NA,NA,NA,NA,NA,-0.22035254191216969,12.12565119448133,-65.34455354430793,-6.986512561861116,-0.1464349998728821,6.554438007757298,73.00163108506858,▁▂▇▁▁\nnumeric,total_home_rush_epa,0,1,NA,NA,NA,NA,NA,-0.06469372946176401,5.308144641341979,-32.110373332952676,-2.927859448711388,0,2.827283695190957,33.16040635136608,▁▁▇▁▁\nnumeric,total_away_rush_epa,0,1,NA,NA,NA,NA,NA,0.06469372946176401,5.308144641341979,-33.16040635136608,-2.827283695190957,0,2.927859448711388,32.110373332952676,▁▁▇▁▁\nnumeric,total_home_pass_epa,0,1,NA,NA,NA,NA,NA,-0.1637843757034343,10.596039297207732,-60.38258939230582,-6.080178061468196,0,5.74475710087931,53.981246434958145,▁▁▇▂▁\nnumeric,total_away_pass_epa,0,1,NA,NA,NA,NA,NA,0.16382928846033679,10.595990115688988,-53.981246434958145,-5.74475710087931,0,6.080178061468196,60.38258939230582,▁▂▇▁▁\nnumeric,air_epa,805471,0.26680047698371523,NA,NA,NA,NA,NA,0.5055065487066353,1.354761434024009,-11.79624876496382,-0.4898784961551428,0.2779173366725445,1.3612197960610501,7.454971208120696,▁▁▅▇▁\nnumeric,yac_epa,805500,0.2667740790300117,NA,NA,NA,NA,NA,-0.36660423146545995,1.924923089183378,-14,-0.8769923797808588,0,0.5234156415099278,9.848225327499676,▁▁▇▅▁\nnumeric,comp_air_epa,364010,0.6686510645657537,NA,NA,NA,NA,NA,0.05577522160824954,0.6080947516859125,-11.79624876496382,0,0,0,7.423614024184644,▁▁▁▇▁\nnumeric,comp_yac_epa,364019,0.668642872097363,NA,NA,NA,NA,NA,0.1588122908959473,0.5770241464224364,-10.914922855328768,0,0,0,9.848225327499676,▁▁▇▁▁\nnumeric,total_home_comp_air_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.1715972755615049,6.026821816064234,-34.2570441190619,-3.49196311540436,0,3.224452398380345,30.63045173761202,▁▁▇▂▁\nnumeric,total_away_comp_air_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.1715972755615049,6.026821816064234,-30.63045173761202,-3.224452398380345,0,3.49196311540436,34.2570441190619,▁▂▇▁▁\nnumeric,total_home_comp_yac_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.06711449665539697,6.386129449480004,-32.63186197145842,-3.6308591519482434,0,3.5630386369884945,35.91268990805838,▁▂▇▁▁\nnumeric,total_away_comp_yac_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.06711449665539697,6.386129449480004,-35.91268990805838,-3.5630386369884945,0,3.6308591519482434,32.63186197145842,▁▁▇▂▁\nnumeric,total_home_raw_air_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.5431016530492653,9.369514500698392,-67.52350558768376,-5.610673192248214,-0.13607840356417,4.56955827208003,62.42438889516052,▁▁▇▁▁\nnumeric,total_away_raw_air_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.5431016530492653,9.369514500698392,-62.42438889516052,-4.56955827208003,0.13607840356417,5.610673192248214,67.52350558768376,▁▁▇▁▁\nnumeric,total_home_raw_yac_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.28727892012474604,12.25002005575581,-66.52721119566374,-6.30992435511531,0,6.773971046050304,83.97783668081532,▁▃▇▁▁\nnumeric,total_away_raw_yac_epa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.28727892012474604,12.25002005575581,-83.97783668081532,-6.773971046050304,0,6.30992435511531,66.52721119566374,▁▁▇▃▁\nnumeric,wp,6298,0.994267092675023,NA,NA,NA,NA,NA,0.5092682336602672,0.3000892471142395,0,0.2650805339217186,0.5197814106941223,0.7546021823708784,0.999977398025294,▇▆▇▇▇\nnumeric,def_wp,6298,0.994267092675023,NA,NA,NA,NA,NA,0.49073176633973276,0.3000892471142395,2.2601974706049077e-5,0.24539781762912163,0.4802185893058777,0.7349194660782814,1,▇▇▇▆▇\nnumeric,home_wp,0,1,NA,NA,NA,NA,NA,0.5651212434687914,0.29463946094535143,0,0.34911757707595825,0.58470419049263,0.8092272281646729,1,▅▅▇▇▇\nnumeric,away_wp,0,1,NA,NA,NA,NA,NA,0.43487875653120855,0.29463946094535143,0,0.19077277183532715,0.41529580950737,0.6508824229240417,1,▇▇▇▅▅\nnumeric,wpa,14462,0.9868356135703688,NA,NA,NA,NA,NA,2.3231512244109806e-4,0.041179407416532865,-0.999494194984436,-0.01393437385559082,-0,0.01011665165424347,0.9997061546600889,▁▁▇▁▁\nnumeric,vegas_wpa,14462,0.9868356135703688,NA,NA,NA,NA,NA,5.26771687176859e-4,0.0399261106770947,-0.9998733997344971,-0.011666126549243927,0,0.008488729596138,0.9998747144678148,▁▁▇▁▁\nnumeric,vegas_home_wpa,6134,0.9944163776545873,NA,NA,NA,NA,NA,4.930193089403416e-6,0.039854870927841776,-0.9998747144678148,-0.010171317495405674,0,0.010242700576782227,0.9998733997344971,▁▁▇▁▁\nnumeric,home_wp_post,84757,0.9228478840674695,NA,NA,NA,NA,NA,0.5654515316738287,0.2932497903348318,0,0.3504820764064789,0.588352620601654,0.8072708249092102,1,▅▅▇▇▇\nnumeric,away_wp_post,84757,0.9228478840674695,NA,NA,NA,NA,NA,0.4345613652772663,0.293342597120003,-0.9994123093201779,0.19272546470165253,0.41165006160736084,0.6495179235935211,1.9977154731750488,▁▃▇▂▁\nnumeric,vegas_wp,6298,0.994267092675023,NA,NA,NA,NA,NA,0.5093007509900243,0.3234289117543515,0,0.22046128660440445,0.5167959928512573,0.799371525645256,0.9999985694885254,▇▆▆▆▇\nnumeric,vegas_home_wp,0,1,NA,NA,NA,NA,NA,0.5658029819469882,0.3181085215410204,0,0.2971372455358505,0.60519078373909,0.8528302535414696,1,▅▃▅▅▇\nnumeric,total_home_rush_wpa,0,1,NA,NA,NA,NA,NA,-0.005736035352490824,0.15295769414640503,-1.0790940578000532,-0.08720871806144714,0,0.08142374248139406,1.0827744475655492,▁▁▇▁▁\nnumeric,total_away_rush_wpa,0,1,NA,NA,NA,NA,NA,0.005736035352490824,0.15295769414640503,-1.0827744475655492,-0.08142374248139406,0,0.08720871806144714,1.0790940578000532,▁▁▇▁▁\nnumeric,total_home_pass_wpa,0,1,NA,NA,NA,NA,NA,-0.011704553292779056,0.2612775797505144,-1.8669247376815488,-0.1780981719493866,0.0036022216081619263,0.1692317584797332,1.7035376865755814,▁▁▇▁▁\nnumeric,total_away_pass_wpa,0,1,NA,NA,NA,NA,NA,0.011704553292779056,0.2612775797505144,-1.7035376865755814,-0.1692317584797332,-0.0036022216081619263,0.1780981719493866,1.8669247376815488,▁▁▇▁▁\nnumeric,air_wpa,805469,0.2668022975322465,NA,NA,NA,NA,NA,0.0031944892416476147,0.04235070909282636,-0.9980781078338623,0,0,0,0.9912653528153896,▁▁▇▁▁\nnumeric,yac_wpa,805469,0.2668022975322465,NA,NA,NA,NA,NA,3.550292191821682e-4,0.05655707094397412,-0.99085608497262,-0.015901625156402588,0,0.011798828840255737,1,▁▁▇▁▁\nnumeric,comp_air_wpa,364008,0.668652885114285,NA,NA,NA,NA,NA,0.001331281430309251,0.021105782500084776,-0.99611663818359375,0,0,0,0.9882552844937891,▁▁▇▁▁\nnumeric,comp_yac_wpa,364008,0.668652885114285,NA,NA,NA,NA,NA,0.004087812878937085,0.026691557398439505,-0.9883022714639083,0,0,0,1,▁▁▇▁▁\nnumeric,total_home_comp_air_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.006595088517105849,0.1518415145504115,-1.9444250103439105,-0.04968388275176994,0,0.04606897134688148,3.0175069247978037,▁▇▃▁▁\nnumeric,total_away_comp_air_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.006595088517105849,0.1518415145504115,-3.0175069247978037,-0.04606897134688148,0,0.04968388275176994,1.9444250103439105,▁▁▃▇▁\nnumeric,total_home_comp_yac_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-1.2000175624890671e-4,0.23046011134609773,-2.555464788224241,-0.11639988422393799,0,0.11682116985321045,2.106908860423353,▁▁▇▁▁\nnumeric,total_away_comp_yac_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,1.2000175624890671e-4,0.23046011134609773,-2.106908860423353,-0.11682116985321045,0,0.11639988422393799,2.555464788224241,▁▁▇▁▁\nnumeric,total_home_raw_air_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.011168386565439632,0.1927320921131443,-2.9340697821495256,-0.05357837677001953,0,0.04698159838693505,3.905425994589482,▁▁▇▁▁\nnumeric,total_away_raw_air_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.011168386565439632,0.1927320921131443,-3.905425994589482,-0.04698159838693505,0,0.05357837677001953,2.9340697821495256,▁▁▇▁▁\nnumeric,total_home_raw_yac_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,-0.0026903383881771385,0.30053654108944405,-3.7097175012884205,-0.1533583104610443,0,0.14303510306494469,3.265507598971233,▁▁▇▁▁\nnumeric,total_away_raw_yac_wpa,328090,0.7013481161874073,NA,NA,NA,NA,NA,0.0026903383881771385,0.30053654108944405,-3.265507598971233,-0.14303510306494469,0,0.1533583104610443,3.7097175012884205,▁▁▇▁▁\nnumeric,punt_blocked,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.024679283809378e-4,0.017388983007877862,0,0,0,0,1,▇▁▁▁▁\nnumeric,first_down_rush,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.07168966981379617,0.25797349583877194,0,0,0,0,1,▇▁▁▁▁\nnumeric,first_down_pass,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.13458868655448952,0.3412838747160773,0,0,0,0,1,▇▁▁▁▁\nnumeric,first_down_penalty,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.019788272450133332,0.13927209063948912,0,0,0,0,1,▇▁▁▁▁\nnumeric,third_down_converted,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.06017108044024827,0.23780364899861764,0,0,0,0,1,▇▁▁▁▁\nnumeric,third_down_failed,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.09510087830198129,0.2933543648216251,0,0,0,0,1,▇▁▁▁▁\nnumeric,fourth_down_converted,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.005742119851723924,0.07555893963498897,0,0,0,0,1,▇▁▁▁▁\nnumeric,fourth_down_failed,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.00593295135227972,0.07679685584657968,0,0,0,0,1,▇▁▁▁▁\nnumeric,incomplete_pass,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.14255017675767742,0.34961370180594376,0,0,0,0,1,▇▁▁▁▁\nnumeric,touchback,0,1,NA,NA,NA,NA,NA,0.023482345230618002,0.15142967201489163,0,0,0,0,1,▇▁▁▁▁\nnumeric,interception,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.01112929311241407,0.10490682746207507,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_inside_twenty,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.017976327352356058,0.132865329756134,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_in_endzone,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.339551259726445e-5,0.005778791326959958,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_out_of_bounds,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0051133300573925735,0.07132453131353966,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_downed,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.007430978631642728,0.08588228121016313,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_fair_catch,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.012493738341388016,0.11107500445965617,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_inside_twenty,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0064730044988526225,0.08019420707197665,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_in_endzone,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.011002390164544478,0.10431369976053685,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_out_of_bounds,50525,0.9540083927287292,NA,NA,NA,NA,NA,7.356554346425972e-4,0.0271130032851308,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_downed,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.8166300111159326e-5,0.006177773050218164,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_fair_catch,50525,0.9540083927287292,NA,NA,NA,NA,NA,9.636990778067718e-5,0.009816349248312638,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble_forced,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.01097281128195831,0.10417494444176967,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble_not_forced,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.005325153023009507,0.072779123533286,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble_out_of_bounds,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.001254717116154364,0.03539977396551467,0,0,0,0,1,▇▁▁▁▁\nnumeric,solo_tackle,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.4764241993425855,0.49944411061962346,0,0,0,1,1,▇▁▁▁▇\nnumeric,safety,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.730755835865827e-4,0.019311570470469743,0,0,0,0,1,▇▁▁▁▁\nnumeric,penalty,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.07373538349975431,0.2613398972362713,0,0,0,0,1,▇▁▁▁▁\nnumeric,tackled_for_loss,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.029865129836982186,0.17021525049721078,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble_lost,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.007655205644795784,0.08715853784738917,0,0,0,0,1,▇▁▁▁▁\nnumeric,own_kickoff_recovery,50525,0.9540083927287292,NA,NA,NA,NA,NA,2.4712679321975634e-4,0.015718331886929632,0,0,0,0,1,▇▁▁▁▁\nnumeric,own_kickoff_recovery_td,50525,0.9540083927287292,NA,NA,NA,NA,NA,9.54157502778982e-7,9.768098600950877e-4,0,0,0,0,1,▇▁▁▁▁\nnumeric,qb_hit,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.04588161767863027,0.20922843164238794,0,0,0,0,1,▇▁▁▁▁\nnumeric,rush_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.3186847892981695,0.4659667386623199,0,0,0,1,1,▇▁▁▁▃\nnumeric,pass_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.4246687880768479,0.49429287030579566,0,0,0,1,1,▇▁▁▁▆\nnumeric,sack,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.026963536871031302,0.16197694152013542,0,0,0,0,1,▇▁▁▁▁\nnumeric,touchdown,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0290970330472451,0.16807862050375538,0,0,0,0,1,▇▁▁▁▁\nnumeric,pass_touchdown,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.016936295674326952,0.1290328386472942,0,0,0,0,1,▇▁▁▁▁\nnumeric,rush_touchdown,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.00968946944072058,0.09795709662084069,0,0,0,0,1,▇▁▁▁▁\nnumeric,return_touchdown,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0022794822741389917,0.04768949994212801,0,0,0,0,1,▇▁▁▁▁\nnumeric,extra_point_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.027105706338945365,0.16239153975780735,0,0,0,0,1,▇▁▁▁▁\nnumeric,two_point_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0018749194929607034,0.04325974983135766,0,0,0,0,1,▇▁▁▁▁\nnumeric,field_goal_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.022192749357136384,0.14731005375929296,0,0,0,0,1,▇▁▁▁▁\nnumeric,kickoff_attempt,50498,0.9540329701339013,NA,NA,NA,NA,NA,0.05856372462960561,0.2348064466583969,0,0,0,0,1,▇▁▁▁▁\nnumeric,punt_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.0547314285169053,0.22745537719220404,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.016238806539795515,0.12639273295859219,0,0,0,0,1,▇▁▁▁▁\nnumeric,complete_pass,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.24267374015428733,0.4286996283554495,0,0,0,0,1,▇▁▁▁▂\nnumeric,assist_tackle,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.13535201255671275,0.3420993377960261,0,0,0,0,1,▇▁▁▁▁\nnumeric,lateral_reception,50525,0.9540083927287292,NA,NA,NA,NA,NA,2.1277712311971333e-4,0.014585336883167023,0,0,0,0,1,▇▁▁▁▁\nnumeric,lateral_rush,50525,0.9540083927287292,NA,NA,NA,NA,NA,3.721214260838033e-5,0.00610006502996325,0,0,0,0,1,▇▁▁▁▁\nnumeric,lateral_return,50525,0.9540083927287292,NA,NA,NA,NA,NA,2.776598333086835e-4,0.016660822404177992,0,0,0,0,1,▇▁▁▁▁\nnumeric,lateral_recovery,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.001992280865802518,0.044590509975986924,0,0,0,0,1,▇▁▁▁▁\nnumeric,passing_yards,844237,0.23151278480206083,NA,NA,NA,NA,NA,11.454679494992785,10.155601152890332,-22,5,9,15,99,▁▇▁▁▁\nnumeric,receiving_yards,844237,0.23151278480206083,NA,NA,NA,NA,NA,11.449501244431513,10.148341932961369,-22,5,9,15,99,▁▇▁▁▁\nnumeric,rushing_yards,765120,0.30353095387640294,NA,NA,NA,NA,NA,4.177474883790673,6.277634572361271,-34,1,3,6,99,▁▇▁▁▁\nnumeric,lateral_receiving_yards,1098347,2.0299116123689842e-4,NA,NA,NA,NA,NA,5.941704035874439,10.321900567115344,-21,0,4,10,62,▁▇▂▁▁\nnumeric,lateral_rushing_yards,1098531,3.550069635982478e-5,NA,NA,NA,NA,NA,9.692307692307692,9.878615517086033,0,4,6,12,44,▇▃▁▁▁\nnumeric,tackle_with_assist,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.07944029120886999,0.2704248529974212,0,0,0,0,1,▇▁▁▁▁\nnumeric,fumble_recovery_1_yards,1082877,0.014284934050629472,NA,NA,NA,NA,NA,2.3068246989103423,9.281173192373814,-100,0,0,0,104,▁▁▇▁▁\nnumeric,fumble_recovery_2_yards,1098447,1.1196373467325937e-4,NA,NA,NA,NA,NA,3.926829268292683,12.171702422588814,-16,0,0,1.5,77,▇▂▁▁▁\nnumeric,return_yards,50525,0.9540083927287292,NA,NA,NA,NA,NA,1.215521280097707,5.810803247891786,-100,0,0,0,109,▁▁▇▁▁\nnumeric,penalty_yards,1021292,0.07034417469983711,NA,NA,NA,NA,NA,8.364502186909599,5.267519745457818,0,5,5,10,66,▇▂▁▁▁\nnumeric,replay_or_challenge,0,1,NA,NA,NA,NA,NA,0.0074205558134665915,0.08582247881242441,0,0,0,0,1,▇▁▁▁▁\nnumeric,defensive_two_point_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,4.7707875138949524e-5,0.006906927291951021,0,0,0,0,1,▇▁▁▁▁\nnumeric,defensive_two_point_conv,50525,0.9540083927287292,NA,NA,NA,NA,NA,1.0495732530568788e-5,0.0032396963414267495,0,0,0,0,1,▇▁▁▁▁\nnumeric,defensive_extra_point_attempt,50525,0.9540083927287292,NA,NA,NA,NA,NA,0,0,0,0,0,0,0,▁▁▇▁▁\nnumeric,defensive_extra_point_conv,50525,0.9540083927287292,NA,NA,NA,NA,NA,0,0,0,0,0,0,0,▁▁▇▁▁\nnumeric,season,0,1,NA,NA,NA,NA,NA,2010.1182082161356,6.627584365671061,1999,2004,2010,2016,2021,▇▆▇▆▇\nnumeric,cp,811614,0.2612086621699118,NA,NA,NA,NA,NA,0.6348154096544791,0.17004701657348936,0.09544014185667038,0.5279483497142792,0.6781390309333801,0.7639535367488861,0.926070511341095,▁▃▃▇▆\nnumeric,cpoe,811614,0.2612086621699118,NA,NA,NA,NA,NA,0.06548366550596109,44.90345874861157,-92.05909371376038,-44.26913559436798,20.371761918067932,32.09476172924042,85.26444137096405,▃▃▁▇▁\nnumeric,series,0,1,NA,NA,NA,NA,NA,29.29300545254285,17.06057955007624,1,15,29,44,82,▇▇▇▃▁\nnumeric,series_success,0,1,NA,NA,NA,NA,NA,0.550455592269951,0.49744794547782784,0,0,1,1,1,▆▁▁▁▇\nnumeric,order_sequence,91628,0.916593389588283,NA,NA,NA,NA,NA,2089.286688328847,1223.2124087724426,1,1038,2071,3110,5921,▇▇▇▃▁\nnumeric,play_deleted,91628,0.916593389588283,NA,NA,NA,NA,NA,0,0,0,0,0,0,0,▁▁▇▁▁\nnumeric,special_teams_play,91628,0.916593389588283,NA,NA,NA,NA,NA,0.15910747590228633,0.36577646163240873,0,0,0,0,1,▇▁▁▁▂\nnumeric,fixed_drive,0,1,NA,NA,NA,NA,NA,12.20017932403033,7.136661236285931,1,6,12,18,38,▇▆▆▁▁\nnumeric,drive_play_count,14081,0.9871824280655762,NA,NA,NA,NA,NA,7.189677350346567,3.602393232327467,0,4,7,10,24,▅▇▅▁▁\nnumeric,drive_first_downs,14081,0.9871824280655762,NA,NA,NA,NA,NA,2.3803994323593884,1.8755856899745,0,1,2,4,9,▇▇▅▁▁\nnumeric,drive_inside20,14081,0.9871824280655762,NA,NA,NA,NA,NA,0.37966636821581407,0.4853040636890067,0,0,0,1,1,▇▁▁▁▅\nnumeric,drive_ended_with_score,14081,0.9871824280655762,NA,NA,NA,NA,NA,0.4375959553301142,0.4960906793535729,0,0,0,1,1,▇▁▁▁▆\nnumeric,drive_quarter_start,14081,0.9871824280655762,NA,NA,NA,NA,NA,2.5190509078469216,1.130511492046773,1,2,3,4,5,▇▇▇▇▁\nnumeric,drive_quarter_end,14081,0.9871824280655762,NA,NA,NA,NA,NA,2.6349700181375746,1.1228361579111892,1,2,3,4,6,▇▃▅▁▁\nnumeric,drive_yards_penalized,105709,0.9037758176538591,NA,NA,NA,NA,NA,0.0910671282284217,7.611621086271443,-40,0,0,0,77,▁▇▁▁▁\nnumeric,drive_play_id_started,14081,0.9871824280655762,NA,NA,NA,NA,NA,1994.7812905432884,1209.712470152506,10,953,1986,3005,5764,▇▇▇▃▁\nnumeric,drive_play_id_ended,14081,0.9871824280655762,NA,NA,NA,NA,NA,2192.7340221984737,1213.3905594679325,34,1149,2164,3206,5921,▇▇▇▅▁\nnumeric,away_score,0,1,NA,NA,NA,NA,NA,21.140365202035373,10.101971087488094,0,14,21,28,59,▃▇▆▂▁\nnumeric,home_score,0,1,NA,NA,NA,NA,NA,23.40198530817335,10.330063582737433,0,16,23,30,62,▂▇▆▂▁\nnumeric,result,0,1,NA,NA,NA,NA,NA,2.2616201061379795,14.548871420505362,-49,-7,3,11,59,▁▃▇▂▁\nnumeric,total,0,1,NA,NA,NA,NA,NA,44.54235051020873,14.34748784077857,3,34,44,54,106,▁▇▆▂▁\nnumeric,spread_line,0,1,NA,NA,NA,NA,NA,2.339101286217539,5.952923025351251,-19,-3,3,6.5,27,▁▅▇▂▁\nnumeric,total_line,0,1,NA,NA,NA,NA,NA,43.481633851279376,5.017518021658727,30,40,43.5,47,63.5,▂▇▇▂▁\nnumeric,div_game,0,1,NA,NA,NA,NA,NA,0.380814149303185,0.4855872193867445,0,0,0,1,1,▇▁▁▁▅\nnumeric,temp,317103,0.7113492995439525,NA,NA,NA,NA,NA,58.22617333809361,16.860510369502123,-6,46,59,71,109,▁▃▇▇▁\nnumeric,wind,317103,0.7113492995439525,NA,NA,NA,NA,NA,8.494006784675488,5.383877047778842,0,5,8,12,71,▇▁▁▁▁\nnumeric,aborted_play,0,1,NA,NA,NA,NA,NA,0.002607935771047818,0.051001341255008464,0,0,0,0,1,▇▁▁▁▁\nnumeric,success,18018,0.9835986782817663,NA,NA,NA,NA,NA,0.41713494584249533,0.49308580166962673,0,0,0,1,1,▇▁▁▁▆\nnumeric,passer_jersey_number,656502,0.40240312406127965,NA,NA,NA,NA,NA,9.03425717310459,4.860778988279489,1,5,9,12,92,▇▁▁▁▁\nnumeric,rusher_jersey_number,794957,0.2763711006126146,NA,NA,NA,NA,NA,27.7503301900775,10.75144904832512,1,22,28,33,99,▂▇▁▁▁\nnumeric,receiver_jersey_number,702732,0.3603211447609165,NA,NA,NA,NA,NA,53.07991400522436,32.341174900030225,1,18,80,84,99,▅▃▁▁▇\nnumeric,pass,0,1,NA,NA,NA,NA,NA,0.43929380922471944,0.49630130225070546,0,0,0,1,1,▇▁▁▁▆\nnumeric,rush,0,1,NA,NA,NA,NA,NA,0.2900707283104399,0.4537949849222303,0,0,0,1,1,▇▁▁▁▃\nnumeric,first_down,50525,0.9540083927287292,NA,NA,NA,NA,NA,0.223826267001894,0.41680719159581336,0,0,0,0,1,▇▁▁▁▂\nnumeric,special,0,1,NA,NA,NA,NA,NA,0.15393466051321264,0.360886269285787,0,0,0,0,1,▇▁▁▁▂\nnumeric,play,0,1,NA,NA,NA,NA,NA,0.752477311413929,0.43157314184813556,0,1,1,1,1,▂▁▁▁▇\nnumeric,jersey_number,367370,0.6655925430332159,NA,NA,NA,NA,NA,16.811021608315098,12.101034042875968,1,8,12,26,99,▇▅▁▁▁\nnumeric,out_of_bounds,0,1,NA,NA,NA,NA,NA,0.06955041554020228,0.25438792059600834,0,0,0,0,1,▇▁▁▁▁\nnumeric,home_opening_kickoff,0,1,NA,NA,NA,NA,NA,0.48775408030439565,0.4998502424557511,0,0,0,1,1,▇▁▁▁▇\nnumeric,qb_epa,18018,0.9835986782817663,NA,NA,NA,NA,NA,-7.359714411560366e-4,1.2454625465492268,-13.58485902735265,-0.5353464668150991,0,0.5254294750135534,9.579868719680235,▁▁▇▃▁\nnumeric,xyac_epa,831471,0.24313334607717307,NA,NA,NA,NA,NA,0.6872020672438497,0.5040984516422667,-1.2820399739124755,0.30739913538713626,0.5661891744628638,0.920687893258089,13.028142577118448,▇▁▁▁▁\nnumeric,xyac_mean_yardage,831419,0.2431806803389861,NA,NA,NA,NA,NA,5.217610647958738,2.983742858931366,-77.34254090196919,3.6430345882508846,4.496286971581867,6.384836019860813,78.85111491640419,▁▁▇▁▁\nnumeric,xyac_median_yardage,831419,0.2431806803389861,NA,NA,NA,NA,NA,3.4332418744455384,2.3961516593295906,0,2,3,5,48,▇▁▁▁▁\nnumeric,xyac_success,831419,0.2431806803389861,NA,NA,NA,NA,NA,0.7963682448146061,0.24753272979377464,0.010269990365486592,0.5846049897518242,0.9878865564242005,1,1,▁▁▃▂▇\nnumeric,xyac_fd,831419,0.2431806803389861,NA,NA,NA,NA,NA,0.598139960266696,0.35654205947727374,0,0.24512977158883587,0.5134540184517391,0.9991877254215069,1,▂▆▂▁▇\nnumeric,xpass,522120,0.5247276004260084,NA,NA,NA,NA,NA,0.6137813038965859,0.2410066250760552,0.01067630760371685,0.44443003088235855,0.5740557610988617,0.8400976061820984,0.998187243938446,▁▃▇▅▆\nnumeric,pass_oe,538745,0.5095942907598059,NA,NA,NA,NA,NA,-0.26308551422534004,41.94502679344846,-99.23535585403442,-41.299596428871155,4.3928563594818115,34.751224517822266,97.72196300327778,▂▇▇▇▂\n"
  },
  {
    "path": "data-raw/replace_models.R",
    "content": "# Helper function to replace the internal calls to the models\n# with a call to the fastrmodels package\nmodels <- c(\n  \"ep_model,\",\n  \"wp_model,\",\n  \"wp_model_spread,\",\n  \"fg_model,\",\n  \"cp_model,\",\n  \"xyac_model,\",\n  \"xpass_model,\"\n)\n\npurrr::walk(models, function(model) {\n  xfun::gsub_dir(\n    # paste0(model,\"(?![:alpha:]+)\"),\n    model,\n    paste0(\"fastrmodels::\", model),\n    dir = usethis::proj_path(\"R\"),\n    ext = \"R\"\n  )\n})\n"
  },
  {
    "path": "data-raw/teams_colors_logos.R",
    "content": "teams_colors_logos <- nflreadr::load_teams()\n\nuse_data(teams_colors_logos, overwrite = TRUE)\n"
  },
  {
    "path": "data-raw/tidy_play_stats_row.R",
    "content": "# Script to create the tidy_play_stats_row tibble that is used in\n# the internal function `sum_play_stats`\n\nlibrary(tidyverse)\n\ntidy_play_stats_row <-\n  as_tibble_row(\n    matrix(NA, ncol = length(pbp_stat_columns)),\n    .name_repair = \"minimal\"\n  ) |>\n  set_names(pbp_stat_columns) |>\n  modify_at(indicator_stats, function(x) {\n    x <- 0\n  }) |>\n  modify_if(is.na, function(x) {\n    x <- NA_character_\n  }) |>\n  modify_at(\n    c(\n      \"air_yards\",\n      \"yards_after_catch\",\n      \"penalty_yards\",\n      \"kick_distance\",\n      \"fumble_recovery_1_yards\",\n      \"fumble_recovery_2_yards\",\n      \"rushing_yards\",\n      \"lateral_rushing_yards\",\n      \"passing_yards\",\n      \"receiving_yards\",\n      \"lateral_receiving_yards\"\n    ),\n    as.integer\n  )\n\ntidy_play_stats_row <- nflfastR:::tidy_play_stats_row\nscramble_fix <- readRDS(\"data-raw/scramble_fix.rds\")\ndefault_play <- readRDS(\"data-raw/pbp_defaultplay.rds\")\nusethis::use_data(\n  tidy_play_stats_row,\n  scramble_fix,\n  default_play,\n  internal = TRUE,\n  overwrite = TRUE\n)\n\n# stats character vectors -------------------------------------------------\n\npbp_stat_columns <-\n  c(\n    # \"play_id\",\n    \"punt_blocked\",\n    \"first_down_rush\",\n    \"first_down_pass\",\n    \"first_down_penalty\",\n    \"third_down_converted\",\n    \"third_down_failed\",\n    \"fourth_down_converted\",\n    \"fourth_down_failed\",\n    \"incomplete_pass\",\n    \"interception\",\n    \"punt_inside_twenty\",\n    \"punt_in_endzone\",\n    \"punt_out_of_bounds\",\n    \"punt_downed\",\n    \"punt_fair_catch\",\n    \"kickoff_inside_twenty\",\n    \"kickoff_in_endzone\",\n    \"kickoff_out_of_bounds\",\n    \"kickoff_fair_catch\",\n    \"fumble_forced\",\n    \"fumble_not_forced\",\n    \"fumble_out_of_bounds\",\n    \"timeout\",\n    \"field_goal_missed\",\n    \"field_goal_made\",\n    \"field_goal_blocked\",\n    \"extra_point_good\",\n    \"extra_point_failed\",\n    \"extra_point_blocked\",\n    \"two_point_rush_good\",\n    \"two_point_rush_failed\",\n    \"two_point_pass_good\",\n    \"two_point_pass_failed\",\n    \"solo_tackle\",\n    \"safety\",\n    \"penalty\",\n    \"tackled_for_loss\",\n    \"extra_point_safety\",\n    \"two_point_rush_safety\",\n    \"two_point_pass_safety\",\n    \"kickoff_downed\",\n    \"two_point_pass_reception_good\",\n    \"two_point_pass_reception_failed\",\n    \"fumble_lost\",\n    \"own_kickoff_recovery\",\n    \"own_kickoff_recovery_td\",\n    \"qb_hit\",\n    \"extra_point_aborted\",\n    \"two_point_return\",\n    \"rush_attempt\",\n    \"pass_attempt\",\n    \"sack\",\n    \"touchdown\",\n    \"pass_touchdown\",\n    \"rush_touchdown\",\n    \"return_touchdown\",\n    \"extra_point_attempt\",\n    \"two_point_attempt\",\n    \"field_goal_attempt\",\n    \"kickoff_attempt\",\n    \"punt_attempt\",\n    \"fumble\",\n    \"complete_pass\",\n    \"assist_tackle\",\n    \"lateral_reception\",\n    \"lateral_rush\",\n    \"lateral_return\",\n    \"lateral_recovery\",\n    \"passer_player_id\",\n    \"passer_player_name\",\n    \"receiver_player_id\",\n    \"receiver_player_name\",\n    \"rusher_player_id\",\n    \"rusher_player_name\",\n    \"lateral_receiver_player_id\",\n    \"lateral_receiver_player_name\",\n    \"lateral_rusher_player_id\",\n    \"lateral_rusher_player_name\",\n    \"lateral_sack_player_id\",\n    \"lateral_sack_player_name\",\n    \"interception_player_id\",\n    \"interception_player_name\",\n    \"lateral_interception_player_id\",\n    \"lateral_interception_player_name\",\n    \"punt_returner_player_id\",\n    \"punt_returner_player_name\",\n    \"lateral_punt_returner_player_id\",\n    \"lateral_punt_returner_player_name\",\n    \"kickoff_returner_player_name\",\n    \"kickoff_returner_player_id\",\n    \"lateral_kickoff_returner_player_id\",\n    \"lateral_kickoff_returner_player_name\",\n    \"punter_player_id\",\n    \"punter_player_name\",\n    \"kicker_player_name\",\n    \"kicker_player_id\",\n    \"own_kickoff_recovery_player_id\",\n    \"own_kickoff_recovery_player_name\",\n    \"blocked_player_id\",\n    \"blocked_player_name\",\n    \"tackle_for_loss_1_player_id\",\n    \"tackle_for_loss_1_player_name\",\n    \"tackle_for_loss_2_player_id\",\n    \"tackle_for_loss_2_player_name\",\n    \"qb_hit_1_player_id\",\n    \"qb_hit_1_player_name\",\n    \"qb_hit_2_player_id\",\n    \"qb_hit_2_player_name\",\n    \"forced_fumble_player_1_team\",\n    \"forced_fumble_player_1_player_id\",\n    \"forced_fumble_player_1_player_name\",\n    \"forced_fumble_player_2_team\",\n    \"forced_fumble_player_2_player_id\",\n    \"forced_fumble_player_2_player_name\",\n    \"solo_tackle_1_team\",\n    \"solo_tackle_2_team\",\n    \"solo_tackle_1_player_id\",\n    \"solo_tackle_2_player_id\",\n    \"solo_tackle_1_player_name\",\n    \"solo_tackle_2_player_name\",\n    \"assist_tackle_1_player_id\",\n    \"assist_tackle_1_player_name\",\n    \"assist_tackle_1_team\",\n    \"assist_tackle_2_player_id\",\n    \"assist_tackle_2_player_name\",\n    \"assist_tackle_2_team\",\n    \"assist_tackle_3_player_id\",\n    \"assist_tackle_3_player_name\",\n    \"assist_tackle_3_team\",\n    \"assist_tackle_4_player_id\",\n    \"assist_tackle_4_player_name\",\n    \"assist_tackle_4_team\",\n    # new for stat ID 80 -> tackle_with_assist\n    \"tackle_with_assist\",\n    \"tackle_with_assist_1_player_id\",\n    \"tackle_with_assist_1_player_name\",\n    \"tackle_with_assist_1_team\",\n    \"tackle_with_assist_2_player_id\",\n    \"tackle_with_assist_2_player_name\",\n    \"tackle_with_assist_2_team\",\n\n    \"pass_defense_1_player_id\",\n    \"pass_defense_1_player_name\",\n    \"pass_defense_2_player_id\",\n    \"pass_defense_2_player_name\",\n    \"fumbled_1_team\",\n    \"fumbled_1_player_id\",\n    \"fumbled_1_player_name\",\n    \"fumbled_2_player_id\",\n    \"fumbled_2_player_name\",\n    \"fumbled_2_team\",\n    \"fumble_recovery_1_team\",\n    \"fumble_recovery_1_yards\",\n    \"fumble_recovery_1_player_id\",\n    \"fumble_recovery_1_player_name\",\n    \"fumble_recovery_2_team\",\n    \"fumble_recovery_2_yards\",\n    \"fumble_recovery_2_player_id\",\n    \"fumble_recovery_2_player_name\",\n    \"td_team\",\n    \"return_team\",\n    \"timeout_team\",\n    \"yards_gained\",\n    \"return_yards\",\n    \"air_yards\",\n    \"yards_after_catch\",\n    \"penalty_team\",\n    \"penalty_player_id\",\n    \"penalty_player_name\",\n    \"penalty_yards\",\n    \"kick_distance\",\n    \"defensive_two_point_attempt\",\n    \"defensive_two_point_conv\",\n    \"defensive_extra_point_attempt\",\n    \"defensive_extra_point_conv\",\n    \"penalty_fix\",\n    \"return_penalty_fix\",\n    #new in nflfastR v4.0\n    \"rushing_yards\",\n    \"lateral_rushing_yards\",\n    \"passing_yards\",\n    \"receiving_yards\",\n    \"lateral_receiving_yards\",\n    # new in nflfastR v4.1\n    \"td_player_id\",\n    \"td_player_name\",\n    \"sack_player_id\",\n    \"sack_player_name\",\n    \"half_sack_1_player_id\",\n    \"half_sack_1_player_name\",\n    \"half_sack_2_player_id\",\n    \"half_sack_2_player_name\",\n    # new in nflfastR > v4.1\n    \"safety_player_name\",\n    \"safety_player_id\"\n  )\n\nindicator_stats <- c(\n  \"punt_blocked\",\n  \"first_down_rush\",\n  \"first_down_pass\",\n  \"first_down_penalty\",\n  \"third_down_converted\",\n  \"third_down_failed\",\n  \"fourth_down_converted\",\n  \"fourth_down_failed\",\n  \"incomplete_pass\",\n  \"interception\",\n  \"punt_inside_twenty\",\n  \"punt_in_endzone\",\n  \"punt_out_of_bounds\",\n  \"punt_downed\",\n  \"punt_fair_catch\",\n  \"kickoff_inside_twenty\",\n  \"kickoff_in_endzone\",\n  \"kickoff_out_of_bounds\",\n  \"kickoff_fair_catch\",\n  \"fumble_forced\",\n  \"fumble_not_forced\",\n  \"fumble_out_of_bounds\",\n  \"timeout\",\n  \"field_goal_missed\",\n  \"field_goal_made\",\n  \"field_goal_blocked\",\n  \"extra_point_good\",\n  \"extra_point_failed\",\n  \"extra_point_blocked\",\n  \"two_point_rush_good\",\n  \"two_point_rush_failed\",\n  \"two_point_pass_good\",\n  \"two_point_pass_failed\",\n  \"solo_tackle\",\n  \"safety\",\n  \"penalty\",\n  \"tackled_for_loss\",\n  \"extra_point_safety\",\n  \"two_point_rush_safety\",\n  \"two_point_pass_safety\",\n  \"kickoff_downed\",\n  \"two_point_pass_reception_good\",\n  \"two_point_pass_reception_failed\",\n  \"fumble_lost\",\n  \"own_kickoff_recovery\",\n  \"own_kickoff_recovery_td\",\n  \"qb_hit\",\n  \"extra_point_aborted\",\n  \"two_point_return\",\n  \"defensive_two_point_attempt\",\n  \"defensive_two_point_conv\",\n  \"defensive_extra_point_attempt\",\n  \"defensive_extra_point_conv\",\n  \"rush_attempt\",\n  \"pass_attempt\",\n  \"sack\",\n  \"touchdown\",\n  \"pass_touchdown\",\n  \"rush_touchdown\",\n  \"return_touchdown\",\n  \"extra_point_attempt\",\n  \"two_point_attempt\",\n  \"field_goal_attempt\",\n  \"kickoff_attempt\",\n  \"punt_attempt\",\n  \"fumble\",\n  \"complete_pass\",\n  \"assist_tackle\",\n  # new for stat ID 80 -> tackle_with_assist\n  \"tackle_with_assist\",\n\n  \"lateral_reception\",\n  \"lateral_rush\",\n  \"lateral_return\",\n  \"lateral_recovery\",\n  \"penalty_fix\",\n  \"yards_gained\",\n  \"return_yards\",\n  \"return_penalty_fix\"\n)\n"
  },
  {
    "path": "data-raw/variable_list.txt",
    "content": "#' \\item{play_id}{Numeric play id that when used with game_id and drive provides the unique identifier for a single play.}\n#' \\item{game_id}{Ten digit identifier for NFL game.}\n#' \\item{old_game_id}{Legacy NFL game ID.}\n#' \\item{home_team}{String abbreviation for the home team.}\n#' \\item{away_team}{String abbreviation for the away team.}\n#' \\item{season_type}{'REG' or 'POST' indicating if the game belongs to regular or post season.}\n#' \\item{week}{Season week.}\n#' \\item{posteam}{String abbreviation for the team with possession.}\n#' \\item{posteam_type}{String indicating whether the posteam team is home or away.}\n#' \\item{defteam}{String abbreviation for the team on defense.}\n#' \\item{side_of_field}{String abbreviation for which team's side of the field the team with possession is currently on.}\n#' \\item{yardline_100}{Numeric distance in the number of yards from the opponent's endzone for the posteam.}\n#' \\item{game_date}{Date of the game.}\n#' \\item{quarter_seconds_remaining}{Numeric seconds remaining in the quarter.}\n#' \\item{half_seconds_remaining}{Numeric seconds remaining in the half.}\n#' \\item{game_seconds_remaining}{Numeric seconds remaining in the game.}\n#' \\item{game_half}{String indicating which half the play is in, either Half1, Half2, or Overtime.}\n#' \\item{quarter_end}{Binary indicator for whether or not the row of the data is marking the end of a quarter.}\n#' \\item{drive}{Numeric drive number in the game.}\n#' \\item{sp}{Binary indicator for whether or not a score occurred on the play.}\n#' \\item{qtr}{Quarter of the game (5 is overtime).}\n#' \\item{down}{The down for the given play.}\n#' \\item{goal_to_go}{Binary indicator for whether or not the posteam is in a goal down situation.}\n#' \\item{time}{Time at start of play provided in string format as minutes:seconds remaining in the quarter.}\n#' \\item{yrdln}{String indicating the current field position for a given play.}\n#' \\item{ydstogo}{Numeric yards in distance from either the first down marker or the endzone in goal down situations.}\n#' \\item{ydsnet}{Numeric value for total yards gained on the given drive.}\n#' \\item{desc}{Detailed string description for the given play.}\n#' \\item{play_type}{String indicating the type of play: pass (includes sacks), run (includes scrambles), punt, field_goal, kickoff, extra_point, qb_kneel, qb_spike, no_play (timeouts and penalties), and missing for rows indicating end of play.}\n#' \\item{yards_gained}{Numeric yards gained (or lost) by the possessing team, excluding yards gained via fumble recoveries and laterals.}\n#' \\item{shotgun}{Binary indicator for whether or not the play was in shotgun formation.}\n#' \\item{no_huddle}{Binary indicator for whether or not the play was in no_huddle formation.}\n#' \\item{qb_dropback}{Binary indicator for whether or not the QB dropped back on the play (pass attempt, sack, or scrambled).}\n#' \\item{qb_kneel}{Binary indicator for whether or not the QB took a knee.}\n#' \\item{qb_spike}{Binary indicator for whether or not the QB spiked the ball.}\n#' \\item{qb_scramble}{Binary indicator for whether or not the QB scrambled.}\n#' \\item{pass_length}{String indicator for pass length: short or deep.}\n#' \\item{pass_location}{String indicator for pass location: left, middle, or right.}\n#' \\item{air_yards}{Numeric value for distance in yards perpendicular to the line of scrimmage at where the targeted receiver either caught or didn't catch the ball.}\n#' \\item{yards_after_catch}{Numeric value for distance in yards perpendicular to the yard line where the receiver made the reception to where the play ended.}\n#' \\item{run_location}{String indicator for location of run: left, middle, or right.}\n#' \\item{run_gap}{String indicator for line gap of run: end, guard, or tackle}\n#' \\item{field_goal_result}{String indicator for result of field goal attempt: made, missed, or blocked.}\n#' \\item{kick_distance}{Numeric distance in yards for kickoffs, field goals, and punts.}\n#' \\item{extra_point_result}{String indicator for the result of the extra point attempt: good, failed, blocked, safety (touchback in defensive endzone is 1 point apparently), or aborted.}\n#' \\item{two_point_conv_result}{String indicator for result of two point conversion attempt: success, failure, safety (touchback in defensive endzone is 1 point apparently), or return.}\n#' \\item{home_timeouts_remaining}{Numeric timeouts remaining in the half for the home team.}\n#' \\item{away_timeouts_remaining}{Numeric timeouts remaining in the half for the away team.}\n#' \\item{timeout}{Binary indicator for whether or not a timeout was called by either team.}\n#' \\item{timeout_team}{String abbreviation for which team called the timeout.}\n#' \\item{td_team}{String abbreviation for which team scored the touchdown.}\n#' \\item{td_player_name}{String name of the player who scored a touchdown.}\n#' \\item{td_player_id}{Unique identifier of the player who scored a touchdown.}\n#' \\item{posteam_timeouts_remaining}{Number of timeouts remaining for the possession team.}\n#' \\item{defteam_timeouts_remaining}{Number of timeouts remaining for the team on defense.}\n#' \\item{total_home_score}{Score for the home team at the end of the play.}\n#' \\item{total_away_score}{Score for the away team at the end of the play.}\n#' \\item{posteam_score}{Score the posteam at the start of the play.}\n#' \\item{defteam_score}{Score the defteam at the start of the play.}\n#' \\item{score_differential}{Score differential between the posteam and defteam at the start of the play.}\n#' \\item{posteam_score_post}{Score for the posteam at the end of the play.}\n#' \\item{defteam_score_post}{Score for the defteam at the end of the play.}\n#' \\item{score_differential_post}{Score differential between the posteam and defteam at the end of the play.}\n#' \\item{no_score_prob}{Predicted probability of no score occurring for the rest of the half based on the expected points model.}\n#' \\item{opp_fg_prob}{Predicted probability of the defteam scoring a FG next.}\n#' \\item{opp_safety_prob}{Predicted probability of the defteam scoring a safety next.}\n#' \\item{opp_td_prob}{Predicted probability of the defteam scoring a TD next.}\n#' \\item{fg_prob}{Predicted probability of the posteam scoring a FG next.}\n#' \\item{safety_prob}{Predicted probability of the posteam scoring a safety next.}\n#' \\item{td_prob}{Predicted probability of the posteam scoring a TD next.}\n#' \\item{extra_point_prob}{Predicted probability of the posteam scoring an extra point.}\n#' \\item{two_point_conversion_prob}{Predicted probability of the posteam scoring the two point conversion.}\n#' \\item{ep}{Using the scoring event probabilities, the estimated expected points with respect to the possession team for the given play.}\n#' \\item{epa}{Expected points added (EPA) by the posteam for the given play.}\n#' \\item{total_home_epa}{Cumulative total EPA for the home team in the game so far.}\n#' \\item{total_away_epa}{Cumulative total EPA for the away team in the game so far.}\n#' \\item{total_home_rush_epa}{Cumulative total rushing EPA for the home team in the game so far.}\n#' \\item{total_away_rush_epa}{Cumulative total rushing EPA for the away team in the game so far.}\n#' \\item{total_home_pass_epa}{Cumulative total passing EPA for the home team in the game so far.}\n#' \\item{total_away_pass_epa}{Cumulative total passing EPA for the away team in the game so far.}\n#' \\item{air_epa}{EPA from the air yards alone. For completions this represents the actual value provided through the air. For incompletions this represents the hypothetical value that could've been added through the air if the pass was completed.}\n#' \\item{yac_epa}{EPA from the yards after catch alone. For completions this represents the actual value provided after the catch. For incompletions this represents the difference between the hypothetical air_epa and the play's raw observed EPA (how much the incomplete pass cost the posteam).}\n#' \\item{comp_air_epa}{EPA from the air yards alone only for completions.}\n#' \\item{comp_yac_epa}{EPA from the yards after catch alone only for completions.}\n#' \\item{total_home_comp_air_epa}{Cumulative total completions air EPA for the home team in the game so far.}\n#' \\item{total_away_comp_air_epa}{Cumulative total completions air EPA for the away team in the game so far.}\n#' \\item{total_home_comp_yac_epa}{Cumulative total completions yac EPA for the home team in the game so far.}\n#' \\item{total_away_comp_yac_epa}{Cumulative total completions yac EPA for the away team in the game so far.}\n#' \\item{total_home_raw_air_epa}{Cumulative total raw air EPA for the home team in the game so far.}\n#' \\item{total_away_raw_air_epa}{Cumulative total raw air EPA for the away team in the game so far.}\n#' \\item{total_home_raw_yac_epa}{Cumulative total raw yac EPA for the home team in the game so far.}\n#' \\item{total_away_raw_yac_epa}{Cumulative total raw yac EPA for the away team in the game so far.}\n#' \\item{wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play.}\n#' \\item{def_wp}{Estimated win probability for the defteam.}\n#' \\item{home_wp}{Estimated win probability for the home team.}\n#' \\item{away_wp}{Estimated win probability for the away team.}\n#' \\item{wpa}{Win probability added (WPA) for the posteam.}\n#' \\item{vegas_wpa}{Win probability added (WPA) for the posteam: spread_adjusted model.}\n#' \\item{vegas_home_wpa}{Win probability added (WPA) for the home team: spread_adjusted model.}\n#' \\item{home_wp_post}{Estimated win probability for the home team at the end of the play.}\n#' \\item{away_wp_post}{Estimated win probability for the away team at the end of the play.}\n#' \\item{vegas_wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play, incorporating pre-game Vegas line.}\n#' \\item{vegas_home_wp}{Estimated win probability for the home team incorporating pre-game Vegas line.}\n#' \\item{total_home_rush_wpa}{Cumulative total rushing WPA for the home team in the game so far.}\n#' \\item{total_away_rush_wpa}{Cumulative total rushing WPA for the away team in the game so far.}\n#' \\item{total_home_pass_wpa}{Cumulative total passing WPA for the home team in the game so far.}\n#' \\item{total_away_pass_wpa}{Cumulative total passing WPA for the away team in the game so far.}\n#' \\item{air_wpa}{WPA through the air (same logic as air_epa).}\n#' \\item{yac_wpa}{WPA from yards after the catch (same logic as yac_epa).}\n#' \\item{comp_air_wpa}{The air_wpa for completions only.}\n#' \\item{comp_yac_wpa}{The yac_wpa for completions only.}\n#' \\item{total_home_comp_air_wpa}{Cumulative total completions air WPA for the home team in the game so far.}\n#' \\item{total_away_comp_air_wpa}{Cumulative total completions air WPA for the away team in the game so far.}\n#' \\item{total_home_comp_yac_wpa}{Cumulative total completions yac WPA for the home team in the game so far.}\n#' \\item{total_away_comp_yac_wpa}{Cumulative total completions yac WPA for the away team in the game so far.}\n#' \\item{total_home_raw_air_wpa}{Cumulative total raw air WPA for the home team in the game so far.}\n#' \\item{total_away_raw_air_wpa}{Cumulative total raw air WPA for the away team in the game so far.}\n#' \\item{total_home_raw_yac_wpa}{Cumulative total raw yac WPA for the home team in the game so far.}\n#' \\item{total_away_raw_yac_wpa}{Cumulative total raw yac WPA for the away team in the game so far.}\n#' \\item{punt_blocked}{Binary indicator for if the punt was blocked.}\n#' \\item{first_down_rush}{Binary indicator for if a running play converted the first down.}\n#' \\item{first_down_pass}{Binary indicator for if a passing play converted the first down.}\n#' \\item{first_down_penalty}{Binary indicator for if a penalty converted the first down.}\n#' \\item{third_down_converted}{Binary indicator for if the first down was converted on third down.}\n#' \\item{third_down_failed}{Binary indicator for if the posteam failed to convert first down on third down.}\n#' \\item{fourth_down_converted}{Binary indicator for if the first down was converted on fourth down.}\n#' \\item{fourth_down_failed}{Binary indicator for if the posteam failed to convert first down on fourth down.}\n#' \\item{incomplete_pass}{Binary indicator for if the pass was incomplete.}\n#' \\item{touchback}{Binary indicator for if a touchback occurred on the play.}\n#' \\item{interception}{Binary indicator for if the pass was intercepted.}\n#' \\item{punt_inside_twenty}{Binary indicator for if the punt ended inside the twenty yard line.}\n#' \\item{punt_in_endzone}{Binary indicator for if the punt was in the endzone.}\n#' \\item{punt_out_of_bounds}{Binary indicator for if the punt went out of bounds.}\n#' \\item{punt_downed}{Binary indicator for if the punt was downed.}\n#' \\item{punt_fair_catch}{Binary indicator for if the punt was caught with a fair catch.}\n#' \\item{kickoff_inside_twenty}{Binary indicator for if the kickoff ended inside the twenty yard line.}\n#' \\item{kickoff_in_endzone}{Binary indicator for if the kickoff was in the endzone.}\n#' \\item{kickoff_out_of_bounds}{Binary indicator for if the kickoff went out of bounds.}\n#' \\item{kickoff_downed}{Binary indicator for if the kickoff was downed.}\n#' \\item{kickoff_fair_catch}{Binary indicator for if the kickoff was caught with a fair catch.}\n#' \\item{fumble_forced}{Binary indicator for if the fumble was forced.}\n#' \\item{fumble_not_forced}{Binary indicator for if the fumble was not forced.}\n#' \\item{fumble_out_of_bounds}{Binary indicator for if the fumble went out of bounds.}\n#' \\item{solo_tackle}{Binary indicator if the play had a solo tackle (could be multiple due to fumbles).}\n#' \\item{safety}{Binary indicator for whether or not a safety occurred.}\n#' \\item{penalty}{Binary indicator for whether or not a penalty occurred.}\n#' \\item{tackled_for_loss}{Binary indicator for whether or not a tackle for loss on a run play occurred.}\n#' \\item{fumble_lost}{Binary indicator for if the fumble was lost.}\n#' \\item{own_kickoff_recovery}{Binary indicator for if the kicking team recovered the kickoff.}\n#' \\item{own_kickoff_recovery_td}{Binary indicator for if the kicking team recovered the kickoff and scored a TD.}\n#' \\item{qb_hit}{Binary indicator if the QB was hit on the play.}\n#' \\item{rush_attempt}{Binary indicator for if the play was a run.}\n#' \\item{pass_attempt}{Binary indicator for if the play was a pass attempt (includes sacks).}\n#' \\item{sack}{Binary indicator for if the play ended in a sack.}\n#' \\item{touchdown}{Binary indicator for if the play resulted in a TD.}\n#' \\item{pass_touchdown}{Binary indicator for if the play resulted in a passing TD.}\n#' \\item{rush_touchdown}{Binary indicator for if the play resulted in a rushing TD.}\n#' \\item{return_touchdown}{Binary indicator for if the play resulted in a return TD.}\n#' \\item{extra_point_attempt}{Binary indicator for extra point attempt.}\n#' \\item{two_point_attempt}{Binary indicator for two point conversion attempt.}\n#' \\item{field_goal_attempt}{Binary indicator for field goal attempt.}\n#' \\item{kickoff_attempt}{Binary indicator for kickoff.}\n#' \\item{punt_attempt}{Binary indicator for punts.}\n#' \\item{fumble}{Binary indicator for if a fumble occurred.}\n#' \\item{complete_pass}{Binary indicator for if the pass was completed.}\n#' \\item{assist_tackle}{Binary indicator for if an assist tackle occurred.}\n#' \\item{lateral_reception}{Binary indicator for if a lateral occurred on the reception.}\n#' \\item{lateral_rush}{Binary indicator for if a lateral occurred on a run.}\n#' \\item{lateral_return}{Binary indicator for if a lateral occurred on a return.}\n#' \\item{lateral_recovery}{Binary indicator for if a lateral occurred on a fumble recovery.}\n#' \\item{passer_player_id}{Unique identifier for the player that attempted the pass.}\n#' \\item{passer_player_name}{String name for the player that attempted the pass.}\n#' \\item{passing_yards}{Numeric yards by the passer_player_name, including yards gained in pass plays with laterals. This should equal official passing statistics.}\n#' \\item{receiver_player_id}{Unique identifier for the receiver that was targeted on the pass.}\n#' \\item{receiver_player_name}{String name for the targeted receiver.}\n#' \\item{receiving_yards}{Numeric yards by the receiver_player_name, excluding yards gained in pass plays with laterals. This should equal official receiving statistics but could miss yards gained in pass plays with laterals. Please see the description of `lateral_receiver_player_name` for further information.}\n#' \\item{rusher_player_id}{Unique identifier for the player that attempted the run.}\n#' \\item{rusher_player_name}{String name for the player that attempted the run.}\n#' \\item{rushing_yards}{Numeric yards by the rusher_player_name, excluding yards gained in rush plays with laterals. This should equal official rushing statistics but could miss yards gained in rush plays with laterals. Please see the description of `lateral_rusher_player_name` for further information.}\n#' \\item{lateral_receiver_player_id}{Unique identifier for the player that received the last(!) lateral on a pass play.}\n#' \\item{lateral_receiver_player_name}{String name for the player that received the last(!) lateral on a pass play. If there were multiple laterals in the same play, this will only be the last player who received a lateral. Please see <https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards> for a list of plays where multiple players recorded lateral receiving yards.}\n#' \\item{lateral_receiving_yards}{Numeric yards by the `lateral_receiver_player_name` in pass plays with laterals. Please see the description of `lateral_receiver_player_name` for further information.}\n#' \\item{lateral_rusher_player_id}{Unique identifier for the player that received the last(!) lateral on a run play.}\n#' \\item{lateral_rusher_player_name}{String name for the player that received the last(!) lateral on a run play. If there were multiple laterals in the same play, this will only be the last player who received a lateral. Please see <https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards> for a list of plays where multiple players recorded lateral rushing yards.}\n#' \\item{lateral_rushing_yards}{Numeric yards by the `lateral_rusher_player_name` in run plays with laterals. Please see the description of `lateral_rusher_player_name` for further information.}\n#' \\item{lateral_sack_player_id}{Unique identifier for the player that received the lateral on a sack.}\n#' \\item{lateral_sack_player_name}{String name for the player that received the lateral on a sack.}\n#' \\item{interception_player_id}{Unique identifier for the player that intercepted the pass.}\n#' \\item{interception_player_name}{String name for the player that intercepted the pass.}\n#' \\item{lateral_interception_player_id}{Unique indentifier for the player that received the lateral on an interception.}\n#' \\item{lateral_interception_player_name}{String name for the player that received the lateral on an interception.}\n#' \\item{punt_returner_player_id}{Unique identifier for the punt returner.}\n#' \\item{punt_returner_player_name}{String name for the punt returner.}\n#' \\item{lateral_punt_returner_player_id}{Unique identifier for the player that received the lateral on a punt return.}\n#' \\item{lateral_punt_returner_player_name}{String name for the player that received the lateral on a punt return.}\n#' \\item{kickoff_returner_player_name}{String name for the kickoff returner.}\n#' \\item{kickoff_returner_player_id}{Unique identifier for the kickoff returner.}\n#' \\item{lateral_kickoff_returner_player_id}{Unique identifier for the player that received the lateral on a kickoff return.}\n#' \\item{lateral_kickoff_returner_player_name}{String name for the player that received the lateral on a kickoff return.}\n#' \\item{punter_player_id}{Unique identifier for the punter.}\n#' \\item{punter_player_name}{String name for the punter.}\n#' \\item{kicker_player_name}{String name for the kicker on FG or kickoff.}\n#' \\item{kicker_player_id}{Unique identifier for the kicker on FG or kickoff.}\n#' \\item{own_kickoff_recovery_player_id}{Unique identifier for the player that recovered their own kickoff.}\n#' \\item{own_kickoff_recovery_player_name}{String name for the player that recovered their own kickoff.}\n#' \\item{blocked_player_id}{Unique identifier for the player that blocked the punt or FG.}\n#' \\item{blocked_player_name}{String name for the player that blocked the punt or FG.}\n#' \\item{tackle_for_loss_1_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_1_player_name}{String name for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_2_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n#' \\item{tackle_for_loss_2_player_name}{String name for one of the potential players with the tackle for loss.}\n#' \\item{qb_hit_1_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_1_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_2_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{qb_hit_2_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see `sack_player` or `half_sack_*_player`.}\n#' \\item{forced_fumble_player_1_team}{Team of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_1_player_id}{Unique identifier of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_1_player_name}{String name of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_team}{Team of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_player_id}{Unique identifier of one of the players with a forced fumble.}\n#' \\item{forced_fumble_player_2_player_name}{String name of one of the players with a forced fumble.}\n#' \\item{solo_tackle_1_team}{Team of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_team}{Team of one of the players with a solo tackle.}\n#' \\item{solo_tackle_1_player_id}{Unique identifier of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_player_id}{Unique identifier of one of the players with a solo tackle.}\n#' \\item{solo_tackle_1_player_name}{String name of one of the players with a solo tackle.}\n#' \\item{solo_tackle_2_player_name}{String name of one of the players with a solo tackle.}\n#' \\item{assist_tackle_1_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_1_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_1_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_2_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_3_team}{Team of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_player_id}{Unique identifier of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_player_name}{String name of one of the players with a tackle assist.}\n#' \\item{assist_tackle_4_team}{Team of one of the players with a tackle assist.}\n#' \\item{tackle_with_assist}{Binary indicator for if there has been a tackle with assist.}\n#' \\item{tackle_with_assist_1_player_id}{Unique identifier of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_1_player_name}{String name of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_1_team}{Team of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_player_id}{Unique identifier of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_player_name}{String name of one of the players with a tackle with assist.}\n#' \\item{tackle_with_assist_2_team}{Team of one of the players with a tackle with assist.}\n#' \\item{pass_defense_1_player_id}{Unique identifier of one of the players with a pass defense.}\n#' \\item{pass_defense_1_player_name}{String name of one of the players with a pass defense.}\n#' \\item{pass_defense_2_player_id}{Unique identifier of one of the players with a pass defense.}\n#' \\item{pass_defense_2_player_name}{String name of one of the players with a pass defense.}\n#' \\item{fumbled_1_team}{Team of one of the first player with a fumble.}\n#' \\item{fumbled_1_player_id}{Unique identifier of the first player who fumbled on the play.}\n#' \\item{fumbled_1_player_name}{String name of one of the first player who fumbled on the play.}\n#' \\item{fumbled_2_player_id}{Unique identifier of the second player who fumbled on the play.}\n#' \\item{fumbled_2_player_name}{String name of one of the second player who fumbled on the play.}\n#' \\item{fumbled_2_team}{Team of one of the second player with a fumble.}\n#' \\item{fumble_recovery_1_team}{Team of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_yards}{Yards gained by one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_player_id}{Unique identifier of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_1_player_name}{String name of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_team}{Team of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_yards}{Yards gained by one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_player_id}{Unique identifier of one of the players with a fumble recovery.}\n#' \\item{fumble_recovery_2_player_name}{String name of one of the players with a fumble recovery.}\n#' \\item{sack_player_id}{Unique identifier of the player who recorded a solo sack.}\n#' \\item{sack_player_name}{String name of the player who recorded a solo sack.}\n#' \\item{half_sack_1_player_id}{Unique identifier of the first player who recorded half a sack.}\n#' \\item{half_sack_1_player_name}{String name of the first player who recorded half a sack.}\n#' \\item{half_sack_2_player_id}{Unique identifier of the second player who recorded half a sack.}\n#' \\item{half_sack_2_player_name}{String name of the second player who recorded half a sack.}\n#' \\item{return_team}{String abbreviation of the return team.}\n#' \\item{return_yards}{Yards gained by the return team.}\n#' \\item{penalty_team}{String abbreviation of the team with the penalty.}\n#' \\item{penalty_player_id}{Unique identifier for the player with the penalty.}\n#' \\item{penalty_player_name}{String name for the player with the penalty.}\n#' \\item{penalty_yards}{Yards gained (or lost) by the posteam from the penalty.}\n#' \\item{replay_or_challenge}{Binary indicator for whether or not a replay or challenge.}\n#' \\item{replay_or_challenge_result}{String indicating the result of the replay or challenge.}\n#' \\item{penalty_type}{String indicating the penalty type of the first penalty in the given play. Will be `NA` if `desc` is missing the type.}\n#' \\item{defensive_two_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on a two point conversion, this results following a turnover.}\n#' \\item{defensive_two_point_conv}{Binary indicator whether or not the defense successfully scored on the two point conversion.}\n#' \\item{defensive_extra_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on an extra point attempt, this results following a blocked attempt that the defense recovers the ball.}\n#' \\item{defensive_extra_point_conv}{Binary indicator whether or not the defense successfully scored on an extra point attempt.}\n#' \\item{safety_player_name}{String name for the player who scored a safety.}\n#' \\item{safety_player_id}{Unique identifier for the player who scored a safety.}\n#' \\item{season}{4 digit number indicating to which season the game belongs to.}\n#' \\item{cp}{Numeric value indicating the probability for a complete pass based on comparable game situations.}\n#' \\item{cpoe}{For a single pass play this is 1 - cp when the pass was completed or 0 - cp when the pass was incomplete. Analyzed for a whole game or season an indicator for the passer how much over or under expectation his completion percentage was.}\n#' \\item{series}{Starts at 1, each new first down increments, numbers shared across both teams NA: kickoffs, extra point/two point conversion attempts, non-plays, no posteam}\n#' \\item{series_success}{1: scored touchdown, gained enough yards for first down.}\n#' \\item{series_result}{Possible values: First down, Touchdown, Opp touchdown, Field goal, Missed field goal, Safety, Turnover, Punt, Turnover on downs, QB kneel, End of half}\n#' \\item{order_sequence}{Column provided by NFL to fix out-of-order plays. Available 2011 and beyond with source \"nfl\".}\n#' \\item{start_time}{Kickoff time in eastern time zone.}\n#' \\item{time_of_day}{Time of day of play in UTC \"HH:MM:SS\" format. Available 2011 and beyond with source \"nfl\".}\n#' \\item{stadium}{Game site name.}\n#' \\item{weather}{String describing the weather including temperature, humidity and wind (direction and speed). Doesn't change during the game!}\n#' \\item{nfl_api_id}{UUID of the game in the new NFL API.}\n#' \\item{play_clock}{Time on the playclock when the ball was snapped.}\n#' \\item{play_deleted}{Binary indicator for deleted plays.}\n#' \\item{play_type_nfl}{Play type as listed in the NFL source. Slightly different to the regular play_type variable.}\n#' \\item{special_teams_play}{Binary indicator for whether play is special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n#' \\item{st_play_type}{Type of special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n#' \\item{end_clock_time}{Game time at the end of a given play.}\n#' \\item{end_yard_line}{String indicating the yardline at the end of the given play consisting of team half and yard line number.}\n#' \\item{fixed_drive}{Manually created drive number in a game.}\n#' \\item{fixed_drive_result}{Manually created drive result.}\n#' \\item{drive_real_start_time}{Local day time when the drive started (currently not used by the NFL and therefore mostly 'NA').}\n#' \\item{drive_play_count}{Numeric value of how many regular plays happened in a given drive.}\n#' \\item{drive_time_of_possession}{Time of possession in a given drive.}\n#' \\item{drive_first_downs}{Number of first downs in a given drive.}\n#' \\item{drive_inside20}{Binary indicator if the offense was able to get inside the opponents 20 yard line.}\n#' \\item{drive_ended_with_score}{Binary indicator the drive ended with a score.}\n#' \\item{drive_quarter_start}{Numeric value indicating in which quarter the given drive has started.}\n#' \\item{drive_quarter_end}{Numeric value indicating in which quarter the given drive has ended.}\n#' \\item{drive_yards_penalized}{Numeric value of how many yards the offense gained or lost through penalties in the given drive.}\n#' \\item{drive_start_transition}{String indicating how the offense got the ball.}\n#' \\item{drive_end_transition}{String indicating how the offense lost the ball.}\n#' \\item{drive_game_clock_start}{Game time at the beginning of a given drive.}\n#' \\item{drive_game_clock_end}{Game time at the end of a given drive.}\n#' \\item{drive_start_yard_line}{String indicating where a given drive started consisting of team half and yard line number.}\n#' \\item{drive_end_yard_line}{String indicating where a given drive ended consisting of team half and yard line number.}\n#' \\item{drive_play_id_started}{Play_id of the first play in the given drive.}\n#' \\item{drive_play_id_ended}{Play_id of the last play in the given drive.}\n#' \\item{away_score}{Total points scored by the away team.}\n#' \\item{home_score}{Total points scored by the home team.}\n#' \\item{location}{Either 'Home' o 'Neutral' indicating if the home team played at home or at a neutral site. }\n#' \\item{result}{Equals home_score - away_score and means the game outcome from the perspective of the home team.}\n#' \\item{total}{Equals home_score + away_score and means the total points scored in the given game.}\n#' \\item{spread_line}{The closing spread line for the game. A positive number means the home team was favored by that many points, a negative number means the away team was favored by that many points. (Source: Pro-Football-Reference)}\n#' \\item{total_line}{The closing total line for the game. (Source: Pro-Football-Reference)}\n#' \\item{div_game}{Binary indicator for if the given game was a division game.}\n#' \\item{roof}{One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' \\item{surface}{What type of ground the game was played on. (Source: Pro-Football-Reference)}\n#' \\item{temp}{The temperature at the stadium only for 'roof' = 'outdoors' or 'open'.(Source: Pro-Football-Reference)}\n#' \\item{wind}{The speed of the wind in miles/hour only for 'roof' = 'outdoors' or 'open'. (Source: Pro-Football-Reference)}\n#' \\item{home_coach}{First and last name of the home team coach. (Source: Pro-Football-Reference)}\n#' \\item{away_coach}{First and last name of the away team coach. (Source: Pro-Football-Reference)}\n#' \\item{stadium_id}{ID of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' \\item{game_stadium}{Name of the stadium the game was played in. (Source: Pro-Football-Reference)}\n#' \\item{success}{Binary indicator wheter epa > 0 in the given play. }\n#' \\item{passer}{Name of the dropback player (scrambles included) including plays with penalties.}\n#' \\item{passer_jersey_number}{Jersey number of the passer.}\n#' \\item{rusher}{Name of the rusher (no scrambles) including plays with penalties.}\n#' \\item{rusher_jersey_number}{Jersey number of the rusher.}\n#' \\item{receiver}{Name of the receiver including plays with penalties.}\n#' \\item{receiver_jersey_number}{Jersey number of the receiver.}\n#' \\item{pass}{Binary indicator if the play was a pass play (sacks and scrambles included).}\n#' \\item{rush}{Binary indicator if the play was a rushing play.}\n#' \\item{first_down}{Binary indicator if the play ended in a first down.}\n#' \\item{aborted_play}{Binary indicator if the play description indicates \"Aborted\".}\n#' \\item{special}{Binary indicator if the play was a special teams play.}\n#' \\item{play}{Binary indicator: 1 if the play was a 'normal' play (including penalties), 0 otherwise.}\n#' \\item{passer_id}{ID of the player in the 'passer' column.}\n#' \\item{rusher_id}{ID of the player in the 'rusher' column.}\n#' \\item{receiver_id}{ID of the player in the 'receiver' column.}\n#' \\item{name}{Name of the 'passer' if it is not 'NA', or name of the 'rusher' otherwise.}\n#' \\item{jersey_number}{Jersey number of the player listed in the 'name' column.}\n#' \\item{id}{ID of the player in the 'name' column.}\n#' \\item{fantasy_player_name}{Name of the rusher on rush plays or receiver on pass plays (from official stats).}\n#' \\item{fantasy_player_id}{ID of the rusher on rush plays or receiver on pass plays (from official stats).}\n#' \\item{fantasy}{Name of the rusher on rush plays or receiver on pass plays.}\n#' \\item{fantasy_id}{ID of the rusher on rush plays or receiver on pass plays.}\n#' \\item{out_of_bounds}{1 if play description contains ran ob, pushed ob, or sacked ob; 0 otherwise.}\n#' \\item{home_opening_kickoff}{= 1 if the home team received the opening kickoff, 0 otherwise.}\n#' \\item{qb_epa}{Gives QB credit for EPA for up to the point where a receiver lost a fumble after a completed catch and makes EPA work more like passing yards on plays with fumbles.}\n#' \\item{xyac_epa}{Expected value of EPA gained after the catch, starting from where the catch was made. Zero yards after the catch would be listed as zero EPA.}\n#' \\item{xyac_mean_yardage}{Average expected yards after the catch based on where the ball was caught.}\n#' \\item{xyac_median_yardage}{Median expected yards after the catch based on where the ball was caught.}\n#' \\item{xyac_success}{Probability play earns positive EPA (relative to where play started) based on where ball was caught.}\n#' \\item{xyac_fd}{Probability play earns a first down based on where the ball was caught.}\n#' \\item{xpass}{Probability of dropback scaled from 0 to 1.}\n#' \\item{pass_oe}{Dropback percent over expected on a given play scaled from 0 to 100.}\n"
  },
  {
    "path": "data-raw/wordmarks.R",
    "content": "library(dplyr)\n\nteams <- nflfastR::teams_colors_logos |>\n  dplyr::filter(!team_abbr %in% c(\"LAR\", \"OAK\", \"SD\", \"STL\"))\n\npurrr::walk(teams$team_abbr, function(x) {\n  load <- glue::glue(\n    \"https://static.www.nfl.com/league/apps/clubs/wordmarks/{x}_fullcolor.png\"\n  ) |>\n    magick::image_read() |>\n    magick::image_trim()\n\n  info <- magick::image_info(load)\n\n  rl <- (700 - info$width) / 2\n  tb <- (192 - info$height) / 2\n\n  image <- magick::image_border(load, \"transparent\", glue::glue(\"{rl}x{tb}\"))\n\n  magick::image_write(\n    image,\n    path = glue::glue(\"wordmarks/{x}.png\"),\n    format = \"png\"\n  )\n\n  if (x == \"LA\") {\n    magick::image_write(image, path = \"wordmarks/LAR.png\", format = \"png\")\n    magick::image_write(image, path = \"wordmarks/STL.png\", format = \"png\")\n  } else if (x == \"LAC\") {\n    magick::image_write(image, path = \"wordmarks/SD.png\", format = \"png\")\n  } else if (x == \"LV\") {\n    magick::image_write(image, path = \"wordmarks/OAK.png\", format = \"png\")\n  }\n})\n"
  },
  {
    "path": "man/add_qb_epa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_additional_functions.R\n\\name{add_qb_epa}\n\\alias{add_qb_epa}\n\\title{Compute QB epa}\n\\usage{\nadd_qb_epa(pbp, ...)\n}\n\\arguments{\n\\item{pbp}{is a Data frame of play-by-play data scraped using \\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{...}{Additional arguments passed to a message function (for internal use).}\n}\n\\description{\nCompute QB epa\n}\n\\details{\nAdd the variable 'qb_epa', which gives QB credit for EPA for up to the point where\na receiver lost a fumble after a completed catch and makes EPA work more\nlike passing yards on plays with fumbles\n}\n"
  },
  {
    "path": "man/add_xpass.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_add_xpass.R\n\\name{add_xpass}\n\\alias{add_xpass}\n\\title{Add expected pass columns}\n\\usage{\nadd_xpass(pbp, ...)\n}\n\\arguments{\n\\item{pbp}{is a Data frame of play-by-play data scraped using \\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{...}{Additional arguments passed to a message function (for internal use).}\n}\n\\value{\nThe input Data Frame of the parameter \\code{pbp} with the following columns\nadded:\n\\describe{\n\\item{xpass}{Probability of dropback scaled from 0 to 1.}\n\\item{pass_oe}{Dropback percent over expected on a given play scaled from 0 to 100.}\n}\n}\n\\description{\nBuild columns from the expected dropback model. Will return\n\\code{NA} on data prior to 2006 since that was before NFL started marking scrambles.\nMust be run on a dataframe that has already had \\code{\\link[=clean_pbp]{clean_pbp()}} run on it.\nNote that the functions \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}} and\nthe database function \\code{\\link[=update_db]{update_db()}} already include this function.\n}\n"
  },
  {
    "path": "man/add_xyac.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_add_xyac.R\n\\name{add_xyac}\n\\alias{add_xyac}\n\\title{Add expected yards after completion (xyac) variables}\n\\usage{\nadd_xyac(pbp, ...)\n}\n\\arguments{\n\\item{pbp}{is a Data frame of play-by-play data scraped using \\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{...}{Additional arguments passed to a message function (for internal use).}\n}\n\\value{\nThe input Data Frame of the parameter 'pbp' with the following columns\nadded:\n\\describe{\n\\item{xyac_epa}{Expected value of EPA gained after the catch, starting from where the catch was made. Zero yards after the catch would be listed as zero EPA.}\n\\item{xyac_success}{Probability play earns positive EPA (relative to where play started) based on where ball was caught.}\n\\item{xyac_fd}{Probability play earns a first down based on where the ball was caught.}\n\\item{xyac_mean_yardage}{Average expected yards after the catch based on where the ball was caught.}\n\\item{xyac_median_yardage}{Median expected yards after the catch based on where the ball was caught.}\n}\n}\n\\description{\nAdd expected yards after completion (xyac) variables\n}\n\\details{\nBuild columns that capture what we should expect after the catch.\n}\n"
  },
  {
    "path": "man/build_nflfastR_pbp.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/build_nflfastR_pbp.R\n\\name{build_nflfastR_pbp}\n\\alias{build_nflfastR_pbp}\n\\title{Build a Complete nflfastR Data Set}\n\\usage{\nbuild_nflfastR_pbp(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...,\n  decode = TRUE,\n  rules = TRUE\n)\n}\n\\arguments{\n\\item{game_ids}{Vector of character ids or a data frame including the variable\n\\code{game_id} (see details for further information).}\n\n\\item{dir}{Path to local directory (defaults to option \"nflfastR.raw_directory\")\nwhere nflfastR searches for raw game play-by-play data.\nSee \\code{\\link[=save_raw_pbp]{save_raw_pbp()}} for additional information.}\n\n\\item{...}{Additional arguments passed to the scraping functions (for internal use)}\n\n\\item{decode}{If \\code{TRUE}, the function \\code{\\link[=decode_player_ids]{decode_player_ids()}} will be executed.}\n\n\\item{rules}{If \\code{FALSE}, printing of the header and footer in the console output will be suppressed.}\n}\n\\value{\nAn nflfastR play-by-play data frame like it can be loaded from \\url{https://github.com/nflverse/nflverse-data}.\n}\n\\description{\n\\code{build_nflfastR_pbp} is a convenient wrapper around 6 nflfastR functions:\n\n\\itemize{\n\\item{\\code{\\link[=fast_scraper]{fast_scraper()}}}\n\\item{\\code{\\link[=clean_pbp]{clean_pbp()}}}\n\\item{\\code{\\link[=add_qb_epa]{add_qb_epa()}}}\n\\item{\\code{\\link[=add_xyac]{add_xyac()}}}\n\\item{\\code{\\link[=add_xpass]{add_xpass()}}}\n\\item{\\code{\\link[=decode_player_ids]{decode_player_ids()}}}\n}\n\nPlease see either the documentation of each function or\n\\href{https://nflfastr.com/articles/field_descriptions.html}{the nflfastR Field Descriptions website}\nto learn about the output.\n}\n\\details{\nTo load valid game_ids please use the package function \\code{\\link[=fast_scraper_schedules]{fast_scraper_schedules()}}.\n}\n\\examples{\n\\donttest{\n# Build nflfastR pbp for the 2018 and 2019 Super Bowls\ntry({# to avoid CRAN test problems\nbuild_nflfastR_pbp(c(\"2018_21_NE_LA\", \"2019_21_SF_KC\"))\n})\n\n# It is also possible to directly use the\n# output of `load_schedules` as input\ntry({# to avoid CRAN test problems\nnflreadr::load_schedules(2025) |>\n  dplyr::slice_tail(n = 3) |>\n  build_nflfastR_pbp()\n})\n\n\\dontshow{\n# Close open connections for R CMD Check\nfuture::plan(\"sequential\")\n}\n}\n}\n\\seealso{\nFor information on parallel processing and progress updates please\nsee \\link{nflfastR}.\n}\n"
  },
  {
    "path": "man/calculate_expected_points.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ep_wp_calculators.R\n\\name{calculate_expected_points}\n\\alias{calculate_expected_points}\n\\title{Compute expected points}\n\\usage{\ncalculate_expected_points(pbp_data)\n}\n\\arguments{\n\\item{pbp_data}{Play-by-play dataset to estimate expected points for.}\n}\n\\value{\nThe original pbp_data with the following columns appended to it:\n\\describe{\n\\item{ep}{expected points.}\n\\item{no_score_prob}{probability of no more scoring this half.}\n\\item{opp_fg_prob}{probability next score opponent field goal this half.}\n\\item{opp_safety_prob}{probability next score opponent safety  this half.}\n\\item{opp_td_prob}{probability of next score opponent touchdown this half.}\n\\item{fg_prob}{probability next score field goal this half.}\n\\item{safety_prob}{probability next score safety this half.}\n\\item{td_prob}{probability text score touchdown this half.}\n}\n}\n\\description{\nfor provided plays. Returns the data with\nprobabilities of each scoring event and EP added. The following columns\nmust be present: season, home_team, posteam, roof (coded as 'open',\n'closed', or 'retractable'), half_seconds_remaining, yardline_100,\nydstogo, posteam_timeouts_remaining, defteam_timeouts_remaining\n}\n\\details{\nComputes expected points for provided plays. Returns the data with\nprobabilities of each scoring event and EP added. The following columns\nmust be present:\n\\itemize{\n\\item{season}\n\\item{home_team}\n\\item{posteam}\n\\item{roof (coded as 'outdoors', 'dome', or 'open'/'closed'/NA (retractable))}\n\\item{half_seconds_remaining}\n\\item{yardline_100}\n\\item{down}\n\\item{ydstogo}\n\\item{posteam_timeouts_remaining}\n\\item{defteam_timeouts_remaining}\n}\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\nlibrary(dplyr)\ndata <- tibble::tibble(\n\"season\" = 1999:2019,\n\"home_team\" = \"SEA\",\n\"posteam\" = \"SEA\",\n\"roof\" = \"outdoors\",\n\"half_seconds_remaining\" = 1800,\n\"yardline_100\" = c(rep(80, 17), rep(75, 4)),\n\"down\" = 1,\n\"ydstogo\" = 10,\n\"posteam_timeouts_remaining\" = 3,\n\"defteam_timeouts_remaining\" = 3\n)\n\nnflfastR::calculate_expected_points(data) |>\n  dplyr::select(season, yardline_100, td_prob, ep)\n})\n}\n}\n"
  },
  {
    "path": "man/calculate_player_stats.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/aggregate_game_stats.R\n\\name{calculate_player_stats}\n\\alias{calculate_player_stats}\n\\title{Get Official Game Stats}\n\\usage{\ncalculate_player_stats(pbp, weekly = FALSE)\n}\n\\arguments{\n\\item{pbp}{A Data frame of NFL play-by-play data typically loaded with\n\\code{\\link[=load_pbp]{load_pbp()}} or \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}. If the data doesn't include the variable\n\\code{qb_epa}, the function \\code{add_qb_epa()} will be called to add it.}\n\n\\item{weekly}{If \\code{TRUE}, returns week-by-week stats, otherwise, stats\nfor the entire Data frame.}\n}\n\\value{\nA data frame including the following columns (all ID columns are\ndecoded to the gsis ID format):\n\\describe{\n\\item{player_id}{ID of the player. Use this to join to other sources.}\n\\item{player_name}{Name of the player}\n\\item{player_display_name}{Full name of the player}\n\\item{position}{Position of the player}\n\\item{position_group}{Position group of the player}\n\\item{headshot_url}{URL to a player headshot image}\n\\item{games}{The number of games where the player recorded passing, rushing or receiving stats.}\n\\item{recent_team}{Most recent team player appears in \\code{pbp} with.}\n\\item{season}{Season if \\code{weekly} is \\code{TRUE}}\n\\item{week}{Week if \\code{weekly} is \\code{TRUE}}\n\\item{season_type}{\\code{REG} or \\code{POST} if \\code{weekly} is \\code{TRUE}}\n\\item{opponent_team}{The player's opponent team if \\code{weekly} is \\code{TRUE}}\n\\item{completions}{The number of completed passes.}\n\\item{attempts}{The number of pass attempts as defined by the NFL.}\n\\item{passing_yards}{Yards gained on pass plays.}\n\\item{passing_tds}{The number of passing touchdowns.}\n\\item{interceptions}{The number of interceptions thrown.}\n\\item{sacks}{The Number of times sacked.}\n\\item{sack_yards}{Yards lost on sack plays.}\n\\item{sack_fumbles}{The number of sacks with a fumble.}\n\\item{sack_fumbles_lost}{The number of sacks with a lost fumble.}\n\\item{passing_air_yards}{Passing air yards (includes incomplete passes).}\n\\item{passing_yards_after_catch}{Yards after the catch gained on plays in\nwhich player was the passer (this is an unofficial stat and may differ slightly\nbetween different sources).}\n\\item{passing_first_downs}{First downs on pass attempts.}\n\\item{passing_epa}{Total expected points added on pass attempts and sacks.\nNOTE: this uses the variable \\code{qb_epa}, which gives QB credit for EPA for up\nto the point where a receiver lost a fumble after a completed catch and makes\nEPA work more like passing yards on plays with fumbles.}\n\\item{passing_2pt_conversions}{Two-point conversion passes.}\n\\item{pacr}{Passing Air Conversion Ratio. PACR = \\code{passing_yards} / \\code{passing_air_yards}}\n\\item{dakota}{Adjusted EPA + CPOE composite based on coefficients which best predict adjusted EPA/play in the following year.}\n\\item{carries}{The number of official rush attempts (incl. scrambles and kneel downs).\nRushes after a lateral reception don't count as carry.}\n\\item{rushing_yards}{Yards gained when rushing with the ball (incl. scrambles and kneel downs).\nAlso includes yards gained after obtaining a lateral on a play that started\nwith a rushing attempt.}\n\\item{rushing_tds}{The number of rushing touchdowns (incl. scrambles).\nAlso includes touchdowns after obtaining a lateral on a play that started\nwith a rushing attempt.}\n\\item{rushing_fumbles}{The number of rushes with a fumble.}\n\\item{rushing_fumbles_lost}{The number of rushes with a lost fumble.}\n\\item{rushing_first_downs}{First downs on rush attempts (incl. scrambles).}\n\\item{rushing_epa}{Expected points added on rush attempts (incl. scrambles and kneel downs).}\n\\item{rushing_2pt_conversions}{Two-point conversion rushes}\n\\item{receptions}{The number of pass receptions. Lateral receptions officially\ndon't count as reception.}\n\\item{targets}{The number of pass plays where the player was the targeted receiver.}\n\\item{receiving_yards}{Yards gained after a pass reception. Includes yards\ngained after receiving a lateral on a play that started as a pass play.}\n\\item{receiving_tds}{The number of touchdowns following a pass reception.\nAlso includes touchdowns after receiving a lateral on a play that started\nas a pass play.}\n\\item{receiving_air_yards}{Receiving air yards (incl. incomplete passes).}\n\\item{receiving_yards_after_catch}{Yards after the catch gained on plays in\nwhich player was receiver (this is an unofficial stat and may differ slightly\nbetween different sources).}\n\\item{receiving_fumbles}{The number of fumbles after a pass reception.}\n\\item{receiving_fumbles_lost}{The number of fumbles lost after a pass reception.}\n\\item{receiving_2pt_conversions}{Two-point conversion receptions}\n\\item{racr}{Receiver Air Conversion Ratio. RACR = \\code{receiving_yards} / \\code{receiving_air_yards}}\n\\item{target_share}{The share of targets of the player in all targets of his team}\n\\item{air_yards_share}{The share of receiving_air_yards of the player in all air_yards of his team}\n\\item{wopr}{Weighted Opportunity Rating. WOPR = 1.5 × \\code{target_share} + 0.7 × \\code{air_yards_share}}\n\\item{fantasy_points}{Standard fantasy points.}\n\\item{fantasy_points_ppr}{PPR fantasy points.}\n}\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated because we have a new, much better and\nharmonized approach in \\code{\\link[=calculate_stats]{calculate_stats()}}.\n\nBuild columns that aggregate official passing, rushing, and receiving stats\neither at the game level or at the level of the entire data frame passed.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\n# pbp <- nflfastR::load_pbp(2020)\n\n# weekly <- calculate_player_stats(pbp, weekly = TRUE)\n# dplyr::glimpse(weekly)\n\n# overall <- calculate_player_stats(pbp, weekly = FALSE)\n# dplyr::glimpse(overall)\n})\n}\n}\n\\seealso{\nThe function \\code{\\link[=load_player_stats]{load_player_stats()}} and the corresponding examples\non \\href{https://nflfastr.com/articles/nflfastR.html#example-11-replicating-official-stats}{the nflfastR website}\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/calculate_player_stats_def.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/aggregate_game_stats_def.R\n\\name{calculate_player_stats_def}\n\\alias{calculate_player_stats_def}\n\\title{Get Official Game Stats on Defense}\n\\usage{\ncalculate_player_stats_def(pbp, weekly = FALSE)\n}\n\\arguments{\n\\item{pbp}{A Data frame of NFL play-by-play data typically loaded with\n\\code{\\link[=load_pbp]{load_pbp()}} or \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}. If the data doesn't include the variable\n\\code{qb_epa}, the function \\code{add_qb_epa()} will be called to add it.}\n\n\\item{weekly}{If \\code{TRUE}, returns week-by-week stats, otherwise, stats\nfor the entire Data frame.}\n}\n\\value{\nA data frame of defensive player stats. See dictionary (# TODO)\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated because we have a new, much better and\nharmonized approach in \\code{\\link[=calculate_stats]{calculate_stats()}}.\n\nBuild columns that aggregate official defense stats\neither at the game level or at the level of the entire data frame passed.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\n  # pbp <- nflfastR::load_pbp(2020)\n\n  # weekly <- calculate_player_stats_def(pbp, weekly = TRUE)\n  # dplyr::glimpse(weekly)\n\n  # overall <- calculate_player_stats_def(pbp, weekly = FALSE)\n  # dplyr::glimpse(overall)\n})\n}\n\n}\n\\seealso{\nThe function \\code{\\link[=load_player_stats]{load_player_stats()}} and the corresponding examples\non \\href{https://nflfastr.com/articles/nflfastR.html#example-11-replicating-official-stats}{the nflfastR website}\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/calculate_player_stats_kicking.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/aggregate_game_stats_kicking.R\n\\name{calculate_player_stats_kicking}\n\\alias{calculate_player_stats_kicking}\n\\title{Summarize Kicking Stats}\n\\usage{\ncalculate_player_stats_kicking(pbp, weekly = FALSE)\n}\n\\arguments{\n\\item{pbp}{A Data frame of NFL play-by-play data typically loaded with\n\\code{\\link[=load_pbp]{load_pbp()}} or \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}.}\n\n\\item{weekly}{If \\code{TRUE}, returns week-by-week stats, otherwise, stats for\nthe entire data frame in argument \\code{pbp}.}\n}\n\\value{\na dataframe of kicking stats\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated because we have a new, much better and\nharmonized approach in \\code{\\link[=calculate_stats]{calculate_stats()}}.\n\nBuild columns that aggregate kicking stats at the game level.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\n    # pbp <- nflreadr::load_pbp(2021)\n    # weekly <- calculate_player_stats_kicking(pbp, weekly = TRUE)\n    # dplyr::glimpse(weekly)\n\n    # overall <- calculate_player_stats_kicking(pbp, weekly = FALSE)\n    # dplyr::glimpse(overall)\n})\n}\n\n}\n\\seealso{\n\\url{https://nflreadr.nflverse.com/reference/load_player_stats.html} for the nflreadr function to download this from repo (\\code{stat_type = \"kicking\"})\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/calculate_series_conversion_rates.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/calculate_series_conversion_rates.R\n\\name{calculate_series_conversion_rates}\n\\alias{calculate_series_conversion_rates}\n\\title{Compute Series Conversion Information from Play by Play}\n\\usage{\ncalculate_series_conversion_rates(pbp, weekly = FALSE)\n}\n\\arguments{\n\\item{pbp}{Play-by-play data as returned by \\code{\\link[=load_pbp]{load_pbp()}}, \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}, or\n\\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{weekly}{If \\code{TRUE}, returns week-by-week stats, otherwise,\nseason-by-season stats in argument \\code{pbp}.}\n}\n\\value{\nA data frame of series information including the following columns:\n\\describe{\n\\item{season}{The NFL season}\n\\item{team}{NFL team abbreviation}\n\\item{week}{Week if \\code{weekly} is \\code{TRUE}}\n\\item{off_n}{The number of series the offense played (excludes QB kneel\ndowns, kickoffs, extra point/two point conversion attempts, non-plays, and\nplays that do not list a \"posteam\")}\n\\item{off_scr}{The rate at which a series ended in either new 1st down or\ntouchdown while the offense was on the field}\n\\item{off_scr_1st}{The rate at which an offense earned a 1st down\nor scored a touchdown on 1st down}\n\\item{off_scr_2nd}{The rate at which an offense earned a 1st down\nor scored a touchdown on 2nd down}\n\\item{off_scr_3rd}{The rate at which an offense earned a 1st down\nor scored a touchdown on 3rd down}\n\\item{off_scr_4th}{The rate at which an offense earned a 1st down\nor scored a touchdown on 4th down}\n\\item{off_1st}{The rate of series that ended in a new 1st down while the\noffense was on the field (does not include offensive touchdown)}\n\\item{off_td}{The rate of series that ended in an offensive touchdown while the\noffense was on the field}\n\\item{off_fg}{The rate of series that ended in a field goal attempt while the\noffense was on the field}\n\\item{off_punt}{The rate of series that ended in a punt while the\noffense was on the field}\n\\item{off_to}{The rate of series that ended in a turnover (including on downs), in an\nopponent score, or at the end of half (or game) while the\noffense was on the field}\n\\item{def_n}{The number of series the defense played (excludes QB kneel\ndowns, kickoffs, extra point/two point conversion attempts, non-plays, and\nplays that do not list a \"posteam\")}\n\\item{def_scr}{The rate at which a series ended in either new 1st down or\ntouchdown while the defense was on the field}\n\\item{def_scr_1st}{The rate at which a defense allowed a\n1st down or touchdown on 1st down}\n\\item{def_scr_2nd}{The rate at which a defense allowed a\n1st down or touchdown on 2nd down}\n\\item{def_scr_3rd}{The rate at which a defense allowed a\n1st down or touchdown on 3rd down}\n\\item{def_scr_4th}{The rate at which a defense allowed a\n1st down or touchdown on 4th down}\n\\item{def_1st}{The rate of series that ended in a new 1st down while the\ndefense was on the field (does not include offensive touchdown)}\n\\item{def_td}{The rate of series that ended in an offensive touchdown while the\ndefense was on the field}\n\\item{def_fg}{The rate of series that ended in a field goal attempt while the\ndefense was on the field}\n\\item{def_punt}{The rate of series that ended in a punt while the\ndefense was on the field}\n\\item{def_to}{The rate of series that ended in a turnover (including on downs), in an\nopponent score, or at the end of half (or game) while the\ndefense was on the field}\n}\n}\n\\description{\nA \"Series\" begins on a 1st and 10 and each team attempts to either earn\na new 1st down (on offense) or prevent the offense from converting a new\n1st down (on defense). Series conversion rate represents how many series\nhave been either converted to a new 1st down or ended in a touchdown.\nThis function computes series conversion rates on offense and defense from\nnflverse play-by-play data along with other series results.\nThe function automatically removes series that ended in a QB kneel down.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\n  pbp <- nflfastR::load_pbp(2021)\n\n  weekly <- calculate_series_conversion_rates(pbp, weekly = TRUE)\n  dplyr::glimpse(weekly)\n\n  overall <- calculate_series_conversion_rates(pbp, weekly = FALSE)\n  dplyr::glimpse(overall)\n})\n}\n}\n"
  },
  {
    "path": "man/calculate_standings.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/calculate_standings.R\n\\name{calculate_standings}\n\\alias{calculate_standings}\n\\title{Compute Division Standings and Conference Seeds from Play by Play}\n\\usage{\ncalculate_standings(\n  nflverse_object,\n  tiebreaker_depth = 3,\n  playoff_seeds = NULL\n)\n}\n\\arguments{\n\\item{nflverse_object}{Data object of class \\code{nflverse_data}. Either schedules\nas returned by \\code{\\link[=fast_scraper_schedules]{fast_scraper_schedules()}} or \\code{\\link[nflreadr:load_schedules]{nflreadr::load_schedules()}}.\nOr play-by-play data as returned by \\code{\\link[=load_pbp]{load_pbp()}}, \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}, or\n\\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{tiebreaker_depth}{A single value equal to 1, 2, or 3. The default is 3. The\nvalue controls the depth of tiebreakers that shall be applied. The deepest\ncurrently implemented tiebreaker is strength of schedule. The following\nvalues are valid:\n\\describe{\n\\item{tiebreaker_depth = 1}{Break all ties with a coinflip. Fastest variant.}\n\\item{tiebreaker_depth = 2}{Apply head-to-head and division win percentage tiebreakers. Random if still tied.}\n\\item{tiebreaker_depth = 3}{Apply all tiebreakers through strength of schedule. Random if still tied.}\n}}\n\n\\item{playoff_seeds}{Number of playoff teams per conference. If \\code{NULL} (the\ndefault), the function will try to split \\code{nflverse_object} into seasons prior\n2020 (6 seeds) and 2020ff (7 seeds). If set to a numeric, it will be used\nfor all seasons in \\code{nflverse_object}!}\n}\n\\value{\nA tibble with NFL regular season standings\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated and replaced by \\code{\\link[nflseedR:nfl_standings]{nflseedR::nfl_standings()}}.\n\nThis function calculates division standings as well as playoff\nseeds per conference based on either nflverse play-by-play data or nflverse\nschedule data.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\n  # load nflverse data both schedules and pbp\n  # scheds <- fast_scraper_schedules(2014)\n  # pbp <- load_pbp(c(2018, 2021))\n\n  # calculate standings based on pbp\n  # calculate_standings(pbp)\n\n  # calculate standings based on schedules\n  # calculate_standings(scheds)\n})\n}\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/calculate_stats.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/calculate_stats.R\n\\name{calculate_stats}\n\\alias{calculate_stats}\n\\title{Calculate NFL Stats}\n\\usage{\ncalculate_stats(\n  seasons = nflreadr::most_recent_season(),\n  summary_level = c(\"season\", \"week\"),\n  stat_type = c(\"player\", \"team\"),\n  season_type = c(\"REG\", \"POST\", \"REG+POST\"),\n  pbp = NULL\n)\n}\n\\arguments{\n\\item{seasons}{A numeric vector of 4-digit years associated with given NFL\nseasons - defaults to latest season. If set to TRUE, returns all available\ndata since 1999. Ignored if argument \\code{pbp} is not \\code{NULL}.}\n\n\\item{summary_level}{Summarize stats by \\code{\"season\"} or \\code{\"week\"}.}\n\n\\item{stat_type}{Calculate \\code{\"player\"} level stats or \\code{\"team\"} level stats.}\n\n\\item{season_type}{One of \\code{\"REG\"}, \\code{\"POST\"}, or \\code{\"REG+POST\"}. Filters\ndata to regular season (\"REG\"), post season (\"POST\") or keeps all data.\nOnly applied if \\code{summary_level} == \\code{\"season\"}.}\n\n\\item{pbp}{This argument allows passing a subset of nflverse play-by-play\ndata, created with \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}} or loaded with \\code{\\link[=load_pbp]{load_pbp()}}.\nStats are then calculated based on the \\code{game_id}s and \\code{play_id}s in this\nsubset of play-by-play data, rather then using the seasons specified in the\n\\code{seasons} argument. The function will error if required variables are\nmissing from the subset, but lists which variables are missing.\nIf \\code{pbp = NULL} (the default), all available games and plays from the\n\\code{seasons} argument are used to calculate stats.\nPlease use this responsibly, because the output is structurally identical\nto full seasons, even if plays have been filtered out. It may then appear\nas if the stats are incorrect. If \\code{pbp} is not \\code{NULL}, the function will add\nthe attribute \\code{\"custom_pbp\" = TRUE} to the function output to help identify\nstats that are possibly based on play-by-play subsets.}\n}\n\\value{\nA tibble of player/team stats summarized by season/week.\n}\n\\description{\nCompute various NFL stats based off nflverse Play-by-Play data.\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\nstats <- calculate_stats(2023, \"season\", \"player\")\ndplyr::glimpse(stats)\n})\n}\n}\n\\seealso{\n\\link{nfl_stats_variables} for a description of all variables.\n\n\\url{https://nflfastr.com/articles/stats_variables.html} for a searchable\ntable of the stats variable descriptions.\n}\n"
  },
  {
    "path": "man/calculate_win_probability.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ep_wp_calculators.R\n\\name{calculate_win_probability}\n\\alias{calculate_win_probability}\n\\title{Compute win probability}\n\\usage{\ncalculate_win_probability(pbp_data)\n}\n\\arguments{\n\\item{pbp_data}{Play-by-play dataset to estimate win probability for.}\n}\n\\value{\nThe original pbp_data with the following columns appended to it:\n\\describe{\n\\item{wp}{win probability.}\n\\item{vegas_wp}{win probability taking into account pre-game spread.}\n}\n}\n\\description{\nfor provided plays. Returns the data with\nprobabilities of winning the game. The following columns\nmust be present: receive_h2_ko (1 if game is in 1st half and possession\nteam will receive 2nd half kickoff, 0 otherwise),\nhome_team, posteam, half_seconds_remaining, game_seconds_remaining,\nspread_line (how many points home team was favored by), down, ydstogo,\nyardline_100, posteam_timeouts_remaining, defteam_timeouts_remaining\n}\n\\details{\nComputes win probability for provided plays. Returns the data with\nspread and non-spread-adjusted win probabilities. The following columns\nmust be present:\n\\itemize{\n\\item{receive_2h_ko (1 if game is in 1st half and possession team will receive 2nd half kickoff, 0 otherwise)}\n\\item{score_differential}\n\\item{home_team}\n\\item{posteam}\n\\item{half_seconds_remaining}\n\\item{game_seconds_remaining}\n\\item{spread_line (how many points home team was favored by)}\n\\item{down}\n\\item{ydstogo}\n\\item{yardline_100}\n\\item{posteam_timeouts_remaining}\n\\item{defteam_timeouts_remaining}\n}\n}\n\\examples{\n\\donttest{\ntry({# to avoid CRAN test problems\nlibrary(dplyr)\ndata <- tibble::tibble(\n\"receive_2h_ko\" = 0,\n\"home_team\" = \"SEA\",\n\"posteam\" = \"SEA\",\n\"score_differential\" = 0,\n\"half_seconds_remaining\" = 1800,\n\"game_seconds_remaining\" = 3600,\n\"spread_line\" = c(1, 3, 4, 7, 14),\n\"down\" = 1,\n\"ydstogo\" = 10,\n\"yardline_100\" = 75,\n\"posteam_timeouts_remaining\" = 3,\n\"defteam_timeouts_remaining\" = 3\n)\n\nnflfastR::calculate_win_probability(data) |>\n  dplyr::select(spread_line, wp, vegas_wp)\n})\n}\n}\n"
  },
  {
    "path": "man/clean_pbp.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_additional_functions.R\n\\name{clean_pbp}\n\\alias{clean_pbp}\n\\title{Clean Play by Play Data}\n\\usage{\nclean_pbp(pbp, ...)\n}\n\\arguments{\n\\item{pbp}{is a Data frame of play-by-play data scraped using \\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{...}{Additional arguments passed to a message function (for internal use).}\n}\n\\value{\nThe input Data Frame of the parameter 'pbp' with the following columns\nadded:\n\\describe{\n\\item{success}{Binary indicator wheter epa > 0 in the given play. }\n\\item{passer}{Name of the dropback player (scrambles included) including plays with penalties.}\n\\item{passer_jersey_number}{Jersey number of the passer.}\n\\item{rusher}{Name of the rusher (no scrambles) including plays with penalties.}\n\\item{rusher_jersey_number}{Jersey number of the rusher.}\n\\item{receiver}{Name of the receiver including plays with penalties.}\n\\item{receiver_jersey_number}{Jersey number of the receiver.}\n\\item{pass}{Binary indicator if the play was a pass play (sacks and scrambles included).}\n\\item{rush}{Binary indicator if the play was a rushing play.}\n\\item{special}{Binary indicator if the play was a special teams play.}\n\\item{first_down}{Binary indicator if the play ended in a first down.}\n\\item{aborted_play}{Binary indicator if the play description indicates \"Aborted\".}\n\\item{play}{Binary indicator: 1 if the play was a 'normal' play (including penalties), 0 otherwise.}\n\\item{passer_id}{ID of the player in the 'passer' column.}\n\\item{rusher_id}{ID of the player in the 'rusher' column.}\n\\item{receiver_id}{ID of the player in the 'receiver' column.}\n\\item{name}{Name of the 'passer' if it is not 'NA', or name of the 'rusher' otherwise.}\n\\item{fantasy}{Name of the rusher on rush plays or receiver on pass plays.}\n\\item{fantasy_id}{ID of the rusher on rush plays or receiver on pass plays.}\n\\item{fantasy_player_name}{Name of the rusher on rush plays or receiver on pass plays (from official stats).}\n\\item{fantasy_player_id}{ID of the rusher on rush plays or receiver on pass plays (from official stats).}\n\\item{jersey_number}{Jersey number of the player listed in the 'name' column.}\n\\item{id}{ID of the player in the 'name' column.}\n\\item{out_of_bounds}{= 1 if play description contains \"ran ob\", \"pushed ob\", or \"sacked ob\"; = 0 otherwise.}\n\\item{home_opening_kickoff}{= 1 if the home team received the opening kickoff, 0 otherwise.}\n}\n}\n\\description{\nClean Play by Play Data\n}\n\\details{\nBuild columns that capture what happens on all plays, including\npenalties, using string extraction from play description.\nLoosely based on Ben's nflfastR guide (\\url{https://nflfastr.com/articles/beginners_guide.html})\nbut updated to work with the RS data, which has a different player format in\nthe play description; e.g. 24-M.Lynch instead of M.Lynch.\nThe function also standardizes team abbreviations so that, for example,\nthe Chargers are always represented by 'LAC' regardless of which year it was.\nStarting in 2022, play-by-play data was missing gsis player IDs of rookies.\nThis functions tries to fix as many as possible.\n}\n\\seealso{\nFor information on parallel processing and progress updates please\nsee \\link{nflfastR}.\n}\n"
  },
  {
    "path": "man/decode_player_ids.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_decode_player_ids.R\n\\name{decode_player_ids}\n\\alias{decode_player_ids}\n\\title{Decode the player IDs in nflfastR play-by-play data}\n\\usage{\ndecode_player_ids(pbp, ..., fast = TRUE)\n}\n\\arguments{\n\\item{pbp}{is a Data frame of play-by-play data scraped using \\code{\\link[=fast_scraper]{fast_scraper()}}.}\n\n\\item{...}{Additional arguments passed to a message function (for internal use).}\n\n\\item{fast}{If \\code{TRUE} the IDs will be decoded with the high efficient\nfunction \\link[gsisdecoder:decode_ids]{decode_ids}. If \\code{FALSE} an nflfastR internal\nfunction will be used for decoding (it is generally not recommended to do this,\nunless there is a problem with \\link[gsisdecoder:decode_ids]{decode_ids}\nwhich can take several days to fix on CRAN.)}\n}\n\\value{\nThe input data frame of the parameter \\code{pbp} with decoded player IDs.\n}\n\\description{\nTakes all columns ending with \\code{'player_id'} as well as the\nvariables \\code{'passer_id'}, \\code{'rusher_id'}, \\code{'fantasy_id'},\n\\code{'receiver_id'}, and \\code{'id'} of an nflfastR play-by-play data set\nand decodes the player IDs to the commonly known GSIS ID format 00-00xxxxx.\n\nThe function uses by default the high efficient \\link[gsisdecoder:decode_ids]{decode_ids}\nof the package \\href{https://cran.r-project.org/package=gsisdecoder}{\\code{gsisdecoder}}.\nIn the unlikely event that there is a problem with this function, an nflfastR\ninternal decoder can be used with the option \\code{fast = FALSE}.\n\nThe 2022 play by play data introduced new player IDs that can't be decoded\nwith gsisdecoder. In that case, IDs are joined through \\link[nflreadr:load_players]{nflreadr::load_players}.\n}\n\\examples{\n\\donttest{\n# Decode data frame consisting of some names and ids\ndecode_player_ids(data.frame(\n  name = c(\"P.Mahomes\", \"B.Baldwin\", \"P.Mahomes\", \"S.Carl\", \"J.Jones\"),\n  id = c(\n    \"32013030-2d30-3033-3338-3733fa30c4fa\",\n    NA_character_,\n    \"00-0033873\",\n    NA_character_,\n    \"32013030-2d30-3032-3739-3434d4d3846d\"\n  )\n))\n}\n}\n"
  },
  {
    "path": "man/fast_scraper.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/top-level_scraper.R\n\\name{fast_scraper}\n\\alias{fast_scraper}\n\\title{Get NFL Play by Play Data}\n\\usage{\nfast_scraper(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  ...,\n  in_builder = FALSE\n)\n}\n\\arguments{\n\\item{game_ids}{Vector of character ids or a data frame including the variable\n\\code{game_id} (see details for further information).}\n\n\\item{dir}{Path to local directory (defaults to option \"nflfastR.raw_directory\")\nwhere nflfastR searches for raw game play-by-play data.\nSee \\code{\\link[=save_raw_pbp]{save_raw_pbp()}} for additional information.}\n\n\\item{...}{Additional arguments passed to the scraping functions (for internal use)}\n\n\\item{in_builder}{If \\code{TRUE}, the final message will be suppressed (for usage inside of \\code{\\link{build_nflfastR_pbp}}).}\n}\n\\value{\nData frame where each individual row represents a single play for\nall passed game_ids containing the following\ndetailed information (description partly extracted from nflscrapR):\n\\describe{\n\\item{play_id}{Numeric play id that when used with game_id and drive provides the unique identifier for a single play.}\n\\item{game_id}{Ten digit identifier for NFL game.}\n\\item{old_game_id}{Legacy NFL game ID.}\n\\item{home_team}{String abbreviation for the home team.}\n\\item{away_team}{String abbreviation for the away team.}\n\\item{season_type}{'REG' or 'POST' indicating if the game belongs to regular or post season.}\n\\item{week}{Season week.}\n\\item{posteam}{String abbreviation for the team with possession.}\n\\item{posteam_type}{String indicating whether the posteam team is home or away.}\n\\item{defteam}{String abbreviation for the team on defense.}\n\\item{side_of_field}{String abbreviation for which team's side of the field the team with possession is currently on.}\n\\item{yardline_100}{Numeric distance in the number of yards from the opponent's endzone for the posteam.}\n\\item{game_date}{Date of the game.}\n\\item{quarter_seconds_remaining}{Numeric seconds remaining in the quarter.}\n\\item{half_seconds_remaining}{Numeric seconds remaining in the half.}\n\\item{game_seconds_remaining}{Numeric seconds remaining in the game.}\n\\item{game_half}{String indicating which half the play is in, either Half1, Half2, or Overtime.}\n\\item{quarter_end}{Binary indicator for whether or not the row of the data is marking the end of a quarter.}\n\\item{drive}{Numeric drive number in the game.}\n\\item{sp}{Binary indicator for whether or not a score occurred on the play.}\n\\item{qtr}{Quarter of the game (5 is overtime).}\n\\item{down}{The down for the given play.}\n\\item{goal_to_go}{Binary indicator for whether or not the posteam is in a goal down situation.}\n\\item{time}{Time at start of play provided in string format as minutes:seconds remaining in the quarter.}\n\\item{yrdln}{String indicating the current field position for a given play.}\n\\item{ydstogo}{Numeric yards in distance from either the first down marker or the endzone in goal down situations.}\n\\item{ydsnet}{Numeric value for total yards gained on the given drive.}\n\\item{desc}{Detailed string description for the given play.}\n\\item{play_type}{String indicating the type of play: pass (includes sacks), run (includes scrambles), punt, field_goal, kickoff, extra_point, qb_kneel, qb_spike, no_play (timeouts and penalties), and missing for rows indicating end of play.}\n\\item{yards_gained}{Numeric yards gained (or lost) by the possessing team, excluding yards gained via fumble recoveries and laterals.}\n\\item{shotgun}{Binary indicator for whether or not the play was in shotgun formation.}\n\\item{no_huddle}{Binary indicator for whether or not the play was in no_huddle formation.}\n\\item{qb_dropback}{Binary indicator for whether or not the QB dropped back on the play (pass attempt, sack, or scrambled).}\n\\item{qb_kneel}{Binary indicator for whether or not the QB took a knee.}\n\\item{qb_spike}{Binary indicator for whether or not the QB spiked the ball.}\n\\item{qb_scramble}{Binary indicator for whether or not the QB scrambled.}\n\\item{pass_length}{String indicator for pass length: short or deep.}\n\\item{pass_location}{String indicator for pass location: left, middle, or right.}\n\\item{air_yards}{Numeric value for distance in yards perpendicular to the line of scrimmage at where the targeted receiver either caught or didn't catch the ball.}\n\\item{yards_after_catch}{Numeric value for distance in yards perpendicular to the yard line where the receiver made the reception to where the play ended.}\n\\item{run_location}{String indicator for location of run: left, middle, or right.}\n\\item{run_gap}{String indicator for line gap of run: end, guard, or tackle}\n\\item{field_goal_result}{String indicator for result of field goal attempt: made, missed, or blocked.}\n\\item{kick_distance}{Numeric distance in yards for kickoffs, field goals, and punts.}\n\\item{extra_point_result}{String indicator for the result of the extra point attempt: good, failed, blocked, safety (touchback in defensive endzone is 1 point apparently), or aborted.}\n\\item{two_point_conv_result}{String indicator for result of two point conversion attempt: success, failure, safety (touchback in defensive endzone is 1 point apparently), or return.}\n\\item{home_timeouts_remaining}{Numeric timeouts remaining in the half for the home team.}\n\\item{away_timeouts_remaining}{Numeric timeouts remaining in the half for the away team.}\n\\item{timeout}{Binary indicator for whether or not a timeout was called by either team.}\n\\item{timeout_team}{String abbreviation for which team called the timeout.}\n\\item{td_team}{String abbreviation for which team scored the touchdown.}\n\\item{td_player_name}{String name of the player who scored a touchdown.}\n\\item{td_player_id}{Unique identifier of the player who scored a touchdown.}\n\\item{posteam_timeouts_remaining}{Number of timeouts remaining for the possession team.}\n\\item{defteam_timeouts_remaining}{Number of timeouts remaining for the team on defense.}\n\\item{total_home_score}{Score for the home team at the end of the play.}\n\\item{total_away_score}{Score for the away team at the end of the play.}\n\\item{posteam_score}{Score the posteam at the start of the play.}\n\\item{defteam_score}{Score the defteam at the start of the play.}\n\\item{score_differential}{Score differential between the posteam and defteam at the start of the play.}\n\\item{posteam_score_post}{Score for the posteam at the end of the play.}\n\\item{defteam_score_post}{Score for the defteam at the end of the play.}\n\\item{score_differential_post}{Score differential between the posteam and defteam at the end of the play.}\n\\item{no_score_prob}{Predicted probability of no score occurring for the rest of the half based on the expected points model.}\n\\item{opp_fg_prob}{Predicted probability of the defteam scoring a FG next.}\n\\item{opp_safety_prob}{Predicted probability of the defteam scoring a safety next.}\n\\item{opp_td_prob}{Predicted probability of the defteam scoring a TD next.}\n\\item{fg_prob}{Predicted probability of the posteam scoring a FG next.}\n\\item{safety_prob}{Predicted probability of the posteam scoring a safety next.}\n\\item{td_prob}{Predicted probability of the posteam scoring a TD next.}\n\\item{extra_point_prob}{Predicted probability of the posteam scoring an extra point.}\n\\item{two_point_conversion_prob}{Predicted probability of the posteam scoring the two point conversion.}\n\\item{ep}{Using the scoring event probabilities, the estimated expected points with respect to the possession team for the given play.}\n\\item{epa}{Expected points added (EPA) by the posteam for the given play.}\n\\item{total_home_epa}{Cumulative total EPA for the home team in the game so far.}\n\\item{total_away_epa}{Cumulative total EPA for the away team in the game so far.}\n\\item{total_home_rush_epa}{Cumulative total rushing EPA for the home team in the game so far.}\n\\item{total_away_rush_epa}{Cumulative total rushing EPA for the away team in the game so far.}\n\\item{total_home_pass_epa}{Cumulative total passing EPA for the home team in the game so far.}\n\\item{total_away_pass_epa}{Cumulative total passing EPA for the away team in the game so far.}\n\\item{air_epa}{EPA from the air yards alone. For completions this represents the actual value provided through the air. For incompletions this represents the hypothetical value that could've been added through the air if the pass was completed.}\n\\item{yac_epa}{EPA from the yards after catch alone. For completions this represents the actual value provided after the catch. For incompletions this represents the difference between the hypothetical air_epa and the play's raw observed EPA (how much the incomplete pass cost the posteam).}\n\\item{comp_air_epa}{EPA from the air yards alone only for completions.}\n\\item{comp_yac_epa}{EPA from the yards after catch alone only for completions.}\n\\item{total_home_comp_air_epa}{Cumulative total completions air EPA for the home team in the game so far.}\n\\item{total_away_comp_air_epa}{Cumulative total completions air EPA for the away team in the game so far.}\n\\item{total_home_comp_yac_epa}{Cumulative total completions yac EPA for the home team in the game so far.}\n\\item{total_away_comp_yac_epa}{Cumulative total completions yac EPA for the away team in the game so far.}\n\\item{total_home_raw_air_epa}{Cumulative total raw air EPA for the home team in the game so far.}\n\\item{total_away_raw_air_epa}{Cumulative total raw air EPA for the away team in the game so far.}\n\\item{total_home_raw_yac_epa}{Cumulative total raw yac EPA for the home team in the game so far.}\n\\item{total_away_raw_yac_epa}{Cumulative total raw yac EPA for the away team in the game so far.}\n\\item{wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play.}\n\\item{def_wp}{Estimated win probability for the defteam.}\n\\item{home_wp}{Estimated win probability for the home team.}\n\\item{away_wp}{Estimated win probability for the away team.}\n\\item{wpa}{Win probability added (WPA) for the posteam.}\n\\item{vegas_wpa}{Win probability added (WPA) for the posteam: spread_adjusted model.}\n\\item{vegas_home_wpa}{Win probability added (WPA) for the home team: spread_adjusted model.}\n\\item{home_wp_post}{Estimated win probability for the home team at the end of the play.}\n\\item{away_wp_post}{Estimated win probability for the away team at the end of the play.}\n\\item{vegas_wp}{Estimated win probabiity for the posteam given the current situation at the start of the given play, incorporating pre-game Vegas line.}\n\\item{vegas_home_wp}{Estimated win probability for the home team incorporating pre-game Vegas line.}\n\\item{total_home_rush_wpa}{Cumulative total rushing WPA for the home team in the game so far.}\n\\item{total_away_rush_wpa}{Cumulative total rushing WPA for the away team in the game so far.}\n\\item{total_home_pass_wpa}{Cumulative total passing WPA for the home team in the game so far.}\n\\item{total_away_pass_wpa}{Cumulative total passing WPA for the away team in the game so far.}\n\\item{air_wpa}{WPA through the air (same logic as air_epa).}\n\\item{yac_wpa}{WPA from yards after the catch (same logic as yac_epa).}\n\\item{comp_air_wpa}{The air_wpa for completions only.}\n\\item{comp_yac_wpa}{The yac_wpa for completions only.}\n\\item{total_home_comp_air_wpa}{Cumulative total completions air WPA for the home team in the game so far.}\n\\item{total_away_comp_air_wpa}{Cumulative total completions air WPA for the away team in the game so far.}\n\\item{total_home_comp_yac_wpa}{Cumulative total completions yac WPA for the home team in the game so far.}\n\\item{total_away_comp_yac_wpa}{Cumulative total completions yac WPA for the away team in the game so far.}\n\\item{total_home_raw_air_wpa}{Cumulative total raw air WPA for the home team in the game so far.}\n\\item{total_away_raw_air_wpa}{Cumulative total raw air WPA for the away team in the game so far.}\n\\item{total_home_raw_yac_wpa}{Cumulative total raw yac WPA for the home team in the game so far.}\n\\item{total_away_raw_yac_wpa}{Cumulative total raw yac WPA for the away team in the game so far.}\n\\item{punt_blocked}{Binary indicator for if the punt was blocked.}\n\\item{first_down_rush}{Binary indicator for if a running play converted the first down.}\n\\item{first_down_pass}{Binary indicator for if a passing play converted the first down.}\n\\item{first_down_penalty}{Binary indicator for if a penalty converted the first down.}\n\\item{third_down_converted}{Binary indicator for if the first down was converted on third down.}\n\\item{third_down_failed}{Binary indicator for if the posteam failed to convert first down on third down.}\n\\item{fourth_down_converted}{Binary indicator for if the first down was converted on fourth down.}\n\\item{fourth_down_failed}{Binary indicator for if the posteam failed to convert first down on fourth down.}\n\\item{incomplete_pass}{Binary indicator for if the pass was incomplete.}\n\\item{touchback}{Binary indicator for if a touchback occurred on the play.}\n\\item{interception}{Binary indicator for if the pass was intercepted.}\n\\item{punt_inside_twenty}{Binary indicator for if the punt ended inside the twenty yard line.}\n\\item{punt_in_endzone}{Binary indicator for if the punt was in the endzone.}\n\\item{punt_out_of_bounds}{Binary indicator for if the punt went out of bounds.}\n\\item{punt_downed}{Binary indicator for if the punt was downed.}\n\\item{punt_fair_catch}{Binary indicator for if the punt was caught with a fair catch.}\n\\item{kickoff_inside_twenty}{Binary indicator for if the kickoff ended inside the twenty yard line.}\n\\item{kickoff_in_endzone}{Binary indicator for if the kickoff was in the endzone.}\n\\item{kickoff_out_of_bounds}{Binary indicator for if the kickoff went out of bounds.}\n\\item{kickoff_downed}{Binary indicator for if the kickoff was downed.}\n\\item{kickoff_fair_catch}{Binary indicator for if the kickoff was caught with a fair catch.}\n\\item{fumble_forced}{Binary indicator for if the fumble was forced.}\n\\item{fumble_not_forced}{Binary indicator for if the fumble was not forced.}\n\\item{fumble_out_of_bounds}{Binary indicator for if the fumble went out of bounds.}\n\\item{solo_tackle}{Binary indicator if the play had a solo tackle (could be multiple due to fumbles).}\n\\item{safety}{Binary indicator for whether or not a safety occurred.}\n\\item{penalty}{Binary indicator for whether or not a penalty occurred.}\n\\item{tackled_for_loss}{Binary indicator for whether or not a tackle for loss on a run play occurred.}\n\\item{fumble_lost}{Binary indicator for if the fumble was lost.}\n\\item{own_kickoff_recovery}{Binary indicator for if the kicking team recovered the kickoff.}\n\\item{own_kickoff_recovery_td}{Binary indicator for if the kicking team recovered the kickoff and scored a TD.}\n\\item{qb_hit}{Binary indicator if the QB was hit on the play.}\n\\item{rush_attempt}{Binary indicator for if the play was a run.}\n\\item{pass_attempt}{Binary indicator for if the play was a pass attempt (includes sacks).}\n\\item{sack}{Binary indicator for if the play ended in a sack.}\n\\item{touchdown}{Binary indicator for if the play resulted in a TD.}\n\\item{pass_touchdown}{Binary indicator for if the play resulted in a passing TD.}\n\\item{rush_touchdown}{Binary indicator for if the play resulted in a rushing TD.}\n\\item{return_touchdown}{Binary indicator for if the play resulted in a return TD.}\n\\item{extra_point_attempt}{Binary indicator for extra point attempt.}\n\\item{two_point_attempt}{Binary indicator for two point conversion attempt.}\n\\item{field_goal_attempt}{Binary indicator for field goal attempt.}\n\\item{kickoff_attempt}{Binary indicator for kickoff.}\n\\item{punt_attempt}{Binary indicator for punts.}\n\\item{fumble}{Binary indicator for if a fumble occurred.}\n\\item{complete_pass}{Binary indicator for if the pass was completed.}\n\\item{assist_tackle}{Binary indicator for if an assist tackle occurred.}\n\\item{lateral_reception}{Binary indicator for if a lateral occurred on the reception.}\n\\item{lateral_rush}{Binary indicator for if a lateral occurred on a run.}\n\\item{lateral_return}{Binary indicator for if a lateral occurred on a return.}\n\\item{lateral_recovery}{Binary indicator for if a lateral occurred on a fumble recovery.}\n\\item{passer_player_id}{Unique identifier for the player that attempted the pass.}\n\\item{passer_player_name}{String name for the player that attempted the pass.}\n\\item{passing_yards}{Numeric yards by the passer_player_name, including yards gained in pass plays with laterals.\nThis should equal official passing statistics.}\n\\item{receiver_player_id}{Unique identifier for the receiver that was targeted on the pass.}\n\\item{receiver_player_name}{String name for the targeted receiver.}\n\\item{receiving_yards}{Numeric yards by the receiver_player_name, excluding yards gained in pass plays with laterals.\nThis should equal official receiving statistics but could miss yards gained in pass plays with laterals.\nPlease see the description of \\code{lateral_receiver_player_name} for further information.}\n\\item{rusher_player_id}{Unique identifier for the player that attempted the run.}\n\\item{rusher_player_name}{String name for the player that attempted the run.}\n\\item{rushing_yards}{Numeric yards by the rusher_player_name, excluding yards gained in rush plays with laterals.\nThis should equal official rushing statistics but could miss yards gained in rush plays with laterals.\nPlease see the description of \\code{lateral_rusher_player_name} for further information.}\n\\item{lateral_receiver_player_id}{Unique identifier for the player that received the last(!) lateral on a pass play.}\n\\item{lateral_receiver_player_name}{String name for the player that received the last(!) lateral on a pass play.\nIf there were multiple laterals in the same play, this will only be the last player who received a lateral.\nPlease see \\url{https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards}\nfor a list of plays where multiple players recorded lateral receiving yards.}\n\\item{lateral_receiving_yards}{Numeric yards by the \\code{lateral_receiver_player_name} in pass plays with laterals.\nPlease see the description of \\code{lateral_receiver_player_name} for further information.}\n\\item{lateral_rusher_player_id}{Unique identifier for the player that received the last(!) lateral on a run play.}\n\\item{lateral_rusher_player_name}{String name for the player that received the last(!) lateral on a run play.\nIf there were multiple laterals in the same play, this will only be the last player who received a lateral.\nPlease see \\url{https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards}\nfor a list of plays where multiple players recorded lateral rushing yards.}\n\\item{lateral_rushing_yards}{Numeric yards by the \\code{lateral_rusher_player_name} in run plays with laterals.\nPlease see the description of \\code{lateral_rusher_player_name} for further information.}\n\\item{lateral_sack_player_id}{Unique identifier for the player that received the lateral on a sack.}\n\\item{lateral_sack_player_name}{String name for the player that received the lateral on a sack.}\n\\item{interception_player_id}{Unique identifier for the player that intercepted the pass.}\n\\item{interception_player_name}{String name for the player that intercepted the pass.}\n\\item{lateral_interception_player_id}{Unique indentifier for the player that received the lateral on an interception.}\n\\item{lateral_interception_player_name}{String name for the player that received the lateral on an interception.}\n\\item{punt_returner_player_id}{Unique identifier for the punt returner.}\n\\item{punt_returner_player_name}{String name for the punt returner.}\n\\item{lateral_punt_returner_player_id}{Unique identifier for the player that received the lateral on a punt return.}\n\\item{lateral_punt_returner_player_name}{String name for the player that received the lateral on a punt return.}\n\\item{kickoff_returner_player_name}{String name for the kickoff returner.}\n\\item{kickoff_returner_player_id}{Unique identifier for the kickoff returner.}\n\\item{lateral_kickoff_returner_player_id}{Unique identifier for the player that received the lateral on a kickoff return.}\n\\item{lateral_kickoff_returner_player_name}{String name for the player that received the lateral on a kickoff return.}\n\\item{punter_player_id}{Unique identifier for the punter.}\n\\item{punter_player_name}{String name for the punter.}\n\\item{kicker_player_name}{String name for the kicker on FG or kickoff.}\n\\item{kicker_player_id}{Unique identifier for the kicker on FG or kickoff.}\n\\item{own_kickoff_recovery_player_id}{Unique identifier for the player that recovered their own kickoff.}\n\\item{own_kickoff_recovery_player_name}{String name for the player that recovered their own kickoff.}\n\\item{blocked_player_id}{Unique identifier for the player that blocked the punt or FG.}\n\\item{blocked_player_name}{String name for the player that blocked the punt or FG.}\n\\item{tackle_for_loss_1_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n\\item{tackle_for_loss_1_player_name}{String name for one of the potential players with the tackle for loss.}\n\\item{tackle_for_loss_2_player_id}{Unique identifier for one of the potential players with the tackle for loss.}\n\\item{tackle_for_loss_2_player_name}{String name for one of the potential players with the tackle for loss.}\n\\item{qb_hit_1_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see \\code{sack_player} or \\verb{half_sack_*_player}.}\n\\item{qb_hit_1_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see \\code{sack_player} or \\verb{half_sack_*_player}.}\n\\item{qb_hit_2_player_id}{Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see \\code{sack_player} or \\verb{half_sack_*_player}.}\n\\item{qb_hit_2_player_name}{String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see \\code{sack_player} or \\verb{half_sack_*_player}.}\n\\item{forced_fumble_player_1_team}{Team of one of the players with a forced fumble.}\n\\item{forced_fumble_player_1_player_id}{Unique identifier of one of the players with a forced fumble.}\n\\item{forced_fumble_player_1_player_name}{String name of one of the players with a forced fumble.}\n\\item{forced_fumble_player_2_team}{Team of one of the players with a forced fumble.}\n\\item{forced_fumble_player_2_player_id}{Unique identifier of one of the players with a forced fumble.}\n\\item{forced_fumble_player_2_player_name}{String name of one of the players with a forced fumble.}\n\\item{solo_tackle_1_team}{Team of one of the players with a solo tackle.}\n\\item{solo_tackle_2_team}{Team of one of the players with a solo tackle.}\n\\item{solo_tackle_1_player_id}{Unique identifier of one of the players with a solo tackle.}\n\\item{solo_tackle_2_player_id}{Unique identifier of one of the players with a solo tackle.}\n\\item{solo_tackle_1_player_name}{String name of one of the players with a solo tackle.}\n\\item{solo_tackle_2_player_name}{String name of one of the players with a solo tackle.}\n\\item{assist_tackle_1_player_id}{Unique identifier of one of the players with a tackle assist.}\n\\item{assist_tackle_1_player_name}{String name of one of the players with a tackle assist.}\n\\item{assist_tackle_1_team}{Team of one of the players with a tackle assist.}\n\\item{assist_tackle_2_player_id}{Unique identifier of one of the players with a tackle assist.}\n\\item{assist_tackle_2_player_name}{String name of one of the players with a tackle assist.}\n\\item{assist_tackle_2_team}{Team of one of the players with a tackle assist.}\n\\item{assist_tackle_3_player_id}{Unique identifier of one of the players with a tackle assist.}\n\\item{assist_tackle_3_player_name}{String name of one of the players with a tackle assist.}\n\\item{assist_tackle_3_team}{Team of one of the players with a tackle assist.}\n\\item{assist_tackle_4_player_id}{Unique identifier of one of the players with a tackle assist.}\n\\item{assist_tackle_4_player_name}{String name of one of the players with a tackle assist.}\n\\item{assist_tackle_4_team}{Team of one of the players with a tackle assist.}\n\\item{tackle_with_assist}{Binary indicator for if there has been a tackle with assist.}\n\\item{tackle_with_assist_1_player_id}{Unique identifier of one of the players with a tackle with assist.}\n\\item{tackle_with_assist_1_player_name}{String name of one of the players with a tackle with assist.}\n\\item{tackle_with_assist_1_team}{Team of one of the players with a tackle with assist.}\n\\item{tackle_with_assist_2_player_id}{Unique identifier of one of the players with a tackle with assist.}\n\\item{tackle_with_assist_2_player_name}{String name of one of the players with a tackle with assist.}\n\\item{tackle_with_assist_2_team}{Team of one of the players with a tackle with assist.}\n\\item{pass_defense_1_player_id}{Unique identifier of one of the players with a pass defense.}\n\\item{pass_defense_1_player_name}{String name of one of the players with a pass defense.}\n\\item{pass_defense_2_player_id}{Unique identifier of one of the players with a pass defense.}\n\\item{pass_defense_2_player_name}{String name of one of the players with a pass defense.}\n\\item{fumbled_1_team}{Team of one of the first player with a fumble.}\n\\item{fumbled_1_player_id}{Unique identifier of the first player who fumbled on the play.}\n\\item{fumbled_1_player_name}{String name of one of the first player who fumbled on the play.}\n\\item{fumbled_2_player_id}{Unique identifier of the second player who fumbled on the play.}\n\\item{fumbled_2_player_name}{String name of one of the second player who fumbled on the play.}\n\\item{fumbled_2_team}{Team of one of the second player with a fumble.}\n\\item{fumble_recovery_1_team}{Team of one of the players with a fumble recovery.}\n\\item{fumble_recovery_1_yards}{Yards gained by one of the players with a fumble recovery.}\n\\item{fumble_recovery_1_player_id}{Unique identifier of one of the players with a fumble recovery.}\n\\item{fumble_recovery_1_player_name}{String name of one of the players with a fumble recovery.}\n\\item{fumble_recovery_2_team}{Team of one of the players with a fumble recovery.}\n\\item{fumble_recovery_2_yards}{Yards gained by one of the players with a fumble recovery.}\n\\item{fumble_recovery_2_player_id}{Unique identifier of one of the players with a fumble recovery.}\n\\item{fumble_recovery_2_player_name}{String name of one of the players with a fumble recovery.}\n\\item{sack_player_id}{Unique identifier of the player who recorded a solo sack.}\n\\item{sack_player_name}{String name of the player who recorded a solo sack.}\n\\item{half_sack_1_player_id}{Unique identifier of the first player who recorded half a sack.}\n\\item{half_sack_1_player_name}{String name of the first player who recorded half a sack.}\n\\item{half_sack_2_player_id}{Unique identifier of the second player who recorded half a sack.}\n\\item{half_sack_2_player_name}{String name of the second player who recorded half a sack.}\n\\item{return_team}{String abbreviation of the return team.}\n\\item{return_yards}{Yards gained by the return team.}\n\\item{penalty_team}{String abbreviation of the team with the penalty.}\n\\item{penalty_player_id}{Unique identifier for the player with the penalty.}\n\\item{penalty_player_name}{String name for the player with the penalty.}\n\\item{penalty_yards}{Yards gained (or lost) by the posteam from the penalty.}\n\\item{replay_or_challenge}{Binary indicator for whether or not a replay or challenge.}\n\\item{replay_or_challenge_result}{String indicating the result of the replay or challenge.}\n\\item{penalty_type}{String indicating the penalty type of the first penalty in the given play. Will be \\code{NA} if \\code{desc} is missing the type.}\n\\item{defensive_two_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on a two point conversion, this results following a turnover.}\n\\item{defensive_two_point_conv}{Binary indicator whether or not the defense successfully scored on the two point conversion.}\n\\item{defensive_extra_point_attempt}{Binary indicator whether or not the defense was able to have an attempt on an extra point attempt, this results following a blocked attempt that the defense recovers the ball.}\n\\item{defensive_extra_point_conv}{Binary indicator whether or not the defense successfully scored on an extra point attempt.}\n\\item{safety_player_name}{String name for the player who scored a safety.}\n\\item{safety_player_id}{Unique identifier for the player who scored a safety.}\n\\item{season}{4 digit number indicating to which season the game belongs to.}\n\\item{cp}{Numeric value indicating the probability for a complete pass based on comparable game situations.}\n\\item{cpoe}{For a single pass play this is 1 - cp when the pass was completed or 0 - cp when the pass was incomplete. Analyzed for a whole game or season an indicator for the passer how much over or under expectation his completion percentage was.}\n\\item{series}{Starts at 1, each new first down increments, numbers shared across both teams NA: kickoffs, extra point/two point conversion attempts, non-plays, no posteam}\n\\item{series_success}{1: scored touchdown, gained enough yards for first down.}\n\\item{series_result}{Possible values: First down, Touchdown, Opp touchdown, Field goal, Missed field goal, Safety, Turnover, Punt, Turnover on downs, QB kneel, End of half}\n\\item{start_time}{Kickoff time in eastern time zone.}\n\\item{order_sequence}{Column provided by NFL to fix out-of-order plays. Available 2011 and beyond with source \"nfl\".}\n\\item{time_of_day}{Time of day of play in UTC \"HH:MM:SS\" format. Available 2011 and beyond with source \"nfl\".}\n\\item{stadium}{Game site name.}\n\\item{weather}{String describing the weather including temperature, humidity and wind (direction and speed). Doesn't change during the game!}\n\\item{nfl_api_id}{UUID of the game in the new NFL API.}\n\\item{play_clock}{Time on the playclock when the ball was snapped.}\n\\item{play_deleted}{Binary indicator for deleted plays.}\n\\item{play_type_nfl}{Play type as listed in the NFL source. Slightly different to the regular play_type variable.}\n\\item{special_teams_play}{Binary indicator for whether play is special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n\\item{st_play_type}{Type of special teams play from NFL source. Available 2011 and beyond with source \"nfl\".}\n\\item{end_clock_time}{Game time at the end of a given play.}\n\\item{end_yard_line}{String indicating the yardline at the end of the given play consisting of team half and yard line number.}\n\\item{drive_real_start_time}{Local day time when the drive started (currently not used by the NFL and therefore mostly 'NA').}\n\\item{drive_play_count}{Numeric value of how many regular plays happened in a given drive.}\n\\item{drive_time_of_possession}{Time of possession in a given drive.}\n\\item{drive_first_downs}{Number of first downs in a given drive.}\n\\item{drive_inside20}{Binary indicator if the offense was able to get inside the opponents 20 yard line.}\n\\item{drive_ended_with_score}{Binary indicator the drive ended with a score.}\n\\item{drive_quarter_start}{Numeric value indicating in which quarter the given drive has started.}\n\\item{drive_quarter_end}{Numeric value indicating in which quarter the given drive has ended.}\n\\item{drive_yards_penalized}{Numeric value of how many yards the offense gained or lost through penalties in the given drive.}\n\\item{drive_start_transition}{String indicating how the offense got the ball.}\n\\item{drive_end_transition}{String indicating how the offense lost the ball.}\n\\item{drive_game_clock_start}{Game time at the beginning of a given drive.}\n\\item{drive_game_clock_end}{Game time at the end of a given drive.}\n\\item{drive_start_yard_line}{String indicating where a given drive started consisting of team half and yard line number.}\n\\item{drive_end_yard_line}{String indicating where a given drive ended consisting of team half and yard line number.}\n\\item{drive_play_id_started}{Play_id of the first play in the given drive.}\n\\item{drive_play_id_ended}{Play_id of the last play in the given drive.}\n\\item{fixed_drive}{Manually created drive number in a game.}\n\\item{fixed_drive_result}{Manually created drive result.}\n\\item{away_score}{Total points scored by the away team.}\n\\item{home_score}{Total points scored by the home team.}\n\\item{location}{Either 'Home' o 'Neutral' indicating if the home team played at home or at a neutral site. }\n\\item{result}{Equals home_score - away_score and means the game outcome from the perspective of the home team.}\n\\item{total}{Equals home_score + away_score and means the total points scored in the given game.}\n\\item{spread_line}{The closing spread line for the game. A positive number means the home team was favored by that many points, a negative number means the away team was favored by that many points. (Source: Pro-Football-Reference)}\n\\item{total_line}{The closing total line for the game. (Source: Pro-Football-Reference)}\n\\item{div_game}{Binary indicator for if the given game was a division game.}\n\\item{roof}{One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference)}\n\\item{surface}{What type of ground the game was played on. (Source: Pro-Football-Reference)}\n\\item{temp}{The temperature at the stadium only for 'roof' = 'outdoors' or 'open'.(Source: Pro-Football-Reference)}\n\\item{wind}{The speed of the wind in miles/hour only for 'roof' = 'outdoors' or 'open'. (Source: Pro-Football-Reference)}\n\\item{home_coach}{First and last name of the home team coach. (Source: Pro-Football-Reference)}\n\\item{away_coach}{First and last name of the away team coach. (Source: Pro-Football-Reference)}\n\\item{stadium_id}{ID of the stadium the game was played in. (Source: Pro-Football-Reference)}\n\\item{game_stadium}{Name of the stadium the game was played in. (Source: Pro-Football-Reference)}\n}\n}\n\\description{\nLoad and parse NFL play-by-play data and add all of the original\nnflfastR variables. As nflfastR now provides multiple functions which add\ninformation to the output of this function, it is recommended to use\n\\code{\\link{build_nflfastR_pbp}} instead.\n}\n\\details{\nTo load valid game_ids please use the package function\n\\code{\\link{fast_scraper_schedules}} (the function can directly handle the\noutput of that function)\n}\n\\examples{\n\\donttest{\n# Get pbp data for two games\ntry({# to avoid CRAN test problems\nfast_scraper(c(\"2019_01_GB_CHI\", \"2013_21_SEA_DEN\"))\n})\n\n\n# It is also possible to directly use the\n# output of `fast_scraper_schedules` as input\ntry({# to avoid CRAN test problems\nlibrary(dplyr, warn.conflicts = FALSE)\nfast_scraper_schedules(2020) |>\n  slice_tail(n = 3) |>\n  fast_scraper()\n})\n\n\\dontshow{\n# Close open connections for R CMD Check\nfuture::plan(\"sequential\")\n}\n}\n}\n\\seealso{\nFor information on parallel processing and progress updates please\nsee \\link{nflfastR}.\n\n\\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}, \\code{\\link[=save_raw_pbp]{save_raw_pbp()}}\n}\n"
  },
  {
    "path": "man/fast_scraper_roster.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/top-level_scraper.R\n\\name{fast_scraper_roster}\n\\alias{fast_scraper_roster}\n\\title{Load Team Rosters for Multiple Seasons}\n\\usage{\nfast_scraper_roster(...)\n}\n\\arguments{\n\\item{...}{\n  Arguments passed on to \\code{\\link[nflreadr:load_rosters]{nflreadr::load_rosters}}\n  \\describe{\n    \\item{\\code{seasons}}{a numeric vector of seasons to return, defaults to returning\nthis year's data if it is March or later. If set to \\code{TRUE}, will return all available data.\nData available back to 1920.}\n    \\item{\\code{file_type}}{One of \\code{c(\"rds\", \"csv\", \"parquet\")}. Can also be set globally with\n\\code{options(nflreadr.prefer)}}\n  }}\n}\n\\value{\nA tibble of season-level roster data.\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated. Please use \\code{\\link[nflreadr:load_rosters]{nflreadr::load_rosters}}.\n}\n\\details{\nSee \\code{\\link[nflreadr:load_rosters]{nflreadr::load_rosters}} for details.\n}\n\\examples{\n\\donttest{\n# Roster of the 2019 and 2020 seasons\ntry({# to avoid CRAN test problems\n# fast_scraper_roster(2019:2020)\n})\n}\n}\n\\seealso{\nFor information on parallel processing and progress updates please\nsee \\link{nflfastR}.\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/fast_scraper_schedules.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/top-level_scraper.R\n\\name{fast_scraper_schedules}\n\\alias{fast_scraper_schedules}\n\\title{Load NFL Season Schedules}\n\\usage{\nfast_scraper_schedules(...)\n}\n\\arguments{\n\\item{...}{\n  Arguments passed on to \\code{\\link[nflreadr:load_schedules]{nflreadr::load_schedules}}\n  \\describe{\n    \\item{\\code{seasons}}{a numeric vector of seasons to return, default \\code{TRUE} returns all available data.}\n  }}\n}\n\\value{\nA tibble of game information for past and/or future games.\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated. Please use \\code{\\link[nflreadr:load_schedules]{nflreadr::load_schedules}}.\n}\n\\details{\nSee \\code{\\link[nflreadr:load_schedules]{nflreadr::load_schedules}} for details.\n}\n\\examples{\n\\donttest{\n# Get schedules for the whole 2015 - 2018 seasons\ntry({# to avoid CRAN test problems\n# fast_scraper_schedules(2015:2018)\n})\n}\n}\n\\seealso{\nFor information on parallel processing and progress updates please\nsee \\link{nflfastR}.\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/field_descriptions.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data_documentation.R\n\\docType{data}\n\\name{field_descriptions}\n\\alias{field_descriptions}\n\\title{nflfastR Field Descriptions}\n\\format{\nA data frame including names and descriptions of all variables in\nan nflfastR dataset.\n}\n\\usage{\nfield_descriptions\n}\n\\description{\nnflfastR Field Descriptions\n}\n\\examples{\n\\donttest{\nfield_descriptions\n}\n}\n\\seealso{\nThe searchable table on the\n\\href{https://nflfastr.com/articles/field_descriptions.html}{nflfastR website}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/missing_raw_pbp.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/save_raw_pbp.R\n\\name{missing_raw_pbp}\n\\alias{missing_raw_pbp}\n\\title{Compute Missing Raw PBP Data on Local Filesystem}\n\\usage{\nmissing_raw_pbp(\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL),\n  seasons = TRUE,\n  verbose = TRUE\n)\n}\n\\arguments{\n\\item{dir}{Path to local directory (defaults to option \"nflfastR.raw_directory\").\nnflfastR will download the raw game files split by season into one sub\ndirectory per season.}\n\n\\item{seasons}{a numeric vector of seasons to return, default \\code{TRUE} returns all available data.}\n\n\\item{verbose}{If \\code{TRUE}, will print number of missing game files as well as\noldest and most recent missing ID to console.}\n}\n\\value{\nA character vector of missing game IDs. If no files are missing,\nreturns \\code{NULL} invisibly.\n}\n\\description{\nUses \\code{\\link[nflreadr:load_schedules]{nflreadr::load_schedules()}} to load game IDs of finished games and\ncompares these IDs to all files saved under \\code{dir}.\nThis function is intended to serve as input for \\code{\\link[=save_raw_pbp]{save_raw_pbp()}}.\n}\n\\examples{\n\\donttest{\ntry(\nmissing <- missing_raw_pbp(tempdir())\n)\n}\n}\n\\seealso{\n\\code{\\link[=save_raw_pbp]{save_raw_pbp()}}\n}\n"
  },
  {
    "path": "man/nfl_stats_variables.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data_documentation.R\n\\docType{data}\n\\name{nfl_stats_variables}\n\\alias{nfl_stats_variables}\n\\title{NFL Stats Variables}\n\\format{\nA data frame explaining all variables returned by the function\n\\code{\\link[=calculate_stats]{calculate_stats()}}.\n}\n\\usage{\nnfl_stats_variables\n}\n\\description{\nNFL Stats Variables\n}\n\\examples{\n\\donttest{\nnfl_stats_variables\n}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/nflfastR-package.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/nflfastR-package.R\n\\docType{package}\n\\name{nflfastR-package}\n\\alias{nflfastR}\n\\alias{nflfastR-package}\n\\title{nflfastR: Functions to Efficiently Access NFL Play by Play Data}\n\\description{\n\\if{html}{\\figure{logo.png}{options: style='float: right' alt='logo' width='120'}}\n\nA set of functions to access National Football League play-by-play data from \\url{https://www.nfl.com/}.\n}\n\\section{Parallel Processing and Progress Updates in nflfastR}{\n\\subsection{Preface}{\n\nPrior to nflfastR v4.0, parallel processing could be activated with an\nargument \\code{pp} in the relevant functions and progress updates were always\nshown. Both of these methods are bad practice and were therefore removed\nin nflfastR v4.0\n\nThe next sections describe how to make nflfastR work in parallel processes\nand show progress updates if the user wants to.\n}\n\n\\subsection{More Speed Using Parallel Processing}{\n\nNearly all nflfastR functions support parallel processing\nusing \\code{\\link[furrr:future_map]{furrr::future_map()}} if it is enabled by a call to \\code{\\link[future:plan]{future::plan()}}\nprior to the function call.\nPlease see the documentation of the functions for detailed information.\n\nAs an example, the following code block will resolve all function calls in the\ncurrent session using multiple sessions in the background and load play-by-play\ndata for the 2018 through 2020 seasons or build them freshly for the 2018 and\n2019 Super Bowls:\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{future::plan(\"multisession\")\nload_pbp(2018:2020)\nbuild_nflfastR_pbp(c(\"2018_21_NE_LA\", \"2019_21_SF_KC\"))\n}\\if{html}{\\out{</div>}}\n\nWe recommend choosing a default parallel processing method and saving it\nas an environment variable in the R user profile to make sure all futures\nwill be resolved with the chosen method by default.\nThis can be done by following the below given steps.\n\nFirst, run the following line and the file \\code{.Renviron} should be opened automatically.\nIf you haven't saved any environment variables yet, this will be an empty file.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{usethis::edit_r_environ()\n}\\if{html}{\\out{</div>}}\n\nIn the opened file \\code{.Renviron} add the next line, then save the file and restart your R session.\nPlease note that this example sets \"multisession\" as default. For most users\nthis should be the appropriate plan but please make sure it truly is.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{R_FUTURE_PLAN=\"multisession\"\n}\\if{html}{\\out{</div>}}\n\nAfter the session is freshly restarted please check if the above method worked\nby running the next line. If the output is \\code{FALSE} you successfully set up a\ndefault non-sequential \\code{\\link[future:plan]{future::plan()}}. If the output is \\code{TRUE} all functions\nwill behave like they were called with \\code{\\link[purrr:map]{purrr::map()}} and NOT in multisession.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{inherits(future::plan(), \"sequential\")\n}\\if{html}{\\out{</div>}}\n\nFor more information on possible plans please see\n\\href{https://github.com/futureverse/future/blob/develop/README.md}{the future package Readme}.\n\nFor more information on \\code{.Renviron} please see\n\\href{https://rstats.wtf/r-startup.html}{this book chapter}.\n}\n\n\\subsection{Get Progress Updates while Functions are Running}{\n\nMost nflfastR functions are able to show progress updates\nusing \\code{\\link[progressr:progressor]{progressr::progressor()}} if they are turned on before the function is\ncalled. There are at least two basic ways to do this by either activating\nprogress updates globally (for the current session) with\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{progressr::handlers(global = TRUE)\n}\\if{html}{\\out{</div>}}\n\nor by piping the function call into \\code{\\link[progressr:with_progress]{progressr::with_progress()}}:\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{load_pbp(2018:2020) |>\n  progressr::with_progress()\n}\\if{html}{\\out{</div>}}\n\nJust like in the previous section, it is possible to activate global\nprogression handlers by default. This can be done by following the below given steps.\n\nFirst, run the following line and the file \\code{.Rprofile} should be opened automatically.\nIf you haven't saved any code yet, this will be an empty file.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{usethis::edit_r_profile()\n}\\if{html}{\\out{</div>}}\n\nIn the opened file \\code{.Rprofile} add the next line, then save the file and restart your R\nsession. All code in this file will be executed when a new R session starts.\nThe part \\verb{if (require(\"progressr\"))} makes sure this will only run if the\npackage progressr is installed to avoid crashing R sessions.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{if (requireNamespace(\"progressr\", quietly = TRUE)) progressr::handlers(global = TRUE)\n}\\if{html}{\\out{</div>}}\n\nAfter the session is freshly restarted please check if the above method worked\nby running the next line. If the output is \\code{TRUE} you successfully activated\nglobal progression handlers for all sessions.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{progressr::handlers(global = NA)\n}\\if{html}{\\out{</div>}}\n\nFor more information how to work with progress handlers please see \\link[progressr:progressr]{progressr::progressr}.\n\nFor more information on \\code{.Rprofile} please see\n\\href{https://rstats.wtf/r-startup.html}{this book chapter}.\n}\n}\n\n\\seealso{\nUseful links:\n\\itemize{\n  \\item \\url{https://nflfastr.com/}\n  \\item \\url{https://github.com/nflverse/nflfastR}\n  \\item Report bugs at \\url{https://github.com/nflverse/nflfastR/issues}\n}\n\n}\n\\author{\n\\strong{Maintainer}: Ben Baldwin \\email{bbaldwin206@gmail.com}\n\nAuthors:\n\\itemize{\n  \\item Sebastian Carl \\email{mrcaseb@gmail.com}\n}\n\nOther contributors:\n\\itemize{\n  \\item Lee Sharpe [contributor]\n  \\item Maksim Horowitz \\email{maksim.horowitz@gmail.com} [contributor]\n  \\item Ron Yurko \\email{ryurko@stat.cmu.edu} [contributor]\n  \\item Samuel Ventura \\email{samventura22@gmail.com} [contributor]\n  \\item Tan Ho [contributor]\n  \\item John Edwards \\email{edwards1860@gmail.com} [contributor]\n}\n\n}\n"
  },
  {
    "path": "man/reexports.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/nflfastR-package.R\n\\docType{import}\n\\name{reexports}\n\\alias{reexports}\n\\alias{load_pbp}\n\\alias{load_player_stats}\n\\alias{load_team_stats}\n\\alias{load_schedules}\n\\alias{load_rosters}\n\\alias{nflverse_sitrep}\n\\alias{most_recent_season}\n\\title{Objects exported from other packages}\n\\keyword{internal}\n\\description{\nThese objects are imported from other packages. Follow the links\nbelow to see their documentation.\n\n\\describe{\n  \\item{nflreadr}{\\code{\\link[nflreadr]{load_pbp}}, \\code{\\link[nflreadr]{load_player_stats}}, \\code{\\link[nflreadr]{load_rosters}}, \\code{\\link[nflreadr]{load_schedules}}, \\code{\\link[nflreadr]{load_team_stats}}, \\code{\\link[nflreadr:latest_season]{most_recent_season}}, \\code{\\link[nflreadr:sitrep]{nflverse_sitrep}}}\n}}\n\n"
  },
  {
    "path": "man/report.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/report.R\n\\name{report}\n\\alias{report}\n\\title{Get a Situation Report on System, nflverse Package Versions and Dependencies}\n\\usage{\nreport(...)\n}\n\\arguments{\n\\item{...}{\n  Arguments passed on to \\code{\\link[nflreadr:sitrep]{nflreadr::nflverse_sitrep}}\n  \\describe{\n    \\item{\\code{pkg}}{a character vector naming installed packages, or \\code{NULL}\n(the default) meaning all nflverse packages. The function checks internally\nif all packages are installed and informs if that is not the case.}\n    \\item{\\code{recursive}}{a logical indicating whether dependencies of \\code{pkg} and their\ndependencies (and so on) should be included.\nCan also be a character vector listing the types of dependencies, a subset\nof \\code{c(\"Depends\", \"Imports\", \"LinkingTo\", \"Suggests\", \"Enhances\")}.\nCharacter string \\code{\"all\"} is shorthand for that vector, character string\n\\code{\"most\"} for the same vector without \\code{\"Enhances\"}, character string \\code{\"strong\"}\n(default) for the first three elements of that vector.}\n    \\item{\\code{redact_path}}{a logical indicating whether options that contain \"path\"\nin the name should be redacted, default = TRUE}\n  }}\n}\n\\description{\n\\ifelse{html}{\\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\\strong{[Deprecated]}}\n\nThis function was deprecated. Please use \\code{\\link[nflreadr:sitrep]{nflreadr::nflverse_sitrep}}.\n\nThis function gives a quick overview of the versions of R and\nthe operating system as well as the versions of nflverse packages, options,\nand their dependencies. It's primarily designed to help you get a quick\nidea of what's going on when you're helping someone else debug a problem.\n}\n\\details{\nSee \\code{\\link[nflreadr:sitrep]{nflreadr::nflverse_sitrep}} for details.\n}\n\\examples{\n\\donttest{\n\\dontshow{\n# set CRAN mirror to avoid failing checks in weird scenarios\nold_ops <- options(repos = c(\"CRAN\" = \"https://cran.rstudio.com/\"))\n}\n\n# report(recursive = FALSE)\nnflverse_sitrep(pkg = \"nflreadr\", recursive = TRUE)\n\n\\dontshow{\n# restore old options\noptions(old_ops)\n}\n}\n}\n\\keyword{internal}\n"
  },
  {
    "path": "man/save_raw_pbp.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/save_raw_pbp.R\n\\name{save_raw_pbp}\n\\alias{save_raw_pbp}\n\\title{Download Raw PBP Data to Local Filesystem}\n\\usage{\nsave_raw_pbp(\n  game_ids,\n  dir = getOption(\"nflfastR.raw_directory\", default = NULL)\n)\n}\n\\arguments{\n\\item{game_ids}{A vector of nflverse game IDs.}\n\n\\item{dir}{Path to local directory (defaults to option \"nflfastR.raw_directory\").\nnflfastR will download the raw game files split by season into one sub\ndirectory per season.}\n}\n\\value{\nThe function returns a data frame with one row for each downloaded file and\nthe following columns:\n\\itemize{\n\\item \\code{success} if the HTTP request was successfully performed, regardless of the\nresponse status code. This is \\code{FALSE} in case of a network error, or in case\nyou tried to resume from a server that did not support this. A value of \\code{NA}\nmeans the download was interrupted while in progress.\n\\item \\code{status_code} the HTTP status code from the request. A successful download is\nusually \\code{200} for full requests or \\code{206} for resumed requests. Anything else\ncould indicate that the downloaded file contains an error page instead of the\nrequested content.\n\\item \\code{resumefrom} the file size before the request, in case a download was resumed.\n\\item \\code{url} final url (after redirects) of the request.\n\\item \\code{destfile} downloaded file on disk.\n\\item \\code{error} if \\code{success == FALSE} this column contains an error message.\n\\item \\code{type} the \\code{Content-Type} response header value.\n\\item \\code{modified} the \\code{Last-Modified} response header value.\n\\item \\code{time} total elapsed download time for this file in seconds.\n\\item \\code{headers} vector with http response headers for the request.\n}\n}\n\\description{\nThe functions \\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}} and \\code{\\link[=fast_scraper]{fast_scraper()}} support loading\nraw pbp data from local file systems instead of Github servers.\nThis function is intended to help setting this up. It loads raw pbp data\nand saves it in the given directory split by season in subdirectories.\n}\n\\examples{\n\\donttest{\n# CREATE LOCAL TEMP DIRECTORY\nlocal_dir <- tempdir()\n\n# LOAD AND SAVE A GAME TO TEMP DIRECTORY\nsave_raw_pbp(\"2021_20_BUF_KC\", dir = local_dir)\n\n# REMOVE THE DIRECTORY\nunlink(file.path(local_dir, 2021))\n}\n}\n\\seealso{\n\\code{\\link[=build_nflfastR_pbp]{build_nflfastR_pbp()}}, \\code{\\link[=missing_raw_pbp]{missing_raw_pbp()}}\n}\n"
  },
  {
    "path": "man/stat_ids.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data_documentation.R\n\\docType{data}\n\\name{stat_ids}\n\\alias{stat_ids}\n\\title{NFL Stat IDs and their Meanings}\n\\format{\nA data frame including NFL stat IDs, names and descriptions used in\nan nflfastR dataset.\n}\n\\source{\n\\url{http://www.nflgsis.com/gsis/Documentation/Partners/StatIDs.html}\n}\n\\usage{\nstat_ids\n}\n\\description{\nNFL Stat IDs and their Meanings\n}\n\\examples{\n\\donttest{\nstat_ids\n}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/teams_colors_logos.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data_documentation.R\n\\docType{data}\n\\name{teams_colors_logos}\n\\alias{teams_colors_logos}\n\\title{NFL Team names, colors and logo urls.}\n\\format{\nA data frame with 36 rows and 10 variables containing NFL team level\ninformation, including franchises in multiple cities:\n\\describe{\n\\item{team_abbr}{Team abbreviation}\n\\item{team_name}{Complete Team name}\n\\item{team_id}{Team id used in the roster function}\n\\item{team_nick}{Nickname}\n\\item{team_conf}{Conference}\n\\item{team_division}{Division}\n\\item{team_color}{Primary color}\n\\item{team_color2}{Secondary color}\n\\item{team_color3}{Tertiary color}\n\\item{team_color4}{Quaternary color}\n\\item{team_logo_wikipedia}{Url to Team logo on wikipedia}\n\\item{team_logo_espn}{Url to higher quality logo on espn}\n\\item{team_wordmark}{Url to team wordmarks}\n\\item{team_conference_logo}{Url to AFC and NFC logos}\n\\item{team_league_logo}{Url to NFL logo}\n}\nThe primary and secondary colors have been taken from nfl.com with some modifications\nfor better team distinction and most recent team color themes.\nThe tertiary and quaternary colors are taken from Lee Sharpe's teamcolors.csv\nwho has taken them from the \\code{teamcolors} package created by Ben Baumer and\nGregory Matthews. The Wikipeadia logo urls are taken from Lee Sharpe's logos.csv\nTeam wordmarks from nfl.com\n}\n\\usage{\nteams_colors_logos\n}\n\\description{\nNFL Team names, colors and logo urls.\n}\n\\examples{\n\\donttest{\nteams_colors_logos\n}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/update_db.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/helper_database_functions.R\n\\name{update_db}\n\\alias{update_db}\n\\title{Update or Create a nflfastR Play-by-Play Database}\n\\usage{\nupdate_db(\n  dbdir = getOption(\"nflfastR.dbdirectory\", default = \".\"),\n  dbname = \"pbp_db\",\n  tblname = \"nflfastR_pbp\",\n  force_rebuild = FALSE,\n  db_connection = NULL\n)\n}\n\\arguments{\n\\item{dbdir}{Directory in which the database is or shall be located. Can also\nbe set globally with \\code{options(nflfastR.dbdirectory)}}\n\n\\item{dbname}{File name of an existing or desired SQLite database within \\code{dbdir}}\n\n\\item{tblname}{The name of the play by play data table within the database}\n\n\\item{force_rebuild}{Hybrid parameter (logical or numeric) to rebuild parts\nof or the complete play by play data table within the database (please see details for further information)}\n\n\\item{db_connection}{A \\code{DBIConnection} object, as returned by\n\\code{\\link[DBI:dbConnect]{DBI::dbConnect()}} (please see details for further information)}\n}\n\\description{\n\\code{update_db} updates or creates a database with \\code{nflfastR}\nplay by play data of all completed games since 1999.\n}\n\\details{\nThis function creates and updates a data table with the name \\code{tblname}\nwithin a SQLite database (other drivers via \\code{db_connection}) located in\n\\code{dbdir} and named \\code{dbname}.\nThe data table combines all play by play data for every available game back\nto the 1999 season and adds the most recent completed games as soon as they\nare available for \\code{nflfastR}.\n\nThe argument \\code{force_rebuild} is of hybrid type. It can rebuild the play\nby play data table either for the whole nflfastR era (with \\code{force_rebuild = TRUE})\nor just for specified seasons (e.g. \\code{force_rebuild = c(2019, 2020)}).\nPlease note the following behavior:\n\\itemize{\n\\item \\code{force_rebuild = TRUE}: The data table with the name \\code{tblname}\nwill be removed completely and rebuilt from scratch. This is helpful when\nnew columns are added during the Off-Season.\n\\item \\code{force_rebuild = c(2019, 2020)}: The data table with the name \\code{tblname}\nwill be preserved and only rows from the 2019 and 2020 seasons will be\ndeleted and re-added. This is intended to be used for ongoing seasons because\nthe NFL fixes bugs in the underlying data during the week and we recommend\nrebuilding the current season every Thursday during the season.\n}\n\nThe parameter \\code{db_connection} is intended for advanced users who want\nto use other DBI drivers, such as MariaDB, Postgres or odbc. Please note that\nthe arguments \\code{dbdir} and \\code{dbname} are dropped in case a \\code{db_connection}\nis provided but the argument \\code{tblname} will still be used to write the\ndata table into the database.\n}\n"
  },
  {
    "path": "man/update_pbp_db.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/database.R\n\\name{update_pbp_db}\n\\alias{update_pbp_db}\n\\title{Update or Create a nflverse Play-by-Play Data Table in a Connected Database}\n\\usage{\nupdate_pbp_db(conn, ..., name = \"nflverse_pbp\", seasons = most_recent_season())\n}\n\\arguments{\n\\item{conn}{A \\code{DBIConnection} object, as returned by \\code{\\link[DBI:dbConnect]{DBI::dbConnect()}}}\n\n\\item{...}{These dots are for future extensions and must be empty.}\n\n\\item{name}{The table name, passed on to \\code{\\link[DBI:dbQuoteIdentifier]{dbQuoteIdentifier()}}. Options are:\n\\itemize{\n\\item a character string with the unquoted DBMS table name,\ne.g. \\code{\"table_name\"},\n\\item a call to \\code{\\link[DBI:Id]{Id()}} with components to the fully qualified table name,\ne.g. \\code{Id(schema = \"my_schema\", table = \"table_name\")}\n\\item a call to \\code{\\link[DBI:SQL]{SQL()}} with the quoted and fully qualified table name\ngiven verbatim, e.g. \\code{SQL('\"my_schema\".\"table_name\"')}\n}}\n\n\\item{seasons}{Hybrid argument (logical or numeric) to update parts\nof or the complete play by play table within the database.\n\nIt can update the play by play data table either for the whole nflfastR era\n(with \\code{seasons = TRUE}) or just for specified seasons\n(e.g. \\code{seasons = 2024:2025}).\n\nDefaults to \\link{most_recent_season}. Please see details for further information.}\n}\n\\value{\nAlways returns the database connection invisibly.\n}\n\\description{\nThe nflfastR play-by-play era dates back to 1999. To analyze all the data\nefficiently, there is practically no alternative to working with a database.\n\nThis function helps to create and maintain a table containing all\nplay-by-play data of the nflfastR era in a connected database.\nPrimarily, the preprocessed data from \\link{load_pbp} is written to the database\nand, if necessary, supplemented with the latest games using\n\\link{build_nflfastR_pbp}.\n}\n\\details{\n\\subsection{The \\code{seasons} argument}{\n\nThe \\code{seasons} argument controls how the table in the connected database is\nhandled.\n\nWith \\code{seasons = TRUE}, the table in argument \\code{name} will be removed completely\n(by calling \\link[DBI:dbRemoveTable]{DBI::dbRemoveTable}) and all seasons of the nflfastR era will be\nadded to a fresh table. This is helpful when new columns are added during the\noffseason.\n\nWith a numerical vector, e.g. \\code{seasons = 2024:2025}, the table in argument\n\\code{name} will be preserved and only rows from the given seasons will be deleted\nand re-added (by calling \\link[DBI:dbAppendTable]{DBI::dbAppendTable}). This is intended to be used\nfor ongoing seasons because the NFL fixes bugs in the underlying data during\nthe week and we recommend rebuilding the current season every Thursday during\nthe season.\n\nThe default behavior is \\code{seasons = most_recent_season()}, which means that\nonly the most recent season is updated or added.\n\nTo keep the table, and thus also the schema, but update all play-by-play\ndata of the nflfastR era, set\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{seasons = seq(1999, most_recent_season())\n}\\if{html}{\\out{</div>}}\n\nIf \\code{seasons} contains multiple seasons, it is possible to control whether the\nseasons are loaded individually and written to the database, or whether\nmultiple seasons should be processed in chunks. The latter is more efficient\nbecause fewer write operations are required, but at the same time, the data\nmust first be stored in memory. The option \\verb{“nflfastR.db_chunk_size”} can\nbe used to control how many seasons are loaded together in a chunk and\nwritten to the database. With the following option, for example, 5 seasons\nare always loaded together and written to the database.\n\n\\if{html}{\\out{<div class=\"sourceCode\">}}\\preformatted{options(\"nflfastR.db_chunk_size\" = 5L)\n}\\if{html}{\\out{</div>}}\n}\n}\n\\examples{\n\\donttest{\ncon <- DBI::dbConnect(duckdb::duckdb())\ntry({# to avoid CRAN test problems\nupdate_pbp_db(con, seasons = 2024)\n})\n}\n}\n"
  },
  {
    "path": "nflfastR.Rproj",
    "content": "Version: 1.0\nProjectId: e1e14382-386c-49b3-9b3f-206a4cc98503\n\nRestoreWorkspace: Default\nSaveWorkspace: Default\nAlwaysSaveHistory: Default\n\nEnableCodeIndexing: Yes\nUseSpacesForTab: Yes\nNumSpacesForTab: 2\nEncoding: UTF-8\n\nRnwWeave: Sweave\nLaTeX: pdfLaTeX\n\nAutoAppendNewline: Yes\nStripTrailingWhitespace: Yes\n\nBuildType: Package\nPackageUseDevtools: Yes\nPackageInstallArgs: --no-multiarch --with-keep.source\nPackageRoxygenize: rd,collate,namespace\n\nUseNativePipeOperator: Yes\n"
  },
  {
    "path": "pkgdown/_pkgdown.yml",
    "content": "url: https://nflfastr.com/\n\ntemplate:\n  bootstrap: 5\n  light-switch: true\n  bslib:\n    font_scale: 1.1\n    base_font: {google: \"Roboto\"}\n    heading_font: {google: \"Kanit\"}\n    code_font: {google: \"Fira Code\"}\n  opengraph:\n    image:\n      src: man/figures/card.png\n      alt: \"nflfastR social preview card\"\n    twitter:\n      site: \"@nflfastR\"\n      card: summary_large_image\n\ntoc:\n  depth: 3\n\nauthors:\n  Sebastian Carl:\n    href: https://mrcaseb.com\n  Ben Baldwin:\n    href: https://bsky.app/profile/rbsdm.com\n  Lee Sharpe:\n    href: https://twitter.com/LeeSharpeNFL\n  Maksim Horowitz:\n    href: https://twitter.com/bklynmaks\n  Ron Yurko:\n    href: https://twitter.com/Stat_Ron\n  Samuel Ventura:\n    href: https://twitter.com/stat_sam\n  Tan Ho:\n    href: https://tanho.ca\n  John Edwards:\n    href: https://johnbedwards.io\nhome:\n  title: An R package to quickly obtain clean and tidy NFL play by play data\n  links:\n  - text: nflverse Discord Chat\n    href: https://discord.gg/5Er2FBnnQa\n  - text: nflfastR Beginner's Guide\n    href: articles/beginners_guide.html\n  - text: nflfastR stats landing page\n    href: https://rbsdm.com/stats/\n  - text: Lee Sharpe's nfl game data\n    href: https://nflgamedata.com\n\nnavbar:\n  bg: dark\n  type: light\n  structure:\n    left:  [home, intro, reference, news, articles]\n    right: [search, lightswitch, stats, games, discord, github, more]\n  components:\n    games:\n      icon: \"fas fa-football-ball fa-lg\"\n      href: http://nflgamedata.com/\n      aria-label: Games\n    stats:\n      icon: \"fas fa-chart-line fa-lg\"\n      href: https://rbsdm.com/stats/\n      aria-label: Stats\n    reference:\n      text: \"Functions\"\n      href: reference/index.html\n    discord:\n      icon: \"fab fa-discord fa-lg\"\n      href: https://discord.com/invite/5Er2FBnnQa\n      aria-label: Discord\n    articles:\n      text: \"Articles\"\n      menu:\n      - text: A beginner’s guide to nflfastR\n        href: articles/beginners_guide.html\n      - text: Field Descriptions\n        href: articles/field_descriptions.html\n      - text: Stats Variable Descriptions\n        href: articles/stats_variables.html\n      - text: nflfastR models\n        href: https://www.opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/\n      - text: Open Source Football\n        href: https://www.opensourcefootball.com/\n    more:\n      text: \"Packages & More\"\n      menu:\n        - text: \"nflverse Packages\"\n        - text: nflfastR\n          href: https://nflfastr.com\n        - text: nflseedR\n          href: https://nflseedr.com\n        - text: nfl4th\n          href: https://www.nfl4th.com\n        - text: nflreadr\n          href: https://nflreadr.nflverse.com/\n        - text: nflplotR\n          href: https://nflplotr.nflverse.com/\n        - text: nflverse\n          href: https://nflverse.nflverse.com/\n        - text: \"Open Source Football\"\n          href: https://www.opensourcefootball.com\n        - text: \"nflverse Data\"\n        - text: nflverse GitHub\n          href: https://github.com/nflverse\n        - text: ffverse\n        - text: \"ffverse.com\"\n          href: https://www.ffverse.com\nreference:\n  - title: Main Functions\n    contents:\n      - build_nflfastR_pbp\n      - update_db\n      - update_pbp_db\n  - title: Load Functions\n    desc: >\n      These functions access precomputed data using the nflreadr package.\n      See <https://nflreadr.nflverse.com> for info and more data load functions.\n    contents:\n      - reexports\n  - title: Utility Functions\n    contents:\n      - save_raw_pbp\n      - missing_raw_pbp\n      - starts_with(\"calculate_\")\n  - title: Documentation\n    contents:\n      - nflfastR-package\n      - teams_colors_logos\n      - field_descriptions\n      - stat_ids\n      - nfl_stats_variables\n  - title: Lower Level Functions\n    desc: >\n      These functions are wrapped in the above listed main functions and\n      typically not used by the enduser.\n    contents:\n      - fast_scraper\n      - add_qb_epa\n      - add_xpass\n      - add_xyac\n      - clean_pbp\n      - decode_player_ids\n  - title: Deprecated\n    desc: 'These functions are no longer recommended for use, see nflreadr for latest versions.'\n    contents:\n      - fast_scraper_roster\n      - fast_scraper_schedules\n      - report\n"
  },
  {
    "path": "pkgdown/extra.css",
    "content": "/*\nCheck: https://www.w3schools.com/css/css_rwd_mediaqueries.asp\nfor Responsive Web Design - Media Queries\n*/\n.row > main {\n  max-width: 100%;\n}\n\n@media only screen and (min-width: 640px) {\n  main + .col-md-3 {\n    margin-left: unset;\n    padding-left: 5rem;\n    max-width: 75%;\n  }\n}\n\nh4.author,h4.date {\n  padding-top:0px;\n  margin-top:0px;\n}\n\n.navbar-brand {\n  font-weight: 300;\n  font-size: 1.5rem;\n  font-family: 'Kanit', sans-serif;\n}\n\n.me-auto {\n    color: #009E8D !important;\n}\n\n/*\nfrom gt custom css\ndraws lines between function names on reference page\n*/\n\ndt {\n  text-decoration: underline;\n  text-decoration-style: solid;\n  text-underline-offset: 4px;\n  font-family: monospace;\n  border-top-style: dotted;\n  border-top-width: 1px;\n  border-top-color: gray;\n  margin-bottom: 5px;\n  padding-top: 5px;\n}\n\n.active .nav-link {\n  color: #F85714 !important;\n}\n"
  },
  {
    "path": "tests/testthat/_snaps/build_nflfastR_pbp.md",
    "content": "# default_play is synced with build_nflfastR_pbp\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"play_id\", \"game_id\", \"old_game_id\", \"home_team\", \"away_team\", \"season_type\", \"week\", \"posteam\", \"posteam_type\", \"defteam\", \"side_of_field\", \"yardline_100\", \"game_date\", \"quarter_seconds_remaining\", \"half_seconds_remaining\", \"game_seconds_remaining\", \"game_half\", \"quarter_end\", \"drive\", \"sp\", \"qtr\", \"down\", \"goal_to_go\", \"time\", \"yrdln\", \"ydstogo\", \"ydsnet\", \"desc\", \"play_type\", \"yards_gained\", \"shotgun\", \"no_huddle\", \"qb_dropback\", \"qb_kneel\", \"qb_spike\", \"qb_scramble\", \"pass_length\", \"pass_location\", \"air_yards\", \"yards_after_catch\", \"run_location\", \"run_gap\", \"field_goal_result\", \"kick_distance\", \"extra_point_result\", \"two_point_conv_result\", \"home_timeouts_remaining\", \"away_timeouts_remaining\", \"timeout\", \"timeout_team\", \"td_team\", \"td_player_name\", \"td_player_id\", \"posteam_timeouts_remaining\", \"defteam_timeouts_remaining\", \"total_home_score\", \"total_away_score\", \"posteam_score\", \"defteam_score\", \"score_differential\", \"posteam_score_post\", \"defteam_score_post\", \"score_differential_post\", \"no_score_prob\", \"opp_fg_prob\", \"opp_safety_prob\", \"opp_td_prob\", \"fg_prob\", \"safety_prob\", \"td_prob\", \"extra_point_prob\", \"two_point_conversion_prob\", \"ep\", \"epa\", \"total_home_epa\", \"total_away_epa\", \"total_home_rush_epa\", \"total_away_rush_epa\", \"total_home_pass_epa\", \"total_away_pass_epa\", \"air_epa\", \"yac_epa\", \"comp_air_epa\", \"comp_yac_epa\", \"total_home_comp_air_epa\", \"total_away_comp_air_epa\", \"total_home_comp_yac_epa\", \"total_away_comp_yac_epa\", \"total_home_raw_air_epa\", \"total_away_raw_air_epa\", \"total_home_raw_yac_epa\", \"total_away_raw_yac_epa\", \"wp\", \"def_wp\", \"home_wp\", \"away_wp\", \"wpa\", \"vegas_wpa\", \"vegas_home_wpa\", \"home_wp_post\", \"away_wp_post\", \"vegas_wp\", \"vegas_home_wp\", \"total_home_rush_wpa\", \"total_away_rush_wpa\", \"total_home_pass_wpa\", \"total_away_pass_wpa\", \"air_wpa\", \"yac_wpa\", \"comp_air_wpa\", \"comp_yac_wpa\", \"total_home_comp_air_wpa\", \"total_away_comp_air_wpa\", \"total_home_comp_yac_wpa\", \"total_away_comp_yac_wpa\", \"total_home_raw_air_wpa\", \"total_away_raw_air_wpa\", \"total_home_raw_yac_wpa\", \"total_away_raw_yac_wpa\", \"punt_blocked\", \"first_down_rush\", \"first_down_pass\", \"first_down_penalty\", \"third_down_converted\", \"third_down_failed\", \"fourth_down_converted\", \"fourth_down_failed\", \"incomplete_pass\", \"touchback\", \"interception\", \"punt_inside_twenty\", \"punt_in_endzone\", \"punt_out_of_bounds\", \"punt_downed\", \"punt_fair_catch\", \"kickoff_inside_twenty\", \"kickoff_in_endzone\", \"kickoff_out_of_bounds\", \"kickoff_downed\", \"kickoff_fair_catch\", \"fumble_forced\", \"fumble_not_forced\", \"fumble_out_of_bounds\", \"solo_tackle\", \"safety\", \"penalty\", \"tackled_for_loss\", \"fumble_lost\", \"own_kickoff_recovery\", \"own_kickoff_recovery_td\", \"qb_hit\", \"rush_attempt\", \"pass_attempt\", \"sack\", \"touchdown\", \"pass_touchdown\", \"rush_touchdown\", \"return_touchdown\", \"extra_point_attempt\", \"two_point_attempt\", \"field_goal_attempt\", \"kickoff_attempt\", \"punt_attempt\", \"fumble\", \"complete_pass\", \"assist_tackle\", \"lateral_reception\", \"lateral_rush\", \"lateral_return\", \"lateral_recovery\", \"passer_player_id\", \"passer_player_name\", \"passing_yards\", \"receiver_player_id\", \"receiver_player_name\", \"receiving_yards\", \"rusher_player_id\", \"rusher_player_name\", \"rushing_yards\", \"lateral_receiver_player_id\", \"lateral_receiver_player_name\", \"lateral_receiving_yards\", \"lateral_rusher_player_id\", \"lateral_rusher_player_name\", \"lateral_rushing_yards\", \"lateral_sack_player_id\", \"lateral_sack_player_name\", \"interception_player_id\", \"interception_player_name\", \"lateral_interception_player_id\", \"lateral_interception_player_name\", \"punt_returner_player_id\", \"punt_returner_player_name\", \"lateral_punt_returner_player_id\", \"lateral_punt_returner_player_name\", \"kickoff_returner_player_name\", \"kickoff_returner_player_id\", \"lateral_kickoff_returner_player_id\", \"lateral_kickoff_returner_player_name\", \"punter_player_id\", \"punter_player_name\", \"kicker_player_name\", \"kicker_player_id\", \"own_kickoff_recovery_player_id\", \"own_kickoff_recovery_player_name\", \"blocked_player_id\", \"blocked_player_name\", \"tackle_for_loss_1_player_id\", \"tackle_for_loss_1_player_name\", \"tackle_for_loss_2_player_id\", \"tackle_for_loss_2_player_name\", \"qb_hit_1_player_id\", \"qb_hit_1_player_name\", \"qb_hit_2_player_id\", \"qb_hit_2_player_name\", \"forced_fumble_player_1_team\", \"forced_fumble_player_1_player_id\", \"forced_fumble_player_1_player_name\", \"forced_fumble_player_2_team\", \"forced_fumble_player_2_player_id\", \"forced_fumble_player_2_player_name\", \"solo_tackle_1_team\", \"solo_tackle_2_team\", \"solo_tackle_1_player_id\", \"solo_tackle_2_player_id\", \"solo_tackle_1_player_name\", \"solo_tackle_2_player_name\", \"assist_tackle_1_player_id\", \"assist_tackle_1_player_name\", \"assist_tackle_1_team\", \"assist_tackle_2_player_id\", \"assist_tackle_2_player_name\", \"assist_tackle_2_team\", \"assist_tackle_3_player_id\", \"assist_tackle_3_player_name\", \"assist_tackle_3_team\", \"assist_tackle_4_player_id\", \"assist_tackle_4_player_name\", \"assist_tackle_4_team\", \"tackle_with_assist\", \"tackle_with_assist_1_player_id\", \"tackle_with_assist_1_player_name\", \"tackle_with_assist_1_team\", \"tackle_with_assist_2_player_id\", \"tackle_with_assist_2_player_name\", \"tackle_with_assist_2_team\", \"pass_defense_1_player_id\", \"pass_defense_1_player_name\", \"pass_defense_2_player_id\", \"pass_defense_2_player_name\", \"fumbled_1_team\", \"fumbled_1_player_id\", \"fumbled_1_player_name\", \"fumbled_2_player_id\", \"fumbled_2_player_name\", \"fumbled_2_team\", \"fumble_recovery_1_team\", \"fumble_recovery_1_yards\", \"fumble_recovery_1_player_id\", \"fumble_recovery_1_player_name\", \"fumble_recovery_2_team\", \"fumble_recovery_2_yards\", \"fumble_recovery_2_player_id\", \"fumble_recovery_2_player_name\", \"sack_player_id\", \"sack_player_name\", \"half_sack_1_player_id\", \"half_sack_1_player_name\", \"half_sack_2_player_id\", \"half_sack_2_player_name\", \"return_team\", \"return_yards\", \"penalty_team\", \"penalty_player_id\", \"penalty_player_name\", \"penalty_yards\", \"replay_or_challenge\", \"replay_or_challenge_result\", \"penalty_type\", \"defensive_two_point_attempt\", \"defensive_two_point_conv\", \"defensive_extra_point_attempt\", \"defensive_extra_point_conv\", \"safety_player_name\", \"safety_player_id\", \"season\", \"cp\", \"cpoe\", \"series\", \"series_success\", \"series_result\", \"order_sequence\", \"start_time\", \"time_of_day\", \"stadium\", \"weather\", \"nfl_api_id\", \"play_clock\", \"play_deleted\", \"play_type_nfl\", \"special_teams_play\", \"st_play_type\", \"end_clock_time\", \"end_yard_line\", \"fixed_drive\", \"fixed_drive_result\", \"drive_real_start_time\", \"drive_play_count\", \"drive_time_of_possession\", \"drive_first_downs\", \"drive_inside20\", \"drive_ended_with_score\", \"drive_quarter_start\", \"drive_quarter_end\", \"drive_yards_penalized\", \"drive_start_transition\", \"drive_end_transition\", \"drive_game_clock_start\", \"drive_game_clock_end\", \"drive_start_yard_line\", \"drive_end_yard_line\", \"drive_play_id_started\", \"drive_play_id_ended\", \"away_score\", \"home_score\", \"location\", \"result\", \"total\", \"spread_line\", \"total_line\", \"div_game\", \"roof\", \"surface\", \"temp\", \"wind\", \"home_coach\", \"away_coach\", \"stadium_id\", \"game_stadium\", \"aborted_play\", \"success\", \"passer\", \"passer_jersey_number\", \"rusher\", \"rusher_jersey_number\", \"receiver\", \"receiver_jersey_number\", \"pass\", \"rush\", \"first_down\", \"special\", \"play\", \"passer_id\", \"rusher_id\", \"receiver_id\", \"name\", \"jersey_number\", \"id\", \"fantasy_player_name\", \"fantasy_player_id\", \"fantasy\", \"fantasy_id\", \"out_of_bounds\", \"home_opening_kickoff\", \"qb_epa\", \"xyac_epa\", \"xyac_mean_yardage\", \"xyac_median_yardage\", \"xyac_success\", \"xyac_fd\", \"xpass\", \"pass_oe\"]\n        }\n      },\n      \"value\": [\"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"numeric\", \"numeric\", \"character\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"numeric\", \"numeric\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"numeric\", \"numeric\", \"character\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"integer\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"numeric\", \"character\", \"character\", \"character\", \"numeric\", \"character\", \"character\", \"numeric\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"character\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"character\", \"character\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"numeric\", \"character\", \"integer\", \"character\", \"integer\", \"character\", \"integer\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"character\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"numeric\"]\n    }\n\n"
  },
  {
    "path": "tests/testthat/_snaps/stats/calculate_stats.md",
    "content": "# calculate_stats works\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"player_id\", \"player_name\", \"player_display_name\", \"position\", \"position_group\", \"headshot_url\", \"season\", \"season_type\", \"recent_team\", \"games\", \"completions\", \"attempts\", \"passing_yards\", \"passing_tds\", \"passing_interceptions\", \"sacks_suffered\", \"sack_yards_lost\", \"sack_fumbles\", \"sack_fumbles_lost\", \"passing_air_yards\", \"passing_yards_after_catch\", \"passing_first_downs\", \"passing_epa\", \"passing_cpoe\", \"passing_2pt_conversions\", \"pacr\", \"passing_10\", \"passing_16\", \"passing_20\", \"passing_40\", \"carries\", \"rushing_yards\", \"rushing_tds\", \"rushing_fumbles\", \"rushing_fumbles_lost\", \"rushing_first_downs\", \"rushing_epa\", \"rushing_2pt_conversions\", \"rushing_10\", \"rushing_12\", \"rushing_20\", \"rushing_40\", \"receptions\", \"targets\", \"receiving_yards\", \"receiving_tds\", \"receiving_fumbles\", \"receiving_fumbles_lost\", \"receiving_air_yards\", \"receiving_yards_after_catch\", \"receiving_first_downs\", \"receiving_epa\", \"receiving_2pt_conversions\", \"receiving_10\", \"receiving_16\", \"receiving_20\", \"receiving_40\", \"racr\", \"target_share\", \"air_yards_share\", \"wopr\", \"special_teams_tds\", \"def_tackles_solo\", \"def_tackles_with_assist\", \"def_tackle_assists\", \"def_tackles_for_loss\", \"def_tackles_for_loss_yards\", \"def_fumbles_forced\", \"def_sacks\", \"def_sack_yards\", \"def_qb_hits\", \"def_interceptions\", \"def_interception_yards\", \"def_pass_defended\", \"def_tds\", \"def_fumbles\", \"def_safeties\", \"misc_yards\", \"fumble_recovery_own\", \"fumble_recovery_yards_own\", \"fumble_recovery_opp\", \"fumble_recovery_yards_opp\", \"fumble_recovery_tds\", \"penalties\", \"penalty_yards\", \"fumbles_forced_by_opp\", \"fumbles_not_forced\", \"fumbles_out_of_bounds\", \"fumbles_total\", \"fumbles_lost_total\", \"punt_returns\", \"punt_return_yards\", \"kickoff_returns\", \"kickoff_return_yards\", \"fg_made\", \"fg_att\", \"fg_missed\", \"fg_blocked\", \"fg_long\", \"fg_pct\", \"fg_made_0_19\", \"fg_made_20_29\", \"fg_made_30_39\", \"fg_made_40_49\", \"fg_made_50_59\", \"fg_made_60_\", \"fg_missed_0_19\", \"fg_missed_20_29\", \"fg_missed_30_39\", \"fg_missed_40_49\", \"fg_missed_50_59\", \"fg_missed_60_\", \"fg_made_list\", \"fg_missed_list\", \"fg_blocked_list\", \"fg_made_distance\", \"fg_missed_distance\", \"fg_blocked_distance\", \"pat_made\", \"pat_att\", \"pat_missed\", \"pat_blocked\", \"pat_pct\", \"gwfg_made\", \"gwfg_att\", \"gwfg_missed\", \"gwfg_blocked\", \"gwfg_distance_list\", \"pt_att\", \"pt_blocked\", \"pt_long\", \"pt_yards\", \"pt_inside_20\", \"pt_out_of_bounds\", \"pt_downed\", \"pt_touchback\", \"pt_fair_caught\", \"pt_returned\", \"pt_return_yards\", \"pt_return_tds\", \"pt_net_yards\", \"fantasy_points\", \"fantasy_points_ppr\"]\n        }\n      },\n      \"value\": [\"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\"]\n    }\n\n---\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"player_id\", \"player_name\", \"player_display_name\", \"position\", \"position_group\", \"headshot_url\", \"season\", \"week\", \"season_type\", \"game_id\", \"team\", \"opponent_team\", \"completions\", \"attempts\", \"passing_yards\", \"passing_tds\", \"passing_interceptions\", \"sacks_suffered\", \"sack_yards_lost\", \"sack_fumbles\", \"sack_fumbles_lost\", \"passing_air_yards\", \"passing_yards_after_catch\", \"passing_first_downs\", \"passing_epa\", \"passing_cpoe\", \"passing_2pt_conversions\", \"pacr\", \"passing_10\", \"passing_16\", \"passing_20\", \"passing_40\", \"carries\", \"rushing_yards\", \"rushing_tds\", \"rushing_fumbles\", \"rushing_fumbles_lost\", \"rushing_first_downs\", \"rushing_epa\", \"rushing_2pt_conversions\", \"rushing_10\", \"rushing_12\", \"rushing_20\", \"rushing_40\", \"receptions\", \"targets\", \"receiving_yards\", \"receiving_tds\", \"receiving_fumbles\", \"receiving_fumbles_lost\", \"receiving_air_yards\", \"receiving_yards_after_catch\", \"receiving_first_downs\", \"receiving_epa\", \"receiving_2pt_conversions\", \"receiving_10\", \"receiving_16\", \"receiving_20\", \"receiving_40\", \"racr\", \"target_share\", \"air_yards_share\", \"wopr\", \"special_teams_tds\", \"def_tackles_solo\", \"def_tackles_with_assist\", \"def_tackle_assists\", \"def_tackles_for_loss\", \"def_tackles_for_loss_yards\", \"def_fumbles_forced\", \"def_sacks\", \"def_sack_yards\", \"def_qb_hits\", \"def_interceptions\", \"def_interception_yards\", \"def_pass_defended\", \"def_tds\", \"def_fumbles\", \"def_safeties\", \"misc_yards\", \"fumble_recovery_own\", \"fumble_recovery_yards_own\", \"fumble_recovery_opp\", \"fumble_recovery_yards_opp\", \"fumble_recovery_tds\", \"penalties\", \"penalty_yards\", \"fumbles_forced_by_opp\", \"fumbles_not_forced\", \"fumbles_out_of_bounds\", \"fumbles_total\", \"fumbles_lost_total\", \"punt_returns\", \"punt_return_yards\", \"kickoff_returns\", \"kickoff_return_yards\", \"fg_made\", \"fg_att\", \"fg_missed\", \"fg_blocked\", \"fg_long\", \"fg_pct\", \"fg_made_0_19\", \"fg_made_20_29\", \"fg_made_30_39\", \"fg_made_40_49\", \"fg_made_50_59\", \"fg_made_60_\", \"fg_missed_0_19\", \"fg_missed_20_29\", \"fg_missed_30_39\", \"fg_missed_40_49\", \"fg_missed_50_59\", \"fg_missed_60_\", \"fg_made_list\", \"fg_missed_list\", \"fg_blocked_list\", \"fg_made_distance\", \"fg_missed_distance\", \"fg_blocked_distance\", \"pat_made\", \"pat_att\", \"pat_missed\", \"pat_blocked\", \"pat_pct\", \"gwfg_made\", \"gwfg_att\", \"gwfg_missed\", \"gwfg_blocked\", \"gwfg_distance\", \"pt_att\", \"pt_blocked\", \"pt_long\", \"pt_yards\", \"pt_inside_20\", \"pt_out_of_bounds\", \"pt_downed\", \"pt_touchback\", \"pt_fair_caught\", \"pt_returned\", \"pt_return_yards\", \"pt_return_tds\", \"pt_net_yards\", \"fantasy_points\", \"fantasy_points_ppr\"]\n        }\n      },\n      \"value\": [\"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\"]\n    }\n\n---\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"season\", \"team\", \"season_type\", \"games\", \"completions\", \"attempts\", \"passing_yards\", \"passing_tds\", \"passing_interceptions\", \"sacks_suffered\", \"sack_yards_lost\", \"sack_fumbles\", \"sack_fumbles_lost\", \"passing_air_yards\", \"passing_yards_after_catch\", \"passing_first_downs\", \"passing_epa\", \"passing_cpoe\", \"passing_2pt_conversions\", \"passing_10\", \"passing_16\", \"passing_20\", \"passing_40\", \"carries\", \"rushing_yards\", \"rushing_tds\", \"rushing_fumbles\", \"rushing_fumbles_lost\", \"rushing_first_downs\", \"rushing_epa\", \"rushing_2pt_conversions\", \"rushing_10\", \"rushing_12\", \"rushing_20\", \"rushing_40\", \"receptions\", \"targets\", \"receiving_yards\", \"receiving_tds\", \"receiving_fumbles\", \"receiving_fumbles_lost\", \"receiving_air_yards\", \"receiving_yards_after_catch\", \"receiving_first_downs\", \"receiving_epa\", \"receiving_2pt_conversions\", \"receiving_10\", \"receiving_16\", \"receiving_20\", \"receiving_40\", \"special_teams_tds\", \"def_tackles_solo\", \"def_tackles_with_assist\", \"def_tackle_assists\", \"def_tackles_for_loss\", \"def_tackles_for_loss_yards\", \"def_fumbles_forced\", \"def_sacks\", \"def_sack_yards\", \"def_qb_hits\", \"def_interceptions\", \"def_interception_yards\", \"def_pass_defended\", \"def_tds\", \"def_fumbles\", \"def_safeties\", \"misc_yards\", \"fumble_recovery_own\", \"fumble_recovery_yards_own\", \"fumble_recovery_opp\", \"fumble_recovery_yards_opp\", \"fumble_recovery_tds\", \"penalties\", \"penalty_yards\", \"timeouts\", \"fumbles_forced_by_opp\", \"fumbles_not_forced\", \"fumbles_out_of_bounds\", \"fumbles_total\", \"fumbles_lost_total\", \"punt_returns\", \"punt_return_yards\", \"kickoff_returns\", \"kickoff_return_yards\", \"fg_made\", \"fg_att\", \"fg_missed\", \"fg_blocked\", \"fg_long\", \"fg_pct\", \"fg_made_0_19\", \"fg_made_20_29\", \"fg_made_30_39\", \"fg_made_40_49\", \"fg_made_50_59\", \"fg_made_60_\", \"fg_missed_0_19\", \"fg_missed_20_29\", \"fg_missed_30_39\", \"fg_missed_40_49\", \"fg_missed_50_59\", \"fg_missed_60_\", \"fg_made_list\", \"fg_missed_list\", \"fg_blocked_list\", \"fg_made_distance\", \"fg_missed_distance\", \"fg_blocked_distance\", \"pat_made\", \"pat_att\", \"pat_missed\", \"pat_blocked\", \"pat_pct\", \"gwfg_made\", \"gwfg_att\", \"gwfg_missed\", \"gwfg_blocked\", \"gwfg_distance_list\", \"pt_att\", \"pt_blocked\", \"pt_long\", \"pt_yards\", \"pt_inside_20\", \"pt_out_of_bounds\", \"pt_downed\", \"pt_touchback\", \"pt_fair_caught\", \"pt_returned\", \"pt_return_yards\", \"pt_return_tds\", \"pt_net_yards\"]\n        }\n      },\n      \"value\": [\"integer\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\"]\n    }\n\n---\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"season\", \"week\", \"team\", \"season_type\", \"game_id\", \"opponent_team\", \"completions\", \"attempts\", \"passing_yards\", \"passing_tds\", \"passing_interceptions\", \"sacks_suffered\", \"sack_yards_lost\", \"sack_fumbles\", \"sack_fumbles_lost\", \"passing_air_yards\", \"passing_yards_after_catch\", \"passing_first_downs\", \"passing_epa\", \"passing_cpoe\", \"passing_2pt_conversions\", \"passing_10\", \"passing_16\", \"passing_20\", \"passing_40\", \"carries\", \"rushing_yards\", \"rushing_tds\", \"rushing_fumbles\", \"rushing_fumbles_lost\", \"rushing_first_downs\", \"rushing_epa\", \"rushing_2pt_conversions\", \"rushing_10\", \"rushing_12\", \"rushing_20\", \"rushing_40\", \"receptions\", \"targets\", \"receiving_yards\", \"receiving_tds\", \"receiving_fumbles\", \"receiving_fumbles_lost\", \"receiving_air_yards\", \"receiving_yards_after_catch\", \"receiving_first_downs\", \"receiving_epa\", \"receiving_2pt_conversions\", \"receiving_10\", \"receiving_16\", \"receiving_20\", \"receiving_40\", \"special_teams_tds\", \"def_tackles_solo\", \"def_tackles_with_assist\", \"def_tackle_assists\", \"def_tackles_for_loss\", \"def_tackles_for_loss_yards\", \"def_fumbles_forced\", \"def_sacks\", \"def_sack_yards\", \"def_qb_hits\", \"def_interceptions\", \"def_interception_yards\", \"def_pass_defended\", \"def_tds\", \"def_fumbles\", \"def_safeties\", \"misc_yards\", \"fumble_recovery_own\", \"fumble_recovery_yards_own\", \"fumble_recovery_opp\", \"fumble_recovery_yards_opp\", \"fumble_recovery_tds\", \"penalties\", \"penalty_yards\", \"timeouts\", \"fumbles_forced_by_opp\", \"fumbles_not_forced\", \"fumbles_out_of_bounds\", \"fumbles_total\", \"fumbles_lost_total\", \"punt_returns\", \"punt_return_yards\", \"kickoff_returns\", \"kickoff_return_yards\", \"fg_made\", \"fg_att\", \"fg_missed\", \"fg_blocked\", \"fg_long\", \"fg_pct\", \"fg_made_0_19\", \"fg_made_20_29\", \"fg_made_30_39\", \"fg_made_40_49\", \"fg_made_50_59\", \"fg_made_60_\", \"fg_missed_0_19\", \"fg_missed_20_29\", \"fg_missed_30_39\", \"fg_missed_40_49\", \"fg_missed_50_59\", \"fg_missed_60_\", \"fg_made_list\", \"fg_missed_list\", \"fg_blocked_list\", \"fg_made_distance\", \"fg_missed_distance\", \"fg_blocked_distance\", \"pat_made\", \"pat_att\", \"pat_missed\", \"pat_blocked\", \"pat_pct\", \"gwfg_made\", \"gwfg_att\", \"gwfg_missed\", \"gwfg_blocked\", \"gwfg_distance\", \"pt_att\", \"pt_blocked\", \"pt_long\", \"pt_yards\", \"pt_inside_20\", \"pt_out_of_bounds\", \"pt_downed\", \"pt_touchback\", \"pt_fair_caught\", \"pt_returned\", \"pt_return_yards\", \"pt_return_tds\", \"pt_net_yards\"]\n        }\n      },\n      \"value\": [\"integer\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\"]\n    }\n\n---\n\n    {\n      \"type\": \"character\",\n      \"attributes\": {\n        \"names\": {\n          \"type\": \"character\",\n          \"attributes\": {},\n          \"value\": [\"player_id\", \"player_name\", \"player_display_name\", \"position\", \"position_group\", \"headshot_url\", \"season\", \"week\", \"season_type\", \"game_id\", \"team\", \"opponent_team\", \"completions\", \"attempts\", \"passing_yards\", \"passing_tds\", \"passing_interceptions\", \"sacks_suffered\", \"sack_yards_lost\", \"sack_fumbles\", \"sack_fumbles_lost\", \"passing_air_yards\", \"passing_yards_after_catch\", \"passing_first_downs\", \"passing_epa\", \"passing_cpoe\", \"passing_2pt_conversions\", \"pacr\", \"passing_10\", \"passing_16\", \"passing_20\", \"passing_40\", \"carries\", \"rushing_yards\", \"rushing_tds\", \"rushing_fumbles\", \"rushing_fumbles_lost\", \"rushing_first_downs\", \"rushing_epa\", \"rushing_2pt_conversions\", \"rushing_10\", \"rushing_12\", \"rushing_20\", \"rushing_40\", \"receptions\", \"targets\", \"receiving_yards\", \"receiving_tds\", \"receiving_fumbles\", \"receiving_fumbles_lost\", \"receiving_air_yards\", \"receiving_yards_after_catch\", \"receiving_first_downs\", \"receiving_epa\", \"receiving_2pt_conversions\", \"receiving_10\", \"receiving_16\", \"receiving_20\", \"receiving_40\", \"racr\", \"target_share\", \"air_yards_share\", \"wopr\", \"special_teams_tds\", \"def_tackles_solo\", \"def_tackles_with_assist\", \"def_tackle_assists\", \"def_tackles_for_loss\", \"def_tackles_for_loss_yards\", \"def_fumbles_forced\", \"def_sacks\", \"def_sack_yards\", \"def_qb_hits\", \"def_interceptions\", \"def_interception_yards\", \"def_pass_defended\", \"def_tds\", \"def_fumbles\", \"def_safeties\", \"misc_yards\", \"fumble_recovery_own\", \"fumble_recovery_yards_own\", \"fumble_recovery_opp\", \"fumble_recovery_yards_opp\", \"fumble_recovery_tds\", \"penalties\", \"penalty_yards\", \"fumbles_forced_by_opp\", \"fumbles_not_forced\", \"fumbles_out_of_bounds\", \"fumbles_total\", \"fumbles_lost_total\", \"punt_returns\", \"punt_return_yards\", \"kickoff_returns\", \"kickoff_return_yards\", \"fg_made\", \"fg_att\", \"fg_missed\", \"fg_blocked\", \"fg_long\", \"fg_pct\", \"fg_made_0_19\", \"fg_made_20_29\", \"fg_made_30_39\", \"fg_made_40_49\", \"fg_made_50_59\", \"fg_made_60_\", \"fg_missed_0_19\", \"fg_missed_20_29\", \"fg_missed_30_39\", \"fg_missed_40_49\", \"fg_missed_50_59\", \"fg_missed_60_\", \"fg_made_list\", \"fg_missed_list\", \"fg_blocked_list\", \"fg_made_distance\", \"fg_missed_distance\", \"fg_blocked_distance\", \"pat_made\", \"pat_att\", \"pat_missed\", \"pat_blocked\", \"pat_pct\", \"gwfg_made\", \"gwfg_att\", \"gwfg_missed\", \"gwfg_blocked\", \"gwfg_distance\", \"pt_att\", \"pt_blocked\", \"pt_long\", \"pt_yards\", \"pt_inside_20\", \"pt_out_of_bounds\", \"pt_downed\", \"pt_touchback\", \"pt_fair_caught\", \"pt_returned\", \"pt_return_yards\", \"pt_return_tds\", \"pt_net_yards\", \"fantasy_points\", \"fantasy_points_ppr\"]\n        }\n      },\n      \"value\": [\"character\", \"character\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"character\", \"character\", \"character\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"integer\", \"numeric\", \"numeric\"]\n    }\n\n"
  },
  {
    "path": "tests/testthat/helpers.R",
    "content": "# sample games we'll use to check with\ngame_ids <- c(\"2025_01_KC_LAC\", \"2019_01_GB_CHI\")\n\ntest_dir <- getwd()\n\npbp_cache <- tempfile(\"pbp_cache\", fileext = \".rds\")\n\nload_test_pbp <- function(pbp = pbp_cache, dir = test_dir) {\n  if (file.exists(pbp) && !is.null(dir)) {\n    if (interactive()) {\n      cli::cli_alert_info(\"Will return pbp from cache\")\n    }\n    return(readRDS(pbp))\n  }\n\n  g <- readRDS(file.path(test_dir, paste0(\"games.rds\")))\n\n  # model output differs across machines so we round to 4 significant digits\n  # to prevent failing tests\n  pbp_data <- build_nflfastR_pbp(game_ids, dir = dir, games = g)\n  if (!is.null(dir)) {\n    saveRDS(pbp_data, pbp)\n  }\n  pbp_data\n}\n\nsave_test_object <- function(object) {\n  obj_name <- deparse(substitute(object))\n  tmp_file <- tempfile(obj_name, fileext = \".csv\")\n  modify_digits <- dplyr::mutate_if(object, is.numeric, signif, digits = 3)\n  data.table::fwrite(modify_digits, tmp_file, na = \"NA\")\n  invisible(tmp_file)\n}\n\nload_expectation <- function(\n  type = c(\"pbp\", \"sc\", \"sc_weekly\", \"ep\", \"wp\"),\n  dir = test_dir\n) {\n  type <- match.arg(type)\n  file_name <- switch(\n    type,\n    \"pbp\" = \"expected_pbp.rds\",\n    \"sc\" = \"expected_sc.rds\",\n    \"sc_weekly\" = \"expected_sc_weekly.rds\",\n    \"ep\" = \"expected_ep.rds\",\n    \"wp\" = \"expected_wp.rds\",\n  )\n  strip_nflverse_attributes(readRDS(file.path(dir, file_name))) |>\n    # we gotta round floating point numbers because of different model output\n    # across platforms\n    round_double_to_digits()\n}\n\n# strip nflverse attributes for tests because timestamp and version cause failures\n# .internal.selfref is a data.table attribute that is not necessary in this case\nstrip_nflverse_attributes <- function(df) {\n  input_attrs <- names(attributes(df))\n  input_remove <- input_attrs[grepl(\n    \"nflverse|.internal.selfref|nflfastR\",\n    input_attrs\n  )]\n  attributes(df)[input_remove] <- NULL\n  df\n}\n\nround_double_to_digits <- function(df, digits = 3) {\n  dplyr::mutate(\n    df,\n    dplyr::across(\n      .cols = relevant_variables(),\n      .fns = function(vec) {\n        formatC(vec, digits = digits, format = \"fg\") |>\n          as.numeric() |>\n          suppressWarnings()\n      }\n    )\n  )\n}\n\nrelevant_variables <- function() {\n  c(\n    dplyr::any_of(c(\n      \"no_score_prob\",\n      \"opp_fg_prob\",\n      \"opp_safety_prob\",\n      \"opp_td_prob\",\n      \"fg_prob\",\n      \"safety_prob\",\n      \"td_prob\",\n      \"ep\",\n      \"cp\",\n      \"cpoe\",\n      \"pass_oe\",\n      \"xpass\"\n    )),\n    dplyr::ends_with(\"epa\"),\n    dplyr::ends_with(\"wp\"),\n    dplyr::ends_with(\"wp_post\"),\n    dplyr::ends_with(\"wpa\"),\n    dplyr::starts_with(\"xyac\")\n  )\n}\n"
  },
  {
    "path": "tests/testthat/test-build_nflfastR_pbp.R",
    "content": "test_that(\"build_nflfastR_pbp works (local data)\", {\n  # This test used to run on CRAN but their changes to env vars which cause\n  # check NOTES for multi-threading forced us to skip on cran. It uses locally\n  # available data so it can't break because of failed downloads\n  # UPDATE Feb 2026: we'll try testing on CRAN again\n  # skip_on_cran()\n\n  pbp <- load_test_pbp(dir = test_dir)\n  expect_s3_class(pbp, \"nflverse_data\")\n  pbp <- strip_nflverse_attributes(pbp) |>\n    # we gotta round floating point numbers because of different model output\n    # across platforms\n    round_double_to_digits()\n  exp <- load_expectation(\"pbp\")\n  expect_equal(pbp, exp)\n})\n\ntest_that(\"build_nflfastR_pbp works (outside CRAN)\", {\n  # this test is almost the same as above. However, it requires data download\n  # and will therefore not run on CRAN but everywhere else.\n  skip_on_cran()\n\n  skip_if_offline(\"github.com\")\n  pbp <- load_test_pbp(dir = NULL)\n  pbp <- strip_nflverse_attributes(pbp) |>\n    # we gotta round floating point numbers because of different model output\n    # across platforms\n    round_double_to_digits()\n  exp <- load_expectation(\"pbp\")\n  expect_equal(pbp, exp)\n})\n\ntest_that(\"default_play is synced with build_nflfastR_pbp\", {\n  # `default_play` is a table of 1 row that is supposed to match the\n  # output structure of build_nflfastR_pbp. It is used to initialize the\n  # data table in pbp DBs.\n  # This test makes sure that it is synced with build_nflfastR_pbp\n\n  exp <- load_expectation(\"pbp\")\n\n  names_and_types_exp <- vapply(exp, class, FUN.VALUE = character(1L))\n  names_and_types_def <- vapply(default_play, class, FUN.VALUE = character(1L))\n\n  expect_identical(names_and_types_def, names_and_types_exp)\n  expect_snapshot_value(names_and_types_def, style = \"json2\")\n})\n"
  },
  {
    "path": "tests/testthat/test-calculate_series_conversion_rates.R",
    "content": "test_that(\"calculate_series_conversion_rates works\", {\n  # This test used to run on CRAN but their changes to env vars which cause\n  # check NOTES for multi-threading forced us to skip on cran.\n  skip_on_cran()\n\n  pbp <- load_test_pbp()\n\n  sc <- calculate_series_conversion_rates(pbp = pbp, weekly = FALSE) |>\n    round_double_to_digits()\n  sc_weekly <- calculate_series_conversion_rates(pbp = pbp, weekly = TRUE) |>\n    round_double_to_digits()\n\n  exp_sc <- load_expectation(\"sc\")\n  exp_sc_weekly <- load_expectation(\"sc_weekly\")\n\n  expect_s3_class(sc, \"tbl_df\")\n  expect_s3_class(sc_weekly, \"tbl_df\")\n\n  expect_equal(sc, exp_sc)\n  expect_equal(sc_weekly, exp_sc_weekly)\n})\n"
  },
  {
    "path": "tests/testthat/test-calculate_stats.R",
    "content": "test_that(\"calculate_stats works\", {\n  skip_on_cran()\n  skip_if_offline(\"github.com\")\n\n  s1 <- calculate_stats(\n    seasons = 2023,\n    summary_level = \"season\",\n    stat_type = \"player\"\n  )\n  s2 <- calculate_stats(\n    seasons = 2023,\n    summary_level = \"week\",\n    stat_type = \"player\"\n  )\n  s3 <- calculate_stats(\n    seasons = 2023,\n    summary_level = \"season\",\n    stat_type = \"team\"\n  )\n  s4 <- calculate_stats(\n    seasons = 2023,\n    summary_level = \"week\",\n    stat_type = \"team\"\n  )\n  s5 <- calculate_stats(\n    seasons = 2023,\n    summary_level = \"week\",\n    stat_type = \"player\",\n    season_type = \"POST\"\n  )\n\n  names_and_types_s1 <- vapply(s1, class, FUN.VALUE = character(1L))\n  names_and_types_s2 <- vapply(s2, class, FUN.VALUE = character(1L))\n  names_and_types_s3 <- vapply(s3, class, FUN.VALUE = character(1L))\n  names_and_types_s4 <- vapply(s4, class, FUN.VALUE = character(1L))\n  names_and_types_s5 <- vapply(s5, class, FUN.VALUE = character(1L))\n\n  var_names <- nflfastR::nfl_stats_variables$variable\n\n  # Make sure variable names are listed in nflfastR::nfl_stats_variables$variable\n  expect_in(names(names_and_types_s1), var_names)\n  expect_in(names(names_and_types_s2), var_names)\n  expect_in(names(names_and_types_s3), var_names)\n  expect_in(names(names_and_types_s4), var_names)\n  expect_in(names(names_and_types_s5), var_names)\n\n  # Weak row number test\n  expect_gt(nrow(s1), 1900)\n  expect_gt(nrow(s2), 17500)\n  expect_identical(nrow(s3), 32L)\n  expect_gt(nrow(s4), 500)\n  expect_gt(nrow(s5), 800)\n\n  # Snapshot variable types and names\n  expect_snapshot_value(names_and_types_s1, style = \"json2\", variant = \"stats\")\n  expect_snapshot_value(names_and_types_s2, style = \"json2\", variant = \"stats\")\n  expect_snapshot_value(names_and_types_s3, style = \"json2\", variant = \"stats\")\n  expect_snapshot_value(names_and_types_s4, style = \"json2\", variant = \"stats\")\n  expect_snapshot_value(names_and_types_s5, style = \"json2\", variant = \"stats\")\n})\n\ntest_that(\"calculate_stats works with pbp subsets\", {\n  skip_on_cran()\n  skip_if_offline(\"github.com\")\n\n  pbp <- load_pbp(2024) |>\n    dplyr::filter(week <= 2, grepl(\"LAC\", game_id))\n  s <- calculate_stats(summary_level = \"week\", stat_type = \"player\", pbp = pbp)\n\n  # Weak row number test\n  expect_lt(nrow(s), 130)\n\n  # week is filtered to <= 2 so stats should return only those weeks\n  expect_in(unique(s$week), 1:2)\n\n  # drop some required columns\n  pbp_wrong <- pbp |> dplyr::mutate(qb_epa = NULL, play_type = NULL)\n  expect_error(\n    calculate_stats(pbp = pbp_wrong),\n    regexp = 'missing the following required variables: \"play_type\" and \"qb_epa\"'\n  )\n})\n"
  },
  {
    "path": "tests/testthat/test-ep_wp_calculators.R",
    "content": "test_that(\"calculate_expected_points works\", {\n  # This test used to run on CRAN but their changes to env vars which cause\n  # check NOTES for multi-threading forced us to skip on cran.\n  skip_on_cran()\n\n  data <- tibble::tibble(\n    \"season\" = 2018:2019,\n    \"home_team\" = \"SEA\",\n    \"posteam\" = \"SEA\",\n    \"roof\" = \"outdoors\",\n    \"half_seconds_remaining\" = 1800,\n    \"yardline_100\" = 75,\n    \"down\" = 1,\n    \"ydstogo\" = 10,\n    \"posteam_timeouts_remaining\" = 3,\n    \"defteam_timeouts_remaining\" = 3\n  )\n  ep <- calculate_expected_points(data) |> round_double_to_digits()\n  exp <- load_expectation(\"ep\")\n  expect_equal(ep, exp)\n})\n\ntest_that(\"calculate_expected_points works\", {\n  # This test used to run on CRAN but their changes to env vars which cause\n  # check NOTES for multi-threading forced us to skip on cran.\n  skip_on_cran()\n\n  data <- tibble::tibble(\n    \"receive_2h_ko\" = 0,\n    \"home_team\" = \"SEA\",\n    \"posteam\" = \"SEA\",\n    \"score_differential\" = 0,\n    \"half_seconds_remaining\" = 1800,\n    \"game_seconds_remaining\" = 3600,\n    \"spread_line\" = c(1, 3, 4, 7, 14),\n    \"down\" = 1,\n    \"ydstogo\" = 10,\n    \"yardline_100\" = 75,\n    \"posteam_timeouts_remaining\" = 3,\n    \"defteam_timeouts_remaining\" = 3\n  )\n  wp <- calculate_win_probability(data) |> round_double_to_digits()\n  exp <- load_expectation(\"wp\")\n  expect_equal(wp, exp)\n})\n"
  },
  {
    "path": "tests/testthat.R",
    "content": "# This file is part of the standard setup for testthat.\n# It is recommended that you do not modify it.\n#\n# Where should you do additional test configuration?\n# Learn more about the roles of various files in:\n# * https://r-pkgs.org/tests.html\n# * https://testthat.r-lib.org/reference/test_package.html#special-files\n\nlibrary(testthat)\nlibrary(nflfastR)\n\ntest_check(\"nflfastR\")\n"
  },
  {
    "path": "tools/check.env",
    "content": "# Check for usage of more than two cores. We really need to do this\n# because CRAN kept rejecting nflfastR\n# It is not supported on Windows and keeps failing on Debian, so it's\n# probably necessary to make sure it doesn't fail on Debian\n_R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_=\"2.5\"\n_R_CHECK_TEST_TIMING_CPU_TO_ELAPSED_THRESHOLD_=\"2.5\"\n"
  },
  {
    "path": "vignettes/.gitignore",
    "content": "*.html\n*.R\npbp_db\n"
  },
  {
    "path": "vignettes/beginners_guide.Rmd",
    "content": "---\ntitle: \"A beginner's guide to nflfastR\"\nauthor: \"Ben Baldwin\"\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#>\",\n  out.width = \"100%\"\n)\n```\n\n## Introduction\n\nThe following guide will assume you have R installed. I also highly recommend working in RStudio. If you need help getting those installed or are unfamiliar with how RStudio is laid out, [please see this section of Lee Sharpe's guide](https://github.com/leesharpe/nfldata/blob/master/RSTUDIO-INTRO.md#r-and-rstudio-introduction).\n\nA quick word if you're new to programming: all of this is happening in R. Obviously, you need to install R on your computer to do any of this. Make sure you save what you're doing in a script (in RStudio, File --> New File --> R script) so you can save your work and run multiple lines of code at once. To run code from a script, highlight what you want, and press control + enter or press the Run button in the top of the editor (see Lee's guide). If you don't highlight anything and press control + enter, the currently selected line will run. As you go through your R journey, you might get stuck and have to google a bunch of things, but that's totally okay and normal. That's how I got started!\n\n## Setup\n\nFirst, you need to install the magic packages. You only need to run this step once on a given computer. For these you can just type them into the RStudio console (look for the Console pane in RStudio) directly since you're never going to be doing this again.\n\n### Install packages\n\n``` {r eval = FALSE}\ninstall.packages(\"tidyverse\", type = \"binary\")\ninstall.packages(\"ggrepel\", type = \"binary\")\ninstall.packages(\"nflreadr\", type = \"binary\")\ninstall.packages(\"nflplotR\", type = \"binary\")\n```\n\n### Load packages\n\nOkay, now here's the stuff you're going to want to start putting into your R script. The following loads `tidyverse`, which contains a lot of helper functions for working with data and `ggrepel` for making figures, along with `nflreadr` (which allows one to quickly download `nflfastR` data, along with a lot of other data). Finally, `nflplotR` makes plotting easier.\n``` {r, results = 'hide', message = FALSE }\nlibrary(tidyverse)\nlibrary(ggrepel)\nlibrary(nflreadr)\nlibrary(nflplotR)\n```\n\nThis one is optional but makes R prefer not to display numbers in scientific notation, which I find very annoying:\n``` {r}\noptions(scipen = 9999)\n```\n\n### Load data\n\nThis will load the full play by play for the 2019 season (including playoffs). We'll get to how to get more seasons later. Note that this is downloading pre-cleaned data from the nflfastR data repository using the `load_pbp()` function included in `nflreadr`, which is much faster than building pbp from scratch.\n\n``` {r}\ndata <- load_pbp(2019)\n```\n\n## Basics: how to look at your data\n\n### Dimensions\n\n```{r echo=FALSE}\nrows = dim(data)[[1]]\ncols = dim(data)[[2]]\n```\n\nBefore moving forward, here are a few ways to get a sense of what's in a dataframe. We can check the **dim**ensions of the data, and this tells us that there are ```r rows``` rows (i.e., plays) in the data and ```r cols``` columns (variables):\n\n``` {r}\ndim(data)\n```\n\n`str` displays the **str**ucture of the dataframe:\n``` {r}\nstr(data[1:10])\n```\n\nIn the above, I've added in the `[1:10]`, which selects only the first 10 columns, otherwise the list is extremely long (remember from above that there are ```r cols``` columns!). Normally, you would just type `str(data)`.\n\nYou can similarly take a glimpse at your data:\n\n``` {r}\nglimpse(data[1:10])\n```\n\nWhere again I'm only showing the first 10 columns. The usual command would be `glimpse(data)`.\n\n### Variable names\n\nAnother very useful command is to get the `names` of the variables in the data, which you would get by entering `names(data)` (I won't show here because, again, it is  ```r cols``` columns).\n\nThat is a lot to work with!\n\n### Viewer\n\nOne more way to look at your data is with the `View()` function. If you're coming from an Excel background, this will help you feel more at home as a way to see what's in the data.\n\n``` {r eval = FALSE}\nView(data)\n```\nThis will open the viewer in RStudio in a new panel. Try it out yourself! Since there are so many columns, the Viewer won't show them all. To pick which columns to view, you can **select** some:\n\n``` {r eval = FALSE}\ndata |>\n  select(home_team, away_team, posteam, desc) |>\n  View()\n```\n\nThe `|>` thing lets you pipe together a bunch of different commands. So we're taking our data, \"`select`\"ing a few variables we want to look at, and then Viewing. Again, I can't display the results of that here, but try it out yourself!\n\n### Head + manipulation\n\nTo start, let's just look at the first few rows (the \"head\") of the data.\n\n``` {r}\ndata |> \n  select(posteam, defteam, desc, rush, pass) |> \n  head()\n```\nA couple things. \"`desc`\" is the important variable that lists the description of what happened on the play, and `head` says to show the first few rows (the \"head\" of the data). Since this is already sorted by game, these are the first 6 rows from a week 1 game, ATL @ MIN. To make code easier to read, people often put each part of a pipe on a new line, which is useful when working with more complicated functions. We could run:\n\n``` {r eval = FALSE}\ndata |> select(posteam, defteam, desc, rush, pass) |> head()\n```\n\nAnd it would return the exact same output as the one written out in multiple lines, but the code isn't as easy to read.\n\nWe've covered `select`, and the next important function to learn is `filter`, which lets you filter the data to what you want. The following returns only plays that are run plays and pass plays; i.e., no punts, kickoffs, field goals, or dead ball penalties (e.g. false starts) where we don't know what the attempted play was.\n\n``` {r}\ndata |> \n  filter(rush == 1 | pass == 1) |>\n  select(posteam, desc, rush, pass, name, passer, rusher, receiver) |> \n  head()\n```\n\nCompared to the first time we did this, the opening line for the start of the game, the kickoff, and the punt are now gone. Note that if you're checking whether a variable is equal to something, we need to use the double equals sign `==` like above. There's probably some technical reason for this [shrug emoji]. Also, the character `|` is used for \"or\", and `&` for \"and\". So `rush == 1 | pass == 1` means \"rush or pass\".\n\nNote that the `rush`, `pass`, `name`, `passer`, `rusher`, and `receiver` columns are all `nflfastR` creations, where we have provided these to make working with the data easier. As we can see above, `passer` is filled in for all dropbacks (including sacks and scrambles, which also have `pass` = 1), and `name` is equal to the passer on pass plays and the rusher on rush plays. Think of this as the primary player involved on a play.\n\nWhat if we wanted to view special teams plays? Again, we can use `filter`:\n\n``` {r}\ndata |> \n  filter(special == 1) |>\n  select(down, ydstogo, desc) |> \n  head()\n```\n\nFourth down plays?\n\n``` {r}\ndata |> \n  filter(down == 4) |>\n  select(down, ydstogo, desc) |> \n  head()\n```\n\nFourth down plays that aren't special teams plays?\n\n``` {r}\ndata |> \n  filter(down == 4 & special == 0) |>\n  select(down, ydstogo, desc) |> \n  head()\n```\n\nSo far, we've just been taking a look at the initial dataset we downloaded, but none of our results are preserved. To save a new dataframe of just the plays we want, we need to use `<-` to assign a new dataframe. Let's save a new dataframe that's just run plays and pass plays with non-missing EPA, called `pbp_rp`.\n\n``` {r}\npbp_rp <- data |>\n  filter(rush == 1 | pass == 1, !is.na(epa))\n```\n\nIn the above, `!is.na(epa)` means to exclude plays with missing (`na`) EPA. The `!` symbol is often used by computer folk to negate something, so `is.na(epa)` means \"EPA is missing\" and `!is.na(epa)` means \"EPA is not missing\", which we have used above.\n\n## Some basic stuff: Part 1\n\nOkay, we have a big dataset where we call dropbacks pass plays and non-dropbacks rush plays. Now we actually want to, like, do stuff.\n\n### Group by and Summarize\n\nLet's take a look at how various Cowboys' running backs fared on run plays in 2019:\n\n``` {r}\npbp_rp |>\n\tfilter(posteam == \"DAL\", rush == 1) |>\n\tgroup_by(rusher) |>\n\tsummarize(\n\t  mean_epa = mean(epa), success_rate = mean(success), ypc = mean(yards_gained), plays = n()\n\t  ) |>\n\tarrange(-mean_epa) |>\n\tfilter(plays > 20)\n```\n\nThere's a lot going on here. We've covered `filter` already. The `group_by` function is an *extremely* useful function that, well, groups by what you tell it -- in this case the rusher. Summarize is useful for collapsing the data down to a summary of what you're looking at, and here, while grouping by player, we're summarizing the mean of EPA, success, yardage (a bad rushing stat, but since we're here), and getting the number of plays using `n()`, which returns the number in a group. Unsurprisingly, Prescott was much more effective as a rusher in 2019 than the running backs, and there was no meaningful difference between Pollard and Elliott in efficiency.\n\nIf you check the [PFR team stats page](https://www.pro-football-reference.com/teams/dal/2019.htm), you'll notice that the above doesn't match up with the official stats. This is because `nflfastR` computes EPA and provides player names on plays with penalties and on two-point conversions. So if wanting to match the official stats, we need to restrict to `down <= 4` (to excluded two-point conversions, which have down listed as `NA`) and `play_type = run` (to exclude penalties, which are `play_type = no_play`):\n\n``` {r}\npbp_rp |>\n\tfilter(posteam == \"DAL\", down <= 4, play_type == 'run') |>\n\tgroup_by(rusher) |>\n\tsummarize(\n\t  mean_epa = mean(epa), success_rate = mean(success), ypc=mean(yards_gained), plays=n()\n\t  ) |>\n\tfilter(plays > 20)\n```\n\nNow we exactly match PFR: Zeke has 301 carries at 4.5 yards/carry, and Pollard has 86 carries for 5.3 yards/carry. Note that we still aren't matching Dak's stats to PFR because the NFL classifies scrambles as rush attempts and `nflfastR` does not.\n\n### Manipulating columns: mutate, if_else, and case_when\n\nLet's say we want to make a new column, named `home`, which is equal to 1 if the team with the ball is the home team. Let's introduce another extremely useful function, `if_else`:\n\n``` {r}\npbp_rp |>\n  mutate(\n    home = if_else(posteam == home_team, 1, 0)\n  ) |>\n  select(posteam, home_team, home) |>\n  head(10)\n```\n\n`mutate` is R's word for creating a new column (or overwriting an existing one); in this case, we've created a new column called `home`. The above uses `if_else`, which uses the following pattern: condition (in this case, `posteam == home_team`), value if condition is true (in this case, if `posteam == home_team`, it is 1), and value if the condition is false (0). So we could use this to, for example, look at average EPA/play by home and road teams:\n\n``` {r}\npbp_rp |>\n  mutate(\n    home = if_else(posteam == home_team, 1, 0)\n  ) |>\n  group_by(home) |>\n  summarize(epa = mean(epa))\n```\nNote that EPA/play is similar for home teams and away teams because `home` is already built into the `nflfastR` EPA model, so this result is expected. Actually, away EPA/play is actually somewhat higher, presumably because away teams out-performed their usual in 2019 as homefield advantage continues to decline generally.\n\n`if_else` is nice if you're creating a new column based on a simple condition. But what if you need to do something more complicated? `case_when` is a good option. Here's how it works:\n\n``` {r}\npbp_rp |>\n  filter(!is.na(cp)) |>\n  mutate(\n    depth = case_when(\n      air_yards < 0 ~ \"Negative\",\n      air_yards >= 0 & air_yards < 10 ~ \"Short\",\n      air_yards >= 10 & air_yards < 20 ~ \"Medium\",\n      air_yards >= 20 ~ \"Deep\"\n    )\n  ) |>\n  group_by(depth) |>\n  summarize(cp = mean(cp))\n```\nNote the new syntax for `case_when`: we have condition (for the first one, air yards less than 0), followed by `~`, followed by assignment (for the first one, \"Negative\"). In the above, we created 4 bins based on air yards and got average completion probability (`cp`) based on the `nflfastR` model. Unsurprisingly, `cp` is lower the longer downfield a throw goes.\n\n### A basic figure\n\nNow that we've gained some skills at manipulating data, let's put it to use by making things. Which teams were the most pass-heavy in the first half on early downs with win probability between 20 and 80, excluding the final 2 minutes of the half when everyone is pass-happy?\n\n``` {r}\nschotty <- pbp_rp |>\n\tfilter(wp > .20 & wp < .80 & down <= 2 & qtr <= 2 & half_seconds_remaining > 120) |>\n\tgroup_by(posteam) |>\n\tsummarize(mean_pass = mean(pass), plays = n()) |>\n\tarrange(-mean_pass)\nschotty\n```\n\nAgain, we've already used `filter`, `group_by`, and `summarize`. The new function we are using here is `arrange`, which sorts the data by the variable(s) given. The minus sign in front of `mean_pass` means to sort in descending order.\n\nLet's make our first figure:\n\n```{r fig1, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600}\nggplot(schotty, aes(x = reorder(posteam, -mean_pass), y = mean_pass)) +\n  geom_text(aes(label = posteam))\n```\n\nThis image is kind of a mess -- we still need a title, axis labels, etc -- but gets the point across. We'll get to that other stuff later. But more importantly, we made something interesting using `nflfastR` data! The \"reorder\" sorts the teams according to pass rate, with the \"-\" again saying to do it in descending order. \"aes\" is short for \"aesthetic\", which is R's weird way of asking which variables should go on the x and y axes.\n\nLooking at the figure, the Chiefs will never have playoff success until they establish the run.\n\n## Loading multiple seasons\n\nBecause all the data is stored in the data repository, it is very fast to load data from multiple seasons.\n\n``` {r}\npbp <- load_pbp(2015:2019)\n```\n\nThis loads play-by-play data from the 2015 through 2019 seasons. \n\nLet's make sure we got it all. By now, you should understand what this is doing:\n\n``` {r}\npbp |>\n  group_by(season) |>\n  summarize(n = n())\n```\n\nSo each season has about 48,000 plays. Just for fun, let's look at the various play types:\n\n``` {r}\npbp |>\n  group_by(play_type) |>\n  summarize(n = n())\n```\n\n## Figures with QB stats\n\nLet's do some stuff with quarterbacks:\n\n``` {r}\nqbs <- pbp |>\n  filter(season_type == \"REG\", !is.na(epa)) |>\n  group_by(id, name) |>\n  summarize(\n    epa = mean(qb_epa),\n    cpoe = mean(cpoe, na.rm = T),\n    n_dropbacks = sum(pass),\n    n_plays = n(),\n    team = last(posteam)\n  ) |>\n  ungroup() |>\n  filter(n_dropbacks > 100 & n_plays > 1000)\n```\n\nLots of new stuff here. First, we're grouping by `id` and `name` to make sure we're getting unique players; i.e., if two players have the same name (like Javorius Allen and Josh Allen both being J.Allen), we are also using their id to differentiate them. `qb_epa` is an `nflfastR` creation that is equal to EPA in all instances except for when a pass is completed and a fumble is lost, in which case a QB gets \"credit\" for the play up to the spot the fumble was lost (making EPA function like passing yards). The `last` part in the `summarize` comment gets the last team that a player was observed playing with.\n\nMy way of getting a dataset with only quarterbacks without joining to external roster data is to make sure they hit some number of dropbacks. In this case, filtering with `n_dropbacks > 100` makes sure we're only including quarterbacks. The `ungroup()` near the end is good practice after grouping to make sure you don't get weird behavior with the data you created down the line.\n\nLet's make some more figures. The `load_teams()` function is provided in the `nflreadr` package, so since we have already loaded the package, it's ready to use.\n\n``` {r}\nload_teams()\n```\n\nLet's join this to the `qbs` dataframe we created:\n\n``` {r}\nqbs <- qbs |>\n  left_join(load_teams(), by = c('team' = 'team_abbr'))\n```\n\n`left_join` means keep all the rows from the left dataframe (the first one provided, `qbs`), and join those rows to available rows in the other dataframe. We also need to provide the joining variables, `team` from `qbs` and `team_abbr` from `load_teams()`. Why do we have to type `by = c('team' = 'team_abbr')`? Who knows, but it's what `left_join` requires as instructions for how to match.\n\n### With team color dots\n\nNow we can make a figure!\n\n```{r fig2, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600}\nqbs |>\n  ggplot(aes(x = cpoe, y = epa)) +\n  #horizontal line with mean EPA\n  geom_hline(yintercept = mean(qbs$epa), color = \"red\", linetype = \"dashed\", alpha=0.5) +\n  #vertical line with mean CPOE\n  geom_vline(xintercept =  mean(qbs$cpoe), color = \"red\", linetype = \"dashed\", alpha=0.5) +\n  #add points for the QBs with the right colors\n  #cex controls point size and alpha the transparency (alpha = 1 is normal)\n  geom_point(color = qbs$team_color, cex=qbs$n_plays / 350, alpha = .6) +\n  #add names using ggrepel, which tries to make them not overlap\n  geom_text_repel(aes(label=name)) +\n  #add a smooth line fitting cpoe + epa\n  stat_smooth(geom='line', alpha=0.5, se=FALSE, method='lm')+\n  #titles and caption\n  labs(x = \"Completion % above expected (CPOE)\",\n       y = \"EPA per play (passes, rushes, and penalties)\",\n       title = \"Quarterback Efficiency, 2015 - 2019\",\n       caption = \"Data: @nflfastR\") +\n  #uses the black and white ggplot theme\n  theme_bw() +\n  #center title with hjust = 0.5\n  theme(\n    plot.title = element_text(size = 14, hjust = 0.5, face = \"bold\")\n  ) +\n  #make ticks look nice\n  #if this doesn't work, `install.packages('scales')`\n  scale_y_continuous(breaks = scales::pretty_breaks(n = 10)) +\n  scale_x_continuous(breaks = scales::pretty_breaks(n = 10))\n\n```\n\nThis looks complicated, but is just a way of getting a bunch of different stuff on the same plot: we have lines for averages, dots, names, etc. I added comments above to explain what is going on, but in practice for making figures I usually just copy and paste stuff and/or google what I need.\n\n### With team logos\n\nWe could also make the same plot with team logos:\n\n```{r fig3, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600}\nqbs |>\n  ggplot(aes(x = cpoe, y = epa)) +\n  #horizontal line with mean EPA\n  geom_hline(yintercept = mean(qbs$epa), color = \"red\", linetype = \"dashed\", alpha=0.5) +\n  #vertical line with mean CPOE\n  geom_vline(xintercept =  mean(qbs$cpoe), color = \"red\", linetype = \"dashed\", alpha=0.5) +\n  #add points for the QBs with the logos (this uses nflplotR package)\n  geom_nfl_logos(aes(team_abbr = team), width = qbs$n_plays / 45000, alpha = 0.75) +\n  #add names using ggrepel, which tries to make them not overlap\n  geom_text_repel(aes(label=name)) +\n  #add a smooth line fitting cpoe + epa\n  stat_smooth(geom='line', alpha=0.5, se=FALSE, method='lm')+\n  #titles and caption\n  labs(x = \"Completion % above expected (CPOE)\",\n       y = \"EPA per play (passes, rushes, and penalties)\",\n       title = \"Quarterback Efficiency, 2015 - 2019\",\n       caption = \"Data: @nflfastR\") +\n  theme_bw() +\n  #center title\n  theme(\n    plot.title = element_text(size = 14, hjust = 0.5, face = \"bold\")\n  ) +\n  #make ticks look nice\n  scale_y_continuous(breaks = scales::pretty_breaks(n = 10)) +\n  scale_x_continuous(breaks = scales::pretty_breaks(n = 10))\n\n```\n\nThe only changes we've made are to use `geom_nfl_logos` instead of `geom_point` (how to figure out the right size for the images in the `width` part? Trial and error).\n\nThis figure would look better with fewer players shown, but the point of this is explaining how to do stuff, so let's call this good enough.\n\n### Team tiers plot\n\nIf it's helpful, here are a few notes about the [chart originally shown here](https://www.nflfastr.com/articles/nflfastR.html#example-5-plot-offensive-and-defensive-epa-per-play-for-a-given-season), which like the above uses nflplotR for team logos.\n\n```{r ex5, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600}\nlibrary(nflplotR)\n# get pbp and filter to regular season rush and pass plays\npbp <- nflreadr::load_pbp(2005) |>\n  filter(season_type == \"REG\") |>\n  filter(!is.na(posteam) & (rush == 1 | pass == 1))\n# offense epa\noffense <- pbp |>\n  group_by(team = posteam) |>\n  summarise(off_epa = mean(epa, na.rm = TRUE))\n# defense epa\ndefense <- pbp |>\n  group_by(team = defteam) |>\n  summarise(def_epa = mean(epa, na.rm = TRUE))\n# make figure\noffense |>\n  inner_join(defense, by = \"team\") |>\n  ggplot(aes(x = off_epa, y = def_epa)) +\n  # tier lines\n  geom_abline(slope = -1.5, intercept = (4:-3)/10, alpha = .2) +\n  # nflplotR magic\n  nflplotR::geom_mean_lines(aes(y0 = off_epa, x0 = def_epa)) +\n  nflplotR::geom_nfl_logos(aes(team_abbr = team), width = 0.07, alpha = 0.7) +\n  labs(\n    x = \"Offense EPA/play\",\n    y = \"Defense EPA/play\",\n    caption = \"Data: @nflfastR\",\n    title = \"2005 NFL Offensive and Defensive EPA per Play\"\n  ) +\n  theme_bw() +\n  theme(\n    plot.title = element_text(size = 12, hjust = 0.5, face = \"bold\")\n  ) +\n  scale_y_reverse()\n```\n\n\n* The `geom_mean_lines()` function adds mean lines for offensive and defensive EPA per play \n* The slope lines are created using `geom_abline()`\n* `scale_y_reverse()` reverses the vertical axis so that up = better defense\n\nEverything else should be comprehensible by now!\n\n### A few more things on plotting\n\nThere are two ways to view plots. One is in the RStudio Viewer, which shows up in RStudio when you plot something. If plots in your RStudio viewer look ugly and pixelated, you probably need to install the `Cairo` package and then set that as the default viewer by doing Tools --> Global Options --> General --> Graphics --> Backend: Set to Cairo.\n\nThe other is to save a .png with your preferred dimensions and resolution. For example, `ggsave(\"test.png\", width = 16, height = 9, units = \"cm\")` would save the current plot as \"`test.png`\" with the units specified (you can view all the ggsave options [here](https://ggplot2.tidyverse.org/reference/ggsave.html)). \n\nOne more note: the RStudio Viewer can take a long time to preview ggplots, especially if you're doing things like adding images. If you're getting frustrated with a plot taking a long time to display, you can take advantage of [ggpreview](https://nflplotr.nflverse.com/reference/ggpreview.html) from `nflplotR`. To do this, first save the plot to an object and then run `ggpreview` on it (if this doesn't make sense, see the examples [here](https://nflplotr.nflverse.com/reference/ggpreview.html)).\n\n## Real life example: let's make a win total model\n\nI'm going to try to go through the process of cleaning and joining multiple data sets to try to get a sense of how I would approach something like this, step-by-step.\n\n### Get team wins each season\n\nWe're going to cheat a little and take advantage of Lee Sharpe's famous `games` file. Most of this stuff has been added into `nflfastR`, but it's easier working with this file where each game is one row. If you're curious, the triple colon is a way to access what is referred to as non-exported functions in a package. Think of this as like a secret menu (why is this secret? Sometimes package developers want to limit the number of exported functions as to be not overwhelming).\n\n``` {r}\ngames <- nflreadr::load_schedules()\nstr(games)\n```\n\nTo start, we want to create a dataframe where each row is a team-season observation, listing how many games they won. There are multiple ways to do this, but I'm going to just take the home and away results and bind together. As an example, here's what the `home` results look like:\n\n``` {r}\nhome <- games |>\n  filter(game_type == 'REG') |>\n  select(season, week, home_team, result) |>\n  rename(team = home_team)\nhome |> head(5)\n```\nNote that we used `rename` to change `home_team` to `team`.\n\n``` {r}\naway <- games |>\n  filter(game_type == 'REG') |>\n  select(season, week, away_team, result) |>\n  rename(team = away_team) |>\n  mutate(result = -result)\naway |> head(5)\n```\nFor away teams, we need to flip the result since result is given from the perspective of the home team. Now let's make a columns called `win` based on the result.\n\n``` {r}\nresults <- bind_rows(home, away) |>\n  arrange(week) |>\n  mutate(\n    win = case_when(\n      result > 0 ~ 1,\n      result < 0 ~ 0,\n      result == 0 ~ 0.5\n    )\n  )\n\nresults |> filter(season == 2019 & team == 'SEA')\n```\n\nDoing the `results |> filter(season == 2019 & team == 'SEA')` part at the end isn't actually for saving the data in a new form, but just making sure the previous step did what I wanted. This is a good habit to get into: frequently inspect your data and make sure it looks like you think it should.\n\nNow that we have the dataframe we wanted, we can get team wins by season easily:\n\n``` {r}\nteam_wins <- results |>\n  group_by(team, season) |>\n  summarize(\n    wins = sum(win),\n    point_diff = sum(result)) |>\n  ungroup()\n\nteam_wins |>\n  arrange(-wins) |>\n  head(5)\n```\n\nAgain, we're making sure the data looks like it \"should\" by checking the 5 seasons with the most wins, and making sure it looks right.\n\nNow that the team-season win and point differential data is ready, we need to go back to the `nflfastR` data to get EPA/play.\n\n### Get team EPA by season\n\nLet's start by getting data from every season from the `nflfastR` data repository:\n\n``` {r}\npbp <- load_pbp(1999:2019) |>\n  filter(\n    rush == 1 | pass == 1,\n    season_type == \"REG\",\n    !is.na(epa),\n    !is.na(posteam),\n    posteam != \"\"\n  ) |>\n  select(season, posteam, pass, defteam, epa)\n```\n\nI'm being pretty aggressive with dropping rows and columns (`filter` and `select`) because otherwise loading this all into memory can be painful on the computer. But this is all we need for what we're doing. Note that I'm only keeping regular season games here (`season_type == \"REG\"`) since this is how this analysis is usually done.\n\nNow we can get EPA/play on offense and defense. Let's break it out by pass and rush too. I don't remember how to do some of this so let's do it in steps. We know we need to group by team, season, and pass, so there's the beginning:\n\n``` {r}\npbp |>\n  group_by(posteam, season, pass) |> \n  summarize(epa = mean(epa)) |>\n  head(4)\n```\n\nBut this makes two rows per team-season. How to get each team-season on the same row? `pivot_wider` is what we need:\n``` {r}\npbp |>\n  group_by(posteam, season, pass) |> \n  summarize(epa = mean(epa)) |>\n  pivot_wider(names_from = pass, values_from = epa) |>\n  head(4)\n```\n\nThis one is hard to wrap my head around so I usually open up the [reference page](https://tidyr.tidyverse.org/reference/pivot_wider.html), read the example, and pray that what I try works. In this case it did. Hooray! This turned our two-lines-per-team dataframe into one, with the 0 column being pass == 0 (run plays) and the 1 column pass == 1. \n\nNow let's rename to something more sensible and save:\n\n``` {r}\noffense <- pbp |>\n  group_by(posteam, season, pass) |> \n  summarize(epa = mean(epa)) |>\n  pivot_wider(names_from = pass, values_from = epa) |>\n  rename(off_pass_epa = `1`, off_rush_epa = `0`)\n```\n\nNote that variable names that are numbers need to be surrounded in tick marks for this to work.\n\nNow we can repeat the same process for defense:\n\n``` {r}\ndefense <- pbp |>\n  group_by(defteam, season, pass) |> \n  summarize(epa = mean(epa)) |>\n  pivot_wider(names_from = pass, values_from = epa) |>\n  rename(def_pass_epa = `1`, def_rush_epa = `0`)\n```\n\nLet's do another sanity check looking at the top 5 pass offenses and defenses:\n``` {r}\n#top 5 offenses\noffense |>\n  arrange(-off_pass_epa) |>\n  head(5)\n\n#top 5 defenses\ndefense |>\n  arrange(def_pass_epa) |>\n  head(5)\n```\n\nThe top pass defenses (2002 TB, 2017 JAX, 2019 NE) and offenses (2007 Pats, 2004 Colts, 2011 Packers) definitely check out! \n\n### Fix team names and join\n\nNow we're ready to bind it all together. Actually, let's make sure all the team names are ready too.\n\n``` {r}\nteam_wins |>\n  group_by(team) |>\n  summarize(n=n()) |>\n  arrange(n)\n```\n\nNope, not yet, we need to fix the Raiders, Rams, and Chargers, which are LV, LA, and LAC in `nflfastR`.\n\n``` {r}\nteam_wins <- team_wins |>\n  mutate(\n    team = case_when(\n      team == 'OAK' ~ 'LV',\n      team == 'SD' ~ 'LAC',\n      team == 'STL' ~ 'LA',\n      TRUE ~ team\n    )\n  )\n```\n\nThe `TRUE` statement at the bottom says that if none of the above cases are found, keep team the same. Let's make sure this worked:\n\n``` {r}\nteam_wins |>\n  group_by(team) |>\n  summarize(n=n()) |>\n  arrange(n)\n```\n\nHOU has 3 fewer seasons because it didn't exist from 1999 through 2001, which is fine, and all the other team names have number of seasons that they should. Okay NOW we can join:\n\n``` {r}\ndata <- team_wins |>\n  left_join(offense, by = c('team' = 'posteam', 'season')) |>\n  left_join(defense, by = c('team' = 'defteam', 'season'))\n\ndata |>\n  filter(team == 'SEA' & season >= 2012)\n```\n\nNow we're getting really close to doing what we want! Next we need to create new columns for prior year EPA, and let's do point differential too.\n\n``` {r}\ndata <- data |> \n  arrange(team, season) |>\n  group_by(team) |> \n  mutate(\n    prior_off_rush_epa = lag(off_rush_epa),\n    prior_off_pass_epa = lag(off_pass_epa),\n    prior_def_rush_epa = lag(def_rush_epa),\n    prior_def_pass_epa = lag(def_pass_epa),\n    prior_point_diff = lag(point_diff)\n  ) |> \n  ungroup()\n\ndata |>\n  head(5)\n```\nFinally! Now we have the data in place and can start doing things with it.\n\n### Correlations and regressions\n\n``` {r}\ndata |> \n  select(-team, -season) |>\n  cor(use=\"complete.obs\") |>\n  round(2)\n```\n\n```{r echo=FALSE}\npp = cor(data$off_pass_epa, data$prior_off_pass_epa, use=\"complete.obs\") |>\n  round(2)\nrr = cor(data$off_rush_epa, data$prior_off_rush_epa, use=\"complete.obs\") |>\n  round(2)\npd = cor(data$def_pass_epa, data$prior_def_pass_epa, use=\"complete.obs\") |>\n  round(2)\nrd = cor(data$def_rush_epa, data$prior_def_rush_epa, use=\"complete.obs\") |>\n  round(2)\n```\n\nWe've covered `select`, but here we see a new use where a minus sign de-selects variables (we need to de-select team name for correlation to work because it doesn't work for character strings, and correlation with the season number itself is meaningless). We've run the correlation on this dataframe, removing missing values, and then rounding to 2 digits. Not surprisingly, we see that wins in the current season are more strongly related to passing offense EPA than rushing EPA or defense EPA, and prior offense carries more predictive power than prior defense. Pass offense is more stable year to year (```r pp```) than rush offense (```r rr```), pass defense (```r pd```), or rush defense (```r rd```).\n\nI'm actually surprised that the values for passing offense aren't higher relative to the others. Maybe it was because most of our prior results come from the `nflscrapR` era (2009 - 2019)? Let's check what this looks like since 2009 relative to earlier seasons:\n\n``` {r}\nmessage(\"2009 through 2019\")\ndata |> \n  filter(season >= 2009) |>\n  select(wins, point_diff, off_pass_epa, off_rush_epa, prior_point_diff, prior_off_pass_epa, prior_off_rush_epa) |>\n  cor(use=\"complete.obs\") |>\n  round(2)\n```\n\n``` {r}\nmessage(\"1999 through 2008\")\ndata |> \n  filter(season < 2009) |>\n  select(wins, point_diff, off_pass_epa, off_rush_epa, prior_point_diff, prior_off_pass_epa, prior_off_rush_epa) |>\n  cor(use=\"complete.obs\") |>\n  round(2)\n```\n\nYep, that seems to be the case. So in the more recent period, passing offense has become slightly more stable but more predictive of following-year success, while at the same time rushing offense has become substantially less stable and less predictive of future team success.\n\nNow let's do a basic regression of wins on prior offense and defense EPA/play. Maybe we should only look at this more recent period to fit our model since it's more relevant for 2020. In the real world, we would be more rigorous about making decisions like this, but let's proceed anyway.\n\n``` {r}\ndata <- data |> filter(season >= 2009)\n\nfit <- lm(wins ~ prior_off_pass_epa  + prior_off_rush_epa + prior_def_pass_epa + prior_def_rush_epa, data = data)\n\nsummary(fit)\n```\n\nI'm actually pretty surprised passing offense isn't higher here. How does this compare to simply using point differential?\n\n``` {r}\nfit2 <- lm(wins ~ prior_point_diff, data = data)\n\nsummary(fit2)\n```\n\nSo R2 is somewhat higher for just point differential. This isn't surprising as we've thrown away special teams plays and haven't attempted to make any adjustments for things like fumble luck that we know can improve EPA's predictive power.\n\n### Predictions\n\nNow let's get the predictions from the EPA model:\n\n``` {r}\npreds <- predict(fit, data |> filter(season == 2020)) |>\n  #was just a vector, need a tibble to bind\n  as_tibble() |>\n  #make the column name make sense\n  rename(prediction = value) |>\n  round(1) |>\n  #get names\n  bind_cols(\n    data |> filter(season == 2020) |> select(team)\n  )\n\npreds |>\n  arrange(-prediction) |>\n  head(5)\n```\n\nThis mostly checks out. \n\nWhat if we just used simple point differential to predict?\n\n``` {r}\npreds2 <- predict(fit2, data |> filter(season == 2020)) |>\n  #was just a vector, need a tibble to bind\n  as_tibble() |>\n  #make the column name make sense\n  rename(prediction = value) |>\n  round(1) |>\n  #get names\n  bind_cols(\n    data |> filter(season == 2020) |> select(team)\n  )\n\npreds2 |>\n  arrange(-prediction) |>\n  head(5)\n```\n\nNot surprisingly, this looks pretty similar. These are very basic models that don't incorporate schedule, roster changes, etc. For example, a better model would take into account Tom Brady no longer playing for the Patriots. But hopefully this has been useful!\n\n## Next Steps\n\nYou now should know enough to be able to tackle a great deal of questions using `nflfastR` data. A good way to build up skills is to take interesting things you see and try to replicate them (for making figures, this will also involve a heavy dose of googling stuff).\n\nLooking at others' code is also a good way to learn. One option is to look through the `nflfastR` code base, much of which you should now understand what it's doing. For example, [here is the function that cleans up the data and prepares it for later stages](https://github.com/mrcaseb/nflfastR/blob/master/R/helper_add_nflscrapr_mutations.R): there's a heavy dose of `mutate`, `group_by`, `arrange`, `lag`, `if_else`, and `case_when`. \n\n### Resources: The gold standards\n\nThis is an R package so this section is pretty R heavy.\n\n* [Introduction to R (**recommended**)](https://r4ds.had.co.nz/explore-intro.html)\n* [Open Source Football](https://www.opensourcefootball.com/): Mix of R and Python\n* [The Mockup Blog (Thomas Mock)](https://themockup.blog/): Invaluable resource for making cool stuff in R\n\n### Code examples: R\n\n* [Lee Sharpe: basic intro to R and RStudio](https://github.com/leesharpe/nfldata/blob/master/RSTUDIO-INTRO.md)\n* [Lee Sharpe: lots of useful NFL / nflscrapR code](https://github.com/leesharpe/nfldata)\n* [Lee Sharpe: how to update current season games](https://github.com/leesharpe/nfldata/blob/master/UPDATING-NFLSCRAPR.md)\n* [Josh Hermsmeyer: Getting Started with R for NFL Analysis](https://t.co/gxDDhOYhcI)\n* [Slavin: visualizing positional tiers in SFB9](https://slavin22.github.io/SFB9-Positional-Tiers/Guide.nb)\n* [Ron Yurko: assorted examples](https://github.com/ryurko/nflscrapR-data/tree/master/R)\n* [CowboysStats: defensive playmaking EPA](https://github.com/dhouston890/cowboys-stats/blob/master/playmaking_epa_pbp.R)\n* [Michael Lopez: function to sample plays](https://github.com/statsbylopez/BlogPosts/blob/master/scrapr-data.R)\n* [Michael Lopez: R for NFL analysis (presentation to club staffers)](https://statsbylopez.netlify.com/post/r-for-nfl-analysis/)\n* [Mitchell Wesson: QB hits investigation](https://gist.github.com/wessonmo/45781bd25a74e8097e0c8bc8fbacf796)\n* [Mitchell Wesson: Investigation of the nflscrapR EP model](https://gist.github.com/wessonmo/ef44ea9873d70f816454cb88b86dcce6)\n* [WHoffman: graphs for receivers (aDoT, success rate, and more)](https://github.com/whoffman21279/Steelers/blob/master/receiving_stats)\n* [ChiBearsStats: investigation of 3rd downs vs offensive efficiency](https://gist.github.com/ChiBearsStats/dac3266037797032a23f38fd9d64d6a8#file-adjustedthirddowns-txt)\n* [ChiBearsStats: the insignificance of field goal kicking](https://gist.github.com/ChiBearsStats/78e33baeed3cd6d3cac0040b47d4ec69)\n\n### More data sources\n\n* [Lee Sharpe: Draft Picks, Draft Values, Games, Logos, Rosters, Standings](https://github.com/leesharpe/nfldata/blob/master/DATASETS.md)\n* [greerre: how to get .csv file of weather & stadium data from PFR in python](https://github.com/greerre/pfr_metadata_pull)\n* [Parker Fleming: Introduction to College Football Data with R and cfbscrapR](https://gist.github.com/spfleming/2527a6ca2b940af2a8aa1fee9320171d)\n\n### Other code examples: Python\n\n* [Deryck97: nflfastR Python Guide](https://gist.github.com/Deryck97/dff8d33e9f841568201a2a0d5519ac5e)\n* [Nick Wan: nflfastR Python Colab Guide](https://colab.research.google.com/github/nickwan/colab_nflfastR/blob/master/nflfastR_starter.ipynb)\n* [Cory Jez: animated plot](https://github.com/jezlax/sports_analytics/blob/master/animated_nfl_scatter.py)\n* [903124S: Sampling EP](https://gist.github.com/903124/6693fdf6b991437a6d6ef9c5d935c83b)\n* [903124S: estimating EPA using nfldb](https://gist.github.com/903124/d304f76688b0699497a35b61b6d1e267)\n* [903124S: estimate EPA for college football](https://gist.github.com/903124/3c6f0dc0a100d78b8622573ef4c504f5)\n* Blake Atkinson: explosiveness [blog post](https://medium.com/@BlakeAtkinson/the-2018-kansas-city-chiefs-and-an-explosiveness-metric-in-football-c3b3fd447d73) and [python code](https://github.com/btatkinson/yard_value/blob/master/yard_value.ipynb)\n* Blake Atkinson: player type visualizations [blog post](https://medium.com/@BlakeAtkinson/visualizing-different-nfl-player-styles-88ef31420539) and [python code](https://github.com/btatkinson/player_vectors/blob/master/player_vectors.ipynb)\n"
  },
  {
    "path": "vignettes/field_descriptions.Rmd",
    "content": "---\ntitle: \"Field Descriptions\"\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  echo = FALSE,\n  comment = \"#>\"\n)\n\nwith_dt <- requireNamespace(\"DT\")\n\n```\n\n```{r eval = with_dt}\nDT::datatable(\n  nflfastR::field_descriptions,\n  options = list(scrollX = TRUE, pageLength = 25),\n  filter = \"top\",\n  rownames = FALSE,\n  style = \"bootstrap4\"\n)\n```\n\n```{r eval = !with_dt}\nknitr::kable(nflfastR::field_descriptions)\n```\n"
  },
  {
    "path": "vignettes/nflfastR.Rmd",
    "content": "---\ntitle: \"Get started with nflfastR\"\nauthor: \"Ben Baldwin & Sebastian Carl\"\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#>\"\n)\nfuture::plan(\"multisession\")\noptions(dplyr.summarise.inform = FALSE)\noptions(nflreadr.verbose = FALSE)\n```\n\nIf you are new to R or are having trouble understanding the code in the below sections we highly recommend the **nflfastR beginner's guide** in `vignette(\"beginners_guide\")`.\n\n# The Main Functions\n\nnflfastR comes with a set of functions to access NFL play-by-play data. This section provides a brief introduction to the essential functions.\n\nnflfastR processes and cleans up play-by-play data and adds variables through [it's models](https://www.opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/). Since some of these tasks are performed by separate functions, the easiest way to compute the complete nflfastR dataset is `build_nflfastR_pbp()`. The main input for that function is a set of game ids which can be accessed with `load_schedules()`. The following code demonstrates how to build the nflfastR dataset for the Super Bowls of the 2017 - 2019 seasons.\n\n```{r}\nids <- nflreadr::load_schedules(2017:2019) |>\n  dplyr::filter(game_type == \"SB\") |>\n  dplyr::pull(game_id)\npbp <- nflfastR::build_nflfastR_pbp(ids)\n```\n\nIn most cases, however, it is not necessary to use this function for individual games, because nflverse provides both a [data release](https://github.com/nflverse/nflverse-data/releases/tag/pbp) and two main play-by-play functions: `load_pbp()` and `update_pbp_db()`. We cover `load_pbp()` below, and please see [Example 8: Using the built-in database function] for how to work with the database function `update_pbp_db()`.\n\nThe easiest way to access the data from the release is the function `load_pbp()`. It can load multiple seasons directly into memory and supports multiple data formats. Loading all play-by-play data of the 2022-2024 seasons is as easy as \n\n```{r}\npbp <- nflfastR::load_pbp(2022:2024)\n```\n\nJoining roster data to the play-by-play data set is possible as well. The data can be accessed with the function `load_rosters()` and its application is demonstrated in [Example 10: Working with roster and position data].\n\n# Application Examples\n\nAll examples listed below assume that the following libraries are installed (and loaded).\n\n``` {r load, warning = FALSE, message = FALSE}\nlibrary(nflfastR)\nlibrary(nflplotR)\nlibrary(dplyr)\nlibrary(ggplot2)\n```\n\n## Example 1: Completion Percentage Over Expected (CPOE)\n\nLet's look at CPOE leaders from the 2009 regular season.\n\nAs discussed above, `nflfastR` has a data release for all available seasons, so there's no need to actually build them. Let's use that here with the convenience function `load_pbp()` which fetches data from the release (for non-R users, .csv and .parquet are also available in the [data release](https://github.com/nflverse/nflverse-data/releases/tag/pbp)).\n\n``` {r ex3-cpoe, warning = FALSE, message = FALSE}\ngames_2009 <- nflfastR::load_pbp(2009) |> dplyr::filter(season_type == \"REG\")\ngames_2009 |>\n  dplyr::filter_out(is.na(cpoe)) |>\n  dplyr::summarize(\n    passer = nflreadr::stat_mode(passer_player_name),\n    cpoe = mean(cpoe),\n    Atts = n(),\n    .by = passer_player_id\n  ) |>\n  dplyr::filter(Atts > 200) |>\n  dplyr::slice_max(cpoe, n = 5) |>\n  knitr::kable(digits = 1)\n```\n\n## Example 2: Using Drive Information\n\nWhen working with `nflfastR`, drive results are automatically included. We use `fixed_drive` and `fixed_drive_result` since the NFL-provided information is a bit wonky. Let's look at how much more likely teams were to score starting from 1st & 10 at their own 20 yard line in 2015 (the last year before touchbacks on kickoffs changed to the 25) than in 2000.\n\n``` {r ex4, warning = FALSE, message = FALSE}\npbp <- nflfastR::load_pbp(c(2003, 2015))\n\nout <- pbp |>\n  dplyr::filter(\n    season_type == \"REG\" & down == 1 & ydstogo == 10 & yardline_100 == 80\n  ) |>\n  dplyr::mutate(\n    drive_score = dplyr::case_when(\n      fixed_drive_result %in% c(\"Touchdown\", \"Field goal\") ~ 1L,\n      TRUE ~ 0L\n    )\n  ) |>\n  dplyr::summarize(drive_score = mean(drive_score), .by = season)\n\nout |>\n  knitr::kable(digits = 3)\n```\n\nSo `r scales::percent(out$drive_score[1], accuracy = 0.1)` of 1st & 10 plays from teams' own 20 would see the drive end up in a score in 2003, compared to `r scales::percent(out$drive_score[2], accuracy = 0.1)` in 2015. This has implications for Expected Points models (see [this article](https://www.opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/)).\n\n## Example 3: Plot offensive and defensive EPA per play for a given season\n\nLet's build the **[NFL team tiers](https://rbsdm.com/stats/stats/)** using offensive and defensive expected points added per play for the 2005 regular season. Creating data viz including NFL team logos (or wordmarks, or headshots), we recommend the nflverse R package [nflplotR](https://nflplotr.nflverse.com).\n\nWhen using `load_pbp()`, the helper function `clean_pbp()` has already been run, which creates \"rush\" and \"pass\" columns that (a) properly count sacks and scrambles as pass plays and (b) properly include plays with penalties. Using this, we can keep only rush or pass plays.\n\n```{r ex5, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600}\npbp <- nflfastR::load_pbp(2005) |>\n  dplyr::filter(season_type == \"REG\") |>\n  dplyr::filter(!is.na(posteam) & (rush == 1 | pass == 1))\noffense <- pbp |>\n  dplyr::group_by(team = posteam) |>\n  dplyr::summarise(off_epa = mean(epa, na.rm = TRUE))\ndefense <- pbp |>\n  dplyr::group_by(team = defteam) |>\n  dplyr::summarise(def_epa = mean(epa, na.rm = TRUE))\noffense |>\n  dplyr::inner_join(defense, by = \"team\") |>\n  ggplot(aes(x = off_epa, y = def_epa)) +\n  geom_abline(\n    slope = -1.5,\n    intercept = c(.4, .3, .2, .1, 0, -.1, -.2, -.3),\n    alpha = .2\n  ) +\n  nflplotR::geom_mean_lines(aes(y0 = off_epa, x0 = def_epa)) +\n  nflplotR::geom_nfl_logos(aes(team_abbr = team), width = 0.07, alpha = 0.7) +\n  labs(\n    x = \"Offense EPA/play\",\n    y = \"Defense EPA/play\",\n    caption = \"Data: @nflfastR\",\n    title = \"2005 NFL Offensive and Defensive EPA per Play\"\n  ) +\n  theme_bw() +\n  theme(\n    plot.title = element_text(size = 12, hjust = 0.5, face = \"bold\")\n  ) +\n  scale_y_reverse()\n```\n\n## Example 4: Expected Points calculator\n\nWe have provided a calculator for working with the Expected Points model. Here is an example of how to use it, looking for how the Expected Points on a drive beginning following a touchback has changed over time.\n\nWhile I have put in `'SEA'` for `home_team` and `posteam`, this only matters for figuring out whether the team with the ball is the home team (there's no actual effect for given team; it would be the same no matter what team is supplied).\n\n``` {r ex6a}\ndata <- tibble::tibble(\n  \"season\" = 1999:2019,\n  \"home_team\" = \"SEA\",\n  \"posteam\" = \"SEA\",\n  \"roof\" = \"outdoors\",\n  \"half_seconds_remaining\" = 1800,\n  \"yardline_100\" = c(rep(80, 17), rep(75, 4)),\n  \"down\" = 1,\n  \"ydstogo\" = 10,\n  \"posteam_timeouts_remaining\" = 3,\n  \"defteam_timeouts_remaining\" = 3\n)\nnflfastR::calculate_expected_points(data) |>\n  dplyr::select(season, yardline_100, td_prob, ep) |>\n  knitr::kable(digits = 2)\n```\n\nNot surprisingly, offenses have become much more successful over time, with the kickoff touchback moving from the 20 to the 25 in 2016 providing an additional boost. Note that the `td_prob` in this example is the probability that the next score within the same half will be a touchdown scored by team with the ball, **not** the probability that the current drive will end in a touchdown (this is why the numbers are different from Example 4 above).\n\nWe could compare the most recent four years to the expectation for playing in a dome by inputting all the same things and changing the `roof` input:\n\n``` {r ex6b}\ndata <- tibble::tibble(\n  \"season\" = 2016:2019,\n  \"week\" = 5,\n  \"home_team\" = \"SEA\",\n  \"posteam\" = \"SEA\",\n  \"roof\" = \"dome\",\n  \"half_seconds_remaining\" = 1800,\n  \"yardline_100\" = c(rep(75, 4)),\n  \"down\" = 1,\n  \"ydstogo\" = 10,\n  \"posteam_timeouts_remaining\" = 3,\n  \"defteam_timeouts_remaining\" = 3\n)\nnflfastR::calculate_expected_points(data) |>\n  dplyr::select(season, yardline_100, td_prob, ep) |>\n  knitr::kable(digits = 2)\n```\n\nSo for 2018 and 2019, 1st & 10 from a home team's own 25 yard line had higher EP in domes than at home, which is to be expected.\n\n## Example 5: Win probability calculator\n\nWe have also provided a calculator for working with the win probability models. Here is an example of how to use it, looking for how the win probability to begin the game depends on the pre-game spread.\n\nWhile I have put in `'SEA'` for `home_team` and `posteam`, this only matters for figuring out whether the team with the ball is the home team (there's no actual effect for given team; it would be the same no matter what team is supplied).\n\n``` {r ex7}\ndata <- tibble::tibble(\n  \"receive_2h_ko\" = 0,\n  \"home_team\" = \"SEA\",\n  \"posteam\" = \"SEA\",\n  \"score_differential\" = 0,\n  \"half_seconds_remaining\" = 1800,\n  \"game_seconds_remaining\" = 3600,\n  \"spread_line\" = c(1, 3, 4, 7, 14),\n  \"down\" = 1,\n  \"ydstogo\" = 10,\n  \"yardline_100\" = 75,\n  \"posteam_timeouts_remaining\" = 3,\n  \"defteam_timeouts_remaining\" = 3\n)\nnflfastR::calculate_win_probability(data) |>\n  dplyr::select(spread_line, wp, vegas_wp) |>\n  knitr::kable(digits = 2)\n```\n\nNot surprisingly, `vegas_wp` increases with the amount a team was coming into the game favored by.\n\n## Example 6: Using the built-in database function\n\nIf you're comfortable using `dplyr` functions to manipulate and tidy data, you're ready to use a database. Why should you use a database?\n\n* The provided function in `nflfastR` makes it extremely easy to build a database and keep it updated\n* Play-by-play data over 25+ seasons takes up a lot of memory: working with a database allows you to only bring into memory what you actually need\n* R makes it *extremely* easy to work with databases.\n\n### Start: install and load packages\n\nTo start, we need to install the two packages required for this that aren't installed automatically when `nflfastR` installs: `DBI` and `duckdb` (advanced users can use other types of databases, but this example will use duckdb). The `if` statements make sure the packages won't be updated if they are already installed:\n\n``` {r eval = FALSE}\nif (!require(\"DBI\")) install.packages(\"DBI\")\nif (!require(\"duckdb\")) install.packages(\"duckdb\")\n```\n\n### Overview\n\nThere's exactly one function in `nflfastR` that works with databases: `update_pbp_db()`. Some notes:\n\n* `update_pbp_db()` follows the DBI argument naming convention and order. It requires an open connection created with `DBI::dbConnect()`.\n* You can specify a different table name with `name`.\n* The `seasons` argument controls how the table in the connected database is handled. This is a hybrid argument, and its behavior is described in detail [in the function documentation](https://nflfastr.com/reference/update_pbp_db.html#the-seasons-argument).\n* If larger parts of the DB need to be updated, then you should definitely consider doing so in chunks. The `\"nflfastR.db_chunk_size\"` option is available for this purpose. Further details can also be found in the function documentation.\n\n### Connect to a database\n\nWorking with databases always requires an open connection. In this example, we will focus solely on duckdb databases, as duckdb has essentially become the state of the art for this type of data. duckdb can easily create a database in your memory. Of course, this doesn't make sense for large amounts of data, because they shouldn't be stored in memory, but the process is practically identical with a locally stored database.\n\nSo let's connect to an in-memory duckdb database:\n\n``` {r}\nconnection <- DBI::dbConnect(duckdb::duckdb())\nconnection\n```\n\n### Write data to the database\n\nLet's say I just want to dump play-by-play data of the 2021 - 2024 seasons in my database. Here we go!\n\n``` {r create-db}\nnflfastR::update_pbp_db(connection, seasons = 2021:2024)\n```\n\nThis created a table named \"nflverse_pbp\" in the connected database and appended 2024 play-by-play data to it.\n\nWait, that's it? That's it! What if it's partway through the season and you want to make sure all the new games are added to the database to allow for data corrections from the NFL to propagate into your database? What do you run? `update_pbp_db()`!\n\n``` {r update-db}\nnflfastR::update_pbp_db(connection)\n```\n\n### Work with the database\n\nNow we're ready to do stuff. If you aren't familiar with databases, they're organized around tables. Here's how to see which tables are present in our database: \n\n``` {r}\nDBI::dbListTables(connection)\n```\n\nSince we went with the defaults, there's a table called `nflverse_pbp`. Another useful function is to see the fields (i.e., columns) in a table:\n\n``` {r}\nDBI::dbListFields(connection, \"nflverse_pbp\") |>\n  utils::head(10)\n```\n\nThis is the same list as the list of columns in `nflfastR` play-by-play. Notice we had to supply the name of the table above (`\"nflverse_pbp\"`). \n\nWith all that out of the way, there's only a couple more things to learn. The main driver here is `tbl`, which helps get output with a specific table in a database:\n\n``` {r}\npbp_db <- dplyr::tbl(connection, \"nflverse_pbp\")\n```\n\nAnd now, everything will magically just \"work\": you can forget you're even working with a database!\n\n``` {r}\npbp_db |>\n  dplyr::group_by(season) |>\n  dplyr::summarize(n = dplyr::n())\npbp_db |>\n  dplyr::filter(\n    rush == 1 | pass == 1,\n    down <= 2,\n    !is.na(epa),\n    !is.na(posteam)\n  ) |>\n  dplyr::group_by(pass) |>\n  dplyr::summarize(mean_epa = mean(epa, na.rm = TRUE))\n```\n\nSo far, everything has stayed in the database. If you want to bring a query into memory, just use `collect()` at the end:\n\n``` {r}\nruss <- pbp_db |>\n  dplyr::filter(name == \"R.Wilson\" & posteam == \"SEA\") |>\n  dplyr::select(desc, epa) |>\n  dplyr::collect()\nruss\n```\n\nSo we've searched through `r pbp_db |> dplyr::count() |> dplyr::collect() |> dplyr::pull(n) |> prettyNum(big.mark = \",\")` rows of data across 300+ columns and only brought about `r round(nrow(russ), -1)` rows and two columns into memory. Pretty neat! This is how we supply the data to the shiny apps on rbsdm.com without running out of memory on the server. Now there's only one more thing to remember. When you're finished doing what you need with the database:\n\n``` {r}\nDBI::dbDisconnect(connection)\n```\n\nFor more details on using a database with `nflfastR`, see [Thomas Mock's life-changing post here](https://themockup.blog/posts/2019-04-28-nflfastr-dbplyr-rsqlite/). More detailed information on dbplyr (the dplyr database back-end) are given in the second edition of [Hadley Wickham's R for Data Science (2e)](https://r4ds.hadley.nz/databases.html).\n\n## Example 7: working with the expected yards after catch model\n\nThe variables in `xyac` are as follows:\n\n* `xyac_epa`: The expected value of EPA gained after the catch, **starting from where the catch was made**.\n* `xyac_success`: The probability the play earns positive EPA (relative to where play started) based on where ball was caught.\n* `xyac_fd`: Probability play earns a first down based on where the ball was caught.\n* `xyac_mean_yardage` and `xyac_median_yardage`: Average and median expected yards after the catch based on where the ball was caught.\n\nSome other notes:\n\n* `epa` = `air_epa` + `yac_epa`, where `air_epa` is the EPA associated with a catch at the target location. If a receiver loses a fumble, it is removed from his `yac_epa`\n* Expected value of EPA at catch point = `air_epa` + `xyac_epa`\n* So if we want to get YAC EPA over expected, we need to compare `yac_epa` to `xyac_epa`, as in the example below\n* To get first downs over expected, we could compare `first_down` to `xyac_fd`\n* These fields are populated for all pass attempts, whether caught or not, but restrict to completed passes when measuring, for example, YAC EPA over expected\n* The expected YAC EPA model doesn't take receiver fumbles into account, so actual minus expected YAC is slightly negative due to fumbles happening\n\nLet's create measures for EPA and first downs over expected in 2015:\n\n``` {r ex9-xyac, warning = FALSE, message = FALSE}\nnflfastR::load_pbp(2015) |>\n  dplyr::group_by(receiver, receiver_id, posteam) |>\n  dplyr::mutate(tgt = sum(complete_pass + incomplete_pass)) |>\n  dplyr::filter(tgt >= 50) |>\n  dplyr::filter(\n    complete_pass == 1,\n    air_yards < yardline_100,\n    !is.na(xyac_epa)\n  ) |>\n  dplyr::summarize(\n    epa_oe = mean(yac_epa - xyac_epa),\n    actual_fd = mean(first_down),\n    expected_fd = mean(xyac_fd),\n    fd_oe = mean(first_down - xyac_fd),\n    rec = dplyr::n()\n  ) |>\n  dplyr::ungroup() |>\n  dplyr::select(\n    receiver,\n    posteam,\n    actual_fd,\n    expected_fd,\n    fd_oe,\n    epa_oe,\n    rec\n  ) |>\n  dplyr::slice_max(epa_oe, n = 10) |>\n  knitr::kable(digits = 3)\n```\n\nThe presence of so many running backs on this list suggests that even though it takes into account target depth and pass direction, the model doesn't do a great job capturing space. Alternatively, running backs might be better at generating yards after the catch since running with the football is their primary role.\n\n## Example 8: Working with roster and position data\n\nAt long last, there's a way to merge the new play-by-play data with roster information. Use the function to get the rosters:\n\n``` {r roster}\nroster <- nflfastR::load_rosters(2019)\n```\n\nNow let's load play-by-play data from 2019:\n``` {r roster_pbp_load}\ngames_2019 <- nflfastR::load_pbp(2019)\n```\n\nHere is what the player IDs look like because `nflfastR` now automatically decodes IDs to look like the old format with GSIS IDs:\n\n``` {r roster_pbp}\ngames_2019 |>\n  dplyr::filter(rush == 1 | pass == 1, posteam == \"SEA\") |>\n  dplyr::select(name, id)\n```\n\nNow we're ready to join to the roster data using these IDs:\n``` {r decode_join}\njoined <- games_2019 |>\n  dplyr::filter(!is.na(receiver_id)) |>\n  dplyr::select(posteam, season, desc, receiver, receiver_id, epa) |>\n  dplyr::left_join(roster, by = c(\"receiver_id\" = \"gsis_id\"))\n```\n\n``` {r decode_table}\n# the real work is done, this just makes a table and has it look nice\njoined |>\n  dplyr::filter(position %in% c(\"WR\", \"TE\", \"RB\")) |>\n  dplyr::group_by(receiver_id, receiver, position) |>\n  dplyr::summarize(tot_epa = sum(epa), n = n()) |>\n  dplyr::arrange(-tot_epa) |>\n  dplyr::ungroup() |>\n  dplyr::group_by(position) |>\n  dplyr::mutate(position_rank = 1:n()) |>\n  dplyr::filter(position_rank <= 5) |>\n  dplyr::rename(\n    Pos_Rank = position_rank,\n    Player = receiver,\n    Pos = position,\n    Tgt = n,\n    EPA = tot_epa\n  ) |>\n  dplyr::select(Player, Pos, Pos_Rank, Tgt, EPA) |>\n  knitr::kable(digits = 0)\n```\n\nNot surprisingly, all 5 of the top 5 WRs in terms of EPA added come in ahead of the top RB. Note that the number of targets won't match official stats because we're including plays with penalties.\n\n## Example 9: Replicating official stats\n\nThe columns like `name`, `passer`, `fantasy` etc are `nflfastR`-created columns that mimic \"real\" football: i.e., excluding plays with spikes, counting scrambles and sacks as pass plays, etc. But if you're trying to replicate official statistics -- perhaps for fantasy purposes -- use the `*_player_name` and `*_player_id` columns.\n\n[Let's try to replicate this page of passing leaders](https://www.nfl.com/stats/player-stats/).\n\n``` {r stats1}\nnflfastR::load_pbp(2020) |>\n  dplyr::filter(\n    season_type == \"REG\",\n    complete_pass == 1 | incomplete_pass == 1 | interception == 1,\n    !is.na(down)\n  ) |>\n  dplyr::group_by(passer_player_name, posteam) |>\n  dplyr::summarize(\n    yards = sum(passing_yards, na.rm = T),\n    tds = sum(touchdown == 1 & td_team == posteam),\n    ints = sum(interception),\n    att = dplyr::n()\n  ) |>\n  dplyr::arrange(-yards) |>\n  utils::head(10) |>\n  knitr::kable(digits = 0)\n```\n\nThese match the official stats on NFL.com (note the filter for `season_type == \"REG\"` since official stats only count regular season games). Note that we're using `passing_yards` here because `yards_gained` is not equal to passing yards on plays with laterals.\n\nWhile the above code works in this case, there are several special cases where it is nearly impossible to get official player stats from nflfastR play-by-play data. The reason for this is that the idea of nflfastR play-by-play data is a \"tidy\" data structure. In other words, the aim is to have one row per play in the data. This can lead to problems if, for example, there are several changes of possession per play (i.e. several fumbles) or if the ball is lateraled in a play. These are just two examples of “abnormal” plays that are not fully captured in a tidy data structure.\nWe have solved this problem with the function `calculate_stats()`. This function uses playstats of the raw play-by-play data before it is parsed into a tidy structure by nflfastR. \n\nThis function has the following features:\n\n- It determines stats in offense, defense, and special teams,\n- either on player level or on team level,\n- and can summarize them on season level (separately for regular season and post season) or on week level.\n\nFor more information see the function documentation of `calculate_stats()`. Again, **don't try to get an exact match with official stats based on nflfastR play-by-play data**. It usually works, but fails because of details that are unsolvable.\n\nNow let's replicate the above table using `calculate_stats()`:\n\n``` {r stats2}\ns <- nflfastR::calculate_stats(\n  seasons = 2020,\n  summary_level = \"season\",\n  stat_type = \"player\",\n  season_type = \"REG\"\n)\ns |>\n  dplyr::slice_max(passing_yards, n = 10) |>\n  dplyr::select(\n    player_name,\n    recent_team,\n    completions,\n    attempts,\n    passing_yards,\n    passing_tds,\n    passing_interceptions,\n    attempts\n  ) |>\n  knitr::kable(digits = 0)\n```\n\nThe same applies to stats data as to pbp data. Its computation is costly, but can be automated. There is therefore rarely a reason to call `calculate_stats()` directly. Instead, nflverse offers the functions `nflfastR::load_player_stats()` and `nflfastR::load_team_stats()` to load precomputed data from data releases.\n\n# Frequent issues\n\n## The `drive` column looks wacky\n\nUse `fixed_drive` and `fixed_drive_result` instead. See [Example 2: Using Drive Information].\n\n## Why are there so many win probability columns?\n\n`vegas_wp` and `vegas_home_wp` incorporate the pregame spread and are much better models.\n\n## Need more help?\n\nPlease ask [in the nflverse Discord server](https://discord.com/invite/5Er2FBnnQa).\n"
  },
  {
    "path": "vignettes/stats_variables.Rmd",
    "content": "---\ntitle: \"NFL Stats Variables\"\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  echo = FALSE,\n  comment = \"#>\"\n)\n\nwith_dt <- requireNamespace(\"DT\")\n\n```\n\nBelow you will find a table that lists and explains all the variables available in `calculate_stats()`. Compared to the old `calculate_player_stats*()` functions that have been deprecated, practically all variables (and their names) have been preserved. However, there are a few differences. These are\n\n- `recent_team`: renamed to `team` (recent team in weekly data never made sense)\n- `interceptions`: renamed to `passing_interceptions` (all passing stats have the passing prefix)\n- `sacks`: renamed to `sacks_suffered` (to make clear it's not on defensive side)\n- `sack_yards`: renamed to `sack_yards_lost` (to make clear it's not on defensive side)\n- `dakota`: not implemented at the moment\n- `def_tackles`: there is `def_tackles_solo` and `def_tackles_with_assist`\n- `def_fumble_recovery_own`: renamed to `fumble_recovery_own` (it is not exclusive to defense)\n- `def_fumble_recovery_yards_own`: renamed to `fumble_recovery_yards_own` (it is not exclusive to defense)\n- `def_fumble_recovery_opp`: renamed to `fumble_recovery_opp` (it is not exclusive to defense)\n- `def_fumble_recovery_yards_opp`: renamed to `fumble_recovery_yards_opp` (it is not exclusive to defense)\n- `def_safety`: renamed to `def_safeties` (we use plural everywhere)\n- `def_penalty`: renamed to `penalties` (it is not exclusive to defense)\n- `def_penalty_yards`: renamed to `penalty_yards` (it is not exclusive to defense)\n\n```{r eval = with_dt}\nDT::datatable(\n  nflfastR::nfl_stats_variables,\n  options = list(scrollX = TRUE, pageLength = 25),\n  filter = \"top\",\n  rownames = FALSE,\n  style = \"bootstrap4\"\n)\n```\n\n```{r eval = !with_dt}\nknitr::kable(nflfastR::nfl_stats_variables)\n```\n"
  }
]