[
  {
    "path": ".Rbuildignore",
    "content": "^ggdensity\\.Rproj$\n^\\.Rproj\\.user$\n^LICENSE\\.md$\n^README\\.Rmd$\n^README\\.md$\n^README_cache$\n^man/figures$\n^_pkgdown\\.yml$\n^docs$\n^pkgdown$\n^\\.github$\n^CRAN-SUBMISSION$\n^doc$\n^Meta$\n^revdep$\n^cran-comments\\.md$\n"
  },
  {
    "path": ".github/.gitignore",
    "content": "*.html\n"
  },
  {
    "path": ".github/workflows/R-CMD-check.yaml",
    "content": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help\non:\n  push:\n    branches: [main, master]\n  pull_request:\n\nname: R-CMD-check.yaml\n\npermissions: read-all\n\njobs:\n  R-CMD-check:\n    runs-on: ${{ matrix.config.os }}\n\n    name: ${{ matrix.config.os }} (${{ matrix.config.r }})\n\n    strategy:\n      fail-fast: false\n      matrix:\n        config:\n          - {os: macos-latest,   r: 'release'}\n          - {os: windows-latest, r: 'release'}\n          - {os: ubuntu-latest,   r: 'devel', http-user-agent: 'release'}\n          - {os: ubuntu-latest,   r: 'release'}\n          - {os: ubuntu-latest,   r: 'oldrel-1'}\n\n    env:\n      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}\n      R_KEEP_PKG_SOURCE: yes\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: r-lib/actions/setup-pandoc@v2\n\n      - uses: r-lib/actions/setup-r@v2\n        with:\n          r-version: ${{ matrix.config.r }}\n          http-user-agent: ${{ matrix.config.http-user-agent }}\n          use-public-rspm: true\n\n      - uses: r-lib/actions/setup-r-dependencies@v2\n        with:\n          extra-packages: any::rcmdcheck\n          needs: check\n\n      - uses: r-lib/actions/check-r-package@v2\n        with:\n          upload-snapshots: true\n          build_args: 'c(\"--no-manual\",\"--compact-vignettes=gs+qpdf\")'\n"
  },
  {
    "path": ".github/workflows/pkgdown.yaml",
    "content": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help\non:\n  push:\n    branches: [main, master]\n  pull_request:\n    branches: [main, master]\n  release:\n    types: [published]\n  workflow_dispatch:\n\nname: pkgdown\n\njobs:\n  pkgdown:\n    runs-on: ubuntu-latest\n    # Only restrict concurrency for non-PR jobs\n    concurrency:\n      group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}\n    env:\n      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}\n    steps:\n      - uses: actions/checkout@v3\n\n      - uses: r-lib/actions/setup-pandoc@v2\n\n      - uses: r-lib/actions/setup-r@v2\n        with:\n          use-public-rspm: true\n\n      - uses: r-lib/actions/setup-r-dependencies@v2\n        with:\n          extra-packages: any::pkgdown, local::.\n          needs: website\n\n      - name: Build site\n        run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)\n        shell: Rscript {0}\n\n      - name: Deploy to GitHub pages 🚀\n        if: github.event_name != 'pull_request'\n        uses: JamesIves/github-pages-deploy-action@v4.4.1\n        with:\n          clean: false\n          branch: gh-pages\n          folder: docs\n"
  },
  {
    "path": ".github/workflows/test-coverage.yaml",
    "content": "# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples\n# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help\non:\n  push:\n    branches: [main, master]\n  pull_request:\n\nname: test-coverage.yaml\n\npermissions: read-all\n\njobs:\n  test-coverage:\n    runs-on: ubuntu-latest\n    env:\n      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: r-lib/actions/setup-r@v2\n        with:\n          use-public-rspm: true\n\n      - uses: r-lib/actions/setup-r-dependencies@v2\n        with:\n          extra-packages: any::covr, any::xml2\n          needs: coverage\n\n      - name: Test coverage\n        run: |\n          cov <- covr::package_coverage(\n            quiet = FALSE,\n            clean = FALSE,\n            install_path = file.path(normalizePath(Sys.getenv(\"RUNNER_TEMP\"), winslash = \"/\"), \"package\")\n          )\n          print(cov)\n          covr::to_cobertura(cov)\n        shell: Rscript {0}\n\n      - uses: codecov/codecov-action@v5\n        with:\n          # Fail if error if not on PR, or if on PR and token is given\n          fail_ci_if_error: ${{ github.event_name != 'pull_request' || secrets.CODECOV_TOKEN }}\n          files: ./cobertura.xml\n          plugins: noop\n          disable_search: true\n          token: ${{ secrets.CODECOV_TOKEN }}\n\n      - name: Show testthat output\n        if: always()\n        run: |\n          ## --------------------------------------------------------------------\n          find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \\; || true\n        shell: bash\n\n      - name: Upload test results\n        if: failure()\n        uses: actions/upload-artifact@v4\n        with:\n          name: coverage-test-failures\n          path: ${{ runner.temp }}/package\n"
  },
  {
    "path": ".gitignore",
    "content": ".Rproj.user\n.Rhistory\n.Rdata\n.httr-oauth\n.DS_Store\ndocs\ninst/doc\n/doc/\n/Meta/\n/README_cache/\n\nrevdep/checks\nrevdep/library\nrevdep/checks.noindex\nrevdep/library.noindex\nrevdep/data.sqlite\nrevdep/cloud.noindex\n"
  },
  {
    "path": "CRAN-SUBMISSION",
    "content": "Version: 1.0.0\nDate: 2023-02-09 22:57:39 UTC\nSHA: 54e4677246f7f7d4e50b02d4a5d61b993900c46f\n"
  },
  {
    "path": "DESCRIPTION",
    "content": "Package: ggdensity\nTitle: Interpretable Bivariate Density Visualization with 'ggplot2'\nVersion: 1.0.1\nAuthors@R: \n    c(person(given = \"James\",\n           family = \"Otto\",\n           role = c(\"aut\", \"cre\", \"cph\"),\n           email = \"jamesotto852@gmail.com\",\n           comment = c(ORCID = \"0000-0002-0665-2515\")),\n      person(given = \"David\",\n           family = \"Kahle\",\n           role = c(\"aut\"),\n           email = \"david@kahle.io\",\n           comment = c(ORCID = \"0000-0002-9999-1558\")))\nDescription: The 'ggplot2' package provides simple functions for visualizing contours\n  of 2-d kernel density estimates. 'ggdensity' implements several additional density estimators \n  as well as more interpretable visualizations based on highest density regions instead of\n  the traditional height of the estimated density surface. \nLicense: MIT + file LICENSE\nEncoding: UTF-8\nRoxygen: list(markdown = TRUE)\nRoxygenNote: 7.3.3\nDepends:\n    ggplot2\nImports:\n    isoband,\n    vctrs,\n    tibble,\n    MASS,\n    stats,\n    scales\nURL: https://jamesotto852.github.io/ggdensity/, https://github.com/jamesotto852/ggdensity/\nBugReports: https://github.com/jamesotto852/ggdensity/issues/\nSuggests: \n    vdiffr,\n    testthat (>= 3.0.0),\n    knitr,\n    rmarkdown\nConfig/testthat/edition: 3\nVignetteBuilder: knitr\n"
  },
  {
    "path": "LICENSE",
    "content": "YEAR: 2021\nCOPYRIGHT HOLDER: ggdensity authors\n"
  },
  {
    "path": "LICENSE.md",
    "content": "# MIT License\n\nCopyright (c) 2021 ggdensity authors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "NAMESPACE",
    "content": "# Generated by roxygen2: do not edit by hand\n\nexport(GeomHdr)\nexport(GeomHdrFun)\nexport(GeomHdrLines)\nexport(GeomHdrLinesFun)\nexport(GeomHdrRug)\nexport(GeomHdrRugFun)\nexport(StatHdr)\nexport(StatHdrFun)\nexport(StatHdrLines)\nexport(StatHdrLinesFun)\nexport(StatHdrPoints)\nexport(StatHdrPointsFun)\nexport(StatHdrRug)\nexport(StatHdrRugFun)\nexport(geom_hdr)\nexport(geom_hdr_fun)\nexport(geom_hdr_lines)\nexport(geom_hdr_lines_fun)\nexport(geom_hdr_points)\nexport(geom_hdr_points_fun)\nexport(geom_hdr_rug)\nexport(geom_hdr_rug_fun)\nexport(get_hdr)\nexport(get_hdr_1d)\nexport(method_freqpoly)\nexport(method_freqpoly_1d)\nexport(method_histogram)\nexport(method_histogram_1d)\nexport(method_kde)\nexport(method_kde_1d)\nexport(method_mvnorm)\nexport(method_norm_1d)\nexport(stat_hdr)\nexport(stat_hdr_fun)\nexport(stat_hdr_lines)\nexport(stat_hdr_lines_fun)\nexport(stat_hdr_points)\nexport(stat_hdr_points_fun)\nexport(stat_hdr_rug)\nexport(stat_hdr_rug_fun)\nimport(ggplot2)\nimportFrom(MASS,bandwidth.nrd)\nimportFrom(MASS,kde2d)\nimportFrom(scales,percent)\nimportFrom(scales,percent_format)\nimportFrom(stats,cor)\nimportFrom(stats,cov)\nimportFrom(stats,dnorm)\nimportFrom(stats,pchisq)\nimportFrom(stats,sd)\nimportFrom(stats,setNames)\nimportFrom(stats,uniroot)\n"
  },
  {
    "path": "NEWS.md",
    "content": "# ggdensity 1.0.1\n\n## Fixes\n\n* Package startup message no longer effects the sessions RNG (Reported by @TimTaylor #34)\n\n* Fixed ordering of probabilities in the plot legend to be independent of order specified in `probs` argument (Reported by @z3tt #32)\n\n\n\n# ggdensity 1.0.0\n\n## Features\n\n* Added `get_hdr()` and `get_hdr_1d()` functions, \nexporting implementation of HDR computations (Suggested by @eliocamp #28)\n\n* Reworked `method` argument, allowing for either character or function call specification.\nImplemented related `method_*()` and `method_*_1d()` functions (e.g. `method_kde()` and `method_kde_1d()`).\nSee `?get_hdr` or `vignette(\"method\", \"ggdensity\")` for details (Suggested by @eliocamp #29)\n\n* Added unit tests (Suggested by @eliocamp, #30)\n\n## Breaking Changes\n\n* Removed arguments governing density estimators from `stat_hdr()` and other layer functions--these\nare now specified with `method_*()` and `method_*_1d()` functions\n\n## Fixes\n\n* [Added support](https://tidyverse.org/blog/2022/08/ggplot2-3-4-0-size-to-linewidth/) for the new `linewidth` aesthetic (Reported by @eliocamp, #23)\n\n# ggdensity 0.1.1\n\n## Fixes\n\n* Removed **ggplot2** build-time dependencies (Reported by @thomasp85, #21)\n\n* Fixed bug in `stat_hdr_lines_fun()` which drew lines between components of disconnected HDRs (Reported by @afranks86, #20)\n\n\n# ggdensity 0.1.0\n\n## Features\n\n* Added `geom`/`stat_hdr_rug()` for visualizing marginal HDRs via \"rug plot\"\nstyle graphics along plot axes (#14)\n\n* Added `geom`/`stat_hdr_points()` and `geom`/`stat_hdr_points_fun()` for \nvisualizing HDR membership of points via a colored scatterplot (#15)\n\n## Fixes\n\n* Changed name of computed variable in all stat functions from `level` to `probs`\n"
  },
  {
    "path": "R/attach.R",
    "content": ".onAttach <- function(...) {\n  random_digit <- function() {\n    time <- as.character(Sys.time())\n    digit <- substr(time, nchar(time), nchar(time))\n    as.integer(digit)\n  }\n  if(!interactive() || random_digit() != 1L) return()\n  packageStartupMessage('  Please cite ggdensity! See citation(\"ggdensity\") for details.')\n}\n"
  },
  {
    "path": "R/get_hdr.R",
    "content": "#' Computing the highest density regions of a 2D density\n#'\n#' `get_hdr` is used to estimate a 2-dimensional density and compute\n#' corresponding HDRs. The estimated density and HDRs are represented in a\n#' discrete form as a grid, defined by arguments `rangex`, `rangey`, and `n`.\n#' `get_hdr` is used internally by layer functions `stat_hdr()`,\n#' `stat_hdr_points()`, `stat_hdr_fun()`, etc.\n#'\n#' @param method Either a character (`\"kde\"`, `\"mvnorm\"`, `\"histogram\"`,\n#'   `\"freqpoly\"`, or `\"fun\"`) or `method_*()` function. See the \"The `method`\n#'   argument\" section below for details.\n#' @param data A data frame with columns `x` and `y`.\n#' @param probs Probabilities to compute HDRs for.\n#' @param rangex,rangey Range of grid representing estimated density and HDRs,\n#'   along the x- and y-axes.\n#' @param n Resolution of grid representing estimated density and HDRs.\n#' @param hdr_membership Should HDR membership of data points (`data`) be\n#'   computed? Defaults to `TRUE`, although it is computationally expensive for\n#'   large data sets.\n#' @param fun Optional, a joint probability density function, must be vectorized\n#'   in its first two arguments. See the \"The `fun` argument\" section below for\n#'   details.\n#' @param args Optional, a list of arguments to be provided to `fun`.\n#'\n#' @section The `method` argument: The density estimator used to estimate the\n#'   HDRs is specified with the `method` argument. The simplest way to specify\n#'   an estimator is to provide a character value to `method`, for example\n#'   `method = \"kde\"` specifies a kernel density estimator. However, this\n#'   specification is limited to the default behavior of the estimator.\n#'\n#'   Instead, it is possible to provide a function call, for example: `method =\n#'   method_kde()`. In many cases, these functions accept parameters governing\n#'   the density estimation procedure. Here, `method_kde()` accepts parameters\n#'   `h` and `adjust`, both related to the kernel's bandwidth. For details, see\n#'   `?method_kde`. Every method of bivariate density estimation implemented has\n#'   such corresponding `method_*()` function, each with an associated help\n#'   page.\n#'\n#'   Note: `geom_hdr()` and other layer functions also have `method` arguments\n#'   which behave in the same way. For more details on the use and\n#'   implementation of the `method_*()` functions, see `vignette(\"method\",\n#'   \"ggdensity\")`.\n#'\n#' @section The `fun` argument: If `method` is set to `\"fun\"`, `get_hdr()`\n#'   expects a bivariate probability density function to be specified with the\n#'   `fun` argument. It is required that `fun` be a function of at least two\n#'   arguments (`x` and `y`). Beyond these first two arguments, `fun` can have\n#'   arbitrarily many arguments; these can be set in `get_hdr()` as a named list\n#'   via the `args` parameter.\n#'\n#'   Note: `get_hdr()` requires that `fun` be vectorized in `x` and `y`. For an\n#'   example of an appropriate choice of `fun`, see the final example below.\n#'\n#' @returns\n#'\n#' `get_hdr` returns a list with elements `df_est` (`data.frame`), `breaks`\n#' (named `numeric`), and `data` (`data.frame`).\n#'\n#' * `df_est`: the estimated HDRs and density evaluated on the grid defined by `rangex`, `rangey`, and `n`.\n#' The column of estimated HDRs (`df_est$hdr`) is a numeric vector with values\n#' from `probs`. The columns `df_est$fhat` and `df_est$fhat_discretized`\n#' correspond to the estimated density on the original scale and rescaled to sum\n#' to 1, respectively.\n#'\n#' * `breaks`: the heights of the estimated density (`df_est$fhat`) corresponding to the HDRs specified by `probs`.\n#' Will always have additional element `Inf` representing the cutoff for the\n#' 100% HDR.\n#'\n#' * `data`: the original data provided in the `data` argument.\n#' If `hdr_membership` is set to `TRUE`, this includes a column\n#' (`data$hdr_membership`) with the HDR corresponding to each data point.\n#'\n#' @examples\n#' df <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n#'\n#' # Two ways to specify `method`\n#' get_hdr(df, method = \"kde\")\n#' get_hdr(df, method = method_kde())\n#'\n#' \\dontrun{\n#'\n#' # If parenthesis are omitted, `get_hdr()` errors\n#' get_hdr(df, method = method_kde)\n#' }\n#'\n#' # Estimate different HDRs with `probs`\n#' get_hdr(df, method = method_kde(), probs = c(.975, .6, .2))\n#'\n#' # Adjust estimator parameters with arguments to `method_kde()`\n#' get_hdr(df, method = method_kde(h = 1))\n#'\n#' # Parametric normal estimator of density\n#' get_hdr(df, method = \"mvnorm\")\n#' get_hdr(df, method = method_mvnorm())\n#'\n#' # Compute \"population\" HDRs of specified bivariate pdf with `method = \"fun\"`\n#' f <- function(x, y, sd_x = 1, sd_y = 1) dnorm(x, sd = sd_x) * dnorm(y, sd = sd_y)\n#'\n#' get_hdr(\n#'   method = \"fun\", fun = f,\n#'   rangex = c(-5, 5), rangey = c(-5, 5)\n#'  )\n#'\n#' get_hdr(\n#'   method = \"fun\", fun = f,\n#'   rangex = c(-5, 5), rangey = c(-5, 5),\n#'   args = list(sd_x = .5, sd_y = .5) # specify additional arguments w/ `args`\n#' )\n#'\n#' @export\nget_hdr <- function(data = NULL, method = \"kde\", probs = c(.99, .95, .8, .5), n = 100, rangex = NULL, rangey = NULL, hdr_membership = TRUE, fun, args = list()) {\n\n  # Deal with missing data argument\n  if (is.null(data)) {\n    if (!is.character(method) | (is.character(method) && method != \"fun\")) {\n      stop('`data` must be provided unless `method = \"fun\"`')\n    } else {\n      if (is.null(rangex) | is.null(rangey)) {\n        stop('If `data` is unspecified, `rangex` and `rangey` must be provided when `method = \"fun\"`')\n      }\n    }\n  }\n\n  rangex <- rangex %||% range(data$x)\n  rangey <- rangey %||% range(data$y)\n\n  probs <- fix_probs(probs)\n\n  # Create df_est (estimated density evaluated on a grid) depending on specified method:\n  if (is.character(method) && method == \"fun\") {\n\n    df_est <- f_est(method = NULL, n = n, rangex = rangex, rangey = rangey, fun = fun, args = args)\n\n  } else  {\n\n    if (is.character(method)) {\n\n      if (!method %in% c(\"kde\", \"mvnorm\", \"histogram\", \"freqpoly\")) stop(\"Invalid method specified\")\n\n      # If method is provided as a character, re-assign correct function output:\n      method <- switch(method,\n        \"kde\"       = method_kde(),\n        \"histogram\" = method_histogram(),\n        \"freqpoly\"  = method_freqpoly(),\n        \"mvnorm\"    = method_mvnorm()\n      )\n\n    }\n\n    # parse args of method to determine strategy of `method`\n    method_formals <- names(formals(method))\n\n    # If `data` is the only argument to `method`, we know `method`\n    # is a function factory, returning a closure of pdf in terms of x, y:\n    if (length(method_formals) == 1 && method_formals == \"data\") {\n\n      df_est <- f_est(method, data, n, rangex, rangey)\n\n    # Otherwise `method` computes a grid for us, shortcutting\n    # representing pdf in terms of x, y:\n    } else if (length(method_formals) == 4 && all(method_formals == c(\"data\", \"n\", \"rangex\", \"rangey\"))) {\n\n      df_est <- method(data, n, rangex, rangey)\n\n    } else {\n\n      stop(\"Invalid `method` argument -- did you forget the `()`?\")\n\n    }\n\n  }\n\n\n  # remove unneeded attributes\n  attr(df_est, \"out.attrs\") <- NULL\n\n  # Manipulate df_est to get information about HDRs:\n\n  # force estimate to integrate to 1\n  df_est$fhat_discretized <- normalize(df_est$fhat)\n\n  # temporarily rescale df$fhat for stability\n  fhat_max <- max(df_est$fhat)\n  df_est$fhat <- df_est$fhat / fhat_max\n\n  # find cutoffs (in terms of rescaled fhat)\n  breaks <- c(find_cutoff(df_est, probs), Inf)\n\n  # find HDRs for points in the grid\n  df_est$hdr <- vapply(df_est$fhat, get_hdr_val, numeric(1), breaks, probs)\n\n  # find hdr membership of points from data\n  if (!is.null(data) & hdr_membership) {\n    data$hdr_membership <- mapply(get_hdr_membership, data$x, data$y, MoreArgs = list(df_est, breaks, probs), SIMPLIFY = TRUE)\n  }\n\n  # transforming df_est$fhat and breaks back to original scale:\n  df_est$fhat <- df_est$fhat * fhat_max\n  breaks <- breaks * fhat_max\n\n  # Give breaks nicely formatted names, corresponding to HDRs:\n  names(breaks) <- scales::percent_format(accuracy = 1)(probs)\n\n  # bundle everything together\n  list(\n    df_est = df_est,\n    breaks = breaks,\n    data = data\n  )\n\n}\n\nfix_probs <- function(probs) {\n  stopifnot(\"Probabilities must be between 0 and 1, exclusive\" = all(probs > 0) & all(probs < 1))\n\n  sort(probs, decreasing = TRUE)\n}\n\nget_hdr_val <- function(fhat, breaks, probs) {\n  hdrs <- which(fhat >= breaks)\n  if (length(hdrs) == 0) return(1)\n  probs[max(hdrs)]\n}\n\nget_hdr_membership <- function(x, y, df_est, breaks, probs) {\n  df_est$dist <- (x - df_est$x)^2 + (y - df_est$y)^2\n  fhat <- df_est[which.min(df_est$dist), \"fhat\"]\n\n  get_hdr_val(fhat, breaks, probs)\n}\n\n\n# method is a function of data\n# fun is a function of vectors x, y\nf_est <- function(method, data, n, rangex, rangey, fun = NULL, args = list()) {\n\n  # If `fun` isn't specified, method returns a closure\n  # representing closed form of density estimate\n  fun <- fun %||% method(data)\n\n  # grid to evaluate fun\n  df <- expand.grid(\n    \"x\" = seq(rangex[1], rangex[2], length.out = n),\n    \"y\" = seq(rangey[1], rangey[2], length.out = n)\n  )\n\n  # evaluate method on the grid, f required to be vectorized in x, y:\n  # (args is only non-empty if fun was specified)\n  df$fhat <- do.call(fun, c(quote(df$x), quote(df$y), args))\n\n  df\n\n}\n\n\n"
  },
  {
    "path": "R/get_hdr_1d.R",
    "content": "#' Computing the highest density regions of a 1D density\n#'\n#' `get_hdr_1d` is used to estimate a 1-dimensional density and compute corresponding HDRs.\n#' The estimated density and HDRs are represented in a discrete form as a grid, defined by arguments `range` and `n`.\n#' `get_hdr_1d` is used internally by layer functions `stat_hdr_rug()` and `stat_hdr_rug_fun()`.\n#'\n#' @inheritParams get_hdr\n#' @param method Either a character (`\"kde\"`, `\"norm\"`, `\"histogram\"`, `\"freqpoly\"`, or `\"fun\"`) or `method_*_1d()` function.\n#'   See the \"The `method` argument\" section below for details.\n#' @param x A vector of data\n#' @param hdr_membership Should HDR membership of data points (`x`) be computed?\n#' @param range Range of grid representing estimated density and HDRs.\n#' @param n Resolution of grid representing estimated density and HDRs.\n#' @param fun Optional, a probability density function, must be vectorized in its first argument.\n#'   See the \"The `fun` argument\" section below for details.\n#'\n#' @section The `method` argument:\n#' The density estimator used to estimate the HDRs is specified with the `method` argument.\n#' The simplest way to specify an estimator is to provide a character value to `method`,\n#' for example `method = \"kde\"` specifies a kernel density estimator.\n#' However, this specification is limited to the default behavior of the estimator.\n#'\n#' Instead, it is possible to provide a function call, for example: `method = method_kde_1d()`.\n#' This is slightly different from the function calls provided in `get_hdr()`, note the `_1d` suffix.\n#' In many cases, these functions accept parameters governing the density estimation procedure.\n#' Here, `method_kde_1d()` accepts several parameters related to the choice of kernel.\n#' For details, see `?method_kde_1d`.\n#' Every method of univariate density estimation implemented has such corresponding `method_*_1d()` function,\n#' each with an associated help page.\n#'\n#' Note: `geom_hdr_rug()` and other layer functions also have `method` arguments which behave in the same way.\n#' For more details on the use and implementation of the `method_*_1d()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @section The `fun` argument:\n#' If `method` is set to `\"fun\"`, `get_hdr_1d()` expects a univariate probability\n#' density function to be specified with the `fun` argument.\n#' It is required that `fun` be a function of at least one argument (`x`).\n#' Beyond this first argument, `fun` can have arbitrarily many arguments;\n#' these can be set in `get_hdr_1d()` as a named list via the `args` parameter.\n#'\n#' Note: `get_hdr_1d()` requires that `fun` be vectorized in `x`.\n#' For an example of an appropriate choice of `fun`, see the final example below.\n#'\n#' @returns\n#'\n#' `get_hdr_1d` returns a list with elements `df_est` (`data.frame`), `breaks` (named `numeric`), and `data` (`data.frame`).\n#'\n#' * `df_est`: the estimated HDRs and density evaluated on the grid defined by `range` and `n`.\n#' The column of estimated HDRs (`df_est$hdr`) is a numeric vector with values from `probs`.\n#' The columns `df_est$fhat` and `df_est$fhat_discretized` correspond to the estimated density\n#' on the original scale and rescaled to sum to 1, respectively.\n#'\n#' * `breaks`: the heights of the estimated density (`df_est$fhat`) corresponding to the HDRs specified by `probs`.\n#' Will always have additional element `Inf` representing the cutoff for the 100% HDR.\n#'\n#' * `data`: the original data provided in the `data` argument.\n#' If `hdr_membership` is set to `TRUE`, this includes a column (`data$hdr_membership`)\n#' with the HDR corresponding to each data point.\n#'\n#' @examples\n#' x <- rnorm(1e3)\n#'\n#' # Two ways to specify `method`\n#' get_hdr_1d(x, method = \"kde\")\n#' get_hdr_1d(x, method = method_kde_1d())\n#'\n#' \\dontrun{\n#'\n#' # If parenthesis are omitted, `get_hdr_1d()` errors\n#' get_hdr_1d(df, method = method_kde_1d)\n#'\n#' # If the `_1d` suffix is omitted, `get_hdr_1d()` errors\n#' get_hdr_1d(x, method = method_kde())\n#' }\n#'\n#' # Adjust estimator parameters with arguments to `method_kde_1d()`\n#' get_hdr_1d(x, method = method_kde_1d(kernel = \"triangular\"))\n#'\n#' # Estimate different HDRs with `probs`\n#' get_hdr_1d(x, method = method_kde_1d(), probs = c(.975, .6, .2))\n#'\n#' # Compute \"population\" HDRs of specified univariate pdf with `method = \"fun\"`\n#' f <- function(x, sd = 1) dnorm(x, sd = sd)\n#' get_hdr_1d(method = \"fun\", fun = f, range = c(-5, 5))\n#' get_hdr_1d(method = \"fun\", fun = f, range = c(-5, 5), args = list(sd = .5))\n#'\n#'\n#' @export\nget_hdr_1d <- function(x = NULL, method = \"kde\", probs = c(.99, .95, .8, .5), n = 512, range = NULL, hdr_membership = TRUE, fun, args = list()) {\n\n  # Deal with missing data argument\n  if (is.null(x)) {\n    if (!is.character(method) | (is.character(method) && method != \"fun\")) {\n      stop('`x` must be provided unless `method = \"fun\"`')\n    } else {\n      if (is.null(range)) {\n        stop('If `x` is unspecified, `range` must be provided when `method = \"fun\"`')\n      }\n    }\n  }\n\n  range <- range %||% range(x)\n\n  probs <- fix_probs(probs)\n\n  # Create df_est (estimated density evaluated on a grid) depending on specified method:\n  if (is.character(method) && method == \"fun\") {\n\n    df_est <- f_est_1d(method = NULL, x = x, n, range = range, fun = fun, args = args)\n\n  } else  {\n\n    if (is.character(method)) {\n\n      if (!method %in% c(\"kde\", \"norm\", \"histogram\", \"freqpoly\")) stop(\"Invalid method specified\")\n\n      # If method is provided as a character, re-assign correct function output:\n      method <- switch(method,\n        \"kde\"       = method_kde_1d(),\n        \"histogram\" = method_histogram_1d(),\n        \"freqpoly\"  = method_freqpoly_1d(),\n        \"norm\"      = method_norm_1d()\n      )\n\n    }\n\n    # parse args of method to determine strategy of `method`\n    method_formals <- names(formals(method))\n\n    # If `data` is the only argument to `method`, we know `method`\n    # is a function factory, returning a closure of pdf in terms of x, y:\n    if (length(method_formals) == 1 && method_formals %in% c(\"x\", \"y\")) {\n\n      df_est <- f_est_1d(method, x, n, range)\n\n    # Otherwise `method` computes a grid for us, shortcutting\n    # representing pdf in terms of x, y:\n    } else if (length(method_formals) == 3 && method_formals[1] %in% c(\"x\", \"y\") & all(method_formals[2:3] == c(\"n\", \"range\"))) {\n\n      df_est <- method(x, n, range)\n\n    } else if (\"data\" %in% method_formals) {\n\n      stop(\"Invalid `method` argument -- did you forget the `_1d()`?\")\n\n    } else {\n\n      stop(\"Invalid `method` argument -- did you forget the `()`?\")\n\n    }\n\n  }\n\n\n  # Manipulate df_est to get information about HDRs:\n\n  # force estimate to integrate to 1\n  df_est$fhat_discretized <- normalize(df_est$fhat)\n\n  # temporarily rescale df$fhat for stability\n  fhat_max <- max(df_est$fhat)\n  df_est$fhat <- df_est$fhat / fhat_max\n\n  # find cutoffs (in terms of rescaled fhat)\n  breaks <- c(find_cutoff(df_est, probs), Inf)\n\n  # find HDRs for points in the grid\n  df_est$hdr <- vapply(df_est$fhat, get_hdr_val, numeric(1), breaks, probs)\n\n  # find hdr membership of points from data\n  if (!is.null(x) & hdr_membership) {\n\n    data <- data.frame(x = x)\n\n    if (hdr_membership) {\n\n      hdr_membership <- vapply(x, get_hdr_membership_1d, numeric(1), df_est, breaks, probs)\n\n      # create data frame w/ input data (x) + HDR membership\n      data$hdr_membership <- hdr_membership\n\n    }\n\n  } else {\n\n    data <- NULL\n\n  }\n\n  # transforming df_est$fhat and breaks back to original scale:\n  df_est$fhat <- df_est$fhat * fhat_max\n  breaks <- breaks * fhat_max\n\n  # Give breaks nicely formatted names, corresponding to HDRs:\n  names(breaks) <- scales::percent_format(accuracy = 1)(probs)\n\n  # bundle everything together\n  list(\n    df_est = df_est,\n    breaks = breaks,\n    data = data\n  )\n\n}\n\nget_hdr_membership_1d <- function(x, df_est, breaks, probs) {\n  df_est$dist <- (x - df_est$x)^2\n  fhat <- df_est[which.min(df_est$dist), \"fhat\"]\n\n  get_hdr_val(fhat, breaks, probs)\n}\n\n# method is a function of data vector x\n# fun is a function of vector x -- the grid\n# Might need to be more careful w/ axis transformations here\nf_est_1d <- function(method, x, n, range, fun = NULL, args = list()) {\n\n  # If fun isn't specified, method returns a closure\n  # representing closed form of density estimate\n  fun <- fun %||% method(x)\n\n  # grid to evaluate fun\n  df <- data.frame(x = seq(range[1], range[2], length.out = n))\n\n  # evaluate method on the grid, f required to be vectorized in x, y:\n  # (args is only non-empty if fun was specified)\n  df$fhat <- do.call(fun, c(quote(df$x), args))\n\n  df\n\n}\n\n\n"
  },
  {
    "path": "R/ggdensity-package.R",
    "content": "#' ggdensity: Stats and Geoms for Density Estimation with ggplot2\n#'\n#' A package that allows more flexible computations for visualization of density\n#' estimates with ggplot2.\n#'\n#' @seealso\n#'\n#' Useful links:\n#' * \\url{https://jamesotto852.github.io/ggdensity/}\n#' * \\url{https://github.com/jamesotto852/ggdensity/}\n#'\n#' @import ggplot2\n#' @importFrom MASS bandwidth.nrd kde2d\n#' @importFrom stats uniroot cov pchisq setNames sd cor dnorm\n#' @name ggdensity\n#' @aliases ggdensity package-ggdensity\nNULL\n"
  },
  {
    "path": "R/hdr.R",
    "content": "#' Highest density regions of a 2D density estimate\n#'\n#' Perform 2D density estimation, compute and plot the resulting highest density regions.\n#' `geom_hdr()` draws filled regions and `geom_hdr_lines()` draws lines outlining the regions.\n#' Note, the plotted objects have probabilities mapped to the `alpha` aesthetic by default.\n#'\n#' @section Aesthetics: `geom_hdr()` and `geom_hdr_lines()` understand the following aesthetics (required\n#'   aesthetics are in bold):\n#'\n#'   - **x**\n#'   - **y**\n#'   - alpha\n#'   - color\n#'   - fill (only `geom_hdr`)\n#'   - group\n#'   - linetype\n#'   - linewidth\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability associated with the highest density region, specified\n#'   by `probs` argument.} }\n#'\n#' @inheritParams ggplot2::geom_path\n#' @inheritParams ggplot2::stat_identity\n#' @inheritParams ggplot2::stat_density2d\n#' @param method Density estimator to use, accepts character vector:\n#'   `\"kde\"`,`\"histogram\"`, `\"freqpoly\"`, or `\"mvnorm\"`.\n#'   Alternatively accepts functions  which return closures corresponding to density estimates,\n#'   see `?get_hdr` or `vignette(\"method\", \"ggdensity\")`.\n#' @param probs Probabilities to compute highest density regions for.\n#' @param xlim,ylim Range to compute and draw regions. If `NULL`, defaults to\n#'   range of data.\n#' @param n Resolution of grid defined by `xlim` and `ylim`.\n#'   Ignored if `method = \"histogram\"` or `method = \"freqpoly\"`.\n#' @name geom_hdr\n#' @rdname geom_hdr\n#' @references Scott, David W. Multivariate Density Estimation (2e), Wiley.\n#'\n#' @import ggplot2\n#'\n#' @examples\n#' # Basic simulated data with bivariate normal data and various methods\n#' df <- data.frame(x = rnorm(1000), y = rnorm(1000))\n#' p <- ggplot(df, aes(x, y)) + coord_equal()\n#'\n#' p + geom_hdr()\n#' p + geom_hdr(method = \"mvnorm\")\n#' p + geom_hdr(method = \"freqpoly\")\n#' # p + geom_hdr(method = \"histogram\")\n#'\n#' # Adding point layers on top to visually assess region estimates\n#' pts <- geom_point(size = .2, color = \"red\")\n#'\n#' p + geom_hdr() + pts\n#' p + geom_hdr(method = \"mvnorm\") + pts\n#' p + geom_hdr(method = \"freqpoly\") + pts\n#' # p + geom_hdr(method = \"histogram\") + pts\n#'\n#' # Highest density region boundary lines\n#' p + geom_hdr_lines()\n#' p + geom_hdr_lines(method = \"mvnorm\")\n#' p + geom_hdr_lines(method = \"freqpoly\")\n#' # p + geom_hdr_lines(method = \"histogram\")\n#'\n#' \\dontrun{\n#'\n#' # 2+ groups - mapping other aesthetics in the geom\n#' rdata <- function(n, n_groups = 3, radius = 3) {\n#'   list_of_dfs <- lapply(0:(n_groups-1), function(k) {\n#'     mu <- c(cos(2*k*pi/n_groups), sin(2*k*pi/n_groups))\n#'     m <- MASS::mvrnorm(n, radius*mu, diag(2))\n#'     structure(data.frame(m, as.character(k)), names = c(\"x\", \"y\", \"c\"))\n#'   })\n#'   do.call(\"rbind\", list_of_dfs)\n#' }\n#'\n#' dfc <- rdata(1000, n_groups = 5)\n#' pf <- ggplot(dfc, aes(x, y, fill = c)) + coord_equal()\n#'\n#' pf + geom_hdr()\n#' pf + geom_hdr(method = \"mvnorm\")\n#' pf + geom_hdr(method = \"mvnorm\", probs = .90, alpha = .5)\n#' pf + geom_hdr(method = \"histogram\")\n#' pf + geom_hdr(method = \"freqpoly\")\n#'\n#' pc <- ggplot(dfc, aes(x, y, color = c)) +\n#'  coord_equal() +\n#'  theme_minimal() +\n#'  theme(panel.grid.minor = element_blank())\n#'\n#' pc + geom_hdr_lines()\n#' pc + geom_hdr_lines(method = \"mvnorm\")\n#'\n#'\n#' # Data with boundaries\n#' ggplot(df, aes(x^2)) + geom_histogram(bins = 30)\n#' ggplot(df, aes(x^2)) + geom_histogram(bins = 30, boundary = 0)\n#' ggplot(df, aes(x^2, y^2)) + geom_hdr(method = \"histogram\")\n#'\n#' }\n#'\nNULL\n\n\n#' @rdname geom_hdr\n#' @export\nstat_hdr <- function(mapping = NULL, data = NULL,\n                     geom = \"hdr\", position = \"identity\",\n                     ...,\n                     method = \"kde\",\n                     probs = c(.99, .95, .8, .5),\n                     n = 100,\n                     xlim = NULL,\n                     ylim = NULL,\n                     na.rm = FALSE,\n                     show.legend = NA,\n                     inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdr,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      method = method,\n      probs = probs,\n      n = n,\n      xlim = xlim,\n      ylim = ylim,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n\n#' @rdname geom_hdr\n#' @format NULL\n#' @usage NULL\n#' @importFrom scales percent\n#' @export\nStatHdr <- ggproto(\"StatHdr\", Stat,\n\n  required_aes = c(\"x\", \"y\"),\n  default_aes = aes(order = after_stat(probs), alpha = after_stat(probs)),\n\n  output = \"bands\",\n\n  compute_group = function(self, data, scales, na.rm = FALSE,\n                           method = \"kde\", probs = c(.99, .95, .8, .5),\n                           n = 100, xlim = NULL, ylim = NULL) {\n\n    rangex <- xlim %||% scales$x$dimension()\n    rangey <- ylim %||% scales$y$dimension()\n\n    # Only calculate HDR membership if we need to\n    need_membership <- (self$output == \"points\")\n\n    res <- get_hdr(data, method, probs, n, rangex, rangey, hdr_membership = need_membership)\n\n    res_to_df(res, probs, data$group[1], self$output)\n\n  }\n)\n\n# internal helper function to convert output of `get_hdr[_1d]()` into\n# what `GeomHdr*$draw_group()` methods need\nres_to_df <- function(res, probs, group, output) {\n\n  probs <- fix_probs(probs)\n\n  # Need z for xyz_to_isobands/lines()\n  res$df_est$z <- res$df_est$fhat\n\n  if (output == \"bands\") {\n\n    isobands <- xyz_to_isobands(res$df_est, res$breaks)\n    names(isobands) <- scales::percent_format(accuracy = 1)(probs)\n    df <- iso_to_polygon(isobands, group)\n    df$probs <- ordered(df$level, levels = names(isobands))\n    df$level <- NULL\n\n  } else if (output == \"lines\") {\n\n    isolines <- xyz_to_isolines(res$df_est, res$breaks)\n    names(isolines) <- scales::percent_format(accuracy = 1)(probs)\n    df <- iso_to_path(isolines, group)\n    df$probs <- ordered(df$level, levels = names(isolines))\n    df$level <- NULL\n\n  } else if (output == \"points\") {\n\n    df <- res$data\n    df$hdr_membership <- scales::percent_format(accuracy = 1)(df$hdr_membership)\n    df$probs <- ordered(df$hdr_membership, levels = scales::percent_format(accuracy = 1)(c(1, probs)))\n    df$hdr_membership <- NULL\n\n  }\n\n  df\n\n}\n\n\n\n#' @rdname geom_hdr\n#' @export\ngeom_hdr <- function(mapping = NULL, data = NULL,\n                       stat = \"hdr\", position = \"identity\",\n                       ...,\n                       na.rm = FALSE,\n                       show.legend = NA,\n                       inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdr,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdr <- ggproto(\"GeomHdr\", GeomPolygon)\n"
  },
  {
    "path": "R/hdr_fun.R",
    "content": "#' Highest density regions of a bivariate pdf\n#'\n#' Compute and plot the highest density regions (HDRs) of a bivariate pdf.\n#' `geom_hdr_fun()` draws filled regions, and `geom_hdr_lines_fun()` draws lines outlining the regions.\n#' Note, the plotted objects have probabilities mapped to the `alpha` aesthetic by default.\n#'\n#' @section Aesthetics: `geom_hdr_fun()` and `geom_hdr_lines_fun()` understand the following aesthetics (required\n#'   aesthetics are in bold):\n#'\n#'   - x\n#'   - y\n#'   - alpha\n#'   - color\n#'   - fill (only `geom_hdr_fun`)\n#'   - group\n#'   - linetype\n#'   - linewidth\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability associated with the highest density region, specified\n#'   by `probs`.} }\n#'\n#' @inheritParams ggplot2::geom_path\n#' @inheritParams ggplot2::stat_identity\n#' @inheritParams ggplot2::stat_density2d\n#' @param fun A function, the joint probability density function, must be\n#' vectorized in its first two arguments; see examples.\n#' @param args Named list of additional arguments passed on to `fun`.\n#' @param probs Probabilities to compute highest density regions for.\n#' @param n Resolution of grid `fun` is evaluated on.\n#' @param xlim,ylim Range to compute and draw regions. If `NULL`, defaults to\n#'   range of data if present.\n#' @name geom_hdr_fun\n#' @rdname geom_hdr_fun\n#'\n#' @import ggplot2\n#'\n#' @examples\n#' # HDRs of the bivariate exponential\n#' f <- function(x, y) dexp(x) * dexp(y)\n#' ggplot() + geom_hdr_fun(fun = f, xlim = c(0, 10), ylim = c(0, 10))\n#'\n#'\n#' # HDRs of a custom parametric model\n#'\n#' # generate example data\n#' n <- 1000\n#' th_true <- c(3, 8)\n#'\n#' rdata <- function(n, th) {\n#'   gen_single_obs <- function(th) {\n#'     rchisq(2, df = th) # can be anything\n#'   }\n#'   df <- replicate(n, gen_single_obs(th))\n#'   setNames(as.data.frame(t(df)), c(\"x\", \"y\"))\n#' }\n#' data <- rdata(n, th_true)\n#'\n#' # estimate unknown parameters via maximum likelihood\n#' likelihood <- function(th) {\n#'   th <- abs(th) # hack to enforce parameter space boundary\n#'   log_f <- function(v) {\n#'     x <- v[1]; y <- v[2]\n#'     dchisq(x, df = th[1], log = TRUE) + dchisq(y, df = th[2], log = TRUE)\n#'   }\n#'   sum(apply(data, 1, log_f))\n#' }\n#' (th_hat <- optim(c(1, 1), likelihood, control = list(fnscale = -1))$par)\n#'\n#' # plot f for the give model\n#' f <- function(x, y, th) dchisq(x, df = th[1]) * dchisq(y, df = th[2])\n#'\n#' ggplot(data, aes(x, y)) +\n#'   geom_hdr_fun(fun = f, args = list(th = th_hat)) +\n#'   geom_point(size = .25, color = \"red\") +\n#'   xlim(0, 30) + ylim(c(0, 30))\n#'\n#' ggplot(data, aes(x, y)) +\n#'   geom_hdr_lines_fun(fun = f, args = list(th = th_hat)) +\n#'   geom_point(size = .25, color = \"red\") +\n#'   xlim(0, 30) + ylim(c(0, 30))\n#'\n#'\nNULL\n\n\n\n\n\n\n#' @rdname geom_hdr_fun\n#' @export\nstat_hdr_fun <- function(mapping = NULL, data = NULL,\n  geom = \"hdr_fun\", position = \"identity\",\n  ...,\n  fun, args = list(),\n  probs = c(.99, .95, .8, .5),\n  xlim = NULL, ylim = NULL, n = 100,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrFun,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      fun = fun,\n      args = args,\n      probs = probs,\n      xlim = xlim,\n      ylim = ylim,\n      n = n,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_fun\n#' @format NULL\n#' @usage NULL\n#' @importFrom scales percent\n#' @export\nStatHdrFun <- ggproto(\"StatHdrFun\", Stat,\n\n  default_aes = aes(order = after_stat(probs), alpha = after_stat(probs)),\n\n  output = \"bands\",\n\n  # very similar to StatHdr$compute_group(),\n  # only difference are the parameters fun + args (vs. method + parameters)\n  # -- this prevents factoring into one compute_group() method,\n  #    compute_group()'s arguments are different.\n  compute_group = function(self, data, scales, na.rm = FALSE,\n                           fun, args = list(), probs = c(.99, .95, .8, .5),\n                           n = 100, xlim = NULL, ylim = NULL) {\n\n    if ((is.null(xlim) & is.null(scales$x)) | (is.null(ylim) & is.null(scales$y))) {\n      stop(\"If no data is provided to StatHdrFun, xlim and ylim must be specified\")\n    }\n\n    rangex <- xlim %||% scales$x$dimension()\n    rangey <- ylim %||% scales$y$dimension()\n\n    # Only calculate HDR membership if we need to\n    need_membership <- (self$output == \"points\")\n\n    res <- get_hdr(data, method = \"fun\", probs, n, rangex, rangey, hdr_membership = need_membership, fun = fun, args = args)\n\n    res_to_df(res, probs, data$group[1], self$output)\n\n  }\n)\n\n\n#' @rdname geom_hdr_fun\n#' @export\ngeom_hdr_fun <- function(mapping = NULL, data = NULL,\n  stat = \"hdr_fun\", position = \"identity\",\n  ...,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdrFun,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_fun\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdrFun <- ggproto(\"GeomHdrFun\", GeomHdr)\n"
  },
  {
    "path": "R/hdr_lines.R",
    "content": "#' @rdname geom_hdr\n#' @usage NULL\n#' @export\nstat_hdr_lines <- function(mapping = NULL, data = NULL,\n                           geom = \"hdr_lines\", position = \"identity\",\n                           ...,\n                           method = \"kde\",\n                           probs = c(.99, .95, .8, .5),\n                           n = 100,\n                           xlim = NULL,\n                           ylim = NULL,\n                           na.rm = FALSE,\n                           show.legend = NA,\n                           inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrLines,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      method = method,\n      probs = probs,\n      n = n,\n      xlim = xlim,\n      ylim = ylim,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @rdname geom_hdr\n#' @format NULL\n#' @usage NULL\n#' @importFrom scales percent_format\n#' @export\nStatHdrLines <- ggproto(\"StatHdrLines\", StatHdr,\n  output = \"lines\"\n)\n\n\n#' @rdname geom_hdr\n#' @usage NULL\n#' @export\ngeom_hdr_lines <- function(mapping = NULL, data = NULL,\n                           stat = \"hdr_lines\", position = \"identity\",\n                           ...,\n                           na.rm = FALSE,\n                           show.legend = NA,\n                           inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdrLines,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @rdname geom_hdr\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdrLines <- ggproto(\"GeomHdrLines\", GeomPath,\n  default_aes = aes(\n    colour = \"#000000\",\n    linewidth = 1,\n    linetype = 1,\n    alpha = NA\n  ))\n"
  },
  {
    "path": "R/hdr_lines_fun.R",
    "content": "#' @rdname geom_hdr_fun\n#' @usage NULL\n#' @export\nstat_hdr_lines_fun <- function(mapping = NULL, data = NULL,\n                               geom = \"hdr_lines_fun\", position = \"identity\",\n                               ...,\n                               fun, args = list(),\n                               probs = c(.99, .95, .8, .5),\n                               xlim = NULL, ylim = NULL, n = 100,\n                               na.rm = FALSE,\n                               show.legend = NA,\n                               inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrLinesFun,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      fun = fun,\n      args = args,\n      probs = probs,\n      xlim = xlim,\n      ylim = ylim,\n      n = n,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_fun\n#' @format NULL\n#' @usage NULL\n#' @importFrom scales percent\n#' @export\nStatHdrLinesFun <- ggproto(\"StatHdrLinesFun\", StatHdrFun,\n  output = \"lines\"\n)\n\n\n#' @rdname geom_hdr_fun\n#' @usage NULL\n#' @export\ngeom_hdr_lines_fun <- function(mapping = NULL, data = NULL,\n                         stat = \"hdr_lines_fun\", position = \"identity\",\n                         ...,\n                         na.rm = FALSE,\n                         show.legend = NA,\n                         inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdrLinesFun,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_fun\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdrLinesFun <- ggproto(\"GeomHdrlinesfun\", GeomHdrLines)\n"
  },
  {
    "path": "R/hdr_points.R",
    "content": "#' Scatterplot colored by highest density regions of a 2D density estimate\n#'\n#' Perform 2D density estimation, compute the resulting highest density regions (HDRs),\n#' and plot the provided data as a scatterplot with points colored according to\n#' their corresponding HDR.\n#'\n#' @section Aesthetics: geom_hdr_points understands the following aesthetics (required\n#'   aesthetics are in bold):\n#'\n#'   - **x**\n#'   - **y**\n#'   - alpha\n#'   - color\n#'   - fill\n#'   - group\n#'   - linetype\n#'   - size\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability associated with the highest density region, specified\n#'   by `probs`.} }\n#'\n#' @inheritParams ggplot2::stat_identity\n#' @inheritParams ggplot2::stat_density2d\n#' @inheritParams geom_hdr\n#'\n#' @name geom_hdr_points\n#' @rdname geom_hdr_points\n#'\n#' @import ggplot2\n#'\n#' @examples\n#' set.seed(1)\n#' df <- data.frame(x = rnorm(500), y = rnorm(500))\n#' p <- ggplot(df, aes(x, y)) +\n#'  coord_equal()\n#'\n#' p + geom_hdr_points()\n#'\n#' # Setting aes(fill = after_stat(probs)), color = \"black\", and\n#' # shape = 21 helps alleviate overplotting:\n#' p + geom_hdr_points(aes(fill = after_stat(probs)), color = \"black\", shape = 21, size = 2)\n#'\n#' # Also works well with geom_hdr_lines()\n#' p +\n#'  geom_hdr_lines(\n#'    aes(color = after_stat(probs)), alpha = 1,\n#'    xlim = c(-5, 5), ylim = c(-5, 5)\n#'  ) +\n#'  geom_hdr_points(\n#'    aes(fill = after_stat(probs)), color = \"black\", shape = 21, size = 2,\n#'    xlim = c(-5, 5), ylim = c(-5, 5)\n#'  )\n#'\nNULL\n\n\n\n#' @export\n#' @rdname geom_hdr_points\nstat_hdr_points <- function(mapping = NULL, data = NULL,\n                            geom = \"point\", position = \"identity\",\n                            ...,\n                            method = \"kde\",\n                            probs = c(.99, .95, .8, .5),\n                            n = 100,\n                            xlim = NULL,\n                            ylim = NULL,\n                            na.rm = FALSE,\n                            show.legend = NA,\n                            inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrPoints,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      method = method,\n      probs = probs,\n      n = n,\n      xlim = xlim,\n      ylim = ylim,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @export\n#' @rdname geom_hdr_points\n#' @format NULL\n#' @usage NULL\nStatHdrPoints <- ggproto(\"StatHdrPoints\", StatHdr,\n  default_aes = aes(order = after_stat(probs), color = after_stat(probs)),\n  output = \"points\"\n)\n\n\n#' @export\n#' @rdname geom_hdr_points\ngeom_hdr_points <- function(mapping = NULL, data = NULL,\n                            stat = \"hdr_points\", position = \"identity\",\n                            ...,\n                            na.rm = FALSE,\n                            show.legend = NA,\n                            inherit.aes = TRUE) {\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomPoint,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n"
  },
  {
    "path": "R/hdr_points_fun.R",
    "content": "#' Scatterplot colored by highest density regions of a bivariate pdf\n#'\n#' Compute the highest density regions (HDRs) of a bivariate pdf and plot the provided\n#' data as a scatterplot with points colored according to their corresponding HDR.\n#'\n#' @section Aesthetics: geom_hdr_points_fun understands the following aesthetics\n#'   (required aesthetics are in bold):\n#'\n#'   - **x**\n#'   - **y**\n#'   - alpha\n#'   - color\n#'   - fill\n#'   - group\n#'   - linetype\n#'   - size\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability associated with the highest density region, specified\n#'   by `probs`.} }\n#'\n#' @inheritParams ggplot2::stat_identity\n#' @inheritParams ggplot2::stat_density2d\n#' @inheritParams geom_hdr_fun\n#'\n#' @name geom_hdr_points_fun\n#' @rdname geom_hdr_points_fun\n#'\n#' @import ggplot2\n#'\n#' @examples\n#' # Can plot points colored according to known pdf:\n#' set.seed(1)\n#' df <- data.frame(x = rexp(1000), y = rexp(1000))\n#' f <- function(x, y) dexp(x) * dexp(y)\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr_points_fun(fun = f, xlim = c(0, 10), ylim = c(0, 10))\n#'\n#'\n#' # Also allows for hdrs of a custom parametric model\n#'\n#' # generate example data\n#' n <- 1000\n#' th_true <- c(3, 8)\n#'\n#' rdata <- function(n, th) {\n#'   gen_single_obs <- function(th) {\n#'     rchisq(2, df = th) # can be anything\n#'   }\n#'   df <- replicate(n, gen_single_obs(th))\n#'   setNames(as.data.frame(t(df)), c(\"x\", \"y\"))\n#' }\n#' data <- rdata(n, th_true)\n#'\n#' # estimate unknown parameters via maximum likelihood\n#' likelihood <- function(th) {\n#'   th <- abs(th) # hack to enforce parameter space boundary\n#'   log_f <- function(v) {\n#'     x <- v[1]; y <- v[2]\n#'     dchisq(x, df = th[1], log = TRUE) + dchisq(y, df = th[2], log = TRUE)\n#'   }\n#'   sum(apply(data, 1, log_f))\n#' }\n#' (th_hat <- optim(c(1, 1), likelihood, control = list(fnscale = -1))$par)\n#'\n#' # plot f for the give model\n#' f <- function(x, y, th) dchisq(x, df = th[1]) * dchisq(y, df = th[2])\n#'\n#' ggplot(data, aes(x, y)) +\n#'   geom_hdr_points_fun(fun = f, args = list(th = th_hat))\n#'\n#' ggplot(data, aes(x, y)) +\n#'   geom_hdr_points_fun(aes(fill = after_stat(probs)), shape = 21, color = \"black\",\n#'     fun = f, args = list(th = th_hat), na.rm = TRUE) +\n#'   geom_hdr_lines_fun(aes(color = after_stat(probs)), alpha = 1, fun = f, args = list(th = th_hat)) +\n#'   lims(x = c(0, 15), y = c(0, 25))\n#'\nNULL\n\n\n#' @export\n#' @rdname geom_hdr_points_fun\nstat_hdr_points_fun <- function(mapping = NULL, data = NULL,\n                                geom = \"point\", position = \"identity\",\n                                ...,\n                                fun, args = list(),\n                                probs = c(.99, .95, .8, .5),\n                                xlim = NULL, ylim = NULL, n = 100,\n                                na.rm = FALSE,\n                                show.legend = NA,\n                                inherit.aes = TRUE) {\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrPointsFun,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      fun = fun,\n      args = args,\n      probs = probs,\n      xlim = xlim,\n      ylim = ylim,\n      n = n,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @export\n#' @format NULL\n#' @usage NULL\n#' @rdname geom_hdr_points_fun\nStatHdrPointsFun <- ggproto(\"StatHdrPointsFun\", StatHdrFun,\n  default_aes = aes(order = after_stat(probs), color = after_stat(probs)),\n  output = \"points\"\n)\n\n#' @export\n#' @rdname geom_hdr_points_fun\ngeom_hdr_points_fun <- function(mapping = NULL, data = NULL,\n                                stat = \"hdr_points_fun\", position = \"identity\",\n                                ...,\n                                na.rm = FALSE,\n                                show.legend = NA,\n                                inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomPoint,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n"
  },
  {
    "path": "R/hdr_rug.R",
    "content": "#' Rug plots of marginal highest density region estimates\n#'\n#' Perform 1D density estimation, compute and plot the resulting highest density\n#' regions in a way similar to [ggplot2::geom_rug()].\n#' Note, the plotted objects have probabilities mapped to the `alpha` aesthetic by default.\n#'\n#' @section Aesthetics: geom_hdr_rug understands the following aesthetics (required\n#'   aesthetics are in bold):\n#'\n#'   - x\n#'   - y\n#'   - alpha\n#'   - fill\n#'   - group\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability of the highest density region, specified\n#'   by `probs`, corresponding to each point.} }\n#'\n#' @inheritParams ggplot2::geom_rug\n#' @inheritParams stat_hdr\n#' @param method,method_y Density estimator(s) to use.\n#'   By default `method` is used for both x- and y-axis.\n#'   If specified, `method_y` will be used for y-axis.\n#'   Accepts character vector: `\"kde\"`,`\"histogram\"`, `\"freqpoly\"`, or `\"norm\"`.\n#'   Alternatively accepts functions  which return closures corresponding to density estimates,\n#'   see `?get_hdr_1d` or `vignette(\"method\", \"ggdensity\")`.\n#' @name geom_hdr_rug\n#' @rdname geom_hdr_rug\n#'\n#' @import ggplot2\n#'\n#' @examples\n#' set.seed(1)\n#' df <- data.frame(x = rnorm(100), y = rnorm(100))\n#'\n#' # Plot marginal HDRs for bivariate data\n#' ggplot(df, aes(x, y)) +\n#'   geom_point() +\n#'   geom_hdr_rug() +\n#'   coord_fixed()\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr() +\n#'   geom_hdr_rug() +\n#'   coord_fixed()\n#'\n#' # Plot HDR for univariate data\n#' ggplot(df, aes(x)) +\n#'   geom_density() +\n#'   geom_hdr_rug()\n#'\n#' ggplot(df, aes(y = y)) +\n#'   geom_density() +\n#'   geom_hdr_rug()\n#'\n#' # Specify location of marginal HDRs as in ggplot2::geom_rug()\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr() +\n#'   geom_hdr_rug(sides = \"tr\", outside = TRUE) +\n#'   coord_fixed(clip = \"off\")\n#'\n#' # Can use same methods of density estimation as geom_hdr().\n#' # For data with constrained support, we suggest setting method = \"histogram\":\n#' ggplot(df, aes(x^2)) +\n#'  geom_histogram(bins = 30, boundary = 0) +\n#'  geom_hdr_rug(method = \"histogram\")\n#'\n#' ggplot(df, aes(x^2, y^2)) +\n#'  geom_hdr(method = \"histogram\") +\n#'  geom_hdr_rug(method = \"histogram\") +\n#'  coord_fixed()\n#'\nNULL\n\n\n\n\n\n\n#' @rdname geom_hdr_rug\n#' @export\nstat_hdr_rug <- function(mapping = NULL, data = NULL,\n                         geom = \"hdr_rug\", position = \"identity\",\n                         ...,\n                         method = \"kde\",\n                         method_y = \"kde\",\n                         probs = c(.99, .95, .8, .5),\n                         xlim = NULL,\n                         ylim = NULL,\n                         n = 512,\n                         na.rm = FALSE,\n                         show.legend = NA,\n                         inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrRug,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      method = method,\n      method_y = method_y,\n      probs = probs,\n      xlim = xlim,\n      ylim = ylim,\n      n = n,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_rug\n#' @format NULL\n#' @usage NULL\n#' @export\nStatHdrRug <- ggproto(\"StatHdrRug\", Stat,\n\n  required_aes = c(\"x|y\"),\n  default_aes = aes(alpha = after_stat(probs)),\n\n  compute_group = function(data, scales, na.rm = FALSE,\n                           method = \"kde\", method_y = NULL,\n                           probs = c(.99, .95, .8, .5),\n                           xlim = NULL, ylim = NULL, n = 512) {\n\n    # Recycle for both x, y\n    if (length(n) == 1) n <- rep(n, 2)\n\n    # If no alternative method_y, use method\n    if (is.null(method_y)) method_y <- method\n\n\n    # Estimate marginal densities\n\n    # Initialize dfs for x and y axes,\n    # in case only x or y are supplied:\n    df_x <- data.frame()\n    df_y <- data.frame()\n\n    if (!is.null(data$x)) {\n\n      rangex <- xlim %||% scales$x$dimension()\n\n      res_x <- get_hdr_1d(data$x, method, probs, n[1], rangex, hdr_membership = FALSE)\n\n      df_x <- res_to_df_1d(res_x, probs, data$group[1], output = \"rug\")\n\n      # Needs correct name for ggplot2 internals\n      df_x$axis <- \"x\"\n      df_x$y <- NA\n\n    }\n\n\n    if (!is.null(data$y)) {\n\n      rangey <- ylim %||% scales$y$dimension()\n\n      res_y <- get_hdr_1d(data$y, method_y, probs, n[2], rangey, hdr_membership = FALSE)\n\n      df_y <- res_to_df_1d(res_y, probs, data$group[1], output = \"rug\")\n\n      # Needs correct name for ggplot2 internals\n      df_y$axis <- \"y\"\n      df_y$y <- df_y$x\n      df_y$x <- NA\n\n    }\n\n    df <- rbind(df_x, df_y)\n\n    # Need to remove extra col if only plotting x or y rug\n    if (is.null(data$x)) df$x <- NULL\n    if (is.null(data$y)) df$y <- NULL\n\n    df\n\n    }\n  )\n\n\nres_to_df_1d <- function(res, probs, group, output) {\n\n  probs <- fix_probs(probs)\n\n  if (output == \"rug\") {\n\n    probs_formatted <- scales::percent_format(accuracy = 1)(probs)\n\n    df <- res$df_est\n\n    # alpha will be mapped to df$probs\n    df$probs <- scales::percent_format(accuracy = 1)(df$hdr)\n    df$probs <- ordered(df$probs, levels = probs_formatted)\n    df$hdr <- NULL\n\n    # Discard 100% HDR if it's not in probs:\n    df <- df[!is.na(df$probs),]\n\n  }\n\n  df\n\n}\n\n\n\n#' @rdname geom_hdr_rug\n#' @export\ngeom_hdr_rug <- function(mapping = NULL, data = NULL,\n                         stat = \"hdr_rug\", position = \"identity\",\n                         ...,\n                         outside = FALSE,\n                         sides = \"bl\",\n                         length = unit(0.03, \"npc\"),\n                         na.rm = FALSE,\n                         show.legend = NA,\n                         inherit.aes = TRUE) {\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdrRug,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      outside = outside,\n      sides = sides,\n      length = length,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @rdname geom_hdr_rug\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdrRug <- ggproto(\"GeomHdrRug\", Geom,\n   optional_aes = c(\"x\", \"y\"),\n\n   draw_panel = function(data, panel_params, coord, sides = \"bl\",\n                         outside = FALSE, length = unit(0.03, \"npc\")) {\n\n     if (!inherits(length, \"unit\")) {\n       abort(\"'length' must be a 'unit' object.\")\n     }\n     rugs <- list()\n\n     # For coord_flip, coord$tranform does not flip the sides where to\n     # draw the rugs. We have to flip them.\n     if (inherits(coord, 'CoordFlip')) {\n       sides <- chartr('tblr', 'rlbt', sides)\n     }\n\n     # move the rug to outside the main plot space\n     if (outside) length <- -length\n\n     # Set up data frames for x and y:\n     data_x <- data[data$axis == \"x\",]\n     data_y <- data[data$axis == \"y\",]\n\n\n     if (nrow(data_x) > 0) {\n\n       data_x <- coord$transform(data_x, panel_params)\n       data_x$width <- resolution(data_x$x, FALSE)\n\n       gp_x <- grid::gpar(\n         col = alpha(data_x$fill, data_x$alpha),\n         fill = alpha(data_x$fill, data_x$alpha),\n         lwd = 0\n       )\n\n       # set up x axis rug rasters\n       if (grepl(\"b\", sides)) {\n         rugs$x_b <- grid::rectGrob(\n           x = unit(data_x$x, \"native\"),\n           y = unit(0, \"npc\"),\n           width = data_x$width,\n           height = length,\n           just = \"bottom\",\n           gp = gp_x\n         )\n       }\n\n       if (grepl(\"t\", sides)) {\n         rugs$x_t <- grid::rectGrob(\n           x = unit(data_x$x, \"native\"),\n           y = unit(1, \"npc\"),\n           width = data_x$width,\n           height = length,\n           just = \"top\",\n           gp = gp_x\n         )\n       }\n     }\n\n     if (nrow(data_y) > 0) {\n\n       data_y <- coord$transform(data_y, panel_params)\n       data_y$height <- resolution(data_y$y, FALSE)\n\n       gp_y <- grid::gpar(\n         col = alpha(data_y$fill, data_y$alpha),\n         fill = alpha(data_y$fill, data_y$alpha),\n         lwd = 0\n       )\n\n\n       # set up y axis rug rasters\n       if (grepl(\"l\", sides)) {\n         rugs$y_l <- grid::rectGrob(\n           x = unit(0, \"npc\"),\n           y = unit(data_y$y, \"native\"),\n           width = length,\n           height = data_y$height,\n           just = \"left\",\n           gp = gp_y\n         )\n       }\n\n       if (grepl(\"r\", sides)) {\n         rugs$y_r <- grid::rectGrob(\n           x = unit(1, \"npc\"),\n           y = unit(data_y$y, \"native\"),\n           width = length,\n           height = data_y$height,\n           just = \"right\",\n           gp = gp_y\n         )\n       }\n\n     }\n\n     grid::gTree(children = do.call(grid::gList, rugs))\n\n   },\n\n  default_aes = aes(fill = \"grey20\", alpha = NA),\n\n  draw_key = draw_key_rect\n)\n\n\n\n\n\n"
  },
  {
    "path": "R/hdr_rug_fun.R",
    "content": "#' Rug plots of highest density region estimates of univariate pdfs\n#'\n#' Compute and plot the highest density regions (HDRs) of specified univariate pdf(s).\n#' Note, the plotted objects have probabilities mapped to the `alpha` aesthetic by default.\n#'\n#' @section Aesthetics: `geom_hdr_rug_fun()` understands the following aesthetics (required\n#'   aesthetics are in bold):\n#'\n#'   - x\n#'   - y\n#'   - alpha\n#'   - fill\n#'   - group\n#'   - subgroup\n#'\n#' @section Computed variables:\n#'\n#'   \\describe{ \\item{probs}{The probability of the highest density region, specified\n#'   by `probs`, corresponding to each point.} }\n#'\n#' @inheritParams ggplot2::geom_rug\n#' @inheritParams stat_hdr_rug\n#' @param fun_x,fun_y Functions, the univariate probability density function for the x- and/or y-axis.\n#'   First argument must be vectorized.\n#' @param args_x,args_y Named list of additional arguments passed on to `fun_x` and/or `fun_y`.\n#' @name geom_hdr_rug_fun\n#' @rdname geom_hdr_rug_fun\n#'\n#' @examples\n#' # Plotting data with exponential marginals\n#' df <- data.frame(x = rexp(1e3), y = rexp(1e3))\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr_rug_fun(fun_x = dexp, fun_y = dexp) +\n#'   geom_point(size = .5) +\n#'   coord_fixed()\n#'\n#' # without data/aesthetic mappings\n#' ggplot() +\n#'   geom_hdr_rug_fun(fun_x = dexp, fun_y = dexp, xlim = c(0, 7), ylim = c(0, 7)) +\n#'   coord_fixed()\n#'\n#'\n#' # Plotting univariate normal data, estimating mean and sd\n#' df <- data.frame(x = rnorm(1e4, mean = 1, sd = 3))\n#'\n#' # estimating parameters\n#' mu_hat <- mean(df$x)\n#' sd_hat <- sd(df$x)\n#'\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug_fun(fun_x = dnorm, args_x = list(mean = mu_hat, sd = sd_hat)) +\n#'   geom_density()\n#'\n#' # Equivalent to `method_norm_1d()` with `geom_hdr_rug()`\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug(method = method_norm_1d()) +\n#'   geom_density()\nNULL\n\n\n#' @rdname geom_hdr_rug_fun\n#' @export\nstat_hdr_rug_fun <- function(mapping = NULL, data = NULL,\n  geom = \"hdr_rug_fun\", position = \"identity\",\n  ...,\n  fun_x = NULL, fun_y = NULL,\n  args_x = list(), args_y = list(),\n  probs = c(.99, .95, .8, .5),\n  xlim = NULL, ylim = NULL, n = 512,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = StatHdrRugFun,\n    geom = geom,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      fun_x = fun_x,\n      fun_y = fun_y,\n      args_x = args_x,\n      args_y = args_y,\n      probs = probs,\n      xlim = xlim,\n      ylim = ylim,\n      n = n,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n\n#' @rdname geom_hdr_rug_fun\n#' @format NULL\n#' @usage NULL\n#' @export\nStatHdrRugFun <- ggproto(\"StatHdrRugFun\", Stat,\n\n  default_aes = aes(order = after_stat(probs), alpha = after_stat(probs)),\n  # if fun_x or fun_y are unspecified data might be dropped\n  dropped_aes = c(\"x\", \"y\"),\n\n  # very similar to StatHdrRug$compute_group(),\n  # only difference are the parameters fun + args (vs. method + parameters)\n  # -- this prevents factoring into one compute_group() method,\n  #    compute_group()'s arguments are different.\n  compute_group = function(self, data, scales, na.rm = FALSE,\n                           fun_x = NULL, fun_y = NULL, args_x = list(), args_y = list(),\n                           probs = c(.99, .95, .8, .5),\n                           n = 512, xlim = NULL, ylim = NULL) {\n\n\n    # Recycle for both x, y\n    if (length(n) == 1) n <- rep(n, 2)\n\n    # Estimate marginal densities\n\n    # Initialize dfs for x and y axes,\n    # in case only x or y are supplied:\n    df_x <- data.frame()\n    df_y <- data.frame()\n\n\n    if (!is.null(fun_x)) {\n\n      if (is.null(xlim) & is.null(scales$x)) {\n        stop(\"`xlim` must be specified if `x` aesthetic not provided to `StatHdrRugFun`\")\n      }\n\n      rangex <- xlim %||% scales$x$dimension()\n\n      res_x <- get_hdr_1d(data$x, method = \"fun\", probs, n[1], rangex, hdr_membership = FALSE, fun = fun_x, args = args_x)\n\n      df_x <- res_to_df_1d(res_x, probs, data$group[1], output = \"rug\")\n\n      # Needs correct name for ggplot2 internals\n      df_x$axis <- \"x\"\n      df_x$y <- NA\n\n    }\n\n\n    if (!is.null(fun_y)) {\n\n      if (is.null(ylim) & is.null(scales$y)) {\n        stop(\"`ylim` must be specified if `y` aesthetic not provided to `StatHdrRugFun`\")\n      }\n\n      rangey <- ylim %||% scales$y$dimension()\n\n      res_y <- get_hdr_1d(data$y, method = \"fun\", probs, n[1], rangey, hdr_membership = FALSE, fun = fun_y, args = args_y)\n\n      df_y <- res_to_df_1d(res_y, probs, data$group[1], output = \"rug\")\n\n      # Needs correct name for ggplot2 internals\n      df_y$axis <- \"y\"\n      df_y$y <- df_y$x\n      df_y$x <- NA\n\n    }\n\n    df <- rbind(df_x, df_y)\n\n    # Need to remove extra col if only plotting x or y rug\n    if (is.null(fun_x)) df$x <- NULL\n    if (is.null(fun_y)) df$y <- NULL\n\n    df\n\n\n  }\n)\n\n\n\n#' @rdname geom_hdr_rug_fun\n#' @export\ngeom_hdr_rug_fun <- function(mapping = NULL, data = NULL,\n                         stat = \"hdr_rug_fun\", position = \"identity\",\n                         ...,\n                         outside = FALSE,\n                         sides = \"bl\",\n                         length = unit(0.03, \"npc\"),\n                         na.rm = FALSE,\n                         show.legend = NA,\n                         inherit.aes = TRUE) {\n\n  if (is.null(data)) data <- ensure_nonempty_data\n\n  layer(\n    data = data,\n    mapping = mapping,\n    stat = stat,\n    geom = GeomHdrRugFun,\n    position = position,\n    show.legend = show.legend,\n    inherit.aes = inherit.aes,\n    params = list(\n      outside = outside,\n      sides = sides,\n      length = length,\n      na.rm = na.rm,\n      ...\n    )\n  )\n}\n\n\n#' @rdname geom_hdr_rug_fun\n#' @format NULL\n#' @usage NULL\n#' @export\nGeomHdrRugFun <- ggproto(\"GeomHdrRugFun\", GeomHdrRug)\n\n\n\n\n\n"
  },
  {
    "path": "R/helpers-ggplot2.R",
    "content": "# unexported functions from ggplot2\n\n`%||%` <- function(x, y) {\n  if (is.null(x)) y else x\n}\n\ntibble0 <- function(...) {\n  tibble::tibble(..., .name_repair = \"minimal\")\n}\n\nunique0 <- function(x, ...) {\n  if (is.null(x)) x else vctrs::vec_unique(x, ...)\n}\n\nisoband_z_matrix <- function(data) {\n  x_pos <- as.integer(factor(data$x, levels = sort(unique0(data$x))))\n  y_pos <- as.integer(factor(data$y, levels = sort(unique0(data$y))))\n  nrow <- max(y_pos)\n  ncol <- max(x_pos)\n  raster <- matrix(NA_real_, nrow = nrow, ncol = ncol)\n  raster[cbind(y_pos, x_pos)] <- data$z\n  raster\n}\n\nxyz_to_isobands <- function(data, breaks) {\n  isoband::isobands(x = sort(unique0(data$x)), y = sort(unique0(data$y)),\n                    z = isoband_z_matrix(data), levels_low = breaks[-length(breaks)],\n                    levels_high = breaks[-1])\n}\n\nxyz_to_isolines <- function(data, breaks) {\n  isoband::isolines(x = sort(unique0(data$x)), y = sort(unique0(data$y)),\n                    z = isoband_z_matrix(data), levels = breaks)\n}\n\niso_to_polygon <- function(iso, group = 1) {\n  lengths <- vapply(iso, function(x) length(x$x), integer(1))\n  if (all(lengths == 0)) {\n    warning(\"Zero contours were generated\")\n    return(tibble0())\n  }\n  levels <- names(iso)\n  xs <- unlist(lapply(iso, \"[[\", \"x\"), use.names = FALSE)\n  ys <- unlist(lapply(iso, \"[[\", \"y\"), use.names = FALSE)\n  ids <- unlist(lapply(iso, \"[[\", \"id\"), use.names = FALSE)\n  item_id <- rep(seq_along(iso), lengths)\n  groups <- paste(group, sprintf(\"%03d\", item_id), sep = \"-\")\n  groups <- factor(groups)\n  tibble0(level = rep(levels, lengths), x = xs, y = ys,\n              piece = as.integer(groups), group = groups, subgroup = ids,\n              .size = length(xs))\n}\n\niso_to_path <- function(iso, group = 1) {\n  lengths <- vapply(iso, function(x) length(x$x), integer(1))\n  if (all(lengths == 0)) {\n    warning(\"Zero contours were generated\")\n    return(tibble0())\n  }\n  levels <- names(iso)\n  xs <- unlist(lapply(iso, \"[[\", \"x\"), use.names = FALSE)\n  ys <- unlist(lapply(iso, \"[[\", \"y\"), use.names = FALSE)\n  ids <- unlist(lapply(iso, \"[[\", \"id\"), use.names = FALSE)\n  item_id <- rep(seq_along(iso), lengths)\n  groups <- paste(group, sprintf(\"%03d\", item_id), sprintf(\"%03d\",\n                                                           ids), sep = \"-\")\n  groups <- factor(groups)\n  tibble0(level = rep(levels, lengths), x = xs, y = ys,\n              piece = as.integer(groups), group = groups, .size = length(xs))\n}\n\nempty <- function(df) {\n  is.null(df) || nrow(df) == 0 || ncol(df) == 0 || inherits(df, \"waiver\")\n}\n\nensure_nonempty_data <- function(data) {\n  if (empty(data)) {\n    tibble0(group = 1, .size = 1)\n  }\n  else {\n    data\n  }\n}\n"
  },
  {
    "path": "R/helpers.R",
    "content": "# this script contains several unexported helper functions\n\n# normalization/scaling functions\nnormalize <- function(v) v / sum(v)\n\n# numerical approximation for finding hdr\n# if method = \"histogram\", don't want to use uniroot, runs into issue if n is small\nfind_cutoff <- function(df, conf, uniroot = TRUE) {\n\n  if (length(conf) > 1) return(vapply(conf, function(x) find_cutoff(df, x, uniroot), numeric(1)))\n\n  # sort df rows by fhat\n  df <- df[order(df$fhat, decreasing = TRUE),]\n\n  # compute cumsum of probs\n  df$cumprob <- cumsum(df$fhat_discretized)\n\n  # determine cutoff\n  max(df[df$cumprob >= conf,]$fhat)\n\n}\n"
  },
  {
    "path": "R/method.R",
    "content": "# methods that return est pdf as closure  ---------------------------------\n\n#' Bivariate parametric normal HDR estimator\n#'\n#' Function used to specify bivariate normal density estimator\n#' for `get_hdr()` and layer functions (e.g. `geom_hdr()`).\n#'\n#' For more details on the use and implementation of the `method_*()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @examples\n#' # Normal estimator is useful when an assumption of normality is appropriate\n#' set.seed(1)\n#' df <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_mvnorm(), xlim = c(-4, 4), ylim = c(-4, 4)) +\n#'   geom_point(size = 1)\n#'\n#' # Can also be used with `get_hdr()` for numerical summary of HDRs\n#' res <- get_hdr(df, method = method_mvnorm())\n#' str(res)\n#'\n#' @export\nmethod_mvnorm <- function() {\n\n  function(data) {\n\n    data_matrix <- with(data, cbind(x, y))\n    mu_hat <- colMeans(data_matrix)\n    R <- chol(cov(data_matrix)) # R'R = crossprod(R) = S\n\n    function(x, y) {\n      X <- cbind(x, y)\n      tmp <- backsolve(R, t(X) - mu_hat, transpose = TRUE)\n      logretval <- -sum(log(diag(R))) - log(2 * pi) - 0.5 * colSums(tmp^2)\n      exp( logretval )\n    }\n\n  }\n\n}\n\n# methods that return closures that compute a grid ------------------------\n\n#' Bivariate kernel density HDR estimator\n#'\n#' Function used to specify bivariate kernel density estimator\n#' for `get_hdr()` and layer functions (e.g. `geom_hdr()`).\n#'\n#' For more details on the use and implementation of the `method_*()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @inheritParams ggplot2::stat_density2d\n#'\n#' @examples\n#' set.seed(1)\n#' df <- data.frame(x = rnorm(1e3, sd = 3), y = rnorm(1e3, sd = 3))\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_kde()) +\n#'   geom_point(size = 1)\n#'\n#' # The defaults of `method_kde()` are the same as the estimator for `ggplot2::geom_density_2d()`\n#' ggplot(df, aes(x, y)) +\n#'   geom_density_2d_filled() +\n#'   geom_hdr_lines(method = method_kde(), probs = seq(.1, .9, by = .1)) +\n#'   theme(legend.position = \"none\")\n#'\n#' # The bandwidth of the estimator can be set directly with `h` or scaled with `adjust`\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_kde(h = 1)) +\n#'   geom_point(size = 1)\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_kde(adjust = 1/2)) +\n#'   geom_point(size = 1)\n#'\n#' # Can also be used with `get_hdr()` for numerical summary of HDRs\n#' res <- get_hdr(df, method = method_kde())\n#' str(res)\n#'\n#' @export\nmethod_kde <- function(h = NULL, adjust = c(1, 1)) {\n\n  function(data, n, rangex, rangey) {\n\n    if (is.null(h)) {\n      h <- c(MASS::bandwidth.nrd(data$x), MASS::bandwidth.nrd(data$y))\n    }\n\n    h <- h * adjust\n\n    kdeout <- MASS::kde2d(\n      x = data$x, y = data$y, n = n, h = h,\n      lims = c(rangex, rangey)\n    )\n\n    df <- with(kdeout, expand.grid(\"x\" = x, \"y\" = y))\n    df$fhat <- as.vector(kdeout$z)\n\n    df\n\n  }\n}\n\n#' Bivariate histogram HDR estimator\n#'\n#' Function used to specify bivariate histogram density estimator\n#' for `get_hdr()` and layer functions (e.g. `geom_hdr()`).\n#'\n#' For more details on the use and implementation of the `method_*()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @param bins Number of bins along each axis.\n#'   Either a vector of length 2 or a scalar value which is recycled for both dimensions.\n#'   Defaults to normal reference rule (Scott, pg 87).\n#' @param smooth If `TRUE`, HDRs are smoothed by the marching squares algorithm.\n#' @param nudgex,nudgey Horizontal and vertical rules for choosing witness points when `smooth == TRUE`.\n#'   Accepts character vector: `\"left\"`, `\"none\"`, `\"right\"` (`nudgex`) or  `\"down\"`, `\"none\"`, `\"up\"` (`nudgey`).\n#'\n#' @references Scott, David W. Multivariate Density Estimation (2e), Wiley.\n#'\n#' @examples\n#' \\dontrun{\n#'\n#' # Histogram estimators can be useful when data has boundary constraints\n#' set.seed(1)\n#' df <- data.frame(x = rexp(1e3), y = rexp(1e3))\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_histogram()) +\n#'   geom_point(size = 1)\n#'\n#' # The resolution of the histogram estimator can be set via `bins`\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_histogram(bins = c(8, 25))) +\n#'   geom_point(size = 1)\n#'\n#' # By setting `smooth = TRUE`, we can graphically smooth the \"blocky\" HDRs\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_histogram(smooth = TRUE)) +\n#'   geom_point(size = 1)\n#'\n#' # However, we need to set `nudgex` and `nudgey` to align the HDRs correctly\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_histogram(smooth = TRUE, nudgex = \"left\", nudgey = \"down\")) +\n#'   geom_point(size = 1)\n#'\n#' # Can also be used with `get_hdr()` for numerical summary of HDRs\n#' res <- get_hdr(df, method = method_histogram())\n#' str(res)\n#' }\n#'\n#' @export\nmethod_histogram <- function(bins = NULL, smooth = FALSE, nudgex = \"none\", nudgey = \"none\") {\n\n  # n is an argument, but it is not used\n  function(data, n, rangex, rangey) {\n\n    if (is.null(bins)) {\n      bins <- numeric(2)\n\n      # define histogram mesh according to Scott p. 87\n      rho <- cor(data$x, data$y)\n      hx <- 3.504 * sd(data$x) * (1 - rho^2)^(3/8) * nrow(data)^(-1/4)\n      hy <- 3.504 * sd(data$y) * (1 - rho^2)^(3/8) * nrow(data)^(-1/4)\n      bins[1] <- round((rangex[2] - rangex[1]) / hx)\n      bins[2] <- round((rangey[2] - rangey[1]) / hy)\n\n    } else if (length(bins == 1)) {\n      bins <- rep(bins, 2)\n    }\n\n    xvals <- data$x\n    yvals <- data$y\n\n    xbtwn <- (rangex[1] <= xvals & xvals <= rangex[2])\n    if (!all(xbtwn)) {\n      xvals <- xvals[xbtwn]\n      yvals <- yvals[xbtwn]\n    }\n\n    ybtwn <- (rangey[1] <= yvals & yvals <= rangey[2])\n    if (!all(ybtwn)) {\n      xvals <- xvals[ybtwn]\n      yvals <- yvals[ybtwn]\n    }\n\n    sx <- seq(rangex[1], rangex[2], length.out = bins[1] + 1)\n    sy <- seq(rangey[1], rangey[2], length.out = bins[2] + 1)\n    de_x <- sx[2] - sx[1]\n    de_y <- sy[2] - sy[1]\n    box_area <- de_x * de_y\n\n    xbin_mdpts <- sx[-(bins[1]+1)] + de_x/2\n    ybin_mdpts <- sy[-(bins[2]+1)] + de_y/2\n\n    xleft <- sx[-(bins[1]+1)]\n    xright <- sx[-1]\n\n    ybottom <- sy[-(bins[2]+1)]\n    ytop <- sy[-1]\n\n\n    df_cuts <- data.frame(\"xbin\" = cut(xvals, sx), \"ybin\" = cut(yvals, sy))\n\n    df <- with(df_cuts, expand.grid(\"xbin\" = levels(xbin), \"ybin\" = levels(ybin)))\n    df$n <- with(df_cuts, as.vector(table(xbin, ybin)))\n\n    df$xbin_midpt <- xbin_mdpts[as.integer(df$xbin)]\n    df$ybin_midpt <- ybin_mdpts[as.integer(df$ybin)]\n\n    df$xmin <- df$xbin_midpt - de_x/2\n    df$xmax <- df$xbin_midpt + de_x/2\n    df$de_x <- de_x\n\n    df$ymin <- df$ybin_midpt - de_y/2\n    df$ymax <- df$ybin_midpt + de_y/2\n    df$de_y <- de_y\n\n    df$fhat <- with(df, n / (sum(n) * box_area))\n\n\n    if (smooth) {\n\n      if(nudgex == \"left\") df$x <- df$xmin\n      if(nudgex == \"none\") df$x <- df$xbin_midpt\n      if(nudgex == \"right\") df$x <- df$xmax\n\n      if(nudgey == \"down\") df$y <- df$ymin\n      if(nudgey == \"none\") df$y <- df$ybin_midpt\n      if(nudgey == \"up\") df$y <- df$ymax\n\n    } else {\n\n      # No nudging if not smoothing\n      df$x <- df$xbin_midpt\n      df$y <- df$ybin_midpt\n\n      # Evaluate histogram on a grid\n      # For xyz_to_iso* funs, need tightly packed values for good isobands/lines\n      # k*k points per histogram footprint\n      # Higher values of k -> better visuals, more computationally expensive\n\n      # Currently determining k heuristically - not based on any theoretical results\n      # The necessary value of k seems to be O((bins[1]*bins[2])^(-1/3))\n      # found constant which yields k = 50 for bins[1]*bins[2] = 10^2\n      k <- if (bins[1] * bins[2] > 10^2) max(floor(225/((bins[1] * bins[2])^(1/3))), 5) else 50\n\n      bbins <- bins * k\n\n      ssx <- seq(rangex[1], rangex[2], length.out = bbins[1])\n      ssy <- seq(rangey[1], rangey[2], length.out = bbins[2])\n\n      ddf <- expand.grid(x = ssx, y = ssy)\n\n      # Need fhat repeated in very particular way for grid:\n      #   e.g.\n      #      k = 2\n      #      df$fhat = 1, 2,\n      #                3, 4\n      #     ddf$fhat = 1, 1, 2, 2,\n      #                1, 1, 2, 2,\n      #                3, 3, 4, 4,\n      #                3, 3, 4, 4\n\n      # m <- matrix(df$fhat, nrow = bins[2], byrow = TRUE)\n      # ddf$fhat <- as.vector(kronecker(m, matrix(1, k, k)))\n\n      fhat <- split(df$fhat, factor(rep(1:bins[2], each = bins[1]))) # split into rows\n      fhat <- lapply(fhat, function(x) rep(x, each = k)) # repeat within rows (horizontal)\n      fhat <- lapply(fhat, function(x) rep(x, times = k)) # repeat rows (vertical)\n      fhat <- unlist(fhat) # concatenate\n      ddf$fhat <- fhat\n\n      df <- ddf\n    }\n\n    df[c(\"x\", \"y\", \"fhat\")]\n\n  }\n}\n\n#' Bivariate frequency polygon HDR estimator\n#'\n#' Function used to specify bivariate frequency polygon density estimator\n#' for `get_hdr()` and layer functions (e.g. `geom_hdr()`).\n#'\n#' For more details on the use and implementation of the `method_*()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @inheritParams method_histogram\n#'\n#' @references Scott, David W. Multivariate Density Estimation (2e), Wiley.\n#'\n#' @examples\n#' set.seed(1)\n#' df <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n#'\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_freqpoly()) +\n#'   geom_point(size = 1)\n#'\n#' # The resolution of the frequency polygon estimator can be set via `bins`\n#' ggplot(df, aes(x, y)) +\n#'   geom_hdr(method = method_freqpoly(bins = c(8, 25))) +\n#'   geom_point(size = 1)\n#'\n#' # Can also be used with `get_hdr()` for numerical summary of HDRs\n#' res <- get_hdr(df, method = method_freqpoly())\n#' str(res)\n#'\n#' @export\nmethod_freqpoly <- function(bins = NULL) {\n\n  # n is an argument, but it is not used\n  function(data, n, rangex, rangey) {\n\n    if (is.null(bins)) {\n      bins <- numeric(2)\n\n      # define histogram mesh according to Scott p. 87\n      # TODO: fill in with rules for frequency polygons\n      rho <- cor(data$x, data$y)\n      hx <- 3.504 * sd(data$x) * (1 - rho^2)^(3/8) * nrow(data)^(-1/4)\n      hy <- 3.504 * sd(data$y) * (1 - rho^2)^(3/8) * nrow(data)^(-1/4)\n      bins[1] <- round((rangex[2] - rangex[1]) / hx)\n      bins[2] <- round((rangey[2] - rangey[1]) / hy)\n\n    } else {\n      if (length(bins == 1)) bins <- rep(bins, 2)\n    }\n\n    xvals <- data$x\n    yvals <- data$y\n\n    xbtwn <- (rangex[1] <= xvals & xvals <= rangex[2])\n    if (!all(xbtwn)) {\n      xvals <- xvals[xbtwn]\n      yvals <- yvals[xbtwn]\n    }\n\n    ybtwn <- (rangey[1] <= yvals & yvals <= rangey[2])\n    if (!all(ybtwn)) {\n      xvals <- xvals[ybtwn]\n      yvals <- yvals[ybtwn]\n    }\n\n\n    de_x <- (rangex[2] - rangex[1]) / bins[1]\n    de_y <- (rangey[2] - rangey[1]) / bins[2]\n    rangex[1] <- rangex[1] - de_x\n    rangex[2] <- rangex[2] + de_x\n    rangey[1] <- rangey[1] - de_y\n    rangey[2] <- rangey[2] + de_y\n    bins <- bins + 2\n    sx <- seq(rangex[1], rangex[2], length.out = bins[1] + 1)\n    sy <- seq(rangey[1], rangey[2], length.out = bins[2] + 1)\n\n\n    box_area <- de_x * de_y\n\n    xbin_mdpts <- sx[-(bins[1]+1)] + de_x/2\n    ybin_mdpts <- sy[-(bins[2]+1)] + de_y/2\n\n    xleft <- sx[-(bins[1]+1)]\n    xright <- sx[-1]\n\n    ybottom <- sy[-(bins[2]+1)]\n    ytop <- sy[-1]\n\n\n    df_cuts <- data.frame(\"xbin\" = cut(xvals, sx), \"ybin\" = cut(yvals, sy))\n\n    df <- with(df_cuts, expand.grid(\"xbin\" = levels(xbin), \"ybin\" = levels(ybin)))\n    df$n <- with(df_cuts, as.vector(table(xbin, ybin)))\n\n    df$xbin_midpt <- xbin_mdpts[as.integer(df$xbin)]\n    df$ybin_midpt <- ybin_mdpts[as.integer(df$ybin)]\n\n    df$xmin <- df$xbin_midpt - de_x/2\n    df$xmax <- df$xbin_midpt + de_x/2\n    df$de_x <- de_x\n\n    df$ymin <- df$ybin_midpt - de_y/2\n    df$ymax <- df$ybin_midpt + de_y/2\n    df$de_y <- de_y\n\n    df$fhat <- with(df, n / (sum(n) * box_area))\n    df$fhat_discretized <- normalize(df$fhat)\n\n    grid <- expand.grid(\n      x = sx[2:bins[1]],\n      y = sy[2:bins[2]]\n    )\n\n    x_midpts <- unique(df$xbin_midpt)\n    y_midpts <- unique(df$ybin_midpt)\n\n    find_A <- function(coords) {\n      x <- coords[[1]]\n      y <- coords[[2]]\n\n      row <- data.frame(\n        x1 = max(x_midpts[x_midpts - x < 0]),\n        x2 = min(x_midpts[x_midpts - x >= 0]),\n        y1 = max(y_midpts[y_midpts - y < 0]),\n        y2 = min(y_midpts[y_midpts - y >= 0])\n      )\n\n      row$fQ11 <- df[df$xbin_midpt == row$x1 & df$ybin_midpt == row$y1, \"fhat\"]\n      row$fQ21 <- df[df$xbin_midpt == row$x2 & df$ybin_midpt == row$y1, \"fhat\"]\n      row$fQ12 <- df[df$xbin_midpt == row$x1 & df$ybin_midpt == row$y2, \"fhat\"]\n      row$fQ22 <- df[df$xbin_midpt == row$x2 & df$ybin_midpt == row$y2, \"fhat\"]\n\n      xy_mat <- with(row, matrix(c(\n        x2 * y2, -x2 * y1, -x1 * y2, x1 * y1,\n        -y2, y1, y2, -y1,\n        -x2, x2, x1, -x1,\n        1, -1, -1, 1\n      ), nrow = 4, byrow = TRUE))\n\n      A <- with(row,\n        1 / ((x2 - x1) * (y2 - y1)) * xy_mat %*% c(fQ11, fQ12, fQ21, fQ22)\n      )\n\n      row$a00 <- A[1]\n      row$a10 <- A[2]\n      row$a01 <- A[3]\n      row$a11 <- A[4]\n\n      row\n    }\n\n\n    A_list <- apply(grid, 1, find_A, simplify = FALSE)\n    df_A <- do.call(rbind, A_list)\n\n    coeffs_to_surface <- function(row, k) {\n      sx <- seq(row[[\"x1\"]], row[[\"x2\"]], length.out = k)[-k]\n      sy <- seq(row[[\"y1\"]], row[[\"y2\"]], length.out = k)[-k]\n\n      fit <- function(x, y) row[[\"a00\"]] + row[[\"a10\"]] * x + row[[\"a01\"]] * y + row[[\"a11\"]] * x * y\n\n      df <- expand.grid(x = sx, y = sy)\n      df$fhat <- fit(df$x, df$y)\n\n      df\n    }\n\n\n    # Currently determining k heuristically - not based on any theoretical results\n    # The necessary value of k seems to be O((bins[1]*bins[2])^(-1/4))\n    k <- if (bins[1] * bins[2] > 10^2) max(floor(30/((bins[1] * bins[2])^(1/4))), 3) else 10\n\n    surface_list <- apply(df_A, 1, coeffs_to_surface, k, simplify = FALSE)\n    df <- do.call(rbind, surface_list)\n\n    df[c(\"x\",\"y\",\"fhat\")]\n\n  }\n}\n\n"
  },
  {
    "path": "R/method_1d.R",
    "content": "# methods that return est pdf as closure  ---------------------------------\n\n#' Univariate parametric normal HDR estimator\n#'\n#' Function used to specify univariate normal density estimator\n#' for `get_hdr_1d()` and layer functions (e.g. `geom_hdr_rug()`).\n#'\n#' For more details on the use and implementation of the `method_*_1d()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @examples\n#' # Normal estimators are useful when an assumption of normality is appropriate\n#' df <- data.frame(x = rnorm(1e3))\n#'\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug(method = method_norm_1d()) +\n#'   geom_density()\n#'\n#' # Can also be used with `get_hdr_1d()` for numerical summary of HDRs\n#' res <- get_hdr_1d(df$x, method = method_norm_1d())\n#' str(res)\n#'\n#' @export\nmethod_norm_1d <- function() {\n\n  function(x) {\n\n    mu_hat <- mean(x)\n    sigma_hat <- sd(x)\n\n    function(x) dnorm(x, mu_hat, sigma_hat)\n\n  }\n}\n\n# methods that return closures that compute a grid ------------------------\n\n#' Univariate kernel density HDR estimator\n#'\n#' Function used to specify univariate kernel density estimator\n#' for `get_hdr_1d()` and layer functions (e.g. `geom_hdr_rug()`).\n#'\n#' For more details on the use and implementation of the `method_*_1d()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @inheritParams stats::density\n#'\n#' @examples\n#' df <- data.frame(x = rnorm(1e3, sd = 3))\n#'\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug(method = method_kde_1d()) +\n#'   geom_density()\n#'\n#' # Details of the KDE can be adjusted with arguments to `method_kde_1d()`\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug(method = method_kde_1d(adjust = 1/5)) +\n#'   geom_density(adjust = 1/5)\n#'\n#' ggplot(df, aes(x)) +\n#'   geom_hdr_rug(method = method_kde_1d(kernel = \"triangular\")) +\n#'   geom_density(kernel = \"triangular\")\n#'\n#' # Can also be used with `get_hdr_1d()` for numerical summary of HDRs\n#' res <- get_hdr_1d(df$x, method = method_kde_1d())\n#' str(res)\n#'\n#' @export\nmethod_kde_1d <- function(bw = \"nrd0\", adjust = 1, kernel = \"gaussian\", weights = NULL, window = kernel) {\n\n  function(x, n, range) {\n\n    nx <- length(x)\n\n    if (is.null(weights)) {\n      weights <- rep(1 / nx, nx)\n    } else {\n      weights <- normalize(weights)\n    }\n\n    dens <- stats::density(\n      x,\n      bw = bw,\n      adjust = adjust,\n      kernel = kernel,\n      weights = weights,\n      window = window,\n      n = n,\n      from = range[1],\n      to = range[2]\n    )\n\n    data.frame(\n      x = dens$x,\n      fhat = dens$y\n    )\n\n  }\n}\n\n#' Univariate histogram HDR estimator\n#'\n#' Function used to specify univariate histogram density estimator\n#' for `get_hdr_1d()` and layer functions (e.g. `geom_hdr_rug()`).\n#'\n#' For more details on the use and implementation of the `method_*_1d()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @param bins Number of bins. Defaults to normal reference rule (Scott, pg 59).\n#'\n#' @references Scott, David W. Multivariate Density Estimation (2e), Wiley.\n#'\n#' @examples\n#' # Histogram estimators can be useful when data has boundary constraints\n#' df <- data.frame(x = rexp(1e3))\n#'\n#' # Strip chart to visualize 1-d data\n#' p <- ggplot(df, aes(x)) +\n#'   geom_jitter(aes(y = 0), width = 0, height = 2) +\n#'   scale_y_continuous(name = NULL, breaks = NULL) +\n#'   coord_cartesian(ylim = c(-3, 3))\n#'\n#' p\n#'\n#' p + geom_hdr_rug(method = method_histogram_1d())\n#'\n#' # The resolution of the histogram estimator can be set via `bins`\n#' p + geom_hdr_rug(method = method_histogram_1d(bins = 5))\n#'\n#' # Can also be used with `get_hdr_1d()` for numerical summary of HDRs\n#' res <- get_hdr_1d(df$x, method = method_histogram_1d())\n#' str(res)\n#'\n#' @export\nmethod_histogram_1d <- function(bins = NULL) {\n\n  function(x, n, range) {\n\n    nx <- length(x)\n\n    # Default to normal reference rule (Scott p. 59)\n    if (is.null(bins)) {\n      hx <- 3.504 * stats::sd(x) * nx^(-1/3)\n      bins <- round((range[2] - range[1]) / hx)\n    }\n\n    sx <- seq(range[1], range[2], length.out = bins + 1)\n    de_x <- sx[2] - sx[1]\n    midpts <- sx[-(bins+1)] + de_x/2\n    n <- as.numeric(table(cut(x, sx)))\n\n    data.frame(\n      x = midpts,\n      fhat = normalize(n)\n    )\n\n  }\n}\n\n#' Univariate frequency polygon HDR estimator\n#'\n#' Function used to specify univariate frequency polygon density estimator\n#' for `get_hdr_1d()` and layer functions (e.g. `geom_hdr_rug()`).\n#'\n#' For more details on the use and implementation of the `method_*_1d()` functions,\n#' see `vignette(\"method\", \"ggdensity\")`.\n#'\n#' @inheritParams method_histogram_1d\n#'\n#' @references Scott, David W. Multivariate Density Estimation (2e), Wiley.\n#'\n#' @examples\n#' df <- data.frame(x = rnorm(1e3))\n#'\n#' # Strip chart to visualize 1-d data\n#' p <- ggplot(df, aes(x)) +\n#'   geom_jitter(aes(y = 0), width = 0, height = 2) +\n#'   scale_y_continuous(name = NULL, breaks = NULL) +\n#'   coord_cartesian(ylim = c(-3, 3))\n#'\n#' p\n#'\n#' p + geom_hdr_rug(method = method_freqpoly_1d())\n#'\n#' # The resolution of the frequency polygon estimator can be set via `bins`\n#' p + geom_hdr_rug(method = method_freqpoly_1d(bins = 100))\n#'\n#' # Can also be used with `get_hdr_1d()` for numerical summary of HDRs\n#' res <- get_hdr_1d(df$x, method = method_freqpoly_1d())\n#' str(res)\n#'\n#' @export\nmethod_freqpoly_1d <- function(bins = NULL) {\n\n  function(x, n, range) {\n\n    # Start with output from method_histogram\n    df <- method_histogram_1d(bins)(x, n, range)\n\n    hx <- df$x[2] - df$x[1]\n\n    # need to pad df from hist_marginal() w/ bins that have est prob of 0\n    # so that we can interpolate\n    df <- rbind(\n\n      # add initial bin w/ est prob of 0\n      data.frame(\n        x = min(df$x) - hx,\n        fhat = 0\n      ),\n\n      # include original histogram estimator\n      df,\n\n      # add final bin w/ est prob of 0\n      data.frame(\n        x = max(df$x) + hx,\n        fhat = 0\n      )\n\n    )\n\n    sx <- seq(range[1], range[2], length.out = n)\n\n    interpolate_fhat <- function(x) {\n      lower_x <- df$x[max(which(df$x < x))]\n      upper_x <- df$x[min(which(df$x >= x))]\n\n      lower_fhat <- df$fhat[max(which(df$x < x))]\n      upper_fhat <- df$fhat[min(which(df$x >= x))]\n\n      lower_fhat + (x - lower_x) * (upper_fhat - lower_fhat) / (upper_x - lower_x)\n    }\n\n    dens <- vapply(sx, interpolate_fhat, numeric(1))\n\n    data.frame(\n      x = sx,\n      fhat = dens\n    )\n\n  }\n}\n\n\n\n"
  },
  {
    "path": "README.Rmd",
    "content": "---\noutput: github_document\n---\n\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#>\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\",\n  dev = \"png\", dpi = 300,\n  fig.height = 3.5, \n  cache = TRUE\n)\n```\n\n```{r, include = FALSE}\nset.seed(1)\n```\n\n# ggdensity <img src=\"man/figures/logo.png\"  align=\"right\"  width=\"120\" style=\"padding-left:10px;background-color:white;\" />\n\n<!-- badges: start -->\n[![R-CMD-check](https://github.com/jamesotto852/ggdensity/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jamesotto852/ggdensity/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/jamesotto852/ggdensity/graph/badge.svg)](https://app.codecov.io/gh/jamesotto852/ggdensity)\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version-ago/ggdensity)](https://cran.r-project.org/package=ggdensity)\n[![CRAN_Download_Badge](http://cranlogs.r-pkg.org/badges/ggdensity)](https://cran.r-project.org/package=ggdensity)\n<!-- badges: end -->\n\n**ggdensity** extends [**ggplot2**](https://github.com/tidyverse/ggplot2) providing more interpretable visualizations of density estimates based on highest density regions (HDRs).\n**ggdensity** offers drop-in replacements for [**ggplot2**](https://github.com/tidyverse/ggplot2) functions:\n\n- instead of `ggplot2::geom_density_2d_filled()`{.R}, use `ggdensity::geom_hdr()`{.R};\n- instead of `ggplot2::geom_density_2d()`{.R}, use `ggdensity::geom_hdr_lines()`{.R}.\n\nAlso included are the functions `geom_hdr_fun()` and `geom_hdr_lines_fun()` for plotting HDRs of user-specified bivariate probability density functions.\n\n\n\n\n## Installation\n\n**ggdensity** is available on CRAN and can be installed with:\n\n``` r \ninstall.packages(\"ggdensity\")\n```\n\nAlternatively, you can install the latest development version from [GitHub](https://github.com/) with:\n\n``` r\nif (!requireNamespace(\"remotes\")) install.packages(\"remotes\")\nremotes::install_github(\"jamesotto852/ggdensity\")\n```\n\n## `geom_density_2d_filled()`{.R} vs. `geom_hdr()`{.R}\n\nThe standard way to visualize the joint distribution of two continuous variables in **ggplot2** is to use `ggplot2::geom_density_2d()`{.R} or `geom_density_2d_filled()`{.R}. Here's an example:\n\n```{r ex0, message=FALSE}\nlibrary(\"ggplot2\"); theme_set(theme_minimal())\ntheme_update(panel.grid.minor = element_blank())\nlibrary(\"ggdensity\")\nlibrary(\"patchwork\")\n\n\ndf <- data.frame(\"x\" = rnorm(1000), \"y\" = rnorm(1000))\np <- ggplot(df, aes(x, y)) + coord_equal()\np + geom_density_2d_filled()\n```\n\nWhile it's a nice looking plot, it isn't immediately clear how we should understand it. That's because `geom_density_2d_filled()`{.R} generates its contours as equidistant level sets of the estimated bivariate density, i.e. taking horizontal slices of the 3d surface at equally-spaced heights, and projecting the intersections down into the plane. So you get a general feel of where the density is high, but not much else. To interpret a contour, you would need to multiply its height by the area it bounds, which of course is very challenging to do by just looking at it.\n\n`geom_hdr()`{.R} tries to get around this problem by presenting you with regions of the estimated distribution that are immediately interpretable:  \n\n```{r ex1}\np + geom_hdr()\n```\n\n`probs` here tells us the probability bounded by the corresponding region, and the regions are computed to be the smallest such regions that bound that level of probability; these are called highest density regions or HDRs. By default, the plotted regions show the $50\\%$, $80\\%$, $95\\%$, and $99\\%$ HDRs of the estimated density, but this can be changed with the `probs` argument to `geom_hdr()`{.R}. Notice that your take-away from the plot made with `geom_density_2d_filled()`{.R} is subtlely yet significantly different than that of the plot made by `geom_hdr()`{.R}.\n\n\n\n\n\n\n\n\n## Visualizing subpopulations and `geom_hdr_lines()`{.R}\n\n**ggdensity**'s functions were designed to be seamlessly consistent with the rest of the **ggplot2** framework. As a consequence, pretty much everything you would expect to just work does. (Well, we hope! [Let us know](https://github.com/jamesotto852/ggdensity/issues/new) if that's not true.)\n\nFor example, because `geom_hdr()` maps probability to the `alpha` aesthetic, the `fill` and `color` aesthetics are available for mapping to variables.\nYou can use them to visualize subpopulations in your data. For example, in the `penguins` data from [**palmerpenguins**](https://github.com/allisonhorst/palmerpenguins) you may want to look at how the relationship between bill length and flipper length changes across different species of penguins. Here's one way you could look at that:\n\n```{r ex_penguins, warning = FALSE}\nlibrary(\"palmerpenguins\")\n\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, fill = species)) +\n  geom_hdr(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(shape = 21)\n```\n\n<div style = \"height:40px;\"></div>\n\nNice, but a bit overplotted. To alleviate overplotting, we can use `geom_hdr_lines()`{.R}:\n\n```{r ex_penguins_lines, warning = FALSE}\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, color = species)) +\n  geom_hdr_lines(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(size = 1)\n```\n\nOr you could facet the plot:\n\n<div style = \"height:40px;\"></div>\n\n```{r ex_penguins_facet, warning = FALSE}\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, fill = species)) +\n  geom_hdr(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(shape = 21) +\n  facet_wrap(vars(species))\n```\n\nThe main point here is that you should really think of `geom_hdr()`{.R} and `geom_hdr_lines()`{.R} as drop-in replacements for functions like `geom_density_2d_filled()`{.R}, `geom_density2d()`{.R}, and so on, and you can expect all of the rest of the **ggplot2** stuff to just work.\n\n\n\n\n## A deeper cut illustrating **ggplot2** integration\n\nThe underlying stat used by `geom_hdr()`{.R} creates the computed variable `probs` that can be mapped in the standard way you map computed variables in **ggplot2**, with `after_stat()`{.R}.\n\nFor example, `geom_hdr()` and `geom_hdr_lines()` map `probs` to the `alpha` aesthetic by default. But you can override it like this, just be sure to override the `alpha` aesthetic by setting `alpha = 1`.\n\n```{r ex_after_stat}\nggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr(\n    aes(fill = after_stat(probs)), \n    alpha = 1, xlim = c(0, 8), ylim = c(30, 110)\n  )\n\nggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr_lines(\n    aes(color = after_stat(probs)), \n    alpha = 1, xlim = c(0, 8), ylim = c(30, 110)\n  )\n```\n\n<!-- ```{r} -->\n<!-- ggplot(faithful, aes(eruptions, waiting)) + -->\n<!--   geom_hdr( -->\n<!--     xlim = scales::expand_range(range(faithful$eruptions), mul = .25), -->\n<!--     ylim = scales::expand_range(range(faithful$waiting),   mul = .25) -->\n<!--   ) + -->\n<!--   geom_point(color = \"red\") + -->\n<!--   scale_x_continuous(breaks = 1:6) + -->\n<!--   scale_y_continuous(breaks = (4:10)*10) -->\n<!-- ``` -->\n\n\n\n## Statistics details\n\nIn addition to trying to make the visuals clean and the functions what you would expect as a **ggplot2** user, we've spent considerable effort in trying to ensure that the graphics you're getting with **ggdensity** are statistically rigorous and provide a range of estimation options for more detailed control. \n\nTo that end, you can pass a `method` argument into `geom_hdr()`{.R} and `geom_hdr_lines()`{.R} that allows you to specify various nonparametric and parametric ways to estimate the underlying bivariate distribution, and we have plans for even more. Each of the estimators below offers advantages in certain contexts. For example, histogram estimators result in HDRs that obey constrained supports. Normal estimators can be helpful in providing simplified visuals that give the viewer a sense of where the distributions are, potentially at the expense of over-simplifying and removing important features of how the variables (co-)vary. \n\n```{r ex_methods, echo = FALSE, fig.width = 11, fig.height = 17}\nlibrary(\"purrr\")\n\ndf_norm <- data.frame(\"x\" = rnorm(5000), \"y\" = rnorm(5000))\n\ndf_norm_mix <- data.frame(\n  \"x\" = rnorm(5000) + c(-1.5, 1.5),\n  \"y\" = rnorm(5000) + c(1.5, -1.5)\n)\n\ndf_exp <- data.frame(\"x\" = rexp(5000, 1), \"y\" = rexp(5000, 1))\n\np_df <- function(df) {\n  ggplot(df, aes(x, y)) + \n    theme(\n      legend.position = \"none\",\n      axis.text.x = element_blank(), axis.ticks.x = element_blank(),\n      axis.text.y = element_blank(), axis.ticks.y = element_blank(),\n      axis.title = element_blank()\n    )\n}\n\np_row <- function(layer, title, ylabs = FALSE) {\n  p_title <- grid::textGrob(title)\n  \n  p_norm <- p_df(df_norm) + \n    layer + \n    coord_fixed(xlim = c(-3.5, 3.5), ylim = c(-3.5, 3.5))\n  \n  p_norm_mix <- p_df(df_norm_mix) + \n    layer + \n    coord_fixed(xlim = c(-4.5, 4.5), ylim = c(-4.5, 4.5))\n  \n  p_norm_exp <- p_df(df_exp) + \n    layer + coord_fixed(xlim = c(-.25, 6), ylim = c(-.25, 6))\n  \n  list(p_title, p_norm, p_norm_mix, p_norm_exp)\n}\n\n\ngeoms <- list(\n  geom_point(size = .3, alpha = .6),\n  # geom_density_2d_filled(),\n  # extreme xlim, ylim ensure that HDRs aren't clipped\n  geom_hdr(method = \"kde\", xlim = c(-10, 10), ylim = c(-10, 10)),\n  geom_hdr(method = \"mvnorm\", xlim = c(-10, 10), ylim = c(-10, 10)),\n  geom_hdr(method = \"histogram\"),\n  geom_hdr(method = \"freqpoly\", xlim = c(-10, 10), ylim = c(-10, 10))\n)\n\ntitles <- c(\n  \"\",\n  \"kde\",\n  \"mvnorm\",\n  \"histogram\",\n  \"freqpoly\"\n)\n\nmap2(geoms, titles, p_row) |>\n  unlist(recursive = FALSE) |>\n  wrap_plots(ncol = 4, widths = c(.2, 1, 1, 1), heights = 1)\n\n```\n\nThe `method` argument may be specified either as a character vector (`method = \"kde\"`) or as a function call (`method = method_kde()`).\nWhen a function call is used, it may be possible to specify parameters governing the density estimation procedure.\nFor example, `method_kde()` accepts parameters `h` and `adjust`, both related to the kernel's bandwidth.\nFor details see `?method_kde` or `vignette(\"method\", \"ggdensity\")`.\n\n\n## If you know your PDF\n\nThe above discussion has focused around densities that are estimated from data. But in some instances, you have the distribution in the form of a function that encodes the [joint PDF](https://en.wikipedia.org/wiki/Probability_density_function). In those circumstances, you can use `geom_hdr_fun()`{.R} and `geom_hdr_lines_fun()`{.R} to make the analogous plots.\nThese functions behave similarly to `geom_function()`{.R} from [**ggplot2**](https://github.com/tidyverse/ggplot2), \naccepting the argument `fun` specifying the pdf to be summarized. Here's an example:\n\n```{r ex_hdr_fun_1}\nf <- function(x, y) dnorm(x) * dgamma(y, 5, 3)\n\nggplot() +\n  geom_hdr_fun(fun = f, xlim = c(-4, 4), ylim = c(0, 5))\n```\n\n\n\n\n<!-- Discuss un-normalized densities here with example of posteriors -->\n\n<!-- In the context of a Bayesian analysis, `geom_hdr()` creates plots of highest posterior regions. -->\n<!-- All we need to do is give `geom_hdr()` a data frame with draws from a posterior, and  -->\n\n\n\n\n### Visualizing custom parametric density estimates with `geom_hdr_fun()`{.R}\n\nIn addition to all of the methods of density estimation available with `geom_hdr()`{.R}, one of the perks of having \n`geom_hdr_fun()`{.R} is that it allows you to plot parametric densities that you estimate outside the **ggdensity** framework.  The basic idea is that you fit your distribution outside **ggdensity** calls with your method of choice, say maximum likelihood, and then plug the maximum likelihood estimate into the density formula to obtain a function to plug into `geom_hdr_fun()`{.R}.\n\nHere's an example of how you can do that that assuming that the underlying data are independent and exponentially distributed with unknown rates.\n\n```{r ex_hdr_fun_2}\nset.seed(123)\nth <- c(3, 5)\ndf <- data.frame(\"x\" = rexp(1000, th[1]), \"y\" = rexp(1000, th[2]))\n\n# construct the likelihood function\nl <- function(th) {\n  log_liks <- apply(df, 1, function(xy) {\n    dexp(xy[1], rate = th[1], log = TRUE) +\n    dexp(xy[2], rate = th[2], log = TRUE)\n  })\n  sum(log_liks)\n}\n\n# compute the mle\n(th_hat <- optim(c(2, 2), l, control = list(fnscale = -1))$par)\n\n# construct the parametric density estimate\nf <- function(x, y, th) dexp(x, th[1]) * dexp(y, th[2])\n\n# pass estimated density into geom_hdr_fun()\nggplot(df, aes(x, y)) +\n  geom_hdr_fun(fun = f, args = list(th = th_hat)) +\n  geom_point(shape = 21, fill = \"lightgreen\", alpha = .25) +\n  coord_equal()\n```\n\n\n\n\n\n\n## Other perks\n\n### `geom_hdr_points()`{.R}\n\nInspired by [**ggpointdensity**](https://github.com/LKremer/ggpointdensity), **ggdensity** provides a scatterplot geom whereby the individual data points can be seen simultaneously with HDRs. This is most useful in situations with significant overplotting.\n\n\n```{r ex_geom_hdr_points, warning = FALSE, dev='png', dpi=300}\np_points <- ggplot(diamonds, aes(carat, price)) +\n  geom_point()\n\np_hdr_points <- ggplot(diamonds, aes(carat, price)) +\n  geom_hdr_points()\n\np_points + p_hdr_points\n```\n\n\n### `geom_hdr_rug()`{.R}\n\nRug plots are standard additions to plots with densities:\n\n```{r ex_geom_hdr_rug_1, dev=\"png\", dpi=300}\nggplot(cars, aes(speed, dist)) +\n  geom_density_2d() +\n  geom_point() +\n  geom_rug()\n```\n\nWith HDRs, these can be used to visualize joint and marginal HDRs simultaneously. The marginal HDRs are computed off of only the corresponding `x` and `y` aesthetic variables. Note that these can be substantially different: the joint HDR is _not_ the [product](https://en.wikipedia.org/wiki/Cartesian_product) of the marginal HDRs.\n\n```{r ex_geom_hdr_rug_2, dev=\"png\", dpi=300}\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug()\n```\n\nLike `geom_rug()`{.R}, these can be placed on different sides of the object:\n\n```{r ex_geom_hdr_rug_3, dev=\"png\", dpi=300}\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug(sides = \"tr\", outside = TRUE) +\n  coord_cartesian(clip = \"off\")\n```\n\nWe sometimes find it easier to view if the rug intervals are colored:\n\n```{r ex_geom_hdr_rug_4, dev=\"png\", dpi=300}\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug(aes(fill = after_stat(probs)), length = unit(.2, \"cm\"), alpha = 1) + \n  scale_fill_viridis_d(option = \"magma\", begin = .8, end = 0)\n```\n\n### Numerical summaries of HDRs\n\nIt is possible to access numerical summaries of the estimated densities and HDRs computed by **ggdensity** with `get_hdr()`:\n\n```{r get-hdr}\ndf <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n\nres <- get_hdr(df, method = \"kde\")\nstr(res)\n```\n\nSimilarly, there is `get_hdr_1d()` for univariate data:\n\n```{r get-hdr-1d}\nx <- rnorm(1e3)\n\nres <- get_hdr_1d(x, method = \"kde\")\nstr(res)\n```\n\nFor details on the objects returned by these functions, see `?get_hdr` and `?get_hdr_1d`.\n\n## A caveat and recommendation for cropped HDRs\n\n`geom_hdr()`{.R} and related functions were written with the intent of playing nicely with **ggplot2**, so that what the typical **ggplot2** user would expect from the rest of the **ggplot2** ecosystem would work in the same way with **ggdensity**. \n\nOne place where the effect isn't ideal is in the limits of the `x`{.R} and `y`{.R} scales. Without getting into too much detail, these key off of the observed points themselves, and not properties of the estimated density. This is consistent with `geom_density_2d()`{.R} and `stat_smooth()`{.R}, for example: computed aesthetics don't extend past the range of the data.\n\nOne potential danger here is that the estimated HDRs are computed based on not the estimated density directly, but a discretization of it. This is how all non-parametric density estimation in R works, e.g. `MASS::kde2d()`{.R}, and most parametric density estimation, too. In other words: the density estimate itself is only known at points on a grid over the `x`{.R}-`y`{.R} aesthetic space. As a consequence, if that range is too small, it's possible that a probabilistically non-trivial proportion of the density is excluded from the computations, biasing the resulting HDRs.\n\nThe punch line is that whenever you see an HDR getting truncated by the window of the plot, it's probably a good idea to manually increase the aesthetic limits and use `coord_cartesian()`{.R} to zoom in as needed. Here's an example using the previously created graphic. The limits given to `coord_cartesian()`{.R} and the call to `scale_y_continuous()`{.R} is simply an effort to make the third plot comparable to the first.\n\n_Note:_ The support of the data isn't respected here-the estimated density doesn't know speed can't go negative. That's not an artifact of the effect described above, that's just because that's what KDE's do.\n\n```{r ex-expand-lims, fig.height=8}\np1 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  ggtitle(\"Default geom_hdr()\")\n\np2 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr(xlim = c(-20, 50), ylim = c(-40, 140)) +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  ggtitle(\"Manually set xlim, ylim\")\n\np3 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr(xlim = c(-20, 50), ylim = c(-40, 140)) +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  scale_y_continuous(breaks = 25*(0:5)) +\n  coord_cartesian(xlim = c(4, 25), ylim = c(-1, 120)) + \n  ggtitle(\"Zoom with coord_cartesian()\")\n\n(p1 / p2 / p3) & theme(title = element_text(size = 9))\n```\n\n\n## Related projects\n\nThere are a few other great packages out there you should know about if you're interested in **ggdensity**.  \n\nThe [**ggdist**](https://mjskay.github.io/ggdist/) package provides several flexible geoms for visualizing distributions of data, mostly univariate data.\n\nThe [**hdrcde**](https://pkg.robjhyndman.com/hdrcde/index.html) package allows you to make bivariate HDR plots as well. At the surface, the main difference is that **hdrcde** doesn't use **ggplot2** graphics; however, under the hood there are many more differences. (More coming on explaining these discrepancies.)\n\nThe code illustrating the two strategies is quite simple, but trying to make the graphics more directly comparable requires some effort. Here's a pretty good rendition on the `faithful`{.R} dataset, which has 272 observations.\n\n```{r hdrcde}\np_hdr_scale <- ggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr(\n    xlim = scales::expand_range(range(faithful$eruptions), mul = .25),\n    ylim = scales::expand_range(range(faithful$waiting),   mul = .25)\n  ) +\n  geom_point(color = \"red\") +\n  scale_x_continuous(breaks = 0:6) +\n  scale_y_continuous(breaks = (3:10)*10) +\n  guides(alpha = \"none\")\n\nden <- with(faithful,\n  MASS::kde2d(eruptions, waiting, n = 100, lims = c(0,6,30,105))\n)\n\nif (!requireNamespace(\"hdrcde\")) install.packages(\"hdrcde\")\nlibrary(\"hdrcde\")\np_den <- ~ with(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), den = den),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\n\npar(mar = c(0,1.75,0,0), bg = NA)\np_hdr_scale +\n  coord_cartesian(xlim = c(0, 6), ylim = c(30, 105), expand = FALSE) +\n  wrap_elements(panel = p_den, clip = FALSE)\n```\n\nThese look quite different, and they are. It's worth noting that even within **hdrcde** there is variability as well:\n\n```{r}\npar(mar = c(3, 3, 1, 1) + 0.1, mfrow = c(1, 2))\nwith(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), \n      kde.package = \"ash\", xextend = .20),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\nwith(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), \n      kde.package = \"ks\", xextend = .20),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\n```\n\n[**gghdr**](https://sayani07.github.io/gghdr/) is somewhat of a **ggplot2** port of **hdrcde**, developed by some of the same team members. In some ways, it's very similar to **ggdensity**. For example, it contains a function `gghdr::geom_hdr_rug()`{.R} that does effectively the same as `ggdensity::geom_hdr_rug()`{.R}; it implements a kind of `ggdensity::geom_hdr_pointdensity()`{.R} via a function `gghdr::hdr_bin()`{.R} plus the color aesthetic to `geom_point()`{.R}; and it provides a boxplot alternative `gghdr::geom_hdr_boxplot()`{.R}. To the extent the similarities between **ggdensity** and **hdrcde**/**gghdr** exist (and they obviously do), they are an example of [convergent evolution](https://en.wikipedia.org/wiki/Convergent_evolution). The present authors only discovered those projects after writing most of **ggdensity**, unfortunately. Interestingly, we also had designs on the CDE part as well (\"conditional density estimation\", think models); however had not implemented it before seeing **hdrcde**. You can expect those to come down the road.\n\nPerhaps the most important difference between **ggdensity** and **gghdr** is that the latter doesn't implement bivariate HDRs in the **ggplot2** framework, which was the original motivation of **ggdensity**. For that purpose, it seems the only project available is **ggdensity**.\n\n"
  },
  {
    "path": "README.md",
    "content": "\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n# ggdensity <img src=\"man/figures/logo.png\"  align=\"right\"  width=\"120\" style=\"padding-left:10px;background-color:white;\" />\n\n<!-- badges: start -->\n\n[![R-CMD-check](https://github.com/jamesotto852/ggdensity/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jamesotto852/ggdensity/actions/workflows/R-CMD-check.yaml)\n[![Codecov test\ncoverage](https://codecov.io/gh/jamesotto852/ggdensity/graph/badge.svg)](https://app.codecov.io/gh/jamesotto852/ggdensity)\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version-ago/ggdensity)](https://cran.r-project.org/package=ggdensity)\n[![CRAN_Download_Badge](http://cranlogs.r-pkg.org/badges/ggdensity)](https://cran.r-project.org/package=ggdensity)\n<!-- badges: end -->\n\n**ggdensity** extends\n[**ggplot2**](https://github.com/tidyverse/ggplot2) providing more\ninterpretable visualizations of density estimates based on highest\ndensity regions (HDRs). **ggdensity** offers drop-in replacements for\n[**ggplot2**](https://github.com/tidyverse/ggplot2) functions:\n\n- instead of `ggplot2::geom_density_2d_filled()`, use\n  `ggdensity::geom_hdr()`;\n- instead of `ggplot2::geom_density_2d()`, use\n  `ggdensity::geom_hdr_lines()`.\n\nAlso included are the functions `geom_hdr_fun()` and\n`geom_hdr_lines_fun()` for plotting HDRs of user-specified bivariate\nprobability density functions.\n\n## Installation\n\n**ggdensity** is available on CRAN and can be installed with:\n\n``` r\ninstall.packages(\"ggdensity\")\n```\n\nAlternatively, you can install the latest development version from\n[GitHub](https://github.com/) with:\n\n``` r\nif (!requireNamespace(\"remotes\")) install.packages(\"remotes\")\nremotes::install_github(\"jamesotto852/ggdensity\")\n```\n\n## `geom_density_2d_filled()` vs. `geom_hdr()`\n\nThe standard way to visualize the joint distribution of two continuous\nvariables in **ggplot2** is to use `ggplot2::geom_density_2d()` or\n`geom_density_2d_filled()`. Here’s an example:\n\n``` r\nlibrary(\"ggplot2\"); theme_set(theme_minimal())\ntheme_update(panel.grid.minor = element_blank())\nlibrary(\"ggdensity\")\nlibrary(\"patchwork\")\n\n\ndf <- data.frame(\"x\" = rnorm(1000), \"y\" = rnorm(1000))\np <- ggplot(df, aes(x, y)) + coord_equal()\np + geom_density_2d_filled()\n```\n\n<img src=\"man/figures/README-ex0-1.png\" width=\"100%\" />\n\nWhile it’s a nice looking plot, it isn’t immediately clear how we should\nunderstand it. That’s because `geom_density_2d_filled()` generates its\ncontours as equidistant level sets of the estimated bivariate density,\ni.e. taking horizontal slices of the 3d surface at equally-spaced\nheights, and projecting the intersections down into the plane. So you\nget a general feel of where the density is high, but not much else. To\ninterpret a contour, you would need to multiply its height by the area\nit bounds, which of course is very challenging to do by just looking at\nit.\n\n`geom_hdr()` tries to get around this problem by presenting you with\nregions of the estimated distribution that are immediately\ninterpretable:\n\n``` r\np + geom_hdr()\n```\n\n<img src=\"man/figures/README-ex1-1.png\" width=\"100%\" />\n\n`probs` here tells us the probability bounded by the corresponding\nregion, and the regions are computed to be the smallest such regions\nthat bound that level of probability; these are called highest density\nregions or HDRs. By default, the plotted regions show the $50\\%$,\n$80\\%$, $95\\%$, and $99\\%$ HDRs of the estimated density, but this can\nbe changed with the `probs` argument to `geom_hdr()`. Notice that your\ntake-away from the plot made with `geom_density_2d_filled()` is subtlely\nyet significantly different than that of the plot made by `geom_hdr()`.\n\n## Visualizing subpopulations and `geom_hdr_lines()`\n\n**ggdensity**’s functions were designed to be seamlessly consistent with\nthe rest of the **ggplot2** framework. As a consequence, pretty much\neverything you would expect to just work does. (Well, we hope! [Let us\nknow](https://github.com/jamesotto852/ggdensity/issues/new) if that’s\nnot true.)\n\nFor example, because `geom_hdr()` maps probability to the `alpha`\naesthetic, the `fill` and `color` aesthetics are available for mapping\nto variables. You can use them to visualize subpopulations in your data.\nFor example, in the `penguins` data from\n[**palmerpenguins**](https://github.com/allisonhorst/palmerpenguins) you\nmay want to look at how the relationship between bill length and flipper\nlength changes across different species of penguins. Here’s one way you\ncould look at that:\n\n``` r\nlibrary(\"palmerpenguins\")\n#> \n#> Attaching package: 'palmerpenguins'\n#> The following objects are masked from 'package:datasets':\n#> \n#>     penguins, penguins_raw\n\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, fill = species)) +\n  geom_hdr(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(shape = 21)\n```\n\n<img src=\"man/figures/README-ex_penguins-1.png\" width=\"100%\" />\n\n<div style=\"height:40px;\">\n\n</div>\n\nNice, but a bit overplotted. To alleviate overplotting, we can use\n`geom_hdr_lines()`:\n\n``` r\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, color = species)) +\n  geom_hdr_lines(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(size = 1)\n```\n\n<img src=\"man/figures/README-ex_penguins_lines-1.png\" width=\"100%\" />\n\nOr you could facet the plot:\n\n<div style=\"height:40px;\">\n\n</div>\n\n``` r\nggplot(penguins, aes(flipper_length_mm, bill_length_mm, fill = species)) +\n  geom_hdr(xlim = c(160, 240), ylim = c(30, 70)) +\n  geom_point(shape = 21) +\n  facet_wrap(vars(species))\n```\n\n<img src=\"man/figures/README-ex_penguins_facet-1.png\" width=\"100%\" />\n\nThe main point here is that you should really think of `geom_hdr()` and\n`geom_hdr_lines()` as drop-in replacements for functions like\n`geom_density_2d_filled()`, `geom_density2d()`, and so on, and you can\nexpect all of the rest of the **ggplot2** stuff to just work.\n\n## A deeper cut illustrating **ggplot2** integration\n\nThe underlying stat used by `geom_hdr()` creates the computed variable\n`probs` that can be mapped in the standard way you map computed\nvariables in **ggplot2**, with `after_stat()`.\n\nFor example, `geom_hdr()` and `geom_hdr_lines()` map `probs` to the\n`alpha` aesthetic by default. But you can override it like this, just be\nsure to override the `alpha` aesthetic by setting `alpha = 1`.\n\n``` r\nggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr(\n    aes(fill = after_stat(probs)), \n    alpha = 1, xlim = c(0, 8), ylim = c(30, 110)\n  )\n```\n\n<img src=\"man/figures/README-ex_after_stat-1.png\" width=\"100%\" />\n\n``` r\n\nggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr_lines(\n    aes(color = after_stat(probs)), \n    alpha = 1, xlim = c(0, 8), ylim = c(30, 110)\n  )\n```\n\n<img src=\"man/figures/README-ex_after_stat-2.png\" width=\"100%\" />\n\n<!-- ```{r} -->\n\n<!-- ggplot(faithful, aes(eruptions, waiting)) + -->\n\n<!--   geom_hdr( -->\n\n<!--     xlim = scales::expand_range(range(faithful$eruptions), mul = .25), -->\n\n<!--     ylim = scales::expand_range(range(faithful$waiting),   mul = .25) -->\n\n<!--   ) + -->\n\n<!--   geom_point(color = \"red\") + -->\n\n<!--   scale_x_continuous(breaks = 1:6) + -->\n\n<!--   scale_y_continuous(breaks = (4:10)*10) -->\n\n<!-- ``` -->\n\n## Statistics details\n\nIn addition to trying to make the visuals clean and the functions what\nyou would expect as a **ggplot2** user, we’ve spent considerable effort\nin trying to ensure that the graphics you’re getting with **ggdensity**\nare statistically rigorous and provide a range of estimation options for\nmore detailed control.\n\nTo that end, you can pass a `method` argument into `geom_hdr()` and\n`geom_hdr_lines()` that allows you to specify various nonparametric and\nparametric ways to estimate the underlying bivariate distribution, and\nwe have plans for even more. Each of the estimators below offers\nadvantages in certain contexts. For example, histogram estimators result\nin HDRs that obey constrained supports. Normal estimators can be helpful\nin providing simplified visuals that give the viewer a sense of where\nthe distributions are, potentially at the expense of over-simplifying\nand removing important features of how the variables (co-)vary.\n\n<img src=\"man/figures/README-ex_methods-1.png\" width=\"100%\" />\n\nThe `method` argument may be specified either as a character vector\n(`method = \"kde\"`) or as a function call (`method = method_kde()`). When\na function call is used, it may be possible to specify parameters\ngoverning the density estimation procedure. For example, `method_kde()`\naccepts parameters `h` and `adjust`, both related to the kernel’s\nbandwidth. For details see `?method_kde` or\n`vignette(\"method\", \"ggdensity\")`.\n\n## If you know your PDF\n\nThe above discussion has focused around densities that are estimated\nfrom data. But in some instances, you have the distribution in the form\nof a function that encodes the [joint\nPDF](https://en.wikipedia.org/wiki/Probability_density_function). In\nthose circumstances, you can use `geom_hdr_fun()` and\n`geom_hdr_lines_fun()` to make the analogous plots. These functions\nbehave similarly to `geom_function()` from\n[**ggplot2**](https://github.com/tidyverse/ggplot2), accepting the\nargument `fun` specifying the pdf to be summarized. Here’s an example:\n\n``` r\nf <- function(x, y) dnorm(x) * dgamma(y, 5, 3)\n\nggplot() +\n  geom_hdr_fun(fun = f, xlim = c(-4, 4), ylim = c(0, 5))\n```\n\n<img src=\"man/figures/README-ex_hdr_fun_1-1.png\" width=\"100%\" />\n\n<!-- Discuss un-normalized densities here with example of posteriors -->\n\n<!-- In the context of a Bayesian analysis, `geom_hdr()` creates plots of highest posterior regions. -->\n\n<!-- All we need to do is give `geom_hdr()` a data frame with draws from a posterior, and  -->\n\n### Visualizing custom parametric density estimates with `geom_hdr_fun()`\n\nIn addition to all of the methods of density estimation available with\n`geom_hdr()`, one of the perks of having `geom_hdr_fun()` is that it\nallows you to plot parametric densities that you estimate outside the\n**ggdensity** framework. The basic idea is that you fit your\ndistribution outside **ggdensity** calls with your method of choice, say\nmaximum likelihood, and then plug the maximum likelihood estimate into\nthe density formula to obtain a function to plug into `geom_hdr_fun()`.\n\nHere’s an example of how you can do that that assuming that the\nunderlying data are independent and exponentially distributed with\nunknown rates.\n\n``` r\nset.seed(123)\nth <- c(3, 5)\ndf <- data.frame(\"x\" = rexp(1000, th[1]), \"y\" = rexp(1000, th[2]))\n\n# construct the likelihood function\nl <- function(th) {\n  log_liks <- apply(df, 1, function(xy) {\n    dexp(xy[1], rate = th[1], log = TRUE) +\n    dexp(xy[2], rate = th[2], log = TRUE)\n  })\n  sum(log_liks)\n}\n\n# compute the mle\n(th_hat <- optim(c(2, 2), l, control = list(fnscale = -1))$par)\n#> [1] 2.912736 5.032125\n\n# construct the parametric density estimate\nf <- function(x, y, th) dexp(x, th[1]) * dexp(y, th[2])\n\n# pass estimated density into geom_hdr_fun()\nggplot(df, aes(x, y)) +\n  geom_hdr_fun(fun = f, args = list(th = th_hat)) +\n  geom_point(shape = 21, fill = \"lightgreen\", alpha = .25) +\n  coord_equal()\n```\n\n<img src=\"man/figures/README-ex_hdr_fun_2-1.png\" width=\"100%\" />\n\n## Other perks\n\n### `geom_hdr_points()`\n\nInspired by\n[**ggpointdensity**](https://github.com/LKremer/ggpointdensity),\n**ggdensity** provides a scatterplot geom whereby the individual data\npoints can be seen simultaneously with HDRs. This is most useful in\nsituations with significant overplotting.\n\n``` r\np_points <- ggplot(diamonds, aes(carat, price)) +\n  geom_point()\n\np_hdr_points <- ggplot(diamonds, aes(carat, price)) +\n  geom_hdr_points()\n\np_points + p_hdr_points\n```\n\n<img src=\"man/figures/README-ex_geom_hdr_points-1.png\" width=\"100%\" />\n\n### `geom_hdr_rug()`\n\nRug plots are standard additions to plots with densities:\n\n``` r\nggplot(cars, aes(speed, dist)) +\n  geom_density_2d() +\n  geom_point() +\n  geom_rug()\n```\n\n<img src=\"man/figures/README-ex_geom_hdr_rug_1-1.png\" width=\"100%\" />\n\nWith HDRs, these can be used to visualize joint and marginal HDRs\nsimultaneously. The marginal HDRs are computed off of only the\ncorresponding `x` and `y` aesthetic variables. Note that these can be\nsubstantially different: the joint HDR is *not* the\n[product](https://en.wikipedia.org/wiki/Cartesian_product) of the\nmarginal HDRs.\n\n``` r\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug()\n```\n\n<img src=\"man/figures/README-ex_geom_hdr_rug_2-1.png\" width=\"100%\" />\n\nLike `geom_rug()`, these can be placed on different sides of the object:\n\n``` r\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug(sides = \"tr\", outside = TRUE) +\n  coord_cartesian(clip = \"off\")\n```\n\n<img src=\"man/figures/README-ex_geom_hdr_rug_3-1.png\" width=\"100%\" />\n\nWe sometimes find it easier to view if the rug intervals are colored:\n\n``` r\nggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  geom_hdr_rug(aes(fill = after_stat(probs)), length = unit(.2, \"cm\"), alpha = 1) + \n  scale_fill_viridis_d(option = \"magma\", begin = .8, end = 0)\n```\n\n<img src=\"man/figures/README-ex_geom_hdr_rug_4-1.png\" width=\"100%\" />\n\n### Numerical summaries of HDRs\n\nIt is possible to access numerical summaries of the estimated densities\nand HDRs computed by **ggdensity** with `get_hdr()`:\n\n``` r\ndf <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n\nres <- get_hdr(df, method = \"kde\")\nstr(res)\n#> List of 3\n#>  $ df_est:'data.frame':  10000 obs. of  5 variables:\n#>   ..$ x               : num [1:10000] -3.05 -2.99 -2.93 -2.86 -2.8 ...\n#>   ..$ y               : num [1:10000] -3.13 -3.13 -3.13 -3.13 -3.13 ...\n#>   ..$ fhat            : num [1:10000] 1.58e-09 4.49e-09 1.30e-08 3.66e-08 9.83e-08 ...\n#>   ..$ fhat_discretized: num [1:10000] 6.43e-12 1.83e-11 5.29e-11 1.49e-10 4.00e-10 ...\n#>   ..$ hdr             : num [1:10000] 1 1 1 1 1 1 1 1 1 1 ...\n#>  $ breaks: Named num [1:5] 0.00257 0.00887 0.02929 0.07574 Inf\n#>   ..- attr(*, \"names\")= chr [1:5] \"99%\" \"95%\" \"80%\" \"50%\" ...\n#>  $ data  :'data.frame':  1000 obs. of  3 variables:\n#>   ..$ x             : num [1:1000] -0.817 -2.463 -1.343 0.136 0.883 ...\n#>   ..$ y             : num [1:1000] -0.5277 -1.4411 -1.9568 0.0287 1.5382 ...\n#>   ..$ hdr_membership: num [1:1000] 0.5 0.99 0.95 0.5 0.8 0.99 0.8 0.95 0.5 0.5 ...\n```\n\nSimilarly, there is `get_hdr_1d()` for univariate data:\n\n``` r\nx <- rnorm(1e3)\n\nres <- get_hdr_1d(x, method = \"kde\")\nstr(res)\n#> List of 3\n#>  $ df_est:'data.frame':  512 obs. of  4 variables:\n#>   ..$ x               : num [1:512] -2.89 -2.88 -2.86 -2.85 -2.84 ...\n#>   ..$ fhat            : num [1:512] 0.0044 0.00459 0.00478 0.00499 0.00519 ...\n#>   ..$ fhat_discretized: num [1:512] 5.46e-05 5.70e-05 5.94e-05 6.19e-05 6.45e-05 ...\n#>   ..$ hdr             : num [1:512] 1 1 1 1 1 1 1 1 1 1 ...\n#>  $ breaks: Named num [1:5] 0.0141 0.0562 0.1756 0.3167 Inf\n#>   ..- attr(*, \"names\")= chr [1:5] \"99%\" \"95%\" \"80%\" \"50%\" ...\n#>  $ data  :'data.frame':  1000 obs. of  2 variables:\n#>   ..$ x             : num [1:1000] -0.4301 -1.5792 0.1929 -0.4973 -0.0859 ...\n#>   ..$ hdr_membership: num [1:1000] 0.5 0.95 0.5 0.5 0.5 0.5 0.8 0.5 0.5 0.99 ...\n```\n\nFor details on the objects returned by these functions, see `?get_hdr`\nand `?get_hdr_1d`.\n\n## A caveat and recommendation for cropped HDRs\n\n`geom_hdr()` and related functions were written with the intent of\nplaying nicely with **ggplot2**, so that what the typical **ggplot2**\nuser would expect from the rest of the **ggplot2** ecosystem would work\nin the same way with **ggdensity**.\n\nOne place where the effect isn’t ideal is in the limits of the `x` and\n`y` scales. Without getting into too much detail, these key off of the\nobserved points themselves, and not properties of the estimated density.\nThis is consistent with `geom_density_2d()` and `stat_smooth()`, for\nexample: computed aesthetics don’t extend past the range of the data.\n\nOne potential danger here is that the estimated HDRs are computed based\non not the estimated density directly, but a discretization of it. This\nis how all non-parametric density estimation in R works,\ne.g. `MASS::kde2d()`, and most parametric density estimation, too. In\nother words: the density estimate itself is only known at points on a\ngrid over the `x`-`y` aesthetic space. As a consequence, if that range\nis too small, it’s possible that a probabilistically non-trivial\nproportion of the density is excluded from the computations, biasing the\nresulting HDRs.\n\nThe punch line is that whenever you see an HDR getting truncated by the\nwindow of the plot, it’s probably a good idea to manually increase the\naesthetic limits and use `coord_cartesian()` to zoom in as needed.\nHere’s an example using the previously created graphic. The limits given\nto `coord_cartesian()` and the call to `scale_y_continuous()` is simply\nan effort to make the third plot comparable to the first.\n\n*Note:* The support of the data isn’t respected here-the estimated\ndensity doesn’t know speed can’t go negative. That’s not an artifact of\nthe effect described above, that’s just because that’s what KDE’s do.\n\n``` r\np1 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr() +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  ggtitle(\"Default geom_hdr()\")\n\np2 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr(xlim = c(-20, 50), ylim = c(-40, 140)) +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  ggtitle(\"Manually set xlim, ylim\")\n\np3 <- ggplot(cars, aes(speed, dist)) +\n  geom_hdr(xlim = c(-20, 50), ylim = c(-40, 140)) +\n  geom_point(color = \"red\") +\n  guides(alpha = \"none\") +\n  scale_y_continuous(breaks = 25*(0:5)) +\n  coord_cartesian(xlim = c(4, 25), ylim = c(-1, 120)) + \n  ggtitle(\"Zoom with coord_cartesian()\")\n\n(p1 / p2 / p3) & theme(title = element_text(size = 9))\n```\n\n<img src=\"man/figures/README-ex-expand-lims-1.png\" width=\"100%\" />\n\n## Related projects\n\nThere are a few other great packages out there you should know about if\nyou’re interested in **ggdensity**.\n\nThe [**ggdist**](https://mjskay.github.io/ggdist/) package provides\nseveral flexible geoms for visualizing distributions of data, mostly\nunivariate data.\n\nThe [**hdrcde**](https://pkg.robjhyndman.com/hdrcde/index.html) package\nallows you to make bivariate HDR plots as well. At the surface, the main\ndifference is that **hdrcde** doesn’t use **ggplot2** graphics; however,\nunder the hood there are many more differences. (More coming on\nexplaining these discrepancies.)\n\nThe code illustrating the two strategies is quite simple, but trying to\nmake the graphics more directly comparable requires some effort. Here’s\na pretty good rendition on the `faithful` dataset, which has 272\nobservations.\n\n``` r\np_hdr_scale <- ggplot(faithful, aes(eruptions, waiting)) +\n  geom_hdr(\n    xlim = scales::expand_range(range(faithful$eruptions), mul = .25),\n    ylim = scales::expand_range(range(faithful$waiting),   mul = .25)\n  ) +\n  geom_point(color = \"red\") +\n  scale_x_continuous(breaks = 0:6) +\n  scale_y_continuous(breaks = (3:10)*10) +\n  guides(alpha = \"none\")\n\nden <- with(faithful,\n  MASS::kde2d(eruptions, waiting, n = 100, lims = c(0,6,30,105))\n)\n\nif (!requireNamespace(\"hdrcde\")) install.packages(\"hdrcde\")\n#> Loading required namespace: hdrcde\nlibrary(\"hdrcde\")\n#> This is hdrcde 3.5.0\np_den <- ~ with(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), den = den),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\n\npar(mar = c(0,1.75,0,0), bg = NA)\np_hdr_scale +\n  coord_cartesian(xlim = c(0, 6), ylim = c(30, 105), expand = FALSE) +\n  wrap_elements(panel = p_den, clip = FALSE)\n```\n\n<img src=\"man/figures/README-hdrcde-1.png\" width=\"100%\" />\n\nThese look quite different, and they are. It’s worth noting that even\nwithin **hdrcde** there is variability as well:\n\n``` r\npar(mar = c(3, 3, 1, 1) + 0.1, mfrow = c(1, 2))\nwith(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), \n      kde.package = \"ash\", xextend = .20),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\nwith(faithful,\n  plot(\n    hdr.2d(eruptions, waiting, prob = c(50, 80, 95, 99), \n      kde.package = \"ks\", xextend = .20),\n    pointcol = \"red\",\n    show.points = TRUE,\n    xlim = c(0, 6),\n    ylim = c(30, 105)\n  )\n)\n```\n\n<img src=\"man/figures/README-unnamed-chunk-3-1.png\" width=\"100%\" />\n\n[**gghdr**](https://sayani07.github.io/gghdr/) is somewhat of a\n**ggplot2** port of **hdrcde**, developed by some of the same team\nmembers. In some ways, it’s very similar to **ggdensity**. For example,\nit contains a function `gghdr::geom_hdr_rug()` that does effectively the\nsame as `ggdensity::geom_hdr_rug()`; it implements a kind of\n`ggdensity::geom_hdr_pointdensity()` via a function `gghdr::hdr_bin()`\nplus the color aesthetic to `geom_point()`; and it provides a boxplot\nalternative `gghdr::geom_hdr_boxplot()`. To the extent the similarities\nbetween **ggdensity** and **hdrcde**/**gghdr** exist (and they obviously\ndo), they are an example of [convergent\nevolution](https://en.wikipedia.org/wiki/Convergent_evolution). The\npresent authors only discovered those projects after writing most of\n**ggdensity**, unfortunately. Interestingly, we also had designs on the\nCDE part as well (“conditional density estimation”, think models);\nhowever had not implemented it before seeing **hdrcde**. You can expect\nthose to come down the road.\n\nPerhaps the most important difference between **ggdensity** and\n**gghdr** is that the latter doesn’t implement bivariate HDRs in the\n**ggplot2** framework, which was the original motivation of\n**ggdensity**. For that purpose, it seems the only project available is\n**ggdensity**.\n"
  },
  {
    "path": "_pkgdown.yml",
    "content": "url: https://jamesotto852.github.io/ggdensity/\ntemplate:\n  bootstrap: 5\n\nreference:\n- title: \"2-dimensional highest density regions\"\n\n- subtitle: \"Plotting functions\"\n  contents:\n  - geom_hdr\n  - geom_hdr_fun\n  - geom_hdr_points\n  - geom_hdr_points_fun\n\n- subtitle: \"Bivariate density estimation methods\"\n  contents:\n  - method_freqpoly\n  - method_histogram\n  - method_kde\n  - method_mvnorm\n\n- subtitle: \"Accessing estimated bivariate densities and regions\"\n  desc: \"All 2D plotting functions use this function to determine HDRs\"\n  contents:\n  - get_hdr\n\n- title: \"1-dimensional highest density regions\"\n\n- subtitle: \"Plotting functions\"\n  contents:\n  - geom_hdr_rug\n  - geom_hdr_rug_fun\n\n- subtitle: \"Univariate density estimation methods\"\n  contents:\n  - method_kde_1d\n  - method_norm_1d\n  - method_freqpoly_1d\n  - method_histogram_1d\n\n- subtitle: \"Accessing estimated univariate densities and regions\"\n  desc: \"All 1D plotting functions use this function to determine HDRs\"\n  contents:\n  - get_hdr_1d\n\n- title: \"Package-level documentation\"\n  contents:\n  - ggdensity\n  - package-ggdensity\n"
  },
  {
    "path": "cran-comments.md",
    "content": "## R CMD check results\n\n0 errors | 0 warnings | 0 notes\n\n## revdepcheck results\n\nWe checked 6 reverse dependencies (5 from CRAN + 1 from Bioconductor), comparing R CMD check results across CRAN and dev versions of this package.\n\n * We saw 0 new problems\n * We failed to check 0 packages\n\n"
  },
  {
    "path": "ggdensity.Rproj",
    "content": "Version: 1.0\n\nRestoreWorkspace: No\nSaveWorkspace: No\nAlwaysSaveHistory: Default\n\nEnableCodeIndexing: Yes\nUseSpacesForTab: Yes\nNumSpacesForTab: 2\nEncoding: UTF-8\n\nRnwWeave: knitr\nLaTeX: pdfLaTeX\n\nAutoAppendNewline: Yes\nStripTrailingWhitespace: Yes\nLineEndingConversion: Posix\n\nBuildType: Package\nPackageUseDevtools: Yes\nPackageInstallArgs: --no-multiarch --with-keep.source\nPackageRoxygenize: rd,collate,namespace\n"
  },
  {
    "path": "man/geom_hdr.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr.R, R/hdr_lines.R\n\\docType{data}\n\\name{geom_hdr}\n\\alias{geom_hdr}\n\\alias{stat_hdr}\n\\alias{StatHdr}\n\\alias{GeomHdr}\n\\alias{stat_hdr_lines}\n\\alias{StatHdrLines}\n\\alias{geom_hdr_lines}\n\\alias{GeomHdrLines}\n\\title{Highest density regions of a 2D density estimate}\n\\usage{\nstat_hdr(\n  mapping = NULL,\n  data = NULL,\n  geom = \"hdr\",\n  position = \"identity\",\n  ...,\n  method = \"kde\",\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  n = 100,\n  xlim = NULL,\n  ylim = NULL,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr\",\n  position = \"identity\",\n  ...,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{method}{Density estimator to use, accepts character vector:\n\\code{\"kde\"},\\code{\"histogram\"}, \\code{\"freqpoly\"}, or \\code{\"mvnorm\"}.\nAlternatively accepts functions  which return closures corresponding to density estimates,\nsee \\code{?get_hdr} or \\code{vignette(\"method\", \"ggdensity\")}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{n}{Resolution of grid defined by \\code{xlim} and \\code{ylim}.\nIgnored if \\code{method = \"histogram\"} or \\code{method = \"freqpoly\"}.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n}\n\\description{\nPerform 2D density estimation, compute and plot the resulting highest density regions.\n\\code{geom_hdr()} draws filled regions and \\code{geom_hdr_lines()} draws lines outlining the regions.\nNote, the plotted objects have probabilities mapped to the \\code{alpha} aesthetic by default.\n}\n\\section{Aesthetics}{\n \\code{geom_hdr()} and \\code{geom_hdr_lines()} understand the following aesthetics (required\naesthetics are in bold):\n\\itemize{\n\\item \\strong{x}\n\\item \\strong{y}\n\\item alpha\n\\item color\n\\item fill (only \\code{geom_hdr})\n\\item group\n\\item linetype\n\\item linewidth\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability associated with the highest density region, specified\nby \\code{probs} argument.} }\n}\n\n\\examples{\n# Basic simulated data with bivariate normal data and various methods\ndf <- data.frame(x = rnorm(1000), y = rnorm(1000))\np <- ggplot(df, aes(x, y)) + coord_equal()\n\np + geom_hdr()\np + geom_hdr(method = \"mvnorm\")\np + geom_hdr(method = \"freqpoly\")\n# p + geom_hdr(method = \"histogram\")\n\n# Adding point layers on top to visually assess region estimates\npts <- geom_point(size = .2, color = \"red\")\n\np + geom_hdr() + pts\np + geom_hdr(method = \"mvnorm\") + pts\np + geom_hdr(method = \"freqpoly\") + pts\n# p + geom_hdr(method = \"histogram\") + pts\n\n# Highest density region boundary lines\np + geom_hdr_lines()\np + geom_hdr_lines(method = \"mvnorm\")\np + geom_hdr_lines(method = \"freqpoly\")\n# p + geom_hdr_lines(method = \"histogram\")\n\n\\dontrun{\n\n# 2+ groups - mapping other aesthetics in the geom\nrdata <- function(n, n_groups = 3, radius = 3) {\n  list_of_dfs <- lapply(0:(n_groups-1), function(k) {\n    mu <- c(cos(2*k*pi/n_groups), sin(2*k*pi/n_groups))\n    m <- MASS::mvrnorm(n, radius*mu, diag(2))\n    structure(data.frame(m, as.character(k)), names = c(\"x\", \"y\", \"c\"))\n  })\n  do.call(\"rbind\", list_of_dfs)\n}\n\ndfc <- rdata(1000, n_groups = 5)\npf <- ggplot(dfc, aes(x, y, fill = c)) + coord_equal()\n\npf + geom_hdr()\npf + geom_hdr(method = \"mvnorm\")\npf + geom_hdr(method = \"mvnorm\", probs = .90, alpha = .5)\npf + geom_hdr(method = \"histogram\")\npf + geom_hdr(method = \"freqpoly\")\n\npc <- ggplot(dfc, aes(x, y, color = c)) +\n coord_equal() +\n theme_minimal() +\n theme(panel.grid.minor = element_blank())\n\npc + geom_hdr_lines()\npc + geom_hdr_lines(method = \"mvnorm\")\n\n\n# Data with boundaries\nggplot(df, aes(x^2)) + geom_histogram(bins = 30)\nggplot(df, aes(x^2)) + geom_histogram(bins = 30, boundary = 0)\nggplot(df, aes(x^2, y^2)) + geom_hdr(method = \"histogram\")\n\n}\n\n}\n\\references{\nScott, David W. Multivariate Density Estimation (2e), Wiley.\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/geom_hdr_fun.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr_fun.R, R/hdr_lines_fun.R\n\\docType{data}\n\\name{geom_hdr_fun}\n\\alias{geom_hdr_fun}\n\\alias{stat_hdr_fun}\n\\alias{StatHdrFun}\n\\alias{GeomHdrFun}\n\\alias{stat_hdr_lines_fun}\n\\alias{StatHdrLinesFun}\n\\alias{geom_hdr_lines_fun}\n\\alias{GeomHdrLinesFun}\n\\title{Highest density regions of a bivariate pdf}\n\\usage{\nstat_hdr_fun(\n  mapping = NULL,\n  data = NULL,\n  geom = \"hdr_fun\",\n  position = \"identity\",\n  ...,\n  fun,\n  args = list(),\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  xlim = NULL,\n  ylim = NULL,\n  n = 100,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr_fun(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr_fun\",\n  position = \"identity\",\n  ...,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{fun}{A function, the joint probability density function, must be\nvectorized in its first two arguments; see examples.}\n\n\\item{args}{Named list of additional arguments passed on to \\code{fun}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data if present.}\n\n\\item{n}{Resolution of grid \\code{fun} is evaluated on.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n}\n\\description{\nCompute and plot the highest density regions (HDRs) of a bivariate pdf.\n\\code{geom_hdr_fun()} draws filled regions, and \\code{geom_hdr_lines_fun()} draws lines outlining the regions.\nNote, the plotted objects have probabilities mapped to the \\code{alpha} aesthetic by default.\n}\n\\section{Aesthetics}{\n \\code{geom_hdr_fun()} and \\code{geom_hdr_lines_fun()} understand the following aesthetics (required\naesthetics are in bold):\n\\itemize{\n\\item x\n\\item y\n\\item alpha\n\\item color\n\\item fill (only \\code{geom_hdr_fun})\n\\item group\n\\item linetype\n\\item linewidth\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability associated with the highest density region, specified\nby \\code{probs}.} }\n}\n\n\\examples{\n# HDRs of the bivariate exponential\nf <- function(x, y) dexp(x) * dexp(y)\nggplot() + geom_hdr_fun(fun = f, xlim = c(0, 10), ylim = c(0, 10))\n\n\n# HDRs of a custom parametric model\n\n# generate example data\nn <- 1000\nth_true <- c(3, 8)\n\nrdata <- function(n, th) {\n  gen_single_obs <- function(th) {\n    rchisq(2, df = th) # can be anything\n  }\n  df <- replicate(n, gen_single_obs(th))\n  setNames(as.data.frame(t(df)), c(\"x\", \"y\"))\n}\ndata <- rdata(n, th_true)\n\n# estimate unknown parameters via maximum likelihood\nlikelihood <- function(th) {\n  th <- abs(th) # hack to enforce parameter space boundary\n  log_f <- function(v) {\n    x <- v[1]; y <- v[2]\n    dchisq(x, df = th[1], log = TRUE) + dchisq(y, df = th[2], log = TRUE)\n  }\n  sum(apply(data, 1, log_f))\n}\n(th_hat <- optim(c(1, 1), likelihood, control = list(fnscale = -1))$par)\n\n# plot f for the give model\nf <- function(x, y, th) dchisq(x, df = th[1]) * dchisq(y, df = th[2])\n\nggplot(data, aes(x, y)) +\n  geom_hdr_fun(fun = f, args = list(th = th_hat)) +\n  geom_point(size = .25, color = \"red\") +\n  xlim(0, 30) + ylim(c(0, 30))\n\nggplot(data, aes(x, y)) +\n  geom_hdr_lines_fun(fun = f, args = list(th = th_hat)) +\n  geom_point(size = .25, color = \"red\") +\n  xlim(0, 30) + ylim(c(0, 30))\n\n\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/geom_hdr_points.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr_points.R\n\\docType{data}\n\\name{geom_hdr_points}\n\\alias{geom_hdr_points}\n\\alias{stat_hdr_points}\n\\alias{StatHdrPoints}\n\\title{Scatterplot colored by highest density regions of a 2D density estimate}\n\\usage{\nstat_hdr_points(\n  mapping = NULL,\n  data = NULL,\n  geom = \"point\",\n  position = \"identity\",\n  ...,\n  method = \"kde\",\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  n = 100,\n  xlim = NULL,\n  ylim = NULL,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr_points(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr_points\",\n  position = \"identity\",\n  ...,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{method}{Density estimator to use, accepts character vector:\n\\code{\"kde\"},\\code{\"histogram\"}, \\code{\"freqpoly\"}, or \\code{\"mvnorm\"}.\nAlternatively accepts functions  which return closures corresponding to density estimates,\nsee \\code{?get_hdr} or \\code{vignette(\"method\", \"ggdensity\")}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{n}{Number of grid points in each direction.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n}\n\\description{\nPerform 2D density estimation, compute the resulting highest density regions (HDRs),\nand plot the provided data as a scatterplot with points colored according to\ntheir corresponding HDR.\n}\n\\section{Aesthetics}{\n geom_hdr_points understands the following aesthetics (required\naesthetics are in bold):\n\\itemize{\n\\item \\strong{x}\n\\item \\strong{y}\n\\item alpha\n\\item color\n\\item fill\n\\item group\n\\item linetype\n\\item size\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability associated with the highest density region, specified\nby \\code{probs}.} }\n}\n\n\\examples{\nset.seed(1)\ndf <- data.frame(x = rnorm(500), y = rnorm(500))\np <- ggplot(df, aes(x, y)) +\n coord_equal()\n\np + geom_hdr_points()\n\n# Setting aes(fill = after_stat(probs)), color = \"black\", and\n# shape = 21 helps alleviate overplotting:\np + geom_hdr_points(aes(fill = after_stat(probs)), color = \"black\", shape = 21, size = 2)\n\n# Also works well with geom_hdr_lines()\np +\n geom_hdr_lines(\n   aes(color = after_stat(probs)), alpha = 1,\n   xlim = c(-5, 5), ylim = c(-5, 5)\n ) +\n geom_hdr_points(\n   aes(fill = after_stat(probs)), color = \"black\", shape = 21, size = 2,\n   xlim = c(-5, 5), ylim = c(-5, 5)\n )\n\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/geom_hdr_points_fun.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr_points_fun.R\n\\docType{data}\n\\name{geom_hdr_points_fun}\n\\alias{geom_hdr_points_fun}\n\\alias{stat_hdr_points_fun}\n\\alias{StatHdrPointsFun}\n\\title{Scatterplot colored by highest density regions of a bivariate pdf}\n\\usage{\nstat_hdr_points_fun(\n  mapping = NULL,\n  data = NULL,\n  geom = \"point\",\n  position = \"identity\",\n  ...,\n  fun,\n  args = list(),\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  xlim = NULL,\n  ylim = NULL,\n  n = 100,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr_points_fun(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr_points_fun\",\n  position = \"identity\",\n  ...,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{fun}{A function, the joint probability density function, must be\nvectorized in its first two arguments; see examples.}\n\n\\item{args}{Named list of additional arguments passed on to \\code{fun}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data if present.}\n\n\\item{n}{Number of grid points in each direction.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n}\n\\description{\nCompute the highest density regions (HDRs) of a bivariate pdf and plot the provided\ndata as a scatterplot with points colored according to their corresponding HDR.\n}\n\\section{Aesthetics}{\n geom_hdr_points_fun understands the following aesthetics\n(required aesthetics are in bold):\n\\itemize{\n\\item \\strong{x}\n\\item \\strong{y}\n\\item alpha\n\\item color\n\\item fill\n\\item group\n\\item linetype\n\\item size\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability associated with the highest density region, specified\nby \\code{probs}.} }\n}\n\n\\examples{\n# Can plot points colored according to known pdf:\nset.seed(1)\ndf <- data.frame(x = rexp(1000), y = rexp(1000))\nf <- function(x, y) dexp(x) * dexp(y)\n\nggplot(df, aes(x, y)) +\n  geom_hdr_points_fun(fun = f, xlim = c(0, 10), ylim = c(0, 10))\n\n\n# Also allows for hdrs of a custom parametric model\n\n# generate example data\nn <- 1000\nth_true <- c(3, 8)\n\nrdata <- function(n, th) {\n  gen_single_obs <- function(th) {\n    rchisq(2, df = th) # can be anything\n  }\n  df <- replicate(n, gen_single_obs(th))\n  setNames(as.data.frame(t(df)), c(\"x\", \"y\"))\n}\ndata <- rdata(n, th_true)\n\n# estimate unknown parameters via maximum likelihood\nlikelihood <- function(th) {\n  th <- abs(th) # hack to enforce parameter space boundary\n  log_f <- function(v) {\n    x <- v[1]; y <- v[2]\n    dchisq(x, df = th[1], log = TRUE) + dchisq(y, df = th[2], log = TRUE)\n  }\n  sum(apply(data, 1, log_f))\n}\n(th_hat <- optim(c(1, 1), likelihood, control = list(fnscale = -1))$par)\n\n# plot f for the give model\nf <- function(x, y, th) dchisq(x, df = th[1]) * dchisq(y, df = th[2])\n\nggplot(data, aes(x, y)) +\n  geom_hdr_points_fun(fun = f, args = list(th = th_hat))\n\nggplot(data, aes(x, y)) +\n  geom_hdr_points_fun(aes(fill = after_stat(probs)), shape = 21, color = \"black\",\n    fun = f, args = list(th = th_hat), na.rm = TRUE) +\n  geom_hdr_lines_fun(aes(color = after_stat(probs)), alpha = 1, fun = f, args = list(th = th_hat)) +\n  lims(x = c(0, 15), y = c(0, 25))\n\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/geom_hdr_rug.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr_rug.R\n\\docType{data}\n\\name{geom_hdr_rug}\n\\alias{geom_hdr_rug}\n\\alias{stat_hdr_rug}\n\\alias{StatHdrRug}\n\\alias{GeomHdrRug}\n\\title{Rug plots of marginal highest density region estimates}\n\\usage{\nstat_hdr_rug(\n  mapping = NULL,\n  data = NULL,\n  geom = \"hdr_rug\",\n  position = \"identity\",\n  ...,\n  method = \"kde\",\n  method_y = \"kde\",\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  xlim = NULL,\n  ylim = NULL,\n  n = 512,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr_rug(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr_rug\",\n  position = \"identity\",\n  ...,\n  outside = FALSE,\n  sides = \"bl\",\n  length = unit(0.03, \"npc\"),\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{method, method_y}{Density estimator(s) to use.\nBy default \\code{method} is used for both x- and y-axis.\nIf specified, \\code{method_y} will be used for y-axis.\nAccepts character vector: \\code{\"kde\"},\\code{\"histogram\"}, \\code{\"freqpoly\"}, or \\code{\"norm\"}.\nAlternatively accepts functions  which return closures corresponding to density estimates,\nsee \\code{?get_hdr_1d} or \\code{vignette(\"method\", \"ggdensity\")}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data.}\n\n\\item{n}{Resolution of grid defined by \\code{xlim} and \\code{ylim}.\nIgnored if \\code{method = \"histogram\"} or \\code{method = \"freqpoly\"}.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n\n\\item{outside}{logical that controls whether to move the rug tassels outside of the plot area. Default is off (FALSE). You will also need to use \\code{coord_cartesian(clip = \"off\")}. When set to TRUE, also consider changing the sides argument to \"tr\". See examples.}\n\n\\item{sides}{A string that controls which sides of the plot the rugs appear on.\nIt can be set to a string containing any of \\code{\"trbl\"}, for top, right,\nbottom, and left.}\n\n\\item{length}{A \\code{\\link[grid:unit]{grid::unit()}} object that sets the length of the rug lines. Use scale expansion to avoid overplotting of data.}\n}\n\\description{\nPerform 1D density estimation, compute and plot the resulting highest density\nregions in a way similar to \\code{\\link[ggplot2:geom_rug]{ggplot2::geom_rug()}}.\nNote, the plotted objects have probabilities mapped to the \\code{alpha} aesthetic by default.\n}\n\\section{Aesthetics}{\n geom_hdr_rug understands the following aesthetics (required\naesthetics are in bold):\n\\itemize{\n\\item x\n\\item y\n\\item alpha\n\\item fill\n\\item group\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability of the highest density region, specified\nby \\code{probs}, corresponding to each point.} }\n}\n\n\\examples{\nset.seed(1)\ndf <- data.frame(x = rnorm(100), y = rnorm(100))\n\n# Plot marginal HDRs for bivariate data\nggplot(df, aes(x, y)) +\n  geom_point() +\n  geom_hdr_rug() +\n  coord_fixed()\n\nggplot(df, aes(x, y)) +\n  geom_hdr() +\n  geom_hdr_rug() +\n  coord_fixed()\n\n# Plot HDR for univariate data\nggplot(df, aes(x)) +\n  geom_density() +\n  geom_hdr_rug()\n\nggplot(df, aes(y = y)) +\n  geom_density() +\n  geom_hdr_rug()\n\n# Specify location of marginal HDRs as in ggplot2::geom_rug()\nggplot(df, aes(x, y)) +\n  geom_hdr() +\n  geom_hdr_rug(sides = \"tr\", outside = TRUE) +\n  coord_fixed(clip = \"off\")\n\n# Can use same methods of density estimation as geom_hdr().\n# For data with constrained support, we suggest setting method = \"histogram\":\nggplot(df, aes(x^2)) +\n geom_histogram(bins = 30, boundary = 0) +\n geom_hdr_rug(method = \"histogram\")\n\nggplot(df, aes(x^2, y^2)) +\n geom_hdr(method = \"histogram\") +\n geom_hdr_rug(method = \"histogram\") +\n coord_fixed()\n\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/geom_hdr_rug_fun.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/hdr_rug_fun.R\n\\docType{data}\n\\name{geom_hdr_rug_fun}\n\\alias{geom_hdr_rug_fun}\n\\alias{stat_hdr_rug_fun}\n\\alias{StatHdrRugFun}\n\\alias{GeomHdrRugFun}\n\\title{Rug plots of highest density region estimates of univariate pdfs}\n\\usage{\nstat_hdr_rug_fun(\n  mapping = NULL,\n  data = NULL,\n  geom = \"hdr_rug_fun\",\n  position = \"identity\",\n  ...,\n  fun_x = NULL,\n  fun_y = NULL,\n  args_x = list(),\n  args_y = list(),\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  xlim = NULL,\n  ylim = NULL,\n  n = 512,\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n\ngeom_hdr_rug_fun(\n  mapping = NULL,\n  data = NULL,\n  stat = \"hdr_rug_fun\",\n  position = \"identity\",\n  ...,\n  outside = FALSE,\n  sides = \"bl\",\n  length = unit(0.03, \"npc\"),\n  na.rm = FALSE,\n  show.legend = NA,\n  inherit.aes = TRUE\n)\n}\n\\arguments{\n\\item{mapping}{Set of aesthetic mappings created by \\code{\\link[ggplot2:aes]{aes()}}. If specified and\n\\code{inherit.aes = TRUE} (the default), it is combined with the default mapping\nat the top level of the plot. You must supply \\code{mapping} if there is no plot\nmapping.}\n\n\\item{data}{The data to be displayed in this layer. There are three\noptions:\n\nIf \\code{NULL}, the default, the data is inherited from the plot\ndata as specified in the call to \\code{\\link[ggplot2:ggplot]{ggplot()}}.\n\nA \\code{data.frame}, or other object, will override the plot\ndata. All objects will be fortified to produce a data frame. See\n\\code{\\link[ggplot2:fortify]{fortify()}} for which variables will be created.\n\nA \\code{function} will be called with a single argument,\nthe plot data. The return value must be a \\code{data.frame}, and\nwill be used as the layer data. A \\code{function} can be created\nfrom a \\code{formula} (e.g. \\code{~ head(.x, 10)}).}\n\n\\item{geom}{The geometric object to use to display the data for this layer.\nWhen using a \\verb{stat_*()} function to construct a layer, the \\code{geom} argument\ncan be used to override the default coupling between stats and geoms. The\n\\code{geom} argument accepts the following:\n\\itemize{\n\\item A \\code{Geom} ggproto subclass, for example \\code{GeomPoint}.\n\\item A string naming the geom. To give the geom as a string, strip the\nfunction name of the \\code{geom_} prefix. For example, to use \\code{geom_point()},\ngive the geom as \\code{\"point\"}.\n\\item For more information and other ways to specify the geom, see the\n\\link[ggplot2:layer_geoms]{layer geom} documentation.\n}}\n\n\\item{position}{A position adjustment to use on the data for this layer. This\ncan be used in various ways, including to prevent overplotting and\nimproving the display. The \\code{position} argument accepts the following:\n\\itemize{\n\\item The result of calling a position function, such as \\code{position_jitter()}.\nThis method allows for passing extra arguments to the position.\n\\item A string naming the position adjustment. To give the position as a\nstring, strip the function name of the \\code{position_} prefix. For example,\nto use \\code{position_jitter()}, give the position as \\code{\"jitter\"}.\n\\item For more information and other ways to specify the position, see the\n\\link[ggplot2:layer_positions]{layer position} documentation.\n}}\n\n\\item{...}{Other arguments passed on to \\code{\\link[ggplot2:layer]{layer()}}'s \\code{params} argument. These\narguments broadly fall into one of 4 categories below. Notably, further\narguments to the \\code{position} argument, or aesthetics that are required\ncan \\emph{not} be passed through \\code{...}. Unknown arguments that are not part\nof the 4 categories below are ignored.\n\\itemize{\n\\item Static aesthetics that are not mapped to a scale, but are at a fixed\nvalue and apply to the layer as a whole. For example, \\code{colour = \"red\"}\nor \\code{linewidth = 3}. The geom's documentation has an \\strong{Aesthetics}\nsection that lists the available options. The 'required' aesthetics\ncannot be passed on to the \\code{params}. Please note that while passing\nunmapped aesthetics as vectors is technically possible, the order and\nrequired length is not guaranteed to be parallel to the input data.\n\\item When constructing a layer using\na \\verb{stat_*()} function, the \\code{...} argument can be used to pass on\nparameters to the \\code{geom} part of the layer. An example of this is\n\\code{stat_density(geom = \"area\", outline.type = \"both\")}. The geom's\ndocumentation lists which parameters it can accept.\n\\item Inversely, when constructing a layer using a\n\\verb{geom_*()} function, the \\code{...} argument can be used to pass on parameters\nto the \\code{stat} part of the layer. An example of this is\n\\code{geom_area(stat = \"density\", adjust = 0.5)}. The stat's documentation\nlists which parameters it can accept.\n\\item The \\code{key_glyph} argument of \\code{\\link[ggplot2:layer]{layer()}} may also be passed on through\n\\code{...}. This can be one of the functions described as\n\\link[ggplot2:draw_key]{key glyphs}, to change the display of the layer in the legend.\n}}\n\n\\item{fun_x, fun_y}{Functions, the univariate probability density function for the x- and/or y-axis.\nFirst argument must be vectorized.}\n\n\\item{args_x, args_y}{Named list of additional arguments passed on to \\code{fun_x} and/or \\code{fun_y}.}\n\n\\item{probs}{Probabilities to compute highest density regions for.}\n\n\\item{xlim, ylim}{Range to compute and draw regions. If \\code{NULL}, defaults to\nrange of data.}\n\n\\item{n}{Resolution of grid defined by \\code{xlim} and \\code{ylim}.\nIgnored if \\code{method = \"histogram\"} or \\code{method = \"freqpoly\"}.}\n\n\\item{na.rm}{If \\code{FALSE}, the default, missing values are removed with\na warning. If \\code{TRUE}, missing values are silently removed.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?\n\\code{NA}, the default, includes if any aesthetics are mapped.\n\\code{FALSE} never includes, and \\code{TRUE} always includes.\nIt can also be a named logical vector to finely select the aesthetics to\ndisplay. To include legend keys for all levels, even\nwhen no data exists, use \\code{TRUE}.  If \\code{NA}, all levels are shown in legend,\nbut unobserved levels are omitted.}\n\n\\item{inherit.aes}{If \\code{FALSE}, overrides the default aesthetics,\nrather than combining with them. This is most useful for helper functions\nthat define both data and aesthetics and shouldn't inherit behaviour from\nthe default plot specification, e.g. \\code{\\link[ggplot2:annotation_borders]{annotation_borders()}}.}\n\n\\item{stat}{The statistical transformation to use on the data for this layer.\nWhen using a \\verb{geom_*()} function to construct a layer, the \\code{stat}\nargument can be used to override the default coupling between geoms and\nstats. The \\code{stat} argument accepts the following:\n\\itemize{\n\\item A \\code{Stat} ggproto subclass, for example \\code{StatCount}.\n\\item A string naming the stat. To give the stat as a string, strip the\nfunction name of the \\code{stat_} prefix. For example, to use \\code{stat_count()},\ngive the stat as \\code{\"count\"}.\n\\item For more information and other ways to specify the stat, see the\n\\link[ggplot2:layer_stats]{layer stat} documentation.\n}}\n\n\\item{outside}{logical that controls whether to move the rug tassels outside of the plot area. Default is off (FALSE). You will also need to use \\code{coord_cartesian(clip = \"off\")}. When set to TRUE, also consider changing the sides argument to \"tr\". See examples.}\n\n\\item{sides}{A string that controls which sides of the plot the rugs appear on.\nIt can be set to a string containing any of \\code{\"trbl\"}, for top, right,\nbottom, and left.}\n\n\\item{length}{A \\code{\\link[grid:unit]{grid::unit()}} object that sets the length of the rug lines. Use scale expansion to avoid overplotting of data.}\n}\n\\description{\nCompute and plot the highest density regions (HDRs) of specified univariate pdf(s).\nNote, the plotted objects have probabilities mapped to the \\code{alpha} aesthetic by default.\n}\n\\section{Aesthetics}{\n \\code{geom_hdr_rug_fun()} understands the following aesthetics (required\naesthetics are in bold):\n\\itemize{\n\\item x\n\\item y\n\\item alpha\n\\item fill\n\\item group\n\\item subgroup\n}\n}\n\n\\section{Computed variables}{\n\n\n\\describe{ \\item{probs}{The probability of the highest density region, specified\nby \\code{probs}, corresponding to each point.} }\n}\n\n\\examples{\n# Plotting data with exponential marginals\ndf <- data.frame(x = rexp(1e3), y = rexp(1e3))\n\nggplot(df, aes(x, y)) +\n  geom_hdr_rug_fun(fun_x = dexp, fun_y = dexp) +\n  geom_point(size = .5) +\n  coord_fixed()\n\n# without data/aesthetic mappings\nggplot() +\n  geom_hdr_rug_fun(fun_x = dexp, fun_y = dexp, xlim = c(0, 7), ylim = c(0, 7)) +\n  coord_fixed()\n\n\n# Plotting univariate normal data, estimating mean and sd\ndf <- data.frame(x = rnorm(1e4, mean = 1, sd = 3))\n\n# estimating parameters\nmu_hat <- mean(df$x)\nsd_hat <- sd(df$x)\n\nggplot(df, aes(x)) +\n  geom_hdr_rug_fun(fun_x = dnorm, args_x = list(mean = mu_hat, sd = sd_hat)) +\n  geom_density()\n\n# Equivalent to `method_norm_1d()` with `geom_hdr_rug()`\nggplot(df, aes(x)) +\n  geom_hdr_rug(method = method_norm_1d()) +\n  geom_density()\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/get_hdr.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/get_hdr.R\n\\name{get_hdr}\n\\alias{get_hdr}\n\\title{Computing the highest density regions of a 2D density}\n\\usage{\nget_hdr(\n  data = NULL,\n  method = \"kde\",\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  n = 100,\n  rangex = NULL,\n  rangey = NULL,\n  hdr_membership = TRUE,\n  fun,\n  args = list()\n)\n}\n\\arguments{\n\\item{data}{A data frame with columns \\code{x} and \\code{y}.}\n\n\\item{method}{Either a character (\\code{\"kde\"}, \\code{\"mvnorm\"}, \\code{\"histogram\"},\n\\code{\"freqpoly\"}, or \\code{\"fun\"}) or \\verb{method_*()} function. See the \"The \\code{method}\nargument\" section below for details.}\n\n\\item{probs}{Probabilities to compute HDRs for.}\n\n\\item{n}{Resolution of grid representing estimated density and HDRs.}\n\n\\item{rangex, rangey}{Range of grid representing estimated density and HDRs,\nalong the x- and y-axes.}\n\n\\item{hdr_membership}{Should HDR membership of data points (\\code{data}) be\ncomputed? Defaults to \\code{TRUE}, although it is computationally expensive for\nlarge data sets.}\n\n\\item{fun}{Optional, a joint probability density function, must be vectorized\nin its first two arguments. See the \"The \\code{fun} argument\" section below for\ndetails.}\n\n\\item{args}{Optional, a list of arguments to be provided to \\code{fun}.}\n}\n\\value{\n\\code{get_hdr} returns a list with elements \\code{df_est} (\\code{data.frame}), \\code{breaks}\n(named \\code{numeric}), and \\code{data} (\\code{data.frame}).\n\\itemize{\n\\item \\code{df_est}: the estimated HDRs and density evaluated on the grid defined by \\code{rangex}, \\code{rangey}, and \\code{n}.\nThe column of estimated HDRs (\\code{df_est$hdr}) is a numeric vector with values\nfrom \\code{probs}. The columns \\code{df_est$fhat} and \\code{df_est$fhat_discretized}\ncorrespond to the estimated density on the original scale and rescaled to sum\nto 1, respectively.\n\\item \\code{breaks}: the heights of the estimated density (\\code{df_est$fhat}) corresponding to the HDRs specified by \\code{probs}.\nWill always have additional element \\code{Inf} representing the cutoff for the\n100\\% HDR.\n\\item \\code{data}: the original data provided in the \\code{data} argument.\nIf \\code{hdr_membership} is set to \\code{TRUE}, this includes a column\n(\\code{data$hdr_membership}) with the HDR corresponding to each data point.\n}\n}\n\\description{\n\\code{get_hdr} is used to estimate a 2-dimensional density and compute\ncorresponding HDRs. The estimated density and HDRs are represented in a\ndiscrete form as a grid, defined by arguments \\code{rangex}, \\code{rangey}, and \\code{n}.\n\\code{get_hdr} is used internally by layer functions \\code{stat_hdr()},\n\\code{stat_hdr_points()}, \\code{stat_hdr_fun()}, etc.\n}\n\\section{The \\code{method} argument}{\n The density estimator used to estimate the\nHDRs is specified with the \\code{method} argument. The simplest way to specify\nan estimator is to provide a character value to \\code{method}, for example\n\\code{method = \"kde\"} specifies a kernel density estimator. However, this\nspecification is limited to the default behavior of the estimator.\n\nInstead, it is possible to provide a function call, for example: \\code{method = method_kde()}. In many cases, these functions accept parameters governing\nthe density estimation procedure. Here, \\code{method_kde()} accepts parameters\n\\code{h} and \\code{adjust}, both related to the kernel's bandwidth. For details, see\n\\code{?method_kde}. Every method of bivariate density estimation implemented has\nsuch corresponding \\verb{method_*()} function, each with an associated help\npage.\n\nNote: \\code{geom_hdr()} and other layer functions also have \\code{method} arguments\nwhich behave in the same way. For more details on the use and\nimplementation of the \\verb{method_*()} functions, see \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\n\\section{The \\code{fun} argument}{\n If \\code{method} is set to \\code{\"fun\"}, \\code{get_hdr()}\nexpects a bivariate probability density function to be specified with the\n\\code{fun} argument. It is required that \\code{fun} be a function of at least two\narguments (\\code{x} and \\code{y}). Beyond these first two arguments, \\code{fun} can have\narbitrarily many arguments; these can be set in \\code{get_hdr()} as a named list\nvia the \\code{args} parameter.\n\nNote: \\code{get_hdr()} requires that \\code{fun} be vectorized in \\code{x} and \\code{y}. For an\nexample of an appropriate choice of \\code{fun}, see the final example below.\n}\n\n\\examples{\ndf <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n\n# Two ways to specify `method`\nget_hdr(df, method = \"kde\")\nget_hdr(df, method = method_kde())\n\n\\dontrun{\n\n# If parenthesis are omitted, `get_hdr()` errors\nget_hdr(df, method = method_kde)\n}\n\n# Estimate different HDRs with `probs`\nget_hdr(df, method = method_kde(), probs = c(.975, .6, .2))\n\n# Adjust estimator parameters with arguments to `method_kde()`\nget_hdr(df, method = method_kde(h = 1))\n\n# Parametric normal estimator of density\nget_hdr(df, method = \"mvnorm\")\nget_hdr(df, method = method_mvnorm())\n\n# Compute \"population\" HDRs of specified bivariate pdf with `method = \"fun\"`\nf <- function(x, y, sd_x = 1, sd_y = 1) dnorm(x, sd = sd_x) * dnorm(y, sd = sd_y)\n\nget_hdr(\n  method = \"fun\", fun = f,\n  rangex = c(-5, 5), rangey = c(-5, 5)\n )\n\nget_hdr(\n  method = \"fun\", fun = f,\n  rangex = c(-5, 5), rangey = c(-5, 5),\n  args = list(sd_x = .5, sd_y = .5) # specify additional arguments w/ `args`\n)\n\n}\n"
  },
  {
    "path": "man/get_hdr_1d.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/get_hdr_1d.R\n\\name{get_hdr_1d}\n\\alias{get_hdr_1d}\n\\title{Computing the highest density regions of a 1D density}\n\\usage{\nget_hdr_1d(\n  x = NULL,\n  method = \"kde\",\n  probs = c(0.99, 0.95, 0.8, 0.5),\n  n = 512,\n  range = NULL,\n  hdr_membership = TRUE,\n  fun,\n  args = list()\n)\n}\n\\arguments{\n\\item{x}{A vector of data}\n\n\\item{method}{Either a character (\\code{\"kde\"}, \\code{\"norm\"}, \\code{\"histogram\"}, \\code{\"freqpoly\"}, or \\code{\"fun\"}) or \\verb{method_*_1d()} function.\nSee the \"The \\code{method} argument\" section below for details.}\n\n\\item{probs}{Probabilities to compute HDRs for.}\n\n\\item{n}{Resolution of grid representing estimated density and HDRs.}\n\n\\item{range}{Range of grid representing estimated density and HDRs.}\n\n\\item{hdr_membership}{Should HDR membership of data points (\\code{x}) be computed?}\n\n\\item{fun}{Optional, a probability density function, must be vectorized in its first argument.\nSee the \"The \\code{fun} argument\" section below for details.}\n\n\\item{args}{Optional, a list of arguments to be provided to \\code{fun}.}\n}\n\\value{\n\\code{get_hdr_1d} returns a list with elements \\code{df_est} (\\code{data.frame}), \\code{breaks} (named \\code{numeric}), and \\code{data} (\\code{data.frame}).\n\\itemize{\n\\item \\code{df_est}: the estimated HDRs and density evaluated on the grid defined by \\code{range} and \\code{n}.\nThe column of estimated HDRs (\\code{df_est$hdr}) is a numeric vector with values from \\code{probs}.\nThe columns \\code{df_est$fhat} and \\code{df_est$fhat_discretized} correspond to the estimated density\non the original scale and rescaled to sum to 1, respectively.\n\\item \\code{breaks}: the heights of the estimated density (\\code{df_est$fhat}) corresponding to the HDRs specified by \\code{probs}.\nWill always have additional element \\code{Inf} representing the cutoff for the 100\\% HDR.\n\\item \\code{data}: the original data provided in the \\code{data} argument.\nIf \\code{hdr_membership} is set to \\code{TRUE}, this includes a column (\\code{data$hdr_membership})\nwith the HDR corresponding to each data point.\n}\n}\n\\description{\n\\code{get_hdr_1d} is used to estimate a 1-dimensional density and compute corresponding HDRs.\nThe estimated density and HDRs are represented in a discrete form as a grid, defined by arguments \\code{range} and \\code{n}.\n\\code{get_hdr_1d} is used internally by layer functions \\code{stat_hdr_rug()} and \\code{stat_hdr_rug_fun()}.\n}\n\\section{The \\code{method} argument}{\n\nThe density estimator used to estimate the HDRs is specified with the \\code{method} argument.\nThe simplest way to specify an estimator is to provide a character value to \\code{method},\nfor example \\code{method = \"kde\"} specifies a kernel density estimator.\nHowever, this specification is limited to the default behavior of the estimator.\n\nInstead, it is possible to provide a function call, for example: \\code{method = method_kde_1d()}.\nThis is slightly different from the function calls provided in \\code{get_hdr()}, note the \\verb{_1d} suffix.\nIn many cases, these functions accept parameters governing the density estimation procedure.\nHere, \\code{method_kde_1d()} accepts several parameters related to the choice of kernel.\nFor details, see \\code{?method_kde_1d}.\nEvery method of univariate density estimation implemented has such corresponding \\verb{method_*_1d()} function,\neach with an associated help page.\n\nNote: \\code{geom_hdr_rug()} and other layer functions also have \\code{method} arguments which behave in the same way.\nFor more details on the use and implementation of the \\verb{method_*_1d()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\n\\section{The \\code{fun} argument}{\n\nIf \\code{method} is set to \\code{\"fun\"}, \\code{get_hdr_1d()} expects a univariate probability\ndensity function to be specified with the \\code{fun} argument.\nIt is required that \\code{fun} be a function of at least one argument (\\code{x}).\nBeyond this first argument, \\code{fun} can have arbitrarily many arguments;\nthese can be set in \\code{get_hdr_1d()} as a named list via the \\code{args} parameter.\n\nNote: \\code{get_hdr_1d()} requires that \\code{fun} be vectorized in \\code{x}.\nFor an example of an appropriate choice of \\code{fun}, see the final example below.\n}\n\n\\examples{\nx <- rnorm(1e3)\n\n# Two ways to specify `method`\nget_hdr_1d(x, method = \"kde\")\nget_hdr_1d(x, method = method_kde_1d())\n\n\\dontrun{\n\n# If parenthesis are omitted, `get_hdr_1d()` errors\nget_hdr_1d(df, method = method_kde_1d)\n\n# If the `_1d` suffix is omitted, `get_hdr_1d()` errors\nget_hdr_1d(x, method = method_kde())\n}\n\n# Adjust estimator parameters with arguments to `method_kde_1d()`\nget_hdr_1d(x, method = method_kde_1d(kernel = \"triangular\"))\n\n# Estimate different HDRs with `probs`\nget_hdr_1d(x, method = method_kde_1d(), probs = c(.975, .6, .2))\n\n# Compute \"population\" HDRs of specified univariate pdf with `method = \"fun\"`\nf <- function(x, sd = 1) dnorm(x, sd = sd)\nget_hdr_1d(method = \"fun\", fun = f, range = c(-5, 5))\nget_hdr_1d(method = \"fun\", fun = f, range = c(-5, 5), args = list(sd = .5))\n\n\n}\n"
  },
  {
    "path": "man/ggdensity.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ggdensity-package.R\n\\name{ggdensity}\n\\alias{ggdensity}\n\\alias{package-ggdensity}\n\\title{ggdensity: Stats and Geoms for Density Estimation with ggplot2}\n\\description{\nA package that allows more flexible computations for visualization of density\nestimates with ggplot2.\n}\n\\seealso{\nUseful links:\n\\itemize{\n\\item \\url{https://jamesotto852.github.io/ggdensity/}\n\\item \\url{https://github.com/jamesotto852/ggdensity/}\n}\n}\n"
  },
  {
    "path": "man/method_freqpoly.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method.R\n\\name{method_freqpoly}\n\\alias{method_freqpoly}\n\\title{Bivariate frequency polygon HDR estimator}\n\\usage{\nmethod_freqpoly(bins = NULL)\n}\n\\arguments{\n\\item{bins}{Number of bins along each axis.\nEither a vector of length 2 or a scalar value which is recycled for both dimensions.\nDefaults to normal reference rule (Scott, pg 87).}\n}\n\\description{\nFunction used to specify bivariate frequency polygon density estimator\nfor \\code{get_hdr()} and layer functions (e.g. \\code{geom_hdr()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\nset.seed(1)\ndf <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_freqpoly()) +\n  geom_point(size = 1)\n\n# The resolution of the frequency polygon estimator can be set via `bins`\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_freqpoly(bins = c(8, 25))) +\n  geom_point(size = 1)\n\n# Can also be used with `get_hdr()` for numerical summary of HDRs\nres <- get_hdr(df, method = method_freqpoly())\nstr(res)\n\n}\n\\references{\nScott, David W. Multivariate Density Estimation (2e), Wiley.\n}\n"
  },
  {
    "path": "man/method_freqpoly_1d.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method_1d.R\n\\name{method_freqpoly_1d}\n\\alias{method_freqpoly_1d}\n\\title{Univariate frequency polygon HDR estimator}\n\\usage{\nmethod_freqpoly_1d(bins = NULL)\n}\n\\arguments{\n\\item{bins}{Number of bins. Defaults to normal reference rule (Scott, pg 59).}\n}\n\\description{\nFunction used to specify univariate frequency polygon density estimator\nfor \\code{get_hdr_1d()} and layer functions (e.g. \\code{geom_hdr_rug()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*_1d()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\ndf <- data.frame(x = rnorm(1e3))\n\n# Strip chart to visualize 1-d data\np <- ggplot(df, aes(x)) +\n  geom_jitter(aes(y = 0), width = 0, height = 2) +\n  scale_y_continuous(name = NULL, breaks = NULL) +\n  coord_cartesian(ylim = c(-3, 3))\n\np\n\np + geom_hdr_rug(method = method_freqpoly_1d())\n\n# The resolution of the frequency polygon estimator can be set via `bins`\np + geom_hdr_rug(method = method_freqpoly_1d(bins = 100))\n\n# Can also be used with `get_hdr_1d()` for numerical summary of HDRs\nres <- get_hdr_1d(df$x, method = method_freqpoly_1d())\nstr(res)\n\n}\n\\references{\nScott, David W. Multivariate Density Estimation (2e), Wiley.\n}\n"
  },
  {
    "path": "man/method_histogram.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method.R\n\\name{method_histogram}\n\\alias{method_histogram}\n\\title{Bivariate histogram HDR estimator}\n\\usage{\nmethod_histogram(bins = NULL, smooth = FALSE, nudgex = \"none\", nudgey = \"none\")\n}\n\\arguments{\n\\item{bins}{Number of bins along each axis.\nEither a vector of length 2 or a scalar value which is recycled for both dimensions.\nDefaults to normal reference rule (Scott, pg 87).}\n\n\\item{smooth}{If \\code{TRUE}, HDRs are smoothed by the marching squares algorithm.}\n\n\\item{nudgex, nudgey}{Horizontal and vertical rules for choosing witness points when \\code{smooth == TRUE}.\nAccepts character vector: \\code{\"left\"}, \\code{\"none\"}, \\code{\"right\"} (\\code{nudgex}) or  \\code{\"down\"}, \\code{\"none\"}, \\code{\"up\"} (\\code{nudgey}).}\n}\n\\description{\nFunction used to specify bivariate histogram density estimator\nfor \\code{get_hdr()} and layer functions (e.g. \\code{geom_hdr()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\n\\dontrun{\n\n# Histogram estimators can be useful when data has boundary constraints\nset.seed(1)\ndf <- data.frame(x = rexp(1e3), y = rexp(1e3))\n\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_histogram()) +\n  geom_point(size = 1)\n\n# The resolution of the histogram estimator can be set via `bins`\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_histogram(bins = c(8, 25))) +\n  geom_point(size = 1)\n\n# By setting `smooth = TRUE`, we can graphically smooth the \"blocky\" HDRs\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_histogram(smooth = TRUE)) +\n  geom_point(size = 1)\n\n# However, we need to set `nudgex` and `nudgey` to align the HDRs correctly\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_histogram(smooth = TRUE, nudgex = \"left\", nudgey = \"down\")) +\n  geom_point(size = 1)\n\n# Can also be used with `get_hdr()` for numerical summary of HDRs\nres <- get_hdr(df, method = method_histogram())\nstr(res)\n}\n\n}\n\\references{\nScott, David W. Multivariate Density Estimation (2e), Wiley.\n}\n"
  },
  {
    "path": "man/method_histogram_1d.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method_1d.R\n\\name{method_histogram_1d}\n\\alias{method_histogram_1d}\n\\title{Univariate histogram HDR estimator}\n\\usage{\nmethod_histogram_1d(bins = NULL)\n}\n\\arguments{\n\\item{bins}{Number of bins. Defaults to normal reference rule (Scott, pg 59).}\n}\n\\description{\nFunction used to specify univariate histogram density estimator\nfor \\code{get_hdr_1d()} and layer functions (e.g. \\code{geom_hdr_rug()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*_1d()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\n# Histogram estimators can be useful when data has boundary constraints\ndf <- data.frame(x = rexp(1e3))\n\n# Strip chart to visualize 1-d data\np <- ggplot(df, aes(x)) +\n  geom_jitter(aes(y = 0), width = 0, height = 2) +\n  scale_y_continuous(name = NULL, breaks = NULL) +\n  coord_cartesian(ylim = c(-3, 3))\n\np\n\np + geom_hdr_rug(method = method_histogram_1d())\n\n# The resolution of the histogram estimator can be set via `bins`\np + geom_hdr_rug(method = method_histogram_1d(bins = 5))\n\n# Can also be used with `get_hdr_1d()` for numerical summary of HDRs\nres <- get_hdr_1d(df$x, method = method_histogram_1d())\nstr(res)\n\n}\n\\references{\nScott, David W. Multivariate Density Estimation (2e), Wiley.\n}\n"
  },
  {
    "path": "man/method_kde.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method.R\n\\name{method_kde}\n\\alias{method_kde}\n\\title{Bivariate kernel density HDR estimator}\n\\usage{\nmethod_kde(h = NULL, adjust = c(1, 1))\n}\n\\arguments{\n\\item{h}{Bandwidth (vector of length two). If \\code{NULL}, estimated\nusing \\code{\\link[MASS:bandwidth.nrd]{MASS::bandwidth.nrd()}}.}\n\n\\item{adjust}{A multiplicative bandwidth adjustment to be used if 'h' is\n'NULL'. This makes it possible to adjust the bandwidth while still\nusing the a bandwidth estimator. For example, \\code{adjust = 1/2} means\nuse half of the default bandwidth.}\n}\n\\description{\nFunction used to specify bivariate kernel density estimator\nfor \\code{get_hdr()} and layer functions (e.g. \\code{geom_hdr()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\nset.seed(1)\ndf <- data.frame(x = rnorm(1e3, sd = 3), y = rnorm(1e3, sd = 3))\n\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_kde()) +\n  geom_point(size = 1)\n\n# The defaults of `method_kde()` are the same as the estimator for `ggplot2::geom_density_2d()`\nggplot(df, aes(x, y)) +\n  geom_density_2d_filled() +\n  geom_hdr_lines(method = method_kde(), probs = seq(.1, .9, by = .1)) +\n  theme(legend.position = \"none\")\n\n# The bandwidth of the estimator can be set directly with `h` or scaled with `adjust`\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_kde(h = 1)) +\n  geom_point(size = 1)\n\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_kde(adjust = 1/2)) +\n  geom_point(size = 1)\n\n# Can also be used with `get_hdr()` for numerical summary of HDRs\nres <- get_hdr(df, method = method_kde())\nstr(res)\n\n}\n"
  },
  {
    "path": "man/method_kde_1d.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method_1d.R\n\\name{method_kde_1d}\n\\alias{method_kde_1d}\n\\title{Univariate kernel density HDR estimator}\n\\usage{\nmethod_kde_1d(\n  bw = \"nrd0\",\n  adjust = 1,\n  kernel = \"gaussian\",\n  weights = NULL,\n  window = kernel\n)\n}\n\\arguments{\n\\item{bw}{the smoothing bandwidth to be used.  The kernels are scaled\n    such that this is the standard deviation of the smoothing kernel.\n    (Note this differs from the reference books cited below.)\n\n    \\code{bw} can also be a character string giving a rule to choose the\n    bandwidth.  See \\code{\\link[stats]{bw.nrd}}. \\cr The default,\n    \\code{\"nrd0\"}, has remained the default for historical and\n    compatibility reasons, rather than as a general recommendation,\n    where e.g., \\code{\"SJ\"} would rather fit, see also\n    Venables and Ripley (2002).\n\n    The specified (or computed) value of \\code{bw} is multiplied by\n    \\code{adjust}.\n  }\n\n\\item{adjust}{the bandwidth used is actually \\code{adjust*bw}.\n    This makes it easy to specify values like \\sQuote{half the default}\n    bandwidth.}\n\n\\item{kernel, window}{a character string giving the smoothing kernel\n    to be used. This must partially match one of \\code{\"gaussian\"},\n    \\code{\"rectangular\"}, \\code{\"triangular\"}, \\code{\"epanechnikov\"},\n    \\code{\"biweight\"}, \\code{\"cosine\"} or \\code{\"optcosine\"}, with default\n    \\code{\"gaussian\"}, and may be abbreviated to a unique prefix (single\n    letter).\n\n    \\code{\"cosine\"} is smoother than \\code{\"optcosine\"}, which is the\n    usual \\sQuote{cosine} kernel in the literature and almost MSE-efficient.\n    However, \\code{\"cosine\"} is the version used by S.\n  }\n\n\\item{weights}{numeric vector of non-negative observation weights,\n    hence of same length as \\code{x}. The default \\code{NULL} is\n    equivalent to \\code{weights = rep(1/nx, nx)} where \\code{nx} is the\n    length of (the finite entries of) \\code{x[]}.  If \\code{na.rm = TRUE}\n    and there are \\code{NA}'s in \\code{x}, they \\emph{and} the\n    corresponding weights are removed before computations.  In that case,\n    when the original weights have summed to one, they are re-scaled to\n    keep doing so.\n\n    Note that weights are \\emph{not} taken into account for automatic\n    bandwidth rules, i.e., when \\code{bw} is a string.  When the weights\n    are proportional to true counts \\code{cn}, \\code{density(x = rep(x, cn))}\n    may be used instead of \\code{weights}.\n  }\n}\n\\description{\nFunction used to specify univariate kernel density estimator\nfor \\code{get_hdr_1d()} and layer functions (e.g. \\code{geom_hdr_rug()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*_1d()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\ndf <- data.frame(x = rnorm(1e3, sd = 3))\n\nggplot(df, aes(x)) +\n  geom_hdr_rug(method = method_kde_1d()) +\n  geom_density()\n\n# Details of the KDE can be adjusted with arguments to `method_kde_1d()`\nggplot(df, aes(x)) +\n  geom_hdr_rug(method = method_kde_1d(adjust = 1/5)) +\n  geom_density(adjust = 1/5)\n\nggplot(df, aes(x)) +\n  geom_hdr_rug(method = method_kde_1d(kernel = \"triangular\")) +\n  geom_density(kernel = \"triangular\")\n\n# Can also be used with `get_hdr_1d()` for numerical summary of HDRs\nres <- get_hdr_1d(df$x, method = method_kde_1d())\nstr(res)\n\n}\n"
  },
  {
    "path": "man/method_mvnorm.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method.R\n\\name{method_mvnorm}\n\\alias{method_mvnorm}\n\\title{Bivariate parametric normal HDR estimator}\n\\usage{\nmethod_mvnorm()\n}\n\\description{\nFunction used to specify bivariate normal density estimator\nfor \\code{get_hdr()} and layer functions (e.g. \\code{geom_hdr()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\n# Normal estimator is useful when an assumption of normality is appropriate\nset.seed(1)\ndf <- data.frame(x = rnorm(1e3), y = rnorm(1e3))\n\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_mvnorm(), xlim = c(-4, 4), ylim = c(-4, 4)) +\n  geom_point(size = 1)\n\n# Can also be used with `get_hdr()` for numerical summary of HDRs\nres <- get_hdr(df, method = method_mvnorm())\nstr(res)\n\n}\n"
  },
  {
    "path": "man/method_norm_1d.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method_1d.R\n\\name{method_norm_1d}\n\\alias{method_norm_1d}\n\\title{Univariate parametric normal HDR estimator}\n\\usage{\nmethod_norm_1d()\n}\n\\description{\nFunction used to specify univariate normal density estimator\nfor \\code{get_hdr_1d()} and layer functions (e.g. \\code{geom_hdr_rug()}).\n}\n\\details{\nFor more details on the use and implementation of the \\verb{method_*_1d()} functions,\nsee \\code{vignette(\"method\", \"ggdensity\")}.\n}\n\\examples{\n# Normal estimators are useful when an assumption of normality is appropriate\ndf <- data.frame(x = rnorm(1e3))\n\nggplot(df, aes(x)) +\n  geom_hdr_rug(method = method_norm_1d()) +\n  geom_density()\n\n# Can also be used with `get_hdr_1d()` for numerical summary of HDRs\nres <- get_hdr_1d(df$x, method = method_norm_1d())\nstr(res)\n\n}\n"
  },
  {
    "path": "revdep/.gitignore",
    "content": "checks\nlibrary\nchecks.noindex\nlibrary.noindex\ncloud.noindex\ndata.sqlite\n*.html\n"
  },
  {
    "path": "revdep/README.md",
    "content": "# Platform\n\n|field    |value                                                                                                                                                                                     |\n|:--------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n|version  |R version 4.5.2 (2025-10-31 ucrt)                                                                                                                                                         |\n|os       |Windows 11 x64 (build 26200)                                                                                                                                                              |\n|system   |x86_64, mingw32                                                                                                                                                                           |\n|ui       |RStudio                                                                                                                                                                                   |\n|language |(EN)                                                                                                                                                                                      |\n|collate  |English_United States.utf8                                                                                                                                                                |\n|ctype    |English_United States.utf8                                                                                                                                                                |\n|tz       |America/Chicago                                                                                                                                                                           |\n|date     |2026-02-24                                                                                                                                                                                |\n|rstudio  |2026.01.1+403 Apple Blossom (desktop)                                                                                                                                                     |\n|pandoc   |NA                                                                                                                                                                                        |\n|quarto   |ERROR: Unknown command \"TMPDIR=C:/Users/james/AppData/Local/Temp/Rtmpon9F2w/file48108fa5255\". Did you mean command \"create\"? @ C:\\PROGRA~1\\RStudio\\RESOUR~1\\app\\bin\\quarto\\bin\\quarto.exe |\n\n# Dependencies\n\n|package      |old    |new    |Δ  |\n|:------------|:------|:------|:--|\n|ggdensity    |1.0.0  |1.0.1  |*  |\n|cli          |3.6.5  |3.6.5  |   |\n|cpp11        |0.5.3  |0.5.3  |   |\n|farver       |2.1.2  |2.1.2  |   |\n|ggplot2      |4.0.2  |4.0.2  |   |\n|glue         |1.8.0  |1.8.0  |   |\n|gtable       |0.3.6  |0.3.6  |   |\n|isoband      |0.3.0  |0.3.0  |   |\n|labeling     |0.4.3  |0.4.3  |   |\n|lifecycle    |1.0.5  |1.0.5  |   |\n|magrittr     |2.0.4  |2.0.4  |   |\n|pillar       |1.11.1 |1.11.1 |   |\n|pkgconfig    |2.0.3  |2.0.3  |   |\n|R6           |2.6.1  |2.6.1  |   |\n|RColorBrewer |1.1-3  |1.1-3  |   |\n|rlang        |1.1.7  |1.1.7  |   |\n|S7           |0.2.1  |0.2.1  |   |\n|scales       |1.4.0  |1.4.0  |   |\n|tibble       |3.3.1  |3.3.1  |   |\n|utf8         |1.2.6  |1.2.6  |   |\n|vctrs        |0.7.1  |0.7.1  |   |\n|viridisLite  |0.4.3  |0.4.3  |   |\n|withr        |3.0.2  |3.0.2  |   |\n\n# Revdeps\n\n"
  },
  {
    "path": "revdep/cran.md",
    "content": "## revdepcheck results\n\nWe checked 6 reverse dependencies (5 from CRAN + 1 from Bioconductor), comparing R CMD check results across CRAN and dev versions of this package.\n\n * We saw 0 new problems\n * We failed to check 0 packages\n\n"
  },
  {
    "path": "revdep/email.yml",
    "content": "release_date: ???\nrel_release_date: ???\nmy_news_url: ???\nrelease_version: ???\nrelease_details: ???\n"
  },
  {
    "path": "revdep/failures.md",
    "content": "*Wow, no problems at all. :)*"
  },
  {
    "path": "revdep/problems.md",
    "content": "*Wow, no problems at all. :)*"
  },
  {
    "path": "tests/testthat/test-fix_probs.R",
    "content": "test_that(\"fix_probs() works as intended\", {\n\n  # Check defaults\n  expect_equal(fix_probs(c(.99, .95, .80, .50)), c(.99, .95, .80, .50))\n\n  # Reorders probabilities correctly\n  expect_equal(fix_probs(c(.80, .50, .99, .95)), c(.99, .95, .80, .50))\n\n  # Works with vectors of length 1\n  expect_equal(fix_probs(.50), .5)\n\n  # Issues error if any probabilites are outside (0, 1)\n  expect_error(fix_probs(c(1.1, .80, .5)), regexp = \"must be between\")\n  expect_error(fix_probs(c(.80, .5, -1)), regexp = \"must be between\")\n  expect_error(fix_probs(c(1)), regexp = \"must be between\")\n  expect_error(fix_probs(c(0)), regexp = \"must be between\")\n\n})\n"
  },
  {
    "path": "tests/testthat/test-get_hdr.R",
    "content": "test_that(\"structure of get_hdr() return value is as expected\", {\n\n  data <- data.frame(\n    x = 1:10,\n    y = rep(1:5, each = 2)\n  )\n\n  res <- get_hdr(data)\n\n  # Checking the top level of res\n  expect_type(res, \"list\")\n  expect_equal(length(res), 3)\n  expect_equal(names(res), c(\"df_est\", \"breaks\", \"data\"))\n\n  # Checking res$df_est:\n  expect_type(res$df_est, \"list\")\n  expect_equal(ncol(res$df_est), 5)\n  expect_equal(colnames(res$df_est), c(\"x\", \"y\", \"fhat\", \"fhat_discretized\", \"hdr\"))\n\n  # Checking res$data\n  expect_type(res$data, \"list\")\n  expect_equal(ncol(res$data), 3)\n  expect_equal(nrow(res$data), 10)\n  expect_equal(colnames(res$data), c(\"x\", \"y\", \"hdr_membership\"))\n\n  # Checking res$breaks\n  expect_type(res$breaks, \"double\")\n  expect_equal(length(res$breaks), 5)\n  expect_equal(names(res$breaks), c(\"99%\", \"95%\", \"80%\", \"50%\", NA))\n\n\n  # Now with non-default args -----------------------------------------\n  res <- get_hdr(data, probs = c(.989, .878, .67, .43, .21), hdr_membership = FALSE)\n\n  # Checking res$data\n  expect_equal(ncol(res$data), 2)\n\n  # Checking res$breaks\n  expect_type(res$breaks, \"double\")\n  expect_equal(length(res$breaks), 6)\n  expect_equal(names(res$breaks), c(\"99%\", \"88%\", \"67%\", \"43%\", \"21%\", NA))\n\n})\n\ntest_that(\"`method` can be provided as a character vector or function\", {\n\n  data <- data.frame(\n    x = 1:10,\n    y = rep(1:5, each = 2)\n  )\n\n  expect_equal(get_hdr(data, \"kde\"), get_hdr(data, method_kde()))\n  expect_equal(get_hdr(data, \"mvnorm\"), get_hdr(data, method_mvnorm()))\n  expect_equal(get_hdr(data, \"freqpoly\"), get_hdr(data, method_freqpoly()))\n  expect_equal(get_hdr(data, \"histogram\"), get_hdr(data, method_histogram()))\n\n})\n\ntest_that(\"get_hdr() errors informatively if bad `method` argument\", {\n\n  data <- data.frame(\n    x = 1:10,\n    y = rep(1:5, each = 2)\n  )\n\n  expect_error(get_hdr(data, method = \"not-a-method\"), regexp = \"Invalid method specified\")\n  expect_error(get_hdr(data, method = method_kde), regexp = \"did you forget\")\n\n})\n\n\n# # The data used for tests:\n#\n# set.seed(1)\n# df <- data.frame(\n#   x = rnorm(5e3),\n#   y = rnorm(5e3)\n# )\n#\n# write_rds(df, here::here(\"tests/testthat/fixtures/df_norm.rds\"))\n\ntest_that(\"get_hdr(method = method_kde()) calculations are consistent\", {\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  res <- get_hdr(data, method_kde())\n\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  # By default, estimate is evaluated on the same range as original data\n  expect_equal(range(res$df_est$x), range(data$x))\n  expect_equal(range(res$df_est$y), range(data$y))\n\n  # default grid is 100 x 100:\n  expect_equal(nrow(res$df_est), 100 * 100)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 185)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0017, 0.0083, 0.0303, 0.0731, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.5, .8, .95, .99, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(858, 1149, 1597, 1771, 4625))\n\n\n  # Checking non-default args ------------------------\n\n  res <- get_hdr(data, method_kde(adjust = .4), probs = c(.97, .85, .4, .1), n = c(100, 200), rangex = c(-3, 2), rangey = c(-1, 3))\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  # Was the custom range used\n  expect_equal(range(res$df_est$x), c(-3, 2))\n  expect_equal(range(res$df_est$y), c(-1, 3))\n\n  # default grid is 100 x 100:\n  expect_equal(nrow(res$df_est), 100 * 200)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 808)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0105, 0.0352, 0.1036, 0.1522, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.1, .4, .85, .97, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(495, 1923, 5584, 4524, 7474))\n\n})\n\n# TODO: above, for other methods\n\n\n\n\ntest_that(\"get_hdr() works with custom function factory supplied to `method`\", {\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  method_mvnorm_ind <- function() {\n\n    function(data) {\n\n      mean_x <- mean(data$x); s_x <- sd(data$x)\n      mean_y <- mean(data$y); s_y <- sd(data$y)\n\n      function(x, y) dnorm(x, mean = mean_x, sd = s_x) * dnorm(y, mean = mean_y, sd = s_y)\n\n    }\n\n  }\n\n  res <- get_hdr(data, method = method_mvnorm_ind())\n\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  # By default, estimate is evaluated on the same range as original data\n  expect_equal(range(res$df_est$x), range(data$x))\n  expect_equal(range(res$df_est$y), range(data$y))\n\n  # default grid is 100 x 100:\n  expect_equal(nrow(res$df_est), 100 * 100)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 185)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0017, 0.0078, 0.031, 0.078, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.5, .8, .95, .99, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(826, 1090, 1642, 1863, 4579))\n\n})\n\n\n\ntest_that(\"get_hdr() works with custom function factory supplied to `method`\", {\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  method_fixed_grid <- function() {\n\n    function(data, n, rangex, rangey) {\n\n      df_grid <- expand.grid(\n        x = seq(rangex[1], rangex[2], length.out = n),\n        y = seq(rangey[1], rangey[2], length.out = n)\n      )\n\n      df_grid$fhat <- dnorm(df_grid$x) * dnorm(df_grid$y)\n\n      df_grid\n\n    }\n\n  }\n\n  res <- get_hdr(data, method = method_fixed_grid())\n\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  # By default, estimate is evaluated on the same range as original data\n  expect_equal(range(res$df_est$x), range(data$x))\n  expect_equal(range(res$df_est$y), range(data$y))\n\n  # default grid is 100 x 100:\n  expect_equal(nrow(res$df_est), 100 * 100)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 185)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0017, 0.008, 0.0321, 0.0796, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.5, .8, .95, .99, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(806, 1065, 1603, 1824, 4702))\n\n})\n\n\ntest_that(\"get_hdr() fails if `method != 'fun' and `data` isn't provided\", {\n\n  expect_error(get_hdr(method = method_kde()), regexp = \".data. must be provided\")\n\n})\n\ntest_that(\"fun argument of get_hdr() requires rangex/y\", {\n\n  expect_error(get_hdr(method = \"fun\", fun = function(x, y) dexp(x) * dexp(y)), regexp = \".rangey. must be provided\")\n\n})\n\n\ntest_that(\"fun argument of get_hdr() works\", {\n\n  res <- get_hdr(method = \"fun\", fun = function(x, y) dexp(x) * dexp(y), rangex = c(0, 10), rangey = c(0, 10))\n\n  # Structure of res is as expected\n  expect_type(res, \"list\")\n  expect_equal(length(res), 3)\n  expect_equal(names(res), c(\"df_est\", \"breaks\", \"data\"))\n\n  expect_null(res$data)\n\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  expect_equal(range(res$df_est$x), c(0, 10))\n  expect_equal(range(res$df_est$y), c(0, 10))\n\n  # default grid is 100 x 100:\n  expect_equal(nrow(res$df_est), 100 * 100)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 108)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0014, 0.0096, 0.0534, 0.1987, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.5, .8, .95, .99, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(145, 306, 669, 1045, 7835))\n})\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "tests/testthat/test-get_hdr_1d.R",
    "content": "test_that(\"structure of get_hdr_1d() return value is as expected\", {\n\n  x <- 1:10\n\n  res <- get_hdr_1d(x)\n\n  # Checking the top level of res\n  expect_type(res, \"list\")\n  expect_equal(length(res), 3)\n  expect_equal(names(res), c(\"df_est\", \"breaks\", \"data\"))\n\n  # Checking res$df_est:\n  expect_type(res$df_est, \"list\")\n  expect_equal(ncol(res$df_est), 4)\n  expect_equal(colnames(res$df_est), c(\"x\", \"fhat\", \"fhat_discretized\", \"hdr\"))\n\n  # Checking res$data\n  expect_type(res$data, \"list\")\n  expect_equal(ncol(res$data), 2)\n  expect_equal(nrow(res$data), 10)\n  expect_equal(colnames(res$data), c(\"x\", \"hdr_membership\"))\n\n  # Checking res$breaks\n  expect_type(res$breaks, \"double\")\n  expect_equal(length(res$breaks), 5)\n  expect_equal(names(res$breaks), c(\"99%\", \"95%\", \"80%\", \"50%\", NA))\n\n})\n\ntest_that(\"`method` can be provided as a character vector or function\", {\n\n  x <- 1:10\n\n  expect_equal(get_hdr_1d(x, \"kde\"), get_hdr_1d(x, method_kde_1d()))\n  expect_equal(get_hdr_1d(x, \"norm\"), get_hdr_1d(x, method_norm_1d()))\n  expect_equal(get_hdr_1d(x, \"freqpoly\"), get_hdr_1d(x, method_freqpoly_1d()))\n  expect_equal(get_hdr_1d(x, \"histogram\"), get_hdr_1d(x, method_histogram_1d()))\n\n})\n\ntest_that(\"get_hdr() errors informatively if bad `method` argument\", {\n\n  x <- 1:10\n\n  expect_error(get_hdr_1d(x, method = \"not-a-method\"), regexp = \"Invalid method specified\")\n  expect_error(get_hdr_1d(x, method = method_kde_1d), regexp = \"did you forget\")\n  expect_error(get_hdr_1d(x, method = method_kde()), regexp = \"1d\")\n\n})\n\ntest_that(\"get_hdr_1d() fails if `method != 'fun' and `x` isn't provided\", {\n\n  expect_error(get_hdr_1d(method = method_kde_1d()), regexp = \".x. must be provided\")\n\n})\n\ntest_that(\"fun argument of get_hdr_1d() requires range\", {\n\n  expect_error(get_hdr_1d(method = \"fun\", fun = dexp), regexp = \".range. must be provided\")\n\n})\n\n\ntest_that(\"fun argument of get_hdr_1d() works\", {\n\n  res <- get_hdr_1d(method = \"fun\", fun = dexp, range = c(0, 10))\n\n  # Structure of res is as expected\n  expect_type(res, \"list\")\n  expect_equal(length(res), 3)\n  expect_equal(names(res), c(\"df_est\", \"breaks\", \"data\"))\n\n  expect_null(res$data)\n\n  # fhat_discretized should be normalized to sum to 1\n  expect_equal(sum(res$df_est$fhat_discretized), 1)\n\n  expect_equal(range(res$df_est$x), c(0, 10))\n\n  # default grid is 512:\n  expect_equal(nrow(res$df_est), 512)\n\n  # Checksums:\n  expect_equal(round(sum(res$df_est$fhat)), 52)\n  expect_equal(as.numeric(round(res$breaks, 4)), c(0.0101, 0.0501, 0.201, 0.5041, Inf))\n  expect_equal(sort(unique(res$df_est$hdr)), c(.5, .8, .95, .99, 1))\n  expect_equal(as.numeric(table(res$df_est$hdr)), c(36, 47, 71, 82, 276))\n})\n\n"
  },
  {
    "path": "tests/testthat/test-layer-wrappers.R",
    "content": "test_that(\"wrapper functions for `layer()` are passing arguments on as expected\", {\n\n  df <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  check_layer <- function(layer_fun, Geom, Stat, mapping = aes(x, y), data = df, ...) {\n\n    hdr_layer <- layer_fun(data = data, mapping = mapping, ...)\n\n    expect_type(hdr_layer, \"environment\")\n    expect_identical(hdr_layer$geom, Geom)\n    expect_identical(hdr_layer$stat, Stat)\n    expect_identical(hdr_layer$mapping, mapping)\n\n  }\n\n  # 2-d layer functions -----------------------------------------------------\n\n  # geom/stat_hdr()\n  check_layer(geom_hdr, GeomHdr, StatHdr)\n  check_layer(stat_hdr, GeomHdr, StatHdr)\n\n  # geom/stat_hdr_lines()\n  check_layer(geom_hdr_lines, GeomHdrLines, StatHdrLines)\n  check_layer(stat_hdr_lines, GeomHdrLines, StatHdrLines)\n\n  # geom/stat_hdr_points()\n  check_layer(geom_hdr_points, GeomPoint, StatHdrPoints)\n  check_layer(stat_hdr_points, GeomPoint, StatHdrPoints)\n\n  # geom/stat_hdr_lines_fun()\n  # -- stat_hdr_points_fun needs to have a `fun` arg provided\n  check_layer(geom_hdr_points_fun, GeomPoint, StatHdrPointsFun)\n  check_layer(stat_hdr_points_fun, GeomPoint, StatHdrPointsFun, fun = function(x, y) dnorm(x) * dnorm(y))\n\n  # geom/stat_hdr_fun()\n  # (stat_hdr_fun needs to have a `fun` arg provided)\n  check_layer(geom_hdr_fun, GeomHdrFun, StatHdrFun)\n  check_layer(stat_hdr_fun, GeomHdrFun, StatHdrFun, fun = function(x, y) dnorm(x) * dnorm(y))\n\n  # -- checking that data doesn't need to be provided\n  check_layer(geom_hdr_fun, GeomHdrFun, StatHdrFun, data = NULL, mapping = NULL)\n  check_layer(stat_hdr_fun, GeomHdrFun, StatHdrFun, data = NULL, mapping = NULL, fun = function(x, y) dnorm(x) * dnorm(y))\n\n  # geom/stat_hdr_lines_fun()\n  # -- stat_hdr_lines_fun needs to have a `fun` arg provided\n  check_layer(geom_hdr_lines_fun, GeomHdrLinesFun, StatHdrLinesFun)\n  check_layer(stat_hdr_lines_fun, GeomHdrLinesFun, StatHdrLinesFun, fun = function(x, y) dnorm(x) * dnorm(y))\n\n  # -- checking that data doesn't need to be provided\n  check_layer(geom_hdr_lines_fun, GeomHdrLinesFun, StatHdrLinesFun, data = NULL, mapping = NULL)\n  check_layer(stat_hdr_lines_fun, GeomHdrLinesFun, StatHdrLinesFun, data = NULL, mapping = NULL, fun = function(x, y) dnorm(x) * dnorm(y))\n\n  # 1-d layer functions -----------------------------------------------------\n\n  # geom/stat_hdr_rug()\n  check_layer(geom_hdr_rug, GeomHdrRug, StatHdrRug)\n  check_layer(stat_hdr_rug, GeomHdrRug, StatHdrRug)\n\n  # -- checking that single x/y aesthetics are allowed:\n  check_layer(geom_hdr_rug, GeomHdrRug, StatHdrRug, mapping  = aes(x))\n  check_layer(stat_hdr_rug, GeomHdrRug, StatHdrRug, mapping  = aes(x))\n  check_layer(geom_hdr_rug, GeomHdrRug, StatHdrRug, mapping  = aes(y))\n  check_layer(stat_hdr_rug, GeomHdrRug, StatHdrRug, mapping  = aes(y))\n\n  # geom/stat_hdr_rug_fun()\n  check_layer(geom_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun)\n  check_layer(stat_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun)\n\n  # -- checking that single x/y aesthetics are allowed:\n  check_layer(geom_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun, mapping  = aes(x))\n  check_layer(stat_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun, mapping  = aes(x))\n  check_layer(geom_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun, mapping  = aes(y))\n  check_layer(stat_hdr_rug_fun, GeomHdrRugFun, StatHdrRugFun, mapping  = aes(y))\n\n})\n\n\n"
  },
  {
    "path": "tests/testthat/test-res_to_df.R",
    "content": "test_that(\"res_to_df returns correct structure for each value of output\", {\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n  probs <- c(.99, .95, .80, .50)\n\n  res <- get_hdr(data, method_kde(), probs)\n\n  # Checking output == \"bands\"\n  df_bands <- res_to_df(res, probs, group = 1, output = \"bands\")\n  expect_type(df_bands, \"list\")\n  expect_equal(colnames(df_bands), c(\"x\", \"y\", \"piece\", \"group\", \"subgroup\", \".size\", \"probs\"))\n  expect(is.ordered(df_bands$probs), \"probs is an ordered object\")\n  expect_equal(levels(df_bands$probs), scales::percent_format(accuracy = 1)(probs))\n\n  # Checking output == \"lines\"\n  df_lines <- res_to_df(res, probs, group = 1, output = \"lines\")\n  expect_type(df_lines, \"list\")\n  expect_equal(colnames(df_lines), c(\"x\", \"y\", \"piece\", \"group\", \".size\", \"probs\"))\n  expect(is.ordered(df_lines$probs), \"probs is an ordered object\")\n  expect_equal(levels(df_lines$probs), scales::percent_format(accuracy = 1)(probs))\n\n  # Checking output == \"points\"\n  df_points <- res_to_df(res, probs, group = 1, output = \"points\")\n  expect_type(df_points, \"list\")\n  expect_equal(colnames(df_points), c(\"x\", \"y\", \"probs\"))\n  expect(is.ordered(df_points$probs), \"probs is an ordered object\")\n  expect_equal(levels(df_points$probs), scales::percent_format(accuracy = 1)(c(1, probs)))\n\n})\n"
  },
  {
    "path": "tests/testthat/test-res_to_df_1d.R",
    "content": "test_that(\"res_to_df_1d returns correct structure for each value of output\", {\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n  probs <- c(.99, .95, .80, .50)\n\n  res <- get_hdr_1d(data$x, method_kde_1d(), probs)\n\n  # Checking output == \"rug\"\n  df_rug <- res_to_df_1d(res, probs, group = 1, output = \"rug\")\n  expect_type(df_rug, \"list\")\n  expect_equal(colnames(df_rug), c(\"x\", \"fhat\", \"fhat_discretized\", \"probs\"))\n  expect(is.ordered(df_rug$probs), \"probs is an ordered object\")\n  expect_equal(levels(df_rug$probs), scales::percent_format(accuracy = 1)(probs))\n\n})\n"
  },
  {
    "path": "tests/testthat/test-visual-tests.R",
    "content": "## Checking basic plots with vdiffr::expect_doppelganger()\n\ntest_that(\"Basic 2d HDRs render consistently\", {\n\n  # platform-dependent snapshots for GitHub actions\n  # in non-CI context, use default directory for snapshots\n  if (Sys.getenv(\"CI\") == \"true\") {\n    snapshot_variant <- Sys.getenv(\"RUNNER_OS\")\n  } else {\n    snapshot_variant <- NULL\n  }\n\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  # geom/stat_hdr\n  geom_hdr_ggplot <- ggplot(data, aes(x, y)) + geom_hdr()\n  stat_hdr_ggplot <- ggplot(data, aes(x, y)) + stat_hdr()\n  vdiffr::expect_doppelganger(\"geom-hdr-ggplot\", geom_hdr_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-ggplot\", stat_hdr_ggplot, variant = snapshot_variant)\n\n  # geom/stat_hdr_lines\n  geom_hdr_lines_ggplot <- ggplot(data, aes(x, y)) + geom_hdr_lines()\n  stat_hdr_lines_ggplot <- ggplot(data, aes(x, y)) + stat_hdr_lines()\n  vdiffr::expect_doppelganger(\"geom-hdr_lines-ggplot\", geom_hdr_lines_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr_lines-ggplot\", stat_hdr_lines_ggplot, variant = snapshot_variant)\n\n  # geom/stat_hdr_points\n  geom_hdr_points_ggplot <- ggplot(data, aes(x, y)) + geom_hdr_points()\n  stat_hdr_points_ggplot <- ggplot(data, aes(x, y)) + stat_hdr_points()\n  vdiffr::expect_doppelganger(\"geom-hdr-points-ggplot\", geom_hdr_points_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-points-ggplot\", stat_hdr_points_ggplot, variant = snapshot_variant)\n\n  # geom/stat_hdr_points_fun\n  geom_hdr_points_fun_ggplot <- ggplot(data, aes(x, y)) + geom_hdr_points_fun(fun = function(x, y) dnorm(x) * dnorm(y))\n  stat_hdr_points_fun_ggplot <- ggplot(data, aes(x, y)) + stat_hdr_points_fun(fun = function(x, y) dnorm(x) * dnorm(y))\n  vdiffr::expect_doppelganger(\"geom-hdr-points-fun-ggplot\", geom_hdr_points_fun_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-points-fun-ggplot\", stat_hdr_points_fun_ggplot, variant = snapshot_variant)\n\n  # geom/stat_hdr_fun\n  geom_hdr_fun_ggplot <- ggplot() +\n    geom_hdr_fun(fun = function(x, y) dnorm(x) * dnorm(y), xlim = c(-5, 5), ylim = c(-5, 5))\n  stat_hdr_fun_ggplot <- ggplot() +\n    stat_hdr_fun(fun = function(x, y) dnorm(x) * dnorm(y), xlim = c(-5, 5), ylim = c(-5, 5))\n  vdiffr::expect_doppelganger(\"geom-hdr-fun-ggplot\", geom_hdr_fun_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-fun-ggplot\", stat_hdr_fun_ggplot, variant = snapshot_variant)\n\n})\n\ntest_that(\"Basic 1d HDRs render consistently\", {\n\n  # platform-dependent snapshots for GitHub actions\n  # in non-CI context, use default directory for snapshots\n  if (Sys.getenv(\"CI\") == \"true\") {\n    snapshot_variant <- Sys.getenv(\"RUNNER_OS\")\n  } else {\n    snapshot_variant <- NULL\n  }\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  # geom/stat_hdr_rug\n  geom_hdr_rug_ggplot <- ggplot(data, aes(x, y)) + geom_hdr_rug()\n  stat_hdr_rug_ggplot <- ggplot(data, aes(x, y)) + stat_hdr_rug()\n  vdiffr::expect_doppelganger(\"geom-hdr-rug-ggplot\", geom_hdr_rug_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-rug-ggplot\", stat_hdr_rug_ggplot, variant = snapshot_variant)\n\n  # geom/stat_hdr_rug_fun\n  geom_hdr_rug_fun_ggplot <- ggplot() +\n    geom_hdr_rug_fun(fun_x = dnorm, fun_y = dexp, xlim = c(-5, 5), ylim = c(0, 10))\n  stat_hdr_rug_fun_ggplot <- ggplot() +\n    stat_hdr_rug_fun(fun_x = dnorm, fun_y = dexp, xlim = c(-5, 5), ylim = c(0, 10))\n\n  vdiffr::expect_doppelganger(\"geom-hdr-rug-fun-ggplot\", geom_hdr_rug_fun_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"stat-hdr-rug-fun-ggplot\", stat_hdr_rug_fun_ggplot, variant = snapshot_variant)\n\n})\n\ntest_that(\"Specified order of probabilities doesn't impact legend ordering\", {\n\n  # platform-dependent snapshots for GitHub actions\n  # in non-CI context, use default directory for snapshots\n  if (Sys.getenv(\"CI\") == \"true\") {\n    snapshot_variant <- Sys.getenv(\"RUNNER_OS\")\n  } else {\n    snapshot_variant <- NULL\n  }\n\n  data <- readRDS(test_path(\"fixtures\", \"df_norm.rds\"))\n\n  geom_hdr_prob_order_ggplot <- ggplot(data, aes(x, y)) +\n    geom_hdr(probs = c(.25, .5, .75, .95))\n\n  geom_hdr_rug_prob_order_ggplot <- ggplot(data, aes(x, y)) +\n    geom_hdr_rug(probs = c(.25, .5, .75, .95))\n\n  vdiffr::expect_doppelganger(\"geom_hdr_prob_order_ggplot\", geom_hdr_prob_order_ggplot, variant = snapshot_variant)\n  vdiffr::expect_doppelganger(\"geom_hdr_rug_prob_order_ggplot\", geom_hdr_rug_prob_order_ggplot, variant = snapshot_variant)\n})\n\n\n\n"
  },
  {
    "path": "tests/testthat.R",
    "content": "# This file is part of the standard setup for testthat.\n# It is recommended that you do not modify it.\n#\n# Where should you do additional test configuration?\n# Learn more about the roles of various files in:\n# * https://r-pkgs.org/tests.html\n# * https://testthat.r-lib.org/reference/test_package.html#special-files\n\nlibrary(testthat)\nlibrary(ggdensity)\n\ntest_check(\"ggdensity\")\n"
  },
  {
    "path": "vignettes/.gitignore",
    "content": "*.html\n*.R\n"
  },
  {
    "path": "vignettes/method.Rmd",
    "content": "---\ntitle: \"The method argument\"\noutput: rmarkdown::html_vignette\nvignette: >\n  %\\VignetteIndexEntry{The method argument}\n  %\\VignetteEngine{knitr::rmarkdown}\n  %\\VignetteEncoding{UTF-8}\n---\n\n<style>\n\nimg {\n  border: 0px;\n}\n\n</style>\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  fig.align = \"center\",\n  dpi = 160,\n  out.width = \"80%\",\n  collapse = TRUE,\n  comment = \"#>\"\n)\n\n```\n\nAlmost every function in **ggdensity** accepts a `method` argument---this is true for `geom_hdr()` and other layer functions\n(`geom_hdr_lines()`, `geom_hdr_points()`, ...),\nas well as `get_hdr()` and `get_hdr_1d()`.\nThis vignette summarizes the many ways in which the `method` argument can be specified; \nfirst looking at it from a more basic perspective,\nthen from the perspective of a developer wanting to implement additional estimators.\n\n## Using **ggdensity**'s `method_*()` functions\n\nFirst, let's load the necessary packages and generate some sample data.\n\n```{r setup}\nlibrary(\"ggdensity\"); theme_set(theme_minimal(8))\ntheme_update(legend.position = \"none\") # Suppressing legends for readability\n```\n\n```{r}\nset.seed(1) \ndf <- data.frame(x = rnorm(500), y = rnorm(500))\np <- ggplot(df, aes(x, y))\np + geom_point()\n```\n\nThe easiest way to plot HDRs with `geom_hdr()` (or any other layer function from **ggdensity**) with a specified density estimator \nis to provide a character object to the `method` argument:\n```{r, fig.show=\"hold\", out.width=\"45%\", fig.align = \"default\"}\np + geom_hdr(method = \"kde\")\n\np + geom_hdr(method = \"mvnorm\")\n\np + geom_hdr(method = \"histogram\")\n\np + geom_hdr(method = \"freqpoly\")\n```\n\nHowever, as of **ggdensity** v1.0.0 there is an alternative approach---providing a `method_*()` function call:\n```{r, fig.show=\"hold\", out.width=\"45%\", fig.align = \"default\"}\np + geom_hdr(method = method_kde())\n\np + geom_hdr(method = method_mvnorm())\n\np + geom_hdr(method = method_histogram())\n\np + geom_hdr(method = method_freqpoly())\n```\n\nThe default behaviors of these two approaches are the same and always will be---in this way, they are completely interchangeable. \nHowever, the `method_*()` function call is required to estimate HDRs with non-default estimator parameters. \nFor example, we can set the `adjust` parameter to apply a multiplicative adjustment to the heuristically determined bandwidth in `method_kde()` (which itself uses the one computed by `MASS::bandwidth.nrd()`):\n\n```{r}\np + geom_hdr(method = method_kde(adjust = 1/2))\n```\n\nThe relevant parameters for each method are documented in their respective `?method_*` help pages.\nNote that these parameters can not be provided to `geom_hdr()` or `stat_hdr()` and thus are not accessible if \na character value is provided to `method`.\n\nThe `method` argument of `get_hdr()` functions in the same way:\n\n```{r}\nres <- get_hdr(df, method = method_kde(adjust = 1/2))\n\nstr(res)\n```\n\nFor details on the output of `get_hdr()`, see `?get_hdr`.\n\n\n### `method_*_1d()` functions\n\nIn **ggdensity**, it is possible to estimate and plot 1-dimensional HDRs with `geom_hdr_rug()` and `get_hdr_1d()`.\nThese functions also accept a `method` argument, but they do not accept the previously discussed `method_*()` functions.\nInstead they accept the 1-dimensional analogues: `method_*_1d()`.\n\n```{r, fig.show=\"hold\", out.width=\"45%\", fig.align = \"default\"}\np + \n  geom_point() +\n  geom_hdr_rug(method = method_kde_1d())\n\np + \n  geom_point() +\n  geom_hdr_rug(method = method_norm_1d())\n\np + \n  geom_point() +\n  geom_hdr_rug(method = method_histogram_1d())\n\np + \n  geom_point() +\n  geom_hdr_rug(method = method_freqpoly_1d())\n```\n\nJust like we saw with `geom_hdr()`, `geom_hdr_rug()` also accepts character values for `method`:\n\n```{r, fig.show=\"hold\", out.width=\"45%\", fig.align = \"default\"}\np + \n  geom_point() +\n  geom_hdr_rug(method = \"kde\")\n\np + \n  geom_point() +\n  geom_hdr_rug(method = \"norm\")\n\np + \n  geom_point() +\n  geom_hdr_rug(method = \"histogram\")\n\np + \n  geom_point() +\n  geom_hdr_rug(method = \"freqpoly\")\n```\n\nBecause the return values of the `method_*()` functions are incompatible with the 1-dimensional\nHDR estimation procedure, if a 2-dimensional method is specified the following error message is issued:\n\n```{r, fig.show = \"hide\"}\np + \n  geom_point() +\n  geom_hdr_rug(method = method_kde()) \n```\n\nLastly, we see that the `method` argument of `get_hdr_1d()` behaves similarly.\n\n```{r}\nres <- get_hdr_1d(df$x, method = method_kde_1d())\n\nstr(res)\n```\n\nAgain, for details on the above output of `get_hdr_1d()`, see `?get_hdr_1d`.\n\n\n## A detailed look at `method_*()` functions\n\nNow that we understand the ways in which `method` can be specified\nlet's look at the internals of the `method_*()` functions.\nNote: the implementations discussed in this section depend heavily on topics in functional programming,\nespecially \n[closures](https://adv-r.hadley.nz/environments.html?q=closures#function-environments) and\n[function factories](https://adv-r.hadley.nz/function-factories.html).\nWhile not necessary, a good understanding of these ideas is helpful---the \nlinked chapters from Hadley Wickham's *Advanced R* are a great place to start.\n\nLooking at the definition of `method_kde()`, we see that it is a function of `h` and `adjust`,\nreturning a closure with arguments `data`, `n`, `rangex`, and `rangey`.\nThe closure passes the `x` and `y` columns of `data` to `MASS::kde2d()`, \nreturning the estimated density evaluated on a grid with columns `x`, `y`, and `fhat`.\nThis closure is what `geom_hdr()` expects as its `method` argument,\nand is how the HDRs are estimated (via `get_hdr()`).\n\n```{r, collapse = TRUE, comment = \"\"}\nmethod_kde\n```\n\nBoth `method_histogram()` and `method_freqpoly()` behave similarly, \naccepting parameters governing the density estimation procedure and returning a closure with arguments\n`data`, `n`, `rangex`, and `rangey`.\nHowever, these functions are significantly more complicated as\nthe density estimation procedures are implemented entirely in **ggdensity**.\n\n`method_mvnorm()` is different in a few ways.\nThe closure it returns is a function of just one argument: `data`.\nThis is because it does not return the estimated density evaluated on a grid.\nInstead, it returns yet another closure with (vectorized) arguments `x` and `y`.\nAs in `method_kde()`, the return value of the closure is a representation of the estimated pdf.\nThe difference is the manner in which the pdf is represented.\nWhereas before we had a pdf defined by a discrete approximation on a grid,\nwe now have an explicit definition of the pdf in terms of `x` and `y`.\n\n```{r, collapse = TRUE, comment = \"\"}\nmethod_mvnorm\n```\n\nTo summarize each of the above cases:\nin the first example, the `method_*()` function returned a closure with arguments \n`data`, `n`, `rangex`, and `rangey` which itself returned the estimated density evaluated on a grid;\nin the second, the `method_*()` function returned a closure with a single argument, `data`, \nwhich itself returned a closure with arguments `x` and `y`, representing the estimated density explicitly.\nIn both cases, the `method_*()` function can have any number of parameters governing the density estimation procedure.\n\nThese are the two ways the `method` argument may be specified.\nThe first is necessary for cases in which an explicit definition of the estimated density is not computationally feasible \n(for example, KDEs).\nThe second is an easier option for the cases in which a closed form of the estimated density is available \n(for example, parametric estimators).\n\nLet's look at how we might define our own `method_*()` functions in each case,\nbeginning with a simple parametric estimator.\n\n### Implementing a method returning a PDF\n\nIn **ggdensity**, `method_mvnorm()` estimates HDRs based on the parametric multivariate normal model.\nIf we wanted to fit a simpler model in which the data is further assumed to be independent,\nwe could implement `method_mvnorm_ind()`.\n\n```{r}\nmethod_mvnorm_ind <- function() {\n  \n  function(data) {\n    \n    xbar <- mean(data$x)\n    ybar <- mean(data$y)\n    \n    sx <- sd(data$x)\n    sy <- sd(data$y)\n    \n    # joint pdf is simply the product of the marginals\n    function(x, y) dnorm(x, xbar, sx) * dnorm(y, ybar, sy)\n  }\n  \n}\n```\n\nTo use our `method_mvnorm_ind()`, we just need to supply it to `geom_hdr()`'s `method` argument.\n```{r}\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind())\n```\n\nIf we transform our data to have non-zero covariance\nwe still see the major and minor axes of the contours coincide with the plot axes---exactly\nwhat we would expect with this (incorrectly) constrained model.\n\n```{r}\nA <- matrix(c(\n  2*cos(pi/6), -2*sin(pi/6),\n  1*sin(pi/6),  1*cos(pi/6)\n), byrow = TRUE, ncol = 2)\n\ndf_rot <- as.data.frame(as.matrix(df) %*% A)\ncolnames(df_rot) <- c(\"x\", \"y\")\n\nggplot(df_rot, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind()) +\n  geom_point(size = .4) +\n  coord_fixed(xlim = c(-6, 6), ylim = c(-6, 6))\n```\n\nNotice, `method_mvnorm_ind()` accepts no arguments.\nThe density estimation procedure is so simple that there are no parameters to govern it.\nTo allow for circular models in which the fitted variances are required to be equal,\nwe can implement a `circular` argument.\n```{r}\nmethod_mvnorm_ind <- function(circular = FALSE) {\n  \n  function(data) {\n    \n    xbar <- mean(data$x)\n    ybar <- mean(data$y)\n    \n    if (circular) {\n      sx <- sd(c(data$x - xbar, data$y - ybar))\n      sy <- sx\n    } else {\n      sx <- sd(data$x)\n      sy <- sd(data$y)\n    }\n    \n    function(x, y) dnorm(x, xbar, sx) * dnorm(y, ybar, sy)\n  }\n  \n}\n```\n\nNow, the contours are perfectly circular.\n\n```{r}\nggplot(df_rot, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind(circular = TRUE)) +\n  geom_point(size = .4) +\n  coord_fixed(xlim = c(-6, 6), ylim = c(-6, 6))\n```\n\nIn the above plot, the upper and lower portions of the HDRs are cut off.\nThis is because the default behavior of **ggdensity** is to not draw HDRs outside of the \"bounding box\" around observed data.\nThis is *not* because we are using a custom `method_*()` function.\nTo fix this, we need to either set a better `ylim` value for `geom_hdr()` or specify a larger range in `scale_y_continuous()`.\n\n```{r, fig.show=\"hold\", out.width=\"45%\", fig.align = \"default\"}\nggplot(df_rot, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind(circular = TRUE), ylim = c(-6, 6)) +\n  geom_point(size = .4) +\n  coord_fixed(xlim = c(-6, 6), ylim = c(-6, 6))\n\nggplot(df_rot, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind(circular = TRUE)) +\n  geom_point(size = .4) +\n  scale_y_continuous(limits = c(-6, 6)) +\n  coord_fixed(xlim = c(-6, 6), ylim = c(-6, 6))\n```\n\nNotice, neither of these approaches involve arguments to `method_mvnorm_ind()`.\nInternally, the closure returned by `method_mvnorm_ind()` is used by `get_hdr()`,\nalong with information from the `scales` associated with the `ggplot` object.\nIt is the `scales` that need adjusting, not anything related to the `method` argument.\n\n\n### Implementing a method returning an evaluated PDF\n\nTo illustrate the other case, in which the object returned by the closure is the estimated density evaluated on a grid, we implement `method_mvnorm_ind_grid()`.\nThis estimates the same independent normal density as `method_mvnorm_ind()`,\nthe only difference is the behavior of the returned closure.\n\n```{r}\nmethod_mvnorm_ind_grid <- function() {\n  \n  function(data, n, rangex, rangey) {\n    \n    # First, we estimate the density -----------------------------\n    \n    xbar <- mean(data$x)\n    ybar <- mean(data$y)\n    \n    sx <- sd(data$x)\n    sy <- sd(data$y)\n    \n    f_est <- function(x, y) dnorm(x, xbar, sx) * dnorm(y, ybar, sy)\n    \n    \n    # Return the density evaluated on a grid ---------------------\n    \n    # df_grid defined by rangex, rangey, and n\n    df_grid <- expand.grid(\n      x = seq(rangex[1], rangex[2], length.out = n),\n      y = seq(rangey[1], rangey[2], length.out = n)\n    )\n    \n    df_grid$fhat <- f_est(df_grid$x, df_grid$y)\n    \n    df_grid\n  }\n  \n}\n```\n\nSee that returned closure has additional arguments `n`, `rangex`, and `rangey` which define the grid.\nAlso, the grid is represented  a `data.frame` with columns `x`, `y`, and `fhat`, where `fhat` is the (potentially unnormalized) density estimate.\n\nAgain, to use our `method_mvnorm_ind_grid()` we provide it to `geom_hdr()`’s method argument.\n\n```{r}\nggplot(df, aes(x, y)) +\n  geom_hdr(method = method_mvnorm_ind_grid())\n```\n\nLike we saw in the previous example, we could prevent the HDRs from being \"cut off\" by specifying either the `x/ylim` arguments in `geom_hdr()` or\nby setting a larger range in `scale_x/y_continuous()`.\n\n<!-- While we can represent this simple density estimator as either a function or grid, -->\n<!-- there are many cases where it is not possible to represent estimators as functions in R. -->\n<!-- For example, the previously discussed `method_kde()` demands a grid representation--the -->\n<!-- underlying tool (`MASS::kde2d()`) represents the estimated density as a grid. -->\n\n\n## The `method_*_1d()` functions\n\nWe saw before that **ggdensity** uses `method_*_1d()` functions for the estimation of 1-dimensional densities.\nThe internals of these functions are very similar to the `method_*()` functions,\nthe only differences are slight changes to the arguments and return values of the returned closures.\n\nLooking at the definition of `method_kde_1d()`, we see the returned closure has arguments `x`, `n`, and `range`.\nThis is very similar to `method_kde()`, the only difference is we are now dealing with univariate data:\nthe vector argument `x` is used instead of `data`, and we have a single `range` parameter instead of `rangex` and `rangey`.\nSimilarly, the closure now returns the estimated density evaluated on a univariate grid, with columns `x` and `fhat` instead of the bivariate grid with columns `x`, `y`, and `fhat`.\nFinally, see that `method_kde_1d()` accepts several arguments governing the density estimation procedure just like `method_kde()`.\n\n```{r, collapse = TRUE, comment = \"\"}\nmethod_kde_1d\n```\n\nEstimated univariate densities can also be represented explicitly, as illustrated by `method_norm_1d()`.\nComparing this to the previously discussed `method_mvnorm()` we see that little has changed:\nthe closure is now a function of a vector `x` instead of `data` and returns a function of one variable (`x`) instead of two (`x` and `y`).\n```{r, collapse = TRUE, comment = \"\"}\nmethod_norm_1d\n```\n\nAdditional `method_*_1d()` functions can be implemented in the same way as the 2-dimensional `method_*()` functions,\nso long as the returned closure is structured in one of the two ways we have seen here.\n\n\n\n\n\n"
  }
]