Repository: tpapp/DynamicHMC.jl Branch: master Commit: 021ffacc569e Files: 39 Total size: 142.3 KB Directory structure: gitextract_tz2mhnh1/ ├── .codecov.yml ├── .github/ │ ├── ISSUE_TEMPLATE/ │ │ └── generic-issue-template.md │ ├── dependabot.yml │ └── workflows/ │ ├── CI.yml │ ├── CompatHelper.yml │ ├── TagBot.yml │ └── documentation.yml ├── .gitignore ├── CHANGELOG.md ├── LICENSE.md ├── Project.toml ├── README.md ├── appveyor.yml ├── docs/ │ ├── Project.toml │ ├── make.jl │ └── src/ │ ├── index.md │ ├── interface.md │ └── worked_example.md ├── src/ │ ├── DynamicHMC.jl │ ├── NUTS.jl │ ├── diagnostics.jl │ ├── hamiltonian.jl │ ├── mcmc.jl │ ├── reporting.jl │ ├── stepsize.jl │ ├── trees.jl │ └── utilities.jl └── test/ ├── Project.toml ├── runtests.jl ├── sample-correctness_tests.jl ├── sample-correctness_utilities.jl ├── test_NUTS.jl ├── test_diagnostics.jl ├── test_hamiltonian.jl ├── test_logging.jl ├── test_mcmc.jl ├── test_stepsize.jl ├── test_trees.jl └── utilities.jl ================================================ FILE CONTENTS ================================================ ================================================ FILE: .codecov.yml ================================================ comment: false ================================================ FILE: .github/ISSUE_TEMPLATE/generic-issue-template.md ================================================ --- name: generic issue template about: question, feature request, or bug report --- *Please make sure you are using **the latest tagged versions** and **the last stable release of Julia** before proceeding.* Support for other versions is very limited. If you need *help* using this package for Bayesian inference, please provide a self-contained description of your inference problem (simplifying if possible) and preferably the code you have written so far to code the log likelihood. If you found a *bug*, please provide a self-contained working example, complete with (simulated) data. Please make sure you set a random seed (`Random.seed!`) at the beginning to make your example reproducible. If you are requesting a new *feature*, please provide a description and a rationale. ## Self-contained example that demonstrates the problem ```julia using DynamicHMC ``` ## Output, expected outcome, comparison to other samplers Did the sampler fail to run, produce incorrect results, …? ## Contributing code for tests You can contribute to the development of this package by allowing that your example is used as a test. Please indicate whether your code can be incorporated into this package under the MIT "Expat" license, found in the root directory of this package. ================================================ FILE: .github/dependabot.yml ================================================ # see https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot version: 2 updates: - package-ecosystem: "github-actions" directory: "/" schedule: # Check for updates to GitHub Actions every month interval: "monthly" ================================================ FILE: .github/workflows/CI.yml ================================================ # from https://discourse.julialang.org/t/easy-workflow-file-for-setting-up-github-actions-ci-for-your-julia-package/49765 name: CI on: pull_request: branches: - master push: branches: - master tags: '*' jobs: test: name: Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} runs-on: ${{ matrix.os }} strategy: fail-fast: false matrix: version: - '1.10' # Replace this with the minimum Julia version that your package supports. E.g. if your package requires Julia 1.9 or higher, change this to '1.9'. - '1' # Leave this line unchanged. '1' will automatically expand to the latest stable 1.x release of Julia. # NOTE: CI on nightly is not enabled by default, uncomment if you need it. Cf the discussion at # https://discourse.julialang.org/t/why-do-packages-run-continuous-integration-tests-on-julia-nightly/101208/3 # - 'nightly' os: - ubuntu-latest arch: - x64 steps: - uses: actions/checkout@v6 - uses: julia-actions/setup-julia@v2 with: version: ${{ matrix.version }} arch: ${{ matrix.arch }} - uses: actions/cache@v5 env: cache-name: cache-artifacts with: path: ~/.julia/artifacts key: ${{ runner.os }}-test-${{ env.cache-name }}-${{ hashFiles('**/Project.toml') }} restore-keys: | ${{ runner.os }}-test-${{ env.cache-name }}- ${{ runner.os }}-test- ${{ runner.os }}- - uses: julia-actions/julia-buildpkg@v1 - uses: julia-actions/julia-runtest@v1 - uses: julia-actions/julia-processcoverage@v1 - uses: codecov/codecov-action@v5 with: file: lcov.info # NOTE: you need to add the token to Github secrets, see # https://docs.codecov.com/docs/adding-the-codecov-token token: ${{ secrets.CODECOV_TOKEN }} ================================================ FILE: .github/workflows/CompatHelper.yml ================================================ # see the docs at https://github.com/JuliaRegistries/CompatHelper.jl name: CompatHelper on: schedule: - cron: 0 0 * * * workflow_dispatch: permissions: contents: write pull-requests: write jobs: CompatHelper: runs-on: ubuntu-latest steps: - name: Check if Julia is already available in the PATH id: julia_in_path run: which julia continue-on-error: true - name: Install Julia, but only if it is not already available in the PATH uses: julia-actions/setup-julia@v2 with: version: '1' arch: ${{ runner.arch }} if: steps.julia_in_path.outcome != 'success' - name: "Add the General registry via Git" run: | import Pkg ENV["JULIA_PKG_SERVER"] = "" Pkg.Registry.add("General") shell: julia --color=yes {0} - name: "Install CompatHelper" run: | import Pkg name = "CompatHelper" uuid = "aa819f21-2bde-4658-8897-bab36330d9b7" version = "3" Pkg.add(; name, uuid, version) shell: julia --color=yes {0} - name: "Run CompatHelper" run: | import CompatHelper CompatHelper.main() shell: julia --color=yes {0} env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} COMPATHELPER_PRIV: ${{ secrets.DOCUMENTER_KEY }} # COMPATHELPER_PRIV: ${{ secrets.COMPATHELPER_PRIV }} ================================================ FILE: .github/workflows/TagBot.yml ================================================ # see the docs at https://github.com/JuliaRegistries/TagBot name: TagBot on: issue_comment: types: - created workflow_dispatch: inputs: lookback: default: 3 permissions: actions: read checks: read contents: write deployments: read issues: read discussions: read packages: read pages: read pull-requests: read repository-projects: read security-events: read statuses: read jobs: TagBot: if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot' runs-on: ubuntu-latest steps: - uses: JuliaRegistries/TagBot@v1 with: token: ${{ secrets.GITHUB_TOKEN }} # Edit the following line to reflect the actual name of the GitHub Secret containing your private key ssh: ${{ secrets.DOCUMENTER_KEY }} # ssh: ${{ secrets.NAME_OF_MY_SSH_PRIVATE_KEY_SECRET }} ================================================ FILE: .github/workflows/documentation.yml ================================================ # see https://juliadocs.github.io/Documenter.jl/dev/man/hosting/#GitHub-Actions # add this file to the repository once you set up authentication as described in the Documenter manual name: Documentation on: push: branches: - master tags: '*' pull_request: jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - uses: julia-actions/setup-julia@latest with: version: '1.10' - name: Install dependencies run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()' - name: Build and deploy env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # If authenticating with GitHub Actions token DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # If authenticating with SSH deploy key run: julia --project=docs/ docs/make.jl ================================================ FILE: .gitignore ================================================ *.jl.cov *.jl.*.cov *.jl.mem /notes.org /Manifest.toml /deps/deps.jl /docs/build /docs/Manifest.toml /test/coverage/Manifest.toml ================================================ FILE: CHANGELOG.md ================================================ # Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## Unreleased ### Added ### Changed ### Deprecated ### Removed ### Fixed ### Security ## v3.6.0 - add `logdensities` to `mcmc` results (#194) - minor cleanup of `Project.toml` ## v3.5.1 - Cleanup tests, tighten tolerances. (#190) - Use OhMyThreads for threaded example. (#191) ## v3.5.0 - remove `@unpack`, use `(;)` - require Julia 1.10 - minor test fixes ## v3.4.8 - refresh Github actions - loosen test tolerances a bit to test cleanly on Julia 1.11 - remove vestigial code from tests ## v3.4.7 - fix compat entries ## v3.4.6 - test with Aqua, minor fixes ## v3.4.5 - Provide more details when initial stepsize search fails. (#180) - Simplify stepsize search. (#181) - replace `Parameters: @unpack` with `SimpleUnPack` (#182) - replace `GLOBAL_RNG` with `default_rng()` (#183) ## v3.4.4 - compat bumps ## 3.4.3 - remove workaround for JET.jl ## 3.4.2 - change stacked ordering ## 3.4.1 - Support Julia LTS and update CompatHelper ## 3.4.0 - transition to MCMCDiagnosticTools for diagnostics - minor fixes - JET tests up front so that method extensions do not confuse it ## v3.3.0 - remove `position_matrix` [https://github.com/tpapp/DynamicHMC.jl/pull/165](#165), add utility functions for posterior matrices ## v3.2.1 - compat version bumps ## v3.2.0 - compat version bumps ## v3.1.2 - compat version bumps ## v3.1.1 - minor doc and export list fixes (follow-up to ([#145](https://github.com/tpapp/DynamicHMC.jl/pull/145))) ## v3.1.0 - more robust U-turn checking ([#145](https://github.com/tpapp/DynamicHMC.jl/pull/145)) ## v3.0.0 - get rid of `local_optimization` in warmup ([#146](https://github.com/tpapp/DynamicHMC.jl/pull/146)) ## v2.2.0 - add a progress bar ([#136](https://github.com/tpapp/DynamicHMC.jl/pull/136)) - compat bounds, minor changes ## v2.1.4 - compat bumps extension ## v2.1.3 - relax test bounds a bit ([#116](https://github.com/tpapp/DynamicHMC.jl/pull/116)) ## v2.1.2 Technical release (compat version bounds extended). ## v2.1.1 - re-enable support for Julia 1.0 ([#107](https://github.com/tpapp/DynamicHMC.jl/pull/107)) - fix penalty sign in initial optimization ([#97](https://github.com/tpapp/DynamicHMC.jl/pull/97)) - add example for skipping stepsize search ([#104](https://github.com/tpapp/DynamicHMC.jl/pull/104)) ## v2.1.0 - add experimental “iterator” interface ([#94](https://github.com/tpapp/DynamicHMC.jl/pull/94)) - use `randexp` for Metropolis acceptance draws - remove dependence on StatsFuns.jl ## v2.0.2 Default keyword arguments for LogProgressReport. ## v2.0.1 Don't print `chain_id` when it is `nothing`. ## v2.0.0 Note: the interface was redesigned. You probably want to review the docs, especially the worked example. ### API changes - major API change: entry point is now `mcmc_with_warmup` - refactor warmup code, add initial optimizer - use the LogDensityProblems v0.9.0 API - use Julia's Logging module for progress messages - diagnostics moved to `DynamicHMC.Diagnostics` - report turning and divergence positions - add `leapfrog_trajectory` for exploration ### Implementation changes - factor out the tree traversal code - abstract trajectory interface - separate random and non-random parts - stricter and more exact unit tests - refactor Hamiltonian code slightly - caching is now in EvaluatedLogDensity - functions renamed - misc - remove dependency on DataStructures, Suppressor - cosmetic changes to dual averaging code - large test cleanup ## v1.0.6 - fix LogDensityProblems version bounds ## v1.0.5 - fix tuning with singular covariance matrices ## v1.0.4 - minor fixes in tests and coverage ## v1.0.3 and prior No CHANGELOG available. ================================================ FILE: LICENSE.md ================================================ The DynamicHMC.jl package is licensed under the MIT "Expat" License: > Copyright (c) 2020: Tamas K. Papp. > > Permission is hereby granted, free of charge, to any person obtaining a copy > of this software and associated documentation files (the "Software"), to deal > in the Software without restriction, including without limitation the rights > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > copies of the Software, and to permit persons to whom the Software is > furnished to do so, subject to the following conditions: > > The above copyright notice and this permission notice shall be included in all > copies or substantial portions of the Software. > > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > SOFTWARE. > ================================================ FILE: Project.toml ================================================ name = "DynamicHMC" uuid = "bbc10e6e-7c05-544b-b16e-64fede858acb" version = "3.6.0" authors = ["Tamas K. Papp "] [deps] ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197" DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c" LogExpFunctions = "2ab3a3ac-af41-5b50-aa03-7779005ae688" ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" TensorCast = "02d47bb6-7ce6-556a-be16-bb1710789e2b" [compat] ArgCheck = "1, 2" DocStringExtensions = "0.8, 0.9" FillArrays = "0.13, 1" LinearAlgebra = "1.6" LogDensityProblems = "1, 2" LogExpFunctions = "0.3" ProgressMeter = "1" Random = "1.6" Statistics = "1" TensorCast = "0.4" julia = "1.10" [workspace] projects = ["test", "docs"] ================================================ FILE: README.md ================================================ # DynamicHMC Implementation of robust dynamic Hamiltonian Monte Carlo methods in Julia. [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![build](https://github.com/tpapp/DynamicHMC.jl/workflows/CI/badge.svg)](https://github.com/tpapp/DynamicHMC.jl/actions?query=workflow%3ACI) [![codecov](https://codecov.io/github/tpapp/DynamicHMC.jl/graph/badge.svg?token=idfCiPzxjL)](https://codecov.io/github/tpapp/DynamicHMC.jl) [![Documentation](https://img.shields.io/badge/docs-stable-blue.svg)](https://tpapp.github.io/DynamicHMC.jl/stable) [![Documentation](https://img.shields.io/badge/docs-master-blue.svg)](https://tpapp.github.io/DynamicHMC.jl/dev) [![DOI](https://zenodo.org/badge/93741413.svg)](https://zenodo.org/badge/latestdoi/93741413) [![Aqua QA](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl) ## Overview This package implements a modern version of the “No-U-turn sampler” in the Julia language, mostly as described in [Betancourt (2017)](https://arxiv.org/abs/1701.02434), with some tweaks. In contrast to frameworks which utilize a directed acyclic graph to build a posterior for a Bayesian model from small components, this package requires that you code a *log-density function* of the posterior in Julia. Derivatives can be provided manually, or using [automatic differentiation](http://www.juliadiff.org/). Consequently, this package requires that the user is comfortable with the basics of the theory of Bayesian inference, to the extent of coding a (log) posterior density in Julia. This approach allows the use of standard tools like [profiling](https://docs.julialang.org/en/v1/manual/profile/) and [benchmarking](https://github.com/JuliaCI/BenchmarkTools.jl) to optimize its [performance](https://docs.julialang.org/en/v1/manual/performance-tips/). The building blocks of the algorithm are implemented using a *functional* (non-modifying) approach whenever possible, allowing extensive unit testing of components, while at the same time also intended to serve as a transparent, pedagogical introduction to the low-level mechanics of current Hamiltonian Monte Carlo samplers, and as a platform for research into MCMC methods. Please start with the [documentation](https://tamaspapp.eu/DynamicHMC.jl/dev/). ## Examples - Some basic examples are available in [DynamicHMCExamples.jl](https://github.com/tpapp/DynamicHMCExamples.jl). - [DynamicHMCModels.jl](https://github.com/StatisticalRethinkingJulia/DynamicHMCModels.jl) contains worked examples from the [Statistical Rethinking](https://xcelab.net/rm/statistical-rethinking/) book. ## Support and participation For general questions, open an issue or ask on [the Discourse forum](https://discourse.julialang.org/). I am happy to help with models. Users who rely on this package and want to participate in discussions are recommended to subscribe to the Github notifications (“watching” the package). Also, I will do my best to accommodate feature requests, just open issues. ## Bibliography Betancourt, M. J., Byrne, S., & Girolami, M. (2014). Optimizing the integrator step size for Hamiltonian Monte Carlo. [arXiv preprint arXiv:1411.6669](https://arxiv.org/pdf/1411.6669). Betancourt, M. (2016). Diagnosing suboptimal cotangent disintegrations in Hamiltonian Monte Carlo. [arXiv preprint arXiv:1604.00695](https://arxiv.org/abs/1604.00695). Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian Monte Carlo. [arXiv preprint arXiv:1701.02434](https://arxiv.org/abs/1701.02434). Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. : CRC Press. Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Hoffman, M. D., & Gelman, A. (2014). The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623. McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC. ================================================ FILE: appveyor.yml ================================================ environment: matrix: - julia_version: 1 - julia_version: nightly platform: - x86 # 32-bit - x64 # 64-bit # # Uncomment the following lines to allow failures on nightly julia # # (tests will run but not make your overall status red) # matrix: # allow_failures: # - julia_version: nightly branches: only: - master - /release-.*/ notifications: - provider: Email on_build_success: false on_build_failure: false on_build_status_changed: false install: - ps: iex ((new-object net.webclient).DownloadString("https://raw.githubusercontent.com/JuliaCI/Appveyor.jl/version-1/bin/install.ps1")) build_script: - echo "%JL_BUILD_SCRIPT%" - C:\julia\bin\julia -e "%JL_BUILD_SCRIPT%" test_script: - echo "%JL_TEST_SCRIPT%" - C:\julia\bin\julia -e "%JL_TEST_SCRIPT%" # # Uncomment to support code coverage upload. Should only be enabled for packages # # which would have coverage gaps without running on Windows # on_success: # - echo "%JL_CODECOV_SCRIPT%" # - C:\julia\bin\julia -e "%JL_CODECOV_SCRIPT%" ================================================ FILE: docs/Project.toml ================================================ [deps] Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c" LogDensityProblemsAD = "996a588d-648d-4e1f-a8f0-a84b347e47b1" MCMCDiagnosticTools = "be115224-59cd-429b-ad48-344e309966f0" OhMyThreads = "67456a42-1dca-4109-a031-0a68de7e3ad5" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" TransformVariables = "84d833dd-6860-57f9-a1a7-6da5db126cff" TransformedLogDensities = "f9bc47f6-f3f8-4f3b-ab21-f8bc73906f26" [compat] Documenter = "1" LogDensityProblems = "2" ================================================ FILE: docs/make.jl ================================================ using Documenter, DynamicHMC makedocs(modules = [DynamicHMC], format = Documenter.HTML(; prettyurls = get(ENV, "CI", nothing) == "true"), clean = true, sitename = "DynamicHMC.jl", authors = "Tamás K. Papp", checkdocs = :exports, linkcheck = true, linkcheck_ignore = [r"^.*xcelab\.net.*$", r"^.*stat\.columbia\.edu.*$"], pages = ["Introduction" => "index.md", "A worked example" => "worked_example.md", "Documentation" => "interface.md"]) deploydocs(repo = "github.com/tpapp/DynamicHMC.jl.git", push_preview = true) ================================================ FILE: docs/src/index.md ================================================ # Introduction [DynamicHMC.jl](https://github.com/tpapp/DynamicHMC.jl/) implements a variant of the “No-U-Turn Sampler” of [Hoffmann and Gelman (2014)](https://arxiv.org/abs/1111.4246), as described in [Betancourt (2017)](https://arxiv.org/abs/1701.02434).[^1] This package is mainly useful for Bayesian inference. [^1]: In order to make the best use of this package, you should read at least the latter paper thoroughly. In order to use it, you should be familiar with the conceptual building blocks of Bayesian inference, most importantly, you should be able to code a (log) posterior as a function in Julia.[^2] The package aims to “[do one thing and do it well](https://en.wikipedia.org/wiki/Unix_philosophy#Do_One_Thing_and_Do_It_Well)”: given a log density function ```math \ell: \mathbb{R}^k \to \mathbb{R} ``` for which you have values ``\ell(x)`` and the gradient ``\nabla \ell(x)``, it samples values from a density ```math p(x) \propto \exp(\ell(x)) ``` using the algorithm above. [^2]: For various techniques and a discussion of MCMC methods (eg domain transformations, or integrating out discrete parameters), you may find the [Stan Modeling Language manual](https://mc-stan.org/users/documentation/index.html) helpful. If you are unfamiliar with Bayesian methods, I would recommend [Bayesian Data Analysis](http://www.stat.columbia.edu/~gelman/book/) and [Statistical Rethinking](https://xcelab.net/rm/statistical-rethinking/). The interface of DynamicHMC.jl expects that you code ``\ell(x), \nabla\ell(x)`` using the interface of the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) package. The latter package also allows you to just code ``\ell(x)`` and obtain ``\nabla\ell(x)`` via [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation). While the NUTS algorithm operates on an *unrestricted* domain ``\mathbb{R}^k``, some parameters have natural restrictions: for example, standard deviations are positive, valid correlation matrices are a subset of all matrices, and structural econometric models can have parameter restrictions for stability. In order to sample for posteriors with parameters like these, *domain transformations* are required.[^3] Also, it is advantageous to decompose a flat vector `x` to a collection of parameters in a disciplined manner. It is recommended that you use [TransformVariables.jl](https://github.com/tpapp/TransformVariables.jl) in combination with [TransformedLogDensities.jl](https://github.com/tpapp/TransformedLogDensities.jl) for this purpose: it has built-in transformations for common cases, and also allows decomposing vectors into tuples, named tuples, and arrays of parameters, combined with these transformations. [^3]: For nonlinear transformations, correcting with the logarithm of the determinant of the Jacobian is required. ## Use cases This package has the following intended use cases: 1. A robust and simple engine for MCMC. The intended audience is users who like to code their (log)posteriors directly, optimize and benchmark them as Julia code, and at the same time want to have access detailed diagnostic information from the NUTS sampler. 2. A *backend* for another interface that needs a NUTS implementation. 3. A *research platform* for advances in MCMC methods. The code of this package is extensively documented, and should allow extensions and experiments easily using multiple dispatch. Contributions are always welcome. ## Support If you have questions, feature requests, or bug reports, please [open an issue](https://github.com/tpapp/DynamicHMC.jl/issues/new). I would like to emphasize that it is perfectly fine to open issues just to ask questions. You can also address questions to [`@Tamas_Papp`](https://discourse.julialang.org/u/Tamas_Papp) on the Julia discourse forum. ## Versioning and interface changes Package versioning follows [Semantic Versioning 2.0](https://semver.org/). Only major version increments change the API in a breaking manner, but there is no deprecation cycle. You are strongly advised to add a [compatibility section](https://pkgdocs.julialang.org/dev/compatibility/) to your `Project.toml`, eg ```toml [compat] DynamicHMC = "3" ``` Only symbols (functions and types) exported directly or indirectly from the `DynamicHMC` module are considered part of the interface. Importantly, the [`DynamicHMC.Diagnostics`](@ref Diagnostics) submodule is not considered part of the interface with respect to semantic versioning, and may be changed with just a minor version increment. The rationale for this is that a good generic diagnostics interface is much harder to get right, so some experimental improvements, occasionally reverted or redesigned, will be normal for this package in the medium run. If you depend on this explicitly in non-interactive code, use ```toml [compat] DynamicHMC = "~3.3.0" ``` There is an actively maintained [CHANGELOG](https://github.com/tpapp/DynamicHMC.jl/blob/master/CHANGELOG.md) which is worth reading after every release, especially major ones. ================================================ FILE: docs/src/interface.md ================================================ # User interface ```@docs DynamicHMC ``` ## Sampling The main entry point for sampling is ```@docs mcmc_with_warmup ``` ## Utilities for working with the posterior ```@docs stack_posterior_matrices pool_posterior_matrices ``` ## Warmup Warmup can be customized. ### Default warmup sequences A warmup sequence is just a tuple of [warmup building blocks](@ref wbb). Two commonly used sequences are predefined. ```@docs default_warmup_stages fixed_stepsize_warmup_stages ``` ### [Warmup building blocks](@id wbb) ```@docs InitialStepsizeSearch DualAveraging TuningNUTS GaussianKineticEnergy ``` ## Progress report Progress reports can be explicit or silent. ```@docs NoProgressReport LogProgressReport ProgressMeterReport ``` ## Algorithm and parameters You probably won't need to change these options with normal usage, except possibly increasing the maximum tree depth. ```@docs DynamicHMC.NUTS ``` ## Inspecting warmup !!! note The warmup interface below is not considered part of the exposed API, and may change with just minor version bump. It is intended for interactive use; the docstrings and the field names of results should be informative. ```@docs DynamicHMC.mcmc_keep_warmup ``` ## Stepwise sampling !!! note The stepwise sampling interface below is not considered part of the exposed API, and may change with just minor version bump. An experimental interface is available to users who wish to do MCMC one step at a time, eg until some desired criterion about effective sample size or mixing is satisfied. See the docstrings below for an example. ```@docs DynamicHMC.mcmc_steps DynamicHMC.mcmc_next_step ``` # Diagnostics !!! note Strictly speaking the `Diagnostics` submodule API is not considered part of the exposed interface, and may change with just minor version bump. It is intended for interactive use. ```@docs DynamicHMC.Diagnostics.explore_log_acceptance_ratios DynamicHMC.Diagnostics.summarize_tree_statistics DynamicHMC.Diagnostics.leapfrog_trajectory DynamicHMC.Diagnostics.EBFMI DynamicHMC.PhasePoint ``` ================================================ FILE: docs/src/worked_example.md ================================================ # A worked example !!! note An extended version of this example can be found [in the DynamicHMCExamples.jl package](https://github.com/tpapp/DynamicHMCExamples.jl/blob/master/src/example_independent_bernoulli.jl). ## Problem statement Consider estimating the parameter ``0 \le \alpha \le 1`` from ``n`` IID observations[^4] [^4]: Note that NUTS is not especially suitable for low-dimensional parameter spaces, but this example works fine. ```math y_i \sim \mathrm{Bernoulli}(\alpha) ``` We will code this with the help of TransformVariables.jl, and obtain the gradient with ForwardDiff.jl (in practice, at the moment I would recommend [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) for small models, and [Flux.jl](https://github.com/FluxML/Flux.jl) for larger ones — consider benchmarking a single evaluation of the log density with gradient).[^5] [^5]: An example of how you can benchmark a log density with gradient `∇P`, obtained as described below: ```julia using BenchmarkTools, LogDensityProblems x = randn(LogDensityProblems.dimension(∇P)) @benchmark LogDensityProblems.logdensity_and_gradient($∇P, $x) ``` ## Coding up the log density First, we load the packages we use. ```@example bernoulli using TransformVariables, TransformedLogDensities, LogDensityProblems, LogDensityProblemsAD, DynamicHMC, DynamicHMC.Diagnostics, Statistics, Random nothing # hide ``` Generally, I would recommend defining an immutable composite type (ie `struct`) to hold the data and all parameters relevant for the log density (eg the prior). This allows you to test your code in a modular way before sampling. For this model, the number of draws equal to `1` is a sufficient statistic. ```@example bernoulli struct BernoulliProblem n::Int # Total number of draws in the data s::Int # Number of draws `==1` in the data end ``` Then we make this problem *callable* with the parameters. Here, we have a single parameter `α`, but pass this in a `NamedTuple` to demonstrate a generally useful pattern. Then, we define an instance of this problem with the data, called `p`.[^6] [^6]: Note that here we used a *flat prior*. This is generally not a good idea for variables with non-finite support: one would usually make priors parameters of the `struct` above, and add the log prior to the log likelihood above. ```@example bernoulli function (problem::BernoulliProblem)(θ) (; α) = θ # extract the parameters (; n, s) = problem # extract the data # log likelihood, with constant terms dropped s * log(α) + (n-s) * log(1-α) end ``` It is generally a good idea to test that your code works by calling it with the parameters; it should return a likelihood. For more complex models, you should benchmark and [optimize](https://docs.julialang.org/en/v1/manual/performance-tips/) this callable directly. ```@example bernoulli p = BernoulliProblem(20, 10) p((α = 0.5, )) # make sure that it works ``` With [TransformVariables.jl](https://github.com/tpapp/TransformVariables.jl), we set up a *transformation* ``\mathbb{R} \to [0,1]`` for ``\alpha``, and use the convenience function `TransformedLogDensity` to obtain a log density in ``\mathbb{R}^1``. Finally, we obtain a log density that supports gradients using automatic differentiation, with [LogDensityProblemsAD.jl](https://github.com/tpapp/LogDensityProblemsAD.jl). ```@example bernoulli trans = as((α = as𝕀,)) P = TransformedLogDensity(trans, p) ∇P = ADgradient(:ForwardDiff, P) ``` Finally, we run MCMC with warmup. Note that you have to specify the *random number generator* explicitly — this is good practice for parallel code. The last parameter is the number of samples. ```@example bernoulli results = mcmc_with_warmup(Random.default_rng(), ∇P, 1000; reporter = NoProgressReport()) nothing # hide ``` The returned value is a `NamedTuple`. Most importantly, it contains the field `posterior_matrix`. You should use the transformation you defined above to retrieve the parameters (here, only `α`). We display the mean here to check that it was recovered correctly. ```@example bernoulli posterior = transform.(trans, eachcol(results.posterior_matrix)) posterior_α = first.(posterior) mean(posterior_α) ``` Using the [`DynamicHMC.Diagnostics`](@ref Diagnostics) submodule, you can obtain various useful diagnostics. The *tree statistics* in particular contain a lot of useful information about turning, divergence, acceptance rates, and tree depths for each step of the chain. Here we just obtain a summary. ```@example bernoulli using DynamicHMC.Diagnostics summarize_tree_statistics(results.tree_statistics) ``` ## Parallel chains and diagnostics Usually one would run multiple chains and check convergence and mixing using generic MCMC diagnostics not specific to NUTS. The specifics of running multiple chains is up to the user: various forms of [parallel computing](https://docs.julialang.org/en/v1/manual/parallel-computing/) can be utilized depending on the problem scale and the hardware available. In the example below we use [multi-threading](https://docs.julialang.org/en/v1/manual/multi-threading/), using [OhMyThreads.jl](https://github.com/JuliaFolds2/OhMyThreads.jl); other excellent packages are available for threading. It is easy to obtain posterior results for use with [MCMCDiagnosticsTools.jl](https://github.com/TuringLang/MCMCDiagnosticTools.jl/) with [`stack_posterior_matrices`](@ref): ```@example bernoulli using OhMyThreads, MCMCDiagnosticTools results5 = tcollect(mcmc_with_warmup(Random.default_rng(), ∇P, 1000; reporter = NoProgressReport()) for _ in 1:5) ess_rhat(stack_posterior_matrices(results5)) ``` Use [`pool_posterior_matrices`](@ref) for a pooled sample: ```@example bernoulli posterior5 = transform.(trans, eachcol(pool_posterior_matrices(results5))) posterior5_α = first.(posterior5) mean(posterior5_α) ``` ================================================ FILE: src/DynamicHMC.jl ================================================ """ Implementation of the No U-Turn Sampler for MCMC. Please [read the documentation](https://tamaspapp.eu/DynamicHMC.jl/dev/). For the impatient: you probably want to 1. define a log density problem (eg for Bayesian inference) using the `LogDensityProblems` package, then 2. use it with [`mcmc_with_warmup`](@ref). """ module DynamicHMC using ArgCheck: @argcheck using DocStringExtensions: FIELDS, FUNCTIONNAME, SIGNATURES, TYPEDEF using FillArrays: Fill using LinearAlgebra: checksquare, cholesky, diag, dot, Diagonal, Symmetric, UniformScaling using LogDensityProblems: capabilities, LogDensityOrder, dimension, logdensity_and_gradient using LogExpFunctions: logaddexp using Random: AbstractRNG, randn, Random, randexp using Statistics: cov, mean, median, middle, quantile, var using TensorCast: @cast, TensorCast include("utilities.jl") include("trees.jl") include("hamiltonian.jl") include("stepsize.jl") include("NUTS.jl") include("reporting.jl") include("mcmc.jl") include("diagnostics.jl") end # module ================================================ FILE: src/NUTS.jl ================================================ ##### ##### NUTS tree sampler implementation. ##### export NUTS #### #### Trajectory and implementation #### """ Representation of a trajectory, ie a Hamiltonian with a discrete integrator that also checks for divergence. """ struct TrajectoryNUTS{TH,Tf,S} "Hamiltonian." H::TH "Log density of z (negative log energy) at initial point." π₀::Tf "Stepsize for leapfrog." ϵ::Tf "Smallest decrease allowed in the log density." min_Δ::Tf "Turn statistic configuration." turn_statistic_configuration::S end function move(trajectory::TrajectoryNUTS, z, fwd) (; H, ϵ) = trajectory leapfrog(H, z, fwd ? ϵ : -ϵ) end ### ### proposals ### """ $(SIGNATURES) Random boolean which is `true` with the given probability `exp(logprob)`, which can be `≥ 1` in which case no random value is drawn. """ function rand_bool_logprob(rng::AbstractRNG, logprob) logprob ≥ 0 || (randexp(rng, Float64) > -logprob) end function calculate_logprob2(::TrajectoryNUTS, is_doubling, ω₁, ω₂, ω) biased_progressive_logprob2(is_doubling, ω₁, ω₂, ω) end function combine_proposals(rng, ::TrajectoryNUTS, z₁, z₂, logprob2::Real, is_forward) rand_bool_logprob(rng, logprob2) ? z₂ : z₁ end ### ### statistics for visited nodes ### struct AcceptanceStatistic{T} """ Logarithm of the sum of metropolis acceptances probabilities over the whole trajectory (including invalid parts). """ log_sum_α::T "Total number of leapfrog steps." steps::Int end function combine_acceptance_statistics(A::AcceptanceStatistic, B::AcceptanceStatistic) AcceptanceStatistic(logaddexp(A.log_sum_α, B.log_sum_α), A.steps + B.steps) end """ $(SIGNATURES) Acceptance statistic for a leaf. The initial leaf is considered not to be visited. """ function leaf_acceptance_statistic(Δ, is_initial) is_initial ? AcceptanceStatistic(oftype(Δ, -Inf), 0) : AcceptanceStatistic(min(Δ, 0), 1) end """ $(SIGNATURES) Return the acceptance rate (a `Real` betwen `0` and `1`). """ acceptance_rate(A::AcceptanceStatistic) = min(exp(A.log_sum_α) / A.steps, 1) combine_visited_statistics(::TrajectoryNUTS, v, w) = combine_acceptance_statistics(v, w) ### ### turn analysis ### """ Statistics for the identification of turning points. See Betancourt (2017, appendix), and subsequent discussion of improvements at . Momenta `p₋` and `p₊` are kept so that they can be added to `ρ` when combining turn statistics. Turn detection is always done by [`combine_turn_statistics`](@ref), which returns `nothing` in case of turning. A `GeneralizedTurnStatistic` should always correspond to a trajectory that is *not* turning (or a leaf node, where the concept does not apply). """ struct GeneralizedTurnStatistic{T} "momentum at the left edge of the trajectory" p₋::T "p♯ at the left edge of the trajectory" p♯₋::T "momentum at the right edge of the trajectory" p₊::T "p♯ at the right edge of the trajectory" p♯₊::T "sum of momenta along trajectory" ρ::T end function leaf_turn_statistic(::Val{:generalized}, H, z) p♯ = calculate_p♯(H, z) GeneralizedTurnStatistic(z.p, p♯, z.p, p♯, z.p) end """ $(SIGNATURES) Internal test for turning. See Betancourt (2017, appendix). """ _is_turning(p♯₋, p♯₊, ρ) = dot(p♯₋, ρ) < 0 || dot(p♯₊, ρ) < 0 function combine_turn_statistics(::TrajectoryNUTS, x::GeneralizedTurnStatistic, y::GeneralizedTurnStatistic) _is_turning(x.p♯₋, y.p♯₋, x.ρ + y.p₋) && return nothing _is_turning(x.p♯₊, y.p♯₊, x.p₊ + y.ρ) && return nothing ρ = x.ρ + y.ρ _is_turning(x.p♯₋, y.p♯₊, ρ) && return nothing GeneralizedTurnStatistic(x.p₋, x.p♯₋, y.p₊, y.p♯₊, ρ) end is_turning(::TrajectoryNUTS, ::GeneralizedTurnStatistic) = false is_turning(::TrajectoryNUTS, ::Nothing) = true ### ### leafs ### function leaf(trajectory::TrajectoryNUTS, z, is_initial) (; H, π₀, min_Δ, turn_statistic_configuration) = trajectory Δ = is_initial ? zero(π₀) : logdensity(H, z) - π₀ isdiv = Δ < min_Δ v = leaf_acceptance_statistic(Δ, is_initial) if isdiv nothing, v else τ = leaf_turn_statistic(turn_statistic_configuration, H, z) (z, Δ, τ), v end end #### #### NUTS interface #### "Default maximum depth for trees." const DEFAULT_MAX_TREE_DEPTH = 10 """ $(TYPEDEF) Implementation of the “No-U-Turn Sampler” of Hoffman and Gelman (2014), including subsequent developments, as described in Betancourt (2017). # Fields $(FIELDS) """ struct NUTS{S} "Maximum tree depth." max_depth::Int "Threshold for negative energy relative to starting point that indicates divergence." min_Δ::Float64 """ Turn statistic configuration. Currently only `Val(:generalized)` (the default) is supported. """ turn_statistic_configuration::S function NUTS(; max_depth = DEFAULT_MAX_TREE_DEPTH, min_Δ = -1000.0, turn_statistic_configuration = Val{:generalized}()) @argcheck 0 < max_depth ≤ MAX_DIRECTIONS_DEPTH @argcheck min_Δ < 0 S = typeof(turn_statistic_configuration) new{S}(Int(max_depth), Float64(min_Δ), turn_statistic_configuration) end end """ $(TYPEDEF) Diagnostic information for a single tree built with the No-U-turn sampler. # Fields Accessing fields directly is part of the API. $(FIELDS) """ struct TreeStatisticsNUTS "Log density of the Hamiltonian (negative energy)." π::Float64 "Depth of the tree." depth::Int "Reason for termination. See [`InvalidTree`](@ref) and [`REACHED_MAX_DEPTH`](@ref)." termination::InvalidTree "Acceptance rate statistic." acceptance_rate::Float64 "Number of leapfrog steps evaluated." steps::Int "Directions for tree doubling (useful for debugging)." directions::Directions end """ $(SIGNATURES) No-U-turn Hamiltonian Monte Carlo transition, using Hamiltonian `H`, starting at evaluated log density position `Q`, using stepsize `ϵ`. Parameters of `algorithm` govern the details of tree construction. Return two values, the new evaluated log density position, and tree statistics. """ function sample_tree(rng, algorithm::NUTS, H::Hamiltonian, Q::EvaluatedLogDensity, ϵ; p = rand_p(rng, H.κ), directions = rand(rng, Directions)) (; max_depth, min_Δ, turn_statistic_configuration) = algorithm z = PhasePoint(Q, p) trajectory = TrajectoryNUTS(H, logdensity(H, z), ϵ, min_Δ, turn_statistic_configuration) ζ, v, termination, depth = sample_trajectory(rng, trajectory, z, max_depth, directions) tree_statistics = TreeStatisticsNUTS(logdensity(H, ζ), depth, termination, acceptance_rate(v), v.steps, directions) ζ.Q, tree_statistics end ================================================ FILE: src/diagnostics.jl ================================================ ##### ##### statistics and diagnostics for the NUTS-specific tree statistics. ##### ##### For general posterior diagnostics, see `stack_posterior_matrices`. module Diagnostics export EBFMI, summarize_tree_statistics, explore_log_acceptance_ratios, leapfrog_trajectory, InvalidTree, REACHED_MAX_DEPTH, is_divergent using DynamicHMC: GaussianKineticEnergy, Hamiltonian, evaluate_ℓ, InvalidTree, REACHED_MAX_DEPTH, is_divergent, local_log_acceptance_ratio, PhasePoint, rand_p, leapfrog, logdensity, MAX_DIRECTIONS_DEPTH using ArgCheck: @argcheck using DocStringExtensions: FIELDS, SIGNATURES, TYPEDEF using LogDensityProblems: dimension import Random using Statistics: mean, quantile, var """ $(SIGNATURES) Energy Bayesian fraction of missing information. Useful for diagnosing poorly chosen kinetic energies. Low values (`≤ 0.3`) are considered problematic. See Betancourt (2016). """ function EBFMI(tree_statistics) πs = map(x -> x.π, tree_statistics) mean(abs2, diff(πs)) / var(πs) end "Acceptance quantiles for [`TreeStatisticsSummary`](@ref) diagnostic summary." const ACCEPTANCE_QUANTILES = [0.05, 0.25, 0.5, 0.75, 0.95] """ $(TYPEDEF) Storing the output of [`NUTS_statistics`](@ref) in a structured way, for pretty printing. Currently for internal use. # Fields $(FIELDS) """ struct TreeStatisticsSummary{T <: Real, C <: NamedTuple} "Sample length." N::Int "average_acceptance" a_mean::T "acceptance quantiles" a_quantiles::Vector{T} "termination counts" termination_counts::C "depth counts (first element is for `0`)" depth_counts::Vector{Int} end """ $(SIGNATURES) Count termination reasons in `tree_statistics`, return as a `NamedTuple`. """ function count_terminations(tree_statistics) max_depth = 0 divergence = 0 turning = 0 for tree_statistic in tree_statistics it = tree_statistic.termination if it == REACHED_MAX_DEPTH max_depth += 1 elseif is_divergent(it) divergence += 1 else turning += 1 end end (; max_depth, divergence, turning) end """ $(SIGNATURES) Count depths in tree statistics. """ function count_depths(tree_statistics) c = zeros(Int, MAX_DIRECTIONS_DEPTH + 1) for tree_statistic in tree_statistics c[tree_statistic.depth + 1] += 1 end c[1:something(findlast(!iszero, c), 0)] end """ $(SIGNATURES) Summarize tree statistics. Mostly useful for NUTS diagnostics. """ function summarize_tree_statistics(tree_statistics) As = map(x -> x.acceptance_rate, tree_statistics) TreeStatisticsSummary(length(tree_statistics), mean(As), quantile(As, ACCEPTANCE_QUANTILES), count_terminations(tree_statistics), count_depths(tree_statistics)) end function Base.show(io::IO, stats::TreeStatisticsSummary) (; N, a_mean, a_quantiles, termination_counts, depth_counts) = stats println(io, "Hamiltonian Monte Carlo sample of length $(N)") print(io, " acceptance rate mean: $(round(a_mean; digits = 2)), 5/25/50/75/95%:") for aq in a_quantiles print(io, " ", round(aq; digits = 2)) end println(io) function print_percentages(pairs) is_first = true for (key, value) in sort(collect(pairs), by = first) if is_first is_first = false else print(io, ",") end print(io, " $(key) => $(round(Int, 100*value/N))%") end end print(io, " termination:") print_percentages(pairs(termination_counts)) println(io) print(io, " depth:") print_percentages(zip(axes(depth_counts, 1) .- 1, depth_counts)) end #### #### Acceptance ratio diagnostics #### """ $(SIGNATURES) From an initial position, calculate the uncapped log acceptance ratio for the given log2 stepsizes and momentums `ps`, `N` of which are generated randomly by default. """ function explore_log_acceptance_ratios(ℓ, q, log2ϵs; rng = Random.default_rng(), κ = GaussianKineticEnergy(dimension(ℓ)), N = 20, ps = [rand_p(rng, κ) for _ in 1:N]) H = Hamiltonian(κ, ℓ) Q = evaluate_ℓ(ℓ, q) As = [local_log_acceptance_ratio(H, PhasePoint(Q, p)) for p in ps] [A(2.0^log2ϵ) for log2ϵ in log2ϵs, A in As] end """ $(TYPEDEF) Implements an iterator on a leapfrog trajectory until the first non-finite log density. # Fields $(FIELDS) """ struct LeapfrogTrajectory{TH,TZ,TF,Tϵ} "Hamiltonian" H::TH "Initial position" z₀::TZ "Negative energy at initial position." π₀::TF "Stepsize (negative: move backward)." ϵ::Tϵ end Base.IteratorSize(::Type{<:LeapfrogTrajectory}) = Base.SizeUnknown() function Base.iterate(lft::LeapfrogTrajectory, zi = (lft.z₀, 0)) (; H, ϵ, π₀) = lft z, i = zi if isfinite(z.Q.ℓq) z′ = leapfrog(H, z, ϵ) i′ = i + sign(ϵ) _position_information(lft, z′, i′), (z′, i′) else nothing end end """ $(SIGNATURES) Position information returned by [`leapfrog_trajectory`](@ref), see documentation there. Internal function. """ function _position_information(lft::LeapfrogTrajectory, z, i) (; H, π₀) = lft (z = z, position = i, Δ = logdensity(H, z) - π₀) end """ $(SIGNATURES) Calculate a leapfrog trajectory visiting `positions` (specified as a `UnitRange`, eg `-5:5`) relative to the starting point `q`, with stepsize `ϵ`. `positions` has to contain `0`, and the trajectories are only tracked up to the first non-finite log density in each direction. Returns a vector of `NamedTuple`s, each containin - `z`, a [`PhasePoint`](@ref) object, - `position`, the corresponding position, - `Δ`, the log density + the kinetic energy relative to position `0`. """ function leapfrog_trajectory(ℓ, q, ϵ, positions::UnitRange{<:Integer}; rng = Random.default_rng(), κ = GaussianKineticEnergy(dimension(ℓ)), p = rand_p(rng, κ)) A, B = first(positions), last(positions) @argcheck A ≤ 0 ≤ B "Positions has to contain `0`." Q = evaluate_ℓ(ℓ, q) H = Hamiltonian(κ, ℓ) z₀ = PhasePoint(Q, p) π₀ = logdensity(H, z₀) lft_fwd = LeapfrogTrajectory(H, z₀, π₀, ϵ) fwd_part = collect(Iterators.take(lft_fwd, B)) bwd_part = collect(Iterators.take(LeapfrogTrajectory(H, z₀, π₀, -ϵ), -A)) vcat(reverse!(bwd_part), _position_information(lft_fwd, z₀, 0), fwd_part) end end ================================================ FILE: src/hamiltonian.jl ================================================ ##### ##### Building blocks for traversing a Hamiltonian deterministically, using the leapfrog ##### integrator. ##### export GaussianKineticEnergy #### #### kinetic energy #### """ $(TYPEDEF) Kinetic energy specifications. Implements the methods - `Base.size` - [`kinetic_energy`](@ref) - [`calculate_p♯`](@ref) - [`∇kinetic_energy`](@ref) - [`rand_p`](@ref) For all subtypes, it is implicitly assumed that kinetic energy is symmetric in the momentum `p`, ```julia kinetic_energy(κ, p, q) == kinetic_energy(κ, .-p, q) ``` When the above is violated, the consequences are undefined. """ abstract type KineticEnergy end """ $(TYPEDEF) Euclidean kinetic energies (position independent). """ abstract type EuclideanKineticEnergy <: KineticEnergy end """ $(TYPEDEF) Gaussian kinetic energy, with ``K(q,p) = p ∣ q ∼ 1/2 pᵀ⋅M⁻¹⋅p + log|M|`` (without constant), which is independent of ``q``. The inverse covariance ``M⁻¹`` is stored. !!! note Making ``M⁻¹`` approximate the posterior variance is a reasonable starting point. """ struct GaussianKineticEnergy{T <: AbstractMatrix, S <: AbstractMatrix} <: EuclideanKineticEnergy "M⁻¹" M⁻¹::T "W such that W*W'=M. Used for generating random draws." W::S function GaussianKineticEnergy(M⁻¹::T, W::S) where {T, S} @argcheck checksquare(M⁻¹) == checksquare(W) new{T,S}(M⁻¹, W) end end """ $(SIGNATURES) Gaussian kinetic energy with the given inverse covariance matrix `M⁻¹`. """ GaussianKineticEnergy(M⁻¹::AbstractMatrix) = GaussianKineticEnergy(M⁻¹, cholesky(inv(M⁻¹)).L) """ $(SIGNATURES) Gaussian kinetic energy with the given inverse covariance matrix `M⁻¹`. """ GaussianKineticEnergy(M⁻¹::Diagonal) = GaussianKineticEnergy(M⁻¹, Diagonal(.√inv.(diag(M⁻¹)))) """ $(SIGNATURES) Gaussian kinetic energy with a diagonal inverse covariance matrix `M⁻¹=m⁻¹*I`. """ GaussianKineticEnergy(N::Integer, m⁻¹ = 1.0) = GaussianKineticEnergy(Diagonal(Fill(m⁻¹, N))) function Base.show(io::IO, κ::GaussianKineticEnergy{T}) where {T} print(io::IO, "Gaussian kinetic energy ($(nameof(T))), √diag(M⁻¹): $(.√(diag(κ.M⁻¹)))") end ## NOTE about implementation: the 3 methods are callable without a third argument (`q`) ## because they are defined for Gaussian (Euclidean) kinetic energies. Base.size(κ::GaussianKineticEnergy, args...) = size(κ.M⁻¹, args...) """ $(SIGNATURES) Return kinetic energy `κ`, at momentum `p`. """ kinetic_energy(κ::GaussianKineticEnergy, p, q = nothing) = dot(p, κ.M⁻¹ * p) / 2 """ $(SIGNATURES) Return ``p♯ = M⁻¹⋅p``, used for turn diagnostics. """ calculate_p♯(κ::GaussianKineticEnergy, p, q = nothing) = κ.M⁻¹ * p """ $(SIGNATURES) Calculate the gradient of the logarithm of kinetic energy in momentum `p`. """ ∇kinetic_energy(κ::GaussianKineticEnergy, p, q = nothing) = calculate_p♯(κ, p) """ $(SIGNATURES) Generate a random momentum from a kinetic energy at position `q`. """ rand_p(rng::AbstractRNG, κ::GaussianKineticEnergy, q = nothing) = κ.W * randn(rng, size(κ.W, 1)) #### #### Hamiltonian #### struct Hamiltonian{K,L} "The kinetic energy specification." κ::K """ The (log) density we are sampling from. Supports the `LogDensityProblem` API. Technically, it is the negative of the potential energy. """ ℓ::L """ $(SIGNATURES) Construct a Hamiltonian from the log density `ℓ`, and the kinetic energy specification `κ`. `ℓ` with a vector are expected to support the `LogDensityProblems` API, with gradients. """ function Hamiltonian(κ::K, ℓ::L) where {K <: KineticEnergy,L} @argcheck capabilities(ℓ) ≥ LogDensityOrder(1) @argcheck dimension(ℓ) == size(κ, 1) new{K,L}(κ, ℓ) end end Base.show(io::IO, H::Hamiltonian) = print(io, "Hamiltonian with $(H.κ)") """ $(TYPEDEF) A log density evaluated at position `q`. The log densities and gradient are saved, so that they are not calculated twice for every leapfrog step (both as start- and endpoints). Because of caching, a `EvaluatedLogDensity` should only be used with a specific Hamiltonian, preferably constructed with the `evaluate_ℓ` constructor. In composite types and arguments, `Q` is usually used for this type. """ struct EvaluatedLogDensity{T,S} "Position." q::T "ℓ(q). Saved for reuse in sampling." ℓq::S "∇ℓ(q). Cached for reuse in sampling." ∇ℓq::T function EvaluatedLogDensity(q::T, ℓq::S, ∇ℓq::T) where {T <: AbstractVector,S <: Real} @argcheck length(q) == length(∇ℓq) new{T,S}(q, ℓq, ∇ℓq) end end # general constructors below are necessary to sanitize input from eg Diagnostics, or an # initial position given as integers, etc function EvaluatedLogDensity(q::AbstractVector, ℓq::Real, ∇ℓq::AbstractVector) q, ∇ℓq = promote(q, ∇ℓq) EvaluatedLogDensity(q, ℓq, ∇ℓq) end EvaluatedLogDensity(q, ℓq::Real, ∇ℓq) = EvaluatedLogDensity(collect(q), ℓq, collect(∇ℓq)) """ $(SIGNATURES) Evaluate log density and gradient and save with the position. Preferred interface for creating `EvaluatedLogDensity` instances. Non-finite elements in `q` always throw an error. Non-finite and not `-Inf` elements in the log density throw an error if `strict`, otherwise replace the log density with `-Inf`. Non-finite elements in the gradient throw an error if `strict`, otherwise replace the log density with `-Inf`. """ function evaluate_ℓ(ℓ, q; strict::Bool = false) all(isfinite, q) || _error("Position vector has non-finite elements."; q) ℓq, ∇ℓq = logdensity_and_gradient(ℓ, q) if (isfinite(ℓq) && all(isfinite, ∇ℓq)) || ℓq == -Inf # everything is finite, or log density is -Inf, which will be rejected EvaluatedLogDensity(q, ℓq, ∇ℓq) elseif !strict # something went wrong, but proceed and replace log density with -Inf, so it is # rejected. EvaluatedLogDensity(q, oftype(ℓq, -Inf), ∇ℓq) # somew elseif isfinite(ℓq) _error("Gradient has non-finite elements."; q, ∇ℓq) else _error("Invalid log posterior."; q, ℓq) end end """ $(TYPEDEF) A point in phase space, consists of a position (in the form of an evaluated log density `ℓ` at `q`) and a momentum. """ struct PhasePoint{T <: EvaluatedLogDensity,S} "Evaluated log density." Q::T "Momentum." p::S function PhasePoint(Q::T, p::S) where {T,S} @argcheck length(p) == length(Q.q) new{T,S}(Q, p) end end """ $(SIGNATURES) Log density for Hamiltonian `H` at point `z`. If `ℓ(q) == -Inf` (rejected), skips the kinetic energy calculation. Non-finite values (incl `NaN`, `Inf`) are automatically converted to `-Inf`. This can happen if 1. the log density is not a finite value, 2. the kinetic energy is not a finite value (which usually happens when `NaN` or `Inf` got mixed in the leapfrog step, leading to an invalid position). """ function logdensity(H::Hamiltonian{<:EuclideanKineticEnergy}, z::PhasePoint) (; ℓq) = z.Q isfinite(ℓq) || return oftype(ℓq, -Inf) K = kinetic_energy(H.κ, z.p) ℓq - (isfinite(K) ? K : oftype(K, Inf)) end function calculate_p♯(H::Hamiltonian{<:EuclideanKineticEnergy}, z::PhasePoint) calculate_p♯(H.κ, z.p) end """ leapfrog(H, z, ϵ) Take a leapfrog step of length `ϵ` from `z` along the Hamiltonian `H`. Return the new phase point. The leapfrog algorithm uses the gradient of the next position to evolve the momentum. If this is not finite, the momentum won't be either, `logdensity` above will catch this and return an `-Inf`, making the point divergent. """ function leapfrog(H::Hamiltonian{<: EuclideanKineticEnergy}, z::PhasePoint, ϵ) (; ℓ, κ) = H (; p, Q) = z @argcheck isfinite(Q.ℓq) "Internal error: leapfrog called from non-finite log density" pₘ = p + ϵ/2 * Q.∇ℓq q′ = Q.q + ϵ * ∇kinetic_energy(κ, pₘ) Q′ = evaluate_ℓ(H.ℓ, q′) p′ = pₘ + ϵ/2 * Q′.∇ℓq PhasePoint(Q′, p′) end ================================================ FILE: src/mcmc.jl ================================================ ##### ##### Sampling: high-level interface and building blocks ##### export InitialStepsizeSearch, DualAveraging, TuningNUTS, mcmc_with_warmup, default_warmup_stages, fixed_stepsize_warmup_stages, stack_posterior_matrices, pool_posterior_matrices "Significant digits to display for reporting." const REPORT_SIGDIGITS = 3 #### #### docstrings for reuse (not exported, internal) #### const _DOC_POSTERIOR_MATRIX = "`posterior_matrix`, a matrix of position vectors, indexes by `[parameter_index, draw_index]`" const _DOC_TREE_STATISTICS = "`tree_statistics`, a vector of tree statistics for each sample" const _DOC_EPSILONS = "`ϵs`, a vector of step sizes for each sample" const _DOC_LOGDENSITIES = "`logdensities`, a vector of log densities, each associated with a column of the posterior matrix" #### #### parts unaffected by warmup #### """ $(TYPEDEF) A log density bundled with an RNG and options for sampling. Contains the parts of the problem which are **not changed during warmup** (and thus the whole sampling). # Fields $(FIELDS) """ struct SamplingLogDensity{R,L,O,S} "Random number generator." rng::R "Log density." ℓ::L """ Algorithm used for sampling, also contains the relevant parameters that are not affected by adaptation. See eg [`NUTS`](@ref). """ algorithm::O "Reporting warmup information and chain progress." reporter::S end #### #### warmup building blocks #### ### ### warmup state ### """ $(TYPEDEF) Representation of a warmup state. Not part of the API. # Fields $(FIELDS) """ struct WarmupState{TQ <: EvaluatedLogDensity,Tκ <: KineticEnergy, Tϵ <: Union{Real,Nothing}} "phasepoint" Q::TQ "kinetic energy" κ::Tκ "stepsize" ϵ::Tϵ end function Base.show(io::IO, warmup_state::WarmupState) (; κ, ϵ) = warmup_state ϵ_display = ϵ ≡ nothing ? "unspecified" : "≈ $(round(ϵ; sigdigits = REPORT_SIGDIGITS))" print(io, "adapted sampling parameters: stepsize (ϵ) $(ϵ_display), $(κ)") end ### ### warmup interface and stages ### """ $(SIGNATURES) Return the *results* and the *next warmup state* after warming up/adapting according to `warmup_stage`, starting from `warmup_state`. Use `nothing` for a no-op. """ function warmup(sampling_logdensity::SamplingLogDensity, warmup_stage::Nothing, warmup_state) nothing, warmup_state end """ $(SIGNATURES) Helper function to create random starting positions in the `[-2,2]ⁿ` box. """ random_position(rng, N) = rand(rng, N) .* 4 .- 2 "Docstring for initial warmup arguments." const DOC_INITIAL_WARMUP_ARGS = """ - `q`: initial position. *Default*: random (uniform [-2,2] for each coordinate). - `κ`: kinetic energy specification. *Default*: Gaussian with identity matrix. - `ϵ`: a scalar for initial stepsize, or `nothing` for heuristic finders. """ """ $(SIGNATURES) Create an initial warmup state from a random position. # Keyword arguments $(DOC_INITIAL_WARMUP_ARGS) """ function initialize_warmup_state(rng, ℓ; q = random_position(rng, dimension(ℓ)), κ = GaussianKineticEnergy(dimension(ℓ)), ϵ = nothing) WarmupState(evaluate_ℓ(ℓ, q; strict = true), κ, ϵ) end function warmup(sampling_logdensity, stepsize_search::InitialStepsizeSearch, warmup_state) (; rng, ℓ, reporter) = sampling_logdensity (; Q, κ, ϵ) = warmup_state @argcheck ϵ ≡ nothing "stepsize ϵ manually specified, won't perform initial search" z = PhasePoint(Q, rand_p(rng, κ)) try ϵ = find_initial_stepsize(stepsize_search, local_log_acceptance_ratio(Hamiltonian(κ, ℓ), z)) report(reporter, "found initial stepsize", ϵ = round(ϵ; sigdigits = REPORT_SIGDIGITS)) nothing, WarmupState(Q, κ, ϵ) catch e @info "failed to find initial stepsize" q = z.Q.q p = z.p κ rethrow(e) end end """ $(TYPEDEF) Tune the step size `ϵ` during sampling, and the metric of the kinetic energy at the end of the block. The method for the latter is determined by the type parameter `M`, which can be 1. `Diagonal` for diagonal metric (the default), 2. `Symmetric` for a dense metric, 3. `Nothing` for an unchanged metric. # Results A `NamedTuple` with the following fields: - $(_DOC_POSTERIOR_MATRIX) - $(_DOC_TREE_STATISTICS) - $(_DOC_EPSILONS) - $(_DOC_LOGDENSITIES) # Fields $(FIELDS) """ struct TuningNUTS{M,D} "Number of samples." N::Int "Dual averaging parameters." stepsize_adaptation::D """ Regularization factor for normalizing variance. An estimated covariance matrix `Σ` is rescaled by `λ` towards ``σ²I``, where ``σ²`` is the median of the diagonal. The constructor has a reasonable default. """ λ::Float64 function TuningNUTS{M}(N::Integer, stepsize_adaptation::D, λ = 5.0/N) where {M <: Union{Nothing,Diagonal,Symmetric},D} @argcheck N ≥ 20 # variance estimator is kind of meaningless for few samples @argcheck λ ≥ 0 new{M,D}(N, stepsize_adaptation, λ) end end function Base.show(io::IO, tuning::TuningNUTS{M}) where {M} (; N, stepsize_adaptation, λ) = tuning print(io, "Stepsize and metric tuner, $(N) samples, $(M) metric, regularization $(λ)") end """ $(SIGNATURES) Estimate the inverse metric from the chain. In most cases, this should be regularized, see [`regularize_M⁻¹`](@ref). """ sample_M⁻¹(::Type{Diagonal}, posterior_matrix) = Diagonal(vec(var(posterior_matrix; dims = 2))) sample_M⁻¹(::Type{Symmetric}, posterior_matrix) = Symmetric(cov(posterior_matrix; dims = 2)) """ $(SIGNATURES) Adjust the inverse metric estimated from the sample, using an *ad-hoc* shrinkage method. """ function regularize_M⁻¹(Σ::Symmetric, λ::Real) d = diag(Σ) (1 - λ) * Σ + λ * Diagonal(d) # ad-hoc “shrinkage” end regularize_M⁻¹(Σ::Union{Diagonal}, λ::Real) = Σ """ $(SIGNATURES) Create an empty posterior matrix, based on `Q` (a logdensity evaluated at a position). """ _empty_posterior_matrix(Q, N) = Matrix{eltype(Q.q)}(undef, length(Q.q), N) """ $(SIGNATURES) Create an empty vector for log densities, based on `Q` (a logdensity evaluated at a position). """ _empty_logdensity_vector(Q, N) = Vector{typeof(Q.ℓq)}(undef, N) """ $(SIGNATURES) Perform a warmup on a given `sampling_logdensity`, using the specified `tuning`, starting from `warmup_state`. Return two values. The first is either `nothing`, or a `NamedTuple` of - $(_DOC_POSTERIOR_MATRIX) - $(_DOC_TREE_STATISTICS) - $(_DOC_EPSILONS) - $(_DOC_LOGDENSITIES) The second is the warmup state. """ function warmup(sampling_logdensity, tuning::TuningNUTS{M}, warmup_state) where {M} (; rng, ℓ, algorithm, reporter) = sampling_logdensity (; Q, κ, ϵ) = warmup_state (; N, stepsize_adaptation, λ) = tuning posterior_matrix = _empty_posterior_matrix(Q, N) logdensities = _empty_logdensity_vector(Q, N) tree_statistics = Vector{TreeStatisticsNUTS}(undef, N) H = Hamiltonian(κ, ℓ) ϵ_state = initial_adaptation_state(stepsize_adaptation, ϵ) ϵs = Vector{Float64}(undef, N) mcmc_reporter = make_mcmc_reporter(reporter, N; currently_warmup = true, tuning = M ≡ Nothing ? "stepsize" : "stepsize and $(M) metric") for i in 1:N ϵ = current_ϵ(ϵ_state) ϵs[i] = ϵ Q, stats = sample_tree(rng, algorithm, H, Q, ϵ) posterior_matrix[:, i] = Q.q logdensities[i] = Q.ℓq tree_statistics[i] = stats ϵ_state = adapt_stepsize(stepsize_adaptation, ϵ_state, stats.acceptance_rate) report(mcmc_reporter, i; ϵ = round(ϵ; sigdigits = REPORT_SIGDIGITS)) end if M ≢ Nothing κ = GaussianKineticEnergy(regularize_M⁻¹(sample_M⁻¹(M, posterior_matrix), λ)) report(mcmc_reporter, "adaptation finished", adapted_kinetic_energy = κ) end ((; posterior_matrix, tree_statistics, ϵs, logdensities), WarmupState(Q, κ, final_ϵ(ϵ_state))) end """ $(TYPEDEF) A composite type for performing MCMC stepwise after warmup. The type is *not* part of the API, see [`mcmc_steps`](@ref) and [`mcmc_next_step`](@ref). """ struct MCMCSteps{TR,TA,TH,TE} rng::TR algorithm::TA H::TH ϵ::TE end """ $(SIGNATURES) Return a value which can be used to perform MCMC stepwise, eg until some criterion is satisfied about the sample. See [`mcmc_next_step`](@ref). Two constructors are available: 1. Explicitly providing - `rng`, the random number generator, - `algorithm`, see [`mcmc_with_warmup`](@ref), - `κ`, the (adapted) metric, - `ℓ`, the log density callable (see [`mcmc_with_warmup`](@ref), - `ϵ`, the stepsize. 2. Using the fields `sampling_logdensity` and `warmup_state`, eg from [`mcmc_keep_warmup`](@ref) (make sure you use eg `final_warmup_state`). # Example ```julia # initialization results = DynamicHMC.mcmc_keep_warmup(RNG, ℓ, 0; reporter = NoProgressReport()) steps = mcmc_steps(results.sampling_logdensity, results.final_warmup_state) Q = results.final_warmup_state.Q # a single update step Q, tree_stats = mcmc_next_step(steps, Q) # extract the position Q.q ``` """ mcmc_steps(rng, algorithm, κ, ℓ, ϵ) = MCMCSteps(rng, algorithm, Hamiltonian(κ, ℓ), ϵ) function mcmc_steps(sampling_logdensity::SamplingLogDensity, warmup_state) (; rng, ℓ, algorithm) = sampling_logdensity (; κ, ϵ) = warmup_state mcmc_steps(rng, algorithm, κ, ℓ, ϵ) end """ $(SIGNATURES) Given `Q` (an evaluated log density at a position), return the next `Q` and tree statistics. """ function mcmc_next_step(mcmc_steps::MCMCSteps, Q::EvaluatedLogDensity) (; rng, algorithm, H, ϵ) = mcmc_steps sample_tree(rng, algorithm, H, Q, ϵ) end """ $(SIGNATURES) Markov Chain Monte Carlo for `sampling_logdensity`, with the adapted `warmup_state`. Return a `NamedTuple` of - $(_DOC_POSTERIOR_MATRIX) - $(_DOC_TREE_STATISTICS) - $(_DOC_LOGDENSITIES) """ function mcmc(sampling_logdensity, N, warmup_state) (; reporter) = sampling_logdensity (; Q) = warmup_state posterior_matrix = _empty_posterior_matrix(Q, N) logdensities = _empty_logdensity_vector(Q, N) tree_statistics = Vector{TreeStatisticsNUTS}(undef, N) mcmc_reporter = make_mcmc_reporter(reporter, N; currently_warmup = false) steps = mcmc_steps(sampling_logdensity, warmup_state) for i in 1:N Q, tree_statistics[i] = mcmc_next_step(steps, Q) posterior_matrix[:, i] = Q.q logdensities[i] = Q.ℓq report(mcmc_reporter, i) end (; posterior_matrix, tree_statistics, logdensities) end """ $(SIGNATURES) Helper function for constructing the “middle” doubling warmup stages in [`default_warmup_stages`](@ref). """ function _doubling_warmup_stages(M, stepsize_adaptation, middle_steps, doubling_stages::Val{D}) where {D} ntuple(i -> TuningNUTS{M}(middle_steps * 2^(i - 1), stepsize_adaptation), D) end """ $(SIGNATURES) A sequence of warmup stages: 1. select an initial stepsize using `stepsize_search` (default: based on a heuristic), 2. tuning stepsize with `init_steps` steps 3. tuning stepsize and covariance: first with `middle_steps` steps, then repeat with twice the steps `doubling_stages` times 4. tuning stepsize with `terminating_steps` steps. `M` (`Diagonal`, the default or `Symmetric`) determines the type of the metric adapted from the sample. This is the suggested tuner of most applications. Use `nothing` for `stepsize_adaptation` to skip the corresponding step. """ function default_warmup_stages(; stepsize_search = InitialStepsizeSearch(), M::Type{<:Union{Diagonal,Symmetric}} = Diagonal, stepsize_adaptation = DualAveraging(), init_steps = 75, middle_steps = 25, doubling_stages = 5, terminating_steps = 50) (stepsize_search, TuningNUTS{Nothing}(init_steps, stepsize_adaptation), _doubling_warmup_stages(M, stepsize_adaptation, middle_steps, Val(doubling_stages))..., TuningNUTS{Nothing}(terminating_steps, stepsize_adaptation)) end """ $(SIGNATURES) A sequence of warmup stages for fixed stepsize, only tuning covariance: first with `middle_steps` steps, then repeat with twice the steps `doubling_stages` times Very similar to [`default_warmup_stages`](@ref), but omits the warmup stages with just stepsize tuning. """ function fixed_stepsize_warmup_stages(; M::Type{<:Union{Diagonal,Symmetric}} = Diagonal, middle_steps = 25, doubling_stages = 5) _doubling_warmup_stages(M, FixedStepsize(), middle_steps, Val(doubling_stages)) end """ $(SIGNATURES) Helper function for implementing warmup. !!! note Changes may imply documentation updates in [`mcmc_keep_warmup`](@ref). """ function _warmup(sampling_logdensity, stages, initial_warmup_state) foldl(stages; init = ((), initial_warmup_state)) do acc, stage stages_and_results, warmup_state = acc results, warmup_state′ = warmup(sampling_logdensity, stage, warmup_state) stage_information = (stage, results, warmup_state = warmup_state′) (stages_and_results..., stage_information), warmup_state′ end end "Shared docstring part for the MCMC API." const DOC_MCMC_ARGS = """ # Arguments - `rng`: the random number generator, eg `Random.default_rng()`. - `ℓ`: the log density, supporting the API of the `LogDensityProblems` package - `N`: the number of samples for inference, after the warmup. # Keyword arguments - `initialization`: see below. - `warmup_stages`: a sequence of warmup stages. See [`default_warmup_stages`](@ref) and [`fixed_stepsize_warmup_stages`](@ref); the latter requires an `ϵ` in initialization. - `algorithm`: see [`NUTS`](@ref). It is very unlikely you need to modify this, except perhaps for the maximum depth. - `reporter`: how progress is reported. By default, verbosely for interactive sessions using the log message mechanism (see [`LogProgressReport`](@ref), and no reporting for non-interactive sessions (see [`NoProgressReport`](@ref)). # Initialization The `initialization` keyword argument should be a `NamedTuple` which can contain the following fields (all of them optional and provided with reasonable defaults): $(DOC_INITIAL_WARMUP_ARGS) """ """ $(SIGNATURES) Perform MCMC with NUTS, keeping the warmup results. Returns a `NamedTuple` of - `initial_warmup_state`, which contains the initial warmup state - `warmup`, an iterable of `NamedTuple`s each containing fields - `stage`: the relevant warmup stage - `results`: results returned by that warmup stage (may be `nothing` if not applicable, or a chain, with tree statistics, etc; see the documentation of stages) - `warmup_state`: the warmup state *after* the corresponding stage. - `final_warmup_state`, which contains the final adaptation after all the warmup - `inference`, which has `posterior_matrix` and `tree_statistics`, see [`mcmc_with_warmup`](@ref). - `sampling_logdensity`, which contains information that is invariant to warmup !!! warning This function is not (yet) exported because the the warmup interface may change with minor versions without being considered breaking. Recommended for interactive use. $(DOC_MCMC_ARGS) """ function mcmc_keep_warmup(rng::AbstractRNG, ℓ, N::Integer; initialization = (), warmup_stages = default_warmup_stages(), algorithm = NUTS(), reporter = default_reporter()) sampling_logdensity = SamplingLogDensity(rng, ℓ, algorithm, reporter) initial_warmup_state = initialize_warmup_state(rng, ℓ; initialization...) warmup, warmup_state = _warmup(sampling_logdensity, warmup_stages, initial_warmup_state) inference = mcmc(sampling_logdensity, N, warmup_state) (; initial_warmup_state, warmup, final_warmup_state = warmup_state, inference, sampling_logdensity) end """ $(SIGNATURES) Perform MCMC with NUTS, including warmup which is not returned. Return a `NamedTuple` of - $(_DOC_POSTERIOR_MATRIX) - $(_DOC_TREE_STATISTICS) - `κ` and `ϵ`, the adapted metric and stepsize. $(DOC_MCMC_ARGS) # Usage examples Using a fixed stepsize: ```julia mcmc_with_warmup(rng, ℓ, N; initialization = (ϵ = 0.1, ), warmup_stages = fixed_stepsize_warmup_stages()) ``` Starting from a given position `q₀` and kinetic energy scaled down (will still be adapted): ```julia mcmc_with_warmup(rng, ℓ, N; initialization = (q = q₀, κ = GaussianKineticEnergy(5, 0.1))) ``` Using a dense metric: ```julia mcmc_with_warmup(rng, ℓ, N; warmup_stages = default_warmup_stages(; M = Symmetric)) ``` Disabling the initial stepsize search (provided explicitly, still adapted): ```julia mcmc_with_warmup(rng, ℓ, N; initialization = (ϵ = 1.0, ), warmup_stages = default_warmup_stages(; stepsize_search = nothing)) ``` """ function mcmc_with_warmup(rng, ℓ, N; initialization = (), warmup_stages = default_warmup_stages(), algorithm = NUTS(), reporter = default_reporter()) (; final_warmup_state, inference) = mcmc_keep_warmup(rng, ℓ, N; initialization = initialization, warmup_stages = warmup_stages, algorithm = algorithm, reporter = reporter) (; κ, ϵ) = final_warmup_state (; inference..., κ, ϵ) end #### #### utilities #### """ $(SIGNATURES) Given a vector of `results`, each containing a property `posterior_matrix` (eg obtained from [`mcmc_with_warmup`](@ref) with the same sample length), return a lazy view as an array indexed by `[draw_index, chain_index, parameter_index]`. This is useful as an input for eg `MCMCDiagnosticTools.ess_rhat`. !!! note The ordering is not compatible with MCMCDiagnostictools version < 0.2. """ function stack_posterior_matrices(results) @cast _[i, k, j]:= results[k].posterior_matrix[j, i] end """ $(SIGNATURES) Given a vector of `results`, each containing a property `posterior_matrix` (eg obtained from [`mcmc_with_warmup`](@ref) with the same sample length), return a lazy view as an array indexed by `[parameter_index, pooled_draw_index]`. This is useful for posterior analysis after diagnostics (see eg `Base.eachcol`). """ function pool_posterior_matrices(results) @cast _[i, j ⊗ k] := results[k].posterior_matrix[i, j] end ================================================ FILE: src/reporting.jl ================================================ ##### ##### Reporting progress. ##### import ProgressMeter export NoProgressReport, LogProgressReport, ProgressMeterReport """ $(TYPEDEF) A placeholder type for not reporting any information. """ struct NoProgressReport end """ $(SIGNATURES) Report to the given `reporter`. The second argument can be 1. a string, which is displayed as is (this is supported by all reporters). 2. or a step in an MCMC chain with a known number of steps for progress reporters (see [`make_mcmc_reporter`](@ref)). `meta` arguments are key-value pairs. In this context, a *step* is a NUTS transition, not a leapfrog step. """ report(reporter::NoProgressReport, step::Union{AbstractString,Integer}; meta...) = nothing """ $(SIGNATURES) Return a reporter which can be used for progress reports with a known number of `total_steps`. May return the same reporter, or a related object. Will display `meta` as key-value pairs. ## Arguments: - `reporter::NoProgressReport`: the original reporter - `total_steps`: total number of steps ## Keyword arguments: - `currently_warmup::Bool`: `true` if we are currently doing warmup; `false` if we are currently doing MCMC - `meta`: key-value pairs that will be displayed by the reporter """ make_mcmc_reporter(reporter::NoProgressReport, total_steps; currently_warmup::Bool = false, meta...) = reporter """ $(TYPEDEF) Report progress into the `Logging` framework, using `@info`. For the information reported, a *step* is a NUTS transition, not a leapfrog step. # Fields $(FIELDS) """ Base.@kwdef struct LogProgressReport{T} "ID of chain. Can be an arbitrary object, eg `nothing`." chain_id::T = nothing "Always report progress past `step_interval` of the last report." step_interval::Int = 100 "Always report progress past this much time (in seconds) after the last report." time_interval_s::Float64 = 1000.0 end """ $(SIGNATURES) Assemble log message metadata. Currently, it adds `chain_id` *iff* it is not `nothing`. """ _log_meta(chain_id::Nothing, meta) = meta _log_meta(chain_id, meta) = (; chain_id, meta...) function report(reporter::LogProgressReport, message::AbstractString; meta...) @info message _log_meta(reporter.chain_id, meta)... nothing end """ $(TYPEDEF) A composite type for tracking the state for which the last log message was emitted, for MCMC reporting with a given total number of steps (see [`make_mcmc_reporter`](@ref). # Fields $(FIELDS) """ mutable struct LogMCMCReport{T} "The progress report sink." log_progress_report::T "Total steps for this stage." total_steps::Int "Index of the last reported step." last_reported_step::Int "The last time a report was logged (determined using `time_ns`)." last_reported_time_ns::UInt64 end function report(reporter::LogMCMCReport, message::AbstractString; meta...) @info message _log_meta(reporter.log_progress_report.chain_id, meta)... nothing end function make_mcmc_reporter(reporter::LogProgressReport, total_steps::Integer; currently_warmup::Bool = false, meta...) @info "Starting MCMC" total_steps = total_steps meta... LogMCMCReport(reporter, total_steps, -1, time_ns()) end function report(reporter::LogMCMCReport, step::Integer; meta...) (; log_progress_report, total_steps, last_reported_step, last_reported_time_ns) = reporter (; chain_id, step_interval, time_interval_s) = log_progress_report @argcheck 1 ≤ step ≤ total_steps Δ_steps = step - last_reported_step t_ns = time_ns() Δ_time_s = (t_ns - last_reported_time_ns) / 1_000_000_000 if last_reported_step < 0 || Δ_steps ≥ step_interval || Δ_time_s ≥ time_interval_s seconds_per_step = Δ_time_s / Δ_steps meta_progress = (step, seconds_per_step = round(seconds_per_step; sigdigits = 2), estimated_seconds_left = round((total_steps - step) * seconds_per_step; sigdigits = 2)) @info "MCMC progress" merge(_log_meta(chain_id, meta_progress), meta)... reporter.last_reported_step = step reporter.last_reported_time_ns = t_ns end nothing end """ $(TYPEDEF) Report progress via a progress bar, using `ProgressMeter.jl`. Example usage: ```julia julia> ProgressMeterReport() ``` """ struct ProgressMeterReport end struct ProgressMeterReportMCMC{T} currently_warmup::Bool progress_meter::T end function make_mcmc_reporter(reporter::ProgressMeterReport, total_steps::Integer; currently_warmup::Bool=false, meta...) description = currently_warmup ? "Warmup: " : "MCMC: " return ProgressMeterReportMCMC(currently_warmup, ProgressMeter.Progress(total_steps, 1, description)) end function report(reporter::ProgressMeterReport, message::AbstractString; meta...) return nothing end function report(reporter::ProgressMeterReportMCMC, message::AbstractString; meta...) return nothing end function report(reporter::ProgressMeterReport, step::Integer; meta...) return nothing end function report(reporter::ProgressMeterReportMCMC, step::Integer; meta...) ProgressMeter.next!(reporter.progress_meter) return nothing end """ $(SIGNATURES) Return a default reporter, taking the environment into account. Keyword arguments are passed to constructors when applicable. """ function default_reporter(; kwargs...) if isinteractive() LogProgressReport(; kwargs...) else NoProgressReport() end end ================================================ FILE: src/stepsize.jl ================================================ ##### ##### stepsize heuristics and adaptation ##### #### #### initial stepsize #### """ $(TYPEDEF) Parameters for the search algorithm for the initial stepsize. The algorithm finds an initial stepsize ``ϵ`` so that the local log acceptance ratio ``A(ϵ)`` is near `params.log_threshold`. $FIELDS !!! NOTE The algorithm is from Hoffman and Gelman (2014), default threshold modified to `0.8` following later practice in Stan. """ struct InitialStepsizeSearch "The stepsize where the search is started." initial_ϵ::Float64 "Log of the threshold that needs to be crossed." log_threshold::Float64 "Maximum number of iterations for crossing the threshold." maxiter_crossing::Int function InitialStepsizeSearch(; log_threshold::Float64 = log(0.8), initial_ϵ = 0.1, maxiter_crossing = 400) @argcheck isfinite(log_threshold) && log_threshold < 0 @argcheck isfinite(initial_ϵ) && 0 < initial_ϵ @argcheck maxiter_crossing ≥ 50 new(initial_ϵ, log_threshold, maxiter_crossing) end end """ $(SIGNATURES) Find an initial stepsize that matches the conditions of `parameters` (see [`InitialStepsizeSearch`](@ref)). `A` is the local log acceptance ratio (uncapped). Cf [`local_log_acceptance_ratio`](@ref). """ function find_initial_stepsize(parameters::InitialStepsizeSearch, A) (; initial_ϵ, log_threshold, maxiter_crossing) = parameters ϵ = initial_ϵ Aϵ = A(ϵ) double = Aϵ > log_threshold # do we double? for _ in 1:maxiter_crossing ϵ′ = double ? 2 * ϵ : ϵ / 2 Aϵ′ = A(ϵ′) (double ? Aϵ′ < log_threshold : Aϵ′ > log_threshold) && return ϵ′ ϵ = ϵ′ end dir = double ? "below" : "above" _error("Initial stepsize search reached maximum number of iterations from $(dir) without crossing."; maxiter_crossing, initial_ϵ, ϵ) end """ $(SIGNATURES) Return a function of the stepsize (``ϵ``) that calculates the local log acceptance ratio for a single leapfrog step around `z` along the Hamiltonian `H`. Formally, let ```julia A(ϵ) = logdensity(H, leapfrog(H, z, ϵ)) - logdensity(H, z) ``` Note that the ratio is not capped by `0`, so it is not a valid (log) probability *per se*. """ function local_log_acceptance_ratio(H, z) ℓ0 = logdensity(H, z) isfinite(ℓ0) || _error("Starting point has non-finite density."; hamiltonian_logdensity = ℓ0, logdensity = z.Q.ℓq, position = z.Q.q) function(ϵ) z1 = leapfrog(H, z, ϵ) ℓ1 = logdensity(H, z1) ℓ1 - ℓ0 end end """ $(TYPEDEF) Parameters for the dual averaging algorithm of Gelman and Hoffman (2014, Algorithm 6). To get reasonable defaults, initialize with `DualAveraging()`. # Fields $(FIELDS) """ struct DualAveraging{T} "target acceptance rate" δ::T "regularization scale" γ::T "relaxation exponent" κ::T "offset" t₀::Int function DualAveraging(δ::T, γ::T, κ::T, t₀::Int) where {T <: Real} @argcheck 0 < δ < 1 @argcheck γ > 0 @argcheck 0.5 < κ ≤ 1 @argcheck t₀ ≥ 0 new{T}(δ, γ, κ, t₀) end end function DualAveraging(; δ = 0.8, γ = 0.05, κ = 0.75, t₀ = 10) DualAveraging(promote(δ, γ, κ)..., t₀) end "Current state of adaptation for `ϵ`." Base.@kwdef struct DualAveragingState{T <: AbstractFloat} μ::T m::Int H̄::T logϵ::T logϵ̄::T end """ $(SIGNATURES) Return an initial adaptation state for the adaptation method and a stepsize `ϵ`. """ function initial_adaptation_state(::DualAveraging, ϵ) @argcheck ϵ > 0 logϵ = log(ϵ) DualAveragingState(; μ = log(10) + logϵ, m = 1, H̄ = zero(logϵ), logϵ, logϵ̄ = zero(logϵ)) end """ $(SIGNATURES) Update the adaptation `A` of log stepsize `logϵ` with average Metropolis acceptance rate `a` over the whole visited trajectory, using the dual averaging algorithm of Gelman and Hoffman (2014, Algorithm 6). Return the new adaptation state. """ function adapt_stepsize(parameters::DualAveraging, A::DualAveragingState, a) @argcheck 0 ≤ a ≤ 1 (; δ, γ, κ, t₀) = parameters (; μ, m, H̄, logϵ, logϵ̄) = A m += 1 H̄ += (δ - a - H̄) / (m + t₀) logϵ = μ - √m/γ * H̄ logϵ̄ += m^(-κ)*(logϵ - logϵ̄) DualAveragingState(; μ, m, H̄, logϵ, logϵ̄) end """ $(SIGNATURES) Return the stepsize `ϵ` for the next HMC step while adapting. """ current_ϵ(A::DualAveragingState, tuning = true) = exp(A.logϵ) """ $(SIGNATURES) Return the final stepsize `ϵ` after adaptation. """ final_ϵ(A::DualAveragingState, tuning = true) = exp(A.logϵ̄) ### ### fixed stepsize adaptation placeholder ### """ $(SIGNATURES) Adaptation with fixed stepsize. Leaves `ϵ` unchanged. """ struct FixedStepsize end initial_adaptation_state(::FixedStepsize, ϵ) = ϵ adapt_stepsize(::FixedStepsize, ϵ, a) = ϵ current_ϵ(ϵ::Real) = ϵ final_ϵ(ϵ::Real) = ϵ ================================================ FILE: src/trees.jl ================================================ ##### ##### Abstract tree/trajectory interface ##### #### #### Directions #### "Maximum number of iterations [`next_direction`](@ref) supports." const MAX_DIRECTIONS_DEPTH = 32 """ Internal type implementing random directions. Draw a new value with `rand`, see [`next_direction`](@ref). Serves two purposes: a fixed value of `Directions` is useful for unit testing, and drawing a single bit flag collection economizes on the RNG cost. """ struct Directions flags::UInt32 end Base.rand(rng::AbstractRNG, ::Type{Directions}) = Directions(rand(rng, UInt32)) """ $(SIGNATURES) Return the next direction flag and the new state of directions. Results are undefined for more than [`MAX_DIRECTIONS_DEPTH`](@ref) updates. """ function next_direction(directions::Directions) (; flags) = directions Bool(flags & 0x01), Directions(flags >>> 1) end #### #### Trajectory interface #### """ $(FUNCTIONNAME)(trajectory, z, is_forward) Move along the trajectory in the specified direction. Return the new position. """ function move end """ $(FUNCTIONNAME)(trajectory, τ) Test if the turn statistics `τ` indicate that the corresponding tree is turning. Will only be called on nontrivial trees (at least two nodes). """ function is_turning end """ $(FUNCTIONNAME)(trajectory, τ₁, τ₂) Combine turn statistics on trajectory. Implementation can assume that the trees that correspond to the turn statistics have the same ordering. When ```julia τ = combine_turn_statistics(trajectory, τ₁, τ₂) is_turning(trajectory, τ) ``` the combined turn statistic `τ` is guaranteed not to escape the caller, so it can eg change type. """ function combine_turn_statistics end """ $(FUNCTIONNAME)(trajectory, v₁, v₂) Combine visited node statistics for adjacent trees trajectory. Implementation should be invariant to the ordering of `v₁` and `v₂` (ie the operation is commutative). """ function combine_visited_statistics end """ $(FUNCTIONNAME)(trajectory, is_doubling::Bool, ω₁, ω₂, ω) Calculate the log probability if selecting the subtree corresponding to `ω₂`. Being the log of a probability, it is always `≤ 0`, but implementations are allowed to return and accept values `> 0` and treat them as `0`. When `is_doubling`, the tree corresponding to `ω₂` was obtained from a doubling step (this can be relevant eg for biased progressive sampling). The value `ω = logaddexp(ω₁, ω₂)` is provided for avoiding redundant calculations. See [`biased_progressive_logprob2`](@ref) for an implementation. """ function calculate_logprob2 end """ $(FUNCTIONNAME)(rng, trajectory, ζ₁, ζ₂, logprob2::Real, is_forward::Bool) Combine two proposals `ζ₁, ζ₂` on `trajectory`, with log probability `logprob2` for selecting `ζ₂`. `ζ₁` is before `ζ₂` iff `is_forward`. """ function combine_proposals end """ ζωτ_or_nothing, v = $(FUNCTIONNAME)(trajectory, z, is_initial) Information for a tree made of a single node. When `is_initial == true`, this is the first node. The first value is either 1. `nothing` for a divergent node, 2. a tuple containing the proposal `ζ`, the log weight (probability) of the node `ω`, the turn statistics `τ` (never tested with `is_turning` for leaf nodes). The second value is the visited node information. """ function leaf end #### #### utilities #### """ $(SIGNATURES) Combine turn statistics with the given direction. When `is_forward`, `τ₁` is before `τ₂`, otherwise after. Internal helper function. """ @inline function combine_turn_statistics_in_direction(trajectory, τ₁, τ₂, is_forward::Bool) if is_forward combine_turn_statistics(trajectory, τ₁, τ₂) else combine_turn_statistics(trajectory, τ₂, τ₁) end end function combine_proposals_and_logweights(rng, trajectory, ζ₁, ζ₂, ω₁::Real, ω₂::Real, is_forward::Bool, is_doubling::Bool) ω = logaddexp(ω₁, ω₂) logprob2 = calculate_logprob2(trajectory, is_doubling, ω₁, ω₂, ω) ζ = combine_proposals(rng, trajectory, ζ₁, ζ₂, logprob2, is_forward) ζ, ω end """ $(SIGNATURES) Given (relative) log probabilities `ω₁` and `ω₂`, return the log probabiliy of drawing a sample from the second (`logprob2`). When `bias`, biases towards the second argument, introducing anti-correlations. """ function biased_progressive_logprob2(bias::Bool, ω₁::Real, ω₂::Real, ω = logaddexp(ω₁, ω₂)) ω₂ - (bias ? ω₁ : ω) end #### #### abstract trajectory interface #### """ $(SIGNATURES) Information about an invalid (sub)tree, using positions relative to the starting node. 1. When `left < right`, this tree was *turning*. 2. When `left == right`, this is a *divergent* node. 3. `left == 1 && right == 0` is used as a sentinel value for reaching maximum depth without encountering any invalid trees (see [`REACHED_MAX_DEPTH`](@ref). All other `left > right` values are disallowed. """ struct InvalidTree left::Int right::Int end InvalidTree(i::Integer) = InvalidTree(i, i) is_divergent(invalid_tree::InvalidTree) = invalid_tree.left == invalid_tree.right function Base.show(io::IO, invalid_tree::InvalidTree) msg = if is_divergent(invalid_tree) "divergence at position $(invalid_tree.left)" elseif invalid_tree == REACHED_MAX_DEPTH "reached maximum depth without divergence or turning" else (; left, right) = invalid_tree "turning at positions $(left):$(right)" end print(io, msg) end "Sentinel value for reaching maximum depth." const REACHED_MAX_DEPTH = InvalidTree(1, 0) """ result, v = adjacent_tree(rng, trajectory, z, i, depth, is_forward) Traverse the tree of given `depth` adjacent to point `z` in `trajectory`. `is_forward` specifies the direction, `rng` is used for random numbers in [`combine_proposals`](@ref). `i` is an integer position relative to the initial node (`0`). The *first value* is either 1. an `InvalidTree`, indicating the first divergent node or turning subtree that was encounteted and invalidated this tree. 2. a tuple of `(ζ, ω, τ, z′, i′), with - `ζ`: the proposal from the tree. - `ω`: the log weight of the subtree that corresponds to the proposal - `τ`: turn statistics - `z′`: the last node of the tree - `i′`: the position of the last node relative to the initial node. The *second value* is always the visited node statistic. """ function adjacent_tree(rng, trajectory, z, i, depth, is_forward) i′ = i + (is_forward ? 1 : -1) if depth == 0 z′ = move(trajectory, z, is_forward) ζωτ, v = leaf(trajectory, z′, false) if ζωτ ≡ nothing InvalidTree(i′), v else (ζωτ..., z′, i′), v end else # “left” tree t₋, v₋ = adjacent_tree(rng, trajectory, z, i, depth - 1, is_forward) t₋ isa InvalidTree && return t₋, v₋ ζ₋, ω₋, τ₋, z₋, i₋ = t₋ # “right” tree — visited information from left is kept even if invalid t₊, v₊ = adjacent_tree(rng, trajectory, z₋, i₋, depth - 1, is_forward) v = combine_visited_statistics(trajectory, v₋, v₊) t₊ isa InvalidTree && return t₊, v ζ₊, ω₊, τ₊, z₊, i₊ = t₊ # turning invalidates τ = combine_turn_statistics_in_direction(trajectory, τ₋, τ₊, is_forward) is_turning(trajectory, τ) && return InvalidTree(i′, i₊), v # valid subtree, combine proposals ζ, ω = combine_proposals_and_logweights(rng, trajectory, ζ₋, ζ₊, ω₋, ω₊, is_forward, false) (ζ, ω, τ, z₊, i₊), v end end """ $(SIGNATURES) Sample a `trajectory` starting at `z`, up to `max_depth`. `directions` determines the tree expansion directions. Return the following values - `ζ`: proposal from the tree - `v`: visited node statistics - `termination`: an `InvalidTree` (this includes the last doubling step turning, which is technically a valid tree) or `REACHED_MAX_DEPTH` when all subtrees were valid and no turning happens. - `depth`: the depth of the tree that was sampled from. Doubling steps that lead to an invalid adjacent tree do not contribute to `depth`. """ function sample_trajectory(rng, trajectory, z, max_depth::Integer, directions::Directions) @argcheck max_depth ≤ MAX_DIRECTIONS_DEPTH (ζ, ω, τ), v = leaf(trajectory, z, true) z₋ = z₊ = z depth = 0 termination = REACHED_MAX_DEPTH i₋ = i₊ = 0 while depth < max_depth is_forward, directions = next_direction(directions) t′, v′ = adjacent_tree(rng, trajectory, is_forward ? z₊ : z₋, is_forward ? i₊ : i₋, depth, is_forward) v = combine_visited_statistics(trajectory, v, v′) # invalid adjacent tree: stop t′ isa InvalidTree && (termination = t′; break) # extract information from adjacent tree ζ′, ω′, τ′, z′, i′ = t′ # update edges and combine proposals if is_forward z₊, i₊ = z′, i′ else z₋, i₋ = z′, i′ end # tree has doubled successfully ζ, ω = combine_proposals_and_logweights(rng, trajectory, ζ, ζ′, ω, ω′, is_forward, true) depth += 1 # when the combined tree is turning, stop τ = combine_turn_statistics_in_direction(trajectory, τ, τ′, is_forward) is_turning(trajectory, τ) && (termination = InvalidTree(i₋, i₊); break) end ζ, v, termination, depth end ================================================ FILE: src/utilities.jl ================================================ ##### ##### utilities ##### #### #### error messages #### """ $(TYPEDEF) The error type used by this package. Debug information should be printed without truncation, with full precision. $(FIELDS) """ struct DynamicHMCError <: Exception message::String debug_information::NamedTuple end """ $(SIGNATURES) Throw a `DynamicHMCError` with given message, keyword arguments used for debug information. """ _error(message::AbstractString; kwargs...) = throw(DynamicHMCError(message, NamedTuple(kwargs))) function Base.showerror(io::IO, error::DynamicHMCError) (; message, debug_information) = error printstyled(io, "DynamicHMC error: ", error; color = :red) for (key, value) in pairs(debug_information) print(io, "\n ") printstyled(io, string(key); color = :blue, bold = true) printstyled(io, " = ", value) end nothing end ================================================ FILE: test/Project.toml ================================================ [deps] Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595" ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197" DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" DynamicHMC = "bbc10e6e-7c05-544b-b16e-64fede858acb" ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" HypothesisTests = "09f84164-cd44-5f33-b23f-e6b0d136a0d5" JET = "c3a54625-cd67-489e-a8e7-0a5a0ff4e31b" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c" LogDensityTestSuite = "feb245ec-c857-584e-a66a-22324acf10c6" LogExpFunctions = "2ab3a3ac-af41-5b50-aa03-7779005ae688" Logging = "56ddb016-857b-54e1-b83d-db4d58db5568" MCMCDiagnosticTools = "be115224-59cd-429b-ad48-344e309966f0" OhMyThreads = "67456a42-1dca-4109-a031-0a68de7e3ad5" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" TransformVariables = "84d833dd-6860-57f9-a1a7-6da5db126cff" [sources] DynamicHMC = {path = ".."} [compat] LogDensityTestSuite = "0.7" ================================================ FILE: test/runtests.jl ================================================ using DynamicHMC, Test, ArgCheck, DocStringExtensions, HypothesisTests, LinearAlgebra, Random, Statistics import OhMyThreads using LogExpFunctions: logaddexp, log1mexp using StatsBase: mean_and_cov using Logging: with_logger, NullLogger import ForwardDiff, Random, TransformVariables using DynamicHMC.Diagnostics using DynamicHMC.Diagnostics: ACCEPTANCE_QUANTILES using LogDensityProblems: logdensity_and_gradient, dimension, LogDensityProblems using LogDensityTestSuite #### #### static analysis and QA; before everything else as tests extend methods #### @testset "static analysis with JET.jl" begin using JET @test isempty(JET.get_reports(report_package(DynamicHMC, target_modules=(DynamicHMC,)))) end @testset "Aqua" begin import Aqua Aqua.test_all(DynamicHMC; ambiguities = false) # testing separately, cf https://github.com/JuliaTesting/Aqua.jl/issues/77 Aqua.test_ambiguities(DynamicHMC) end ### ### general test environment ### const RNG = copy(Random.default_rng()) # shorthand Random.seed!(RNG, UInt32[0x23ef614d, 0x8332e05c, 0x3c574111, 0x121aa2f4]) "Tolerant testing in a CI environment." const RELAX = (k = "CONTINUOUS_INTEGRATION"; haskey(ENV, k) && ENV[k] == "true") include("utilities.jl") #### #### unit tests #### include("test_trees.jl") include("test_hamiltonian.jl") include("test_NUTS.jl") include("test_stepsize.jl") include("test_mcmc.jl") include("test_diagnostics.jl") include("test_logging.jl") #### #### sample correctness tests #### include("sample-correctness_tests.jl") ================================================ FILE: test/sample-correctness_tests.jl ================================================ include("sample-correctness_utilities.jl") ##### ##### sample correctness tests ##### ##### Sample from well-characterized distributions using LogDensityTestSuite, check ##### convergence and mixing, and compare. "Adaptation with dense matrix" const MCMC_ARGS2 = (warmup_stages = default_warmup_stages(; M = Symmetric),) @testset "NUTS tests with random normal" begin for _ in 1:10 K = rand(RNG, 3:10) μ = randn(RNG, K) d = abs.(randn(RNG, K)) C = rand_C(K) ℓ = multivariate_normal(μ, Diagonal(d) * C) title = "multivariate normal μ = $(μ) d = $(d) C = $(C)" NUTS_tests(RNG, ℓ, title, 1000; mcmc_args = MCMC_ARGS2, R̂_alert = 1.02, τ_alert = 0.7) end end @testset "ill-conditioned multivariate normal" begin # this test case was isolated using random tests μ = [-1.729922440774685, -0.011762500688978205, 0.11423091067230899, 0.05085717388622323, 0.09102774773399233, -0.3769237300508154, -1.1645971596831883, -1.4196407006756644, 0.07406060991401947] d = [0.31285715405356296, 1.6321047397137334, 1.9304214045496948, 0.9408515651923572, 0.632832415315841, 0.3994529605030148, 0.9479547802750243, 0.000686699019868418, 0.14074551354895906] C = [1.0 -0.625893845478092 -0.8607538232958145 0.4906036948283603 -0.045129301268019346 -0.9798256449980116 -0.09448716779625055 0.1972478332046149 -0.38125524332165456; 0.0 0.7799082601131022 0.22963314745353192 -0.8390321758549951 -0.2940681265758735 0.05788305453491861 -0.30348581879657555 -0.3395815944065493 0.40817023926937634; 0.0 0.0 0.45428127109998945 0.07704183020878513 0.5013749270904165 0.09940288184055725 -0.4898077520422466 -0.04390387380845317 -0.39358273046921877; 0.0 0.0 0.0 0.22225566111771966 -0.5034002085122711 0.1540822287067389 -0.52831870161212 -0.20197326086456527 -0.4230725997740589; 0.0 0.0 0.0 0.0 0.6377293278924043 0.002108173376346147 -0.563819920556515 0.07024142256309863 0.20409522211102057; 0.0 0.0 0.0 0.0 0.0 0.05444765270890811 0.21770654511030652 0.4167989822452558 0.4096707796964533; 0.0 0.0 0.0 0.0 0.0 0.0 0.12102564140379203 0.6237333486866049 -0.1142510107612157; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4851374500990013 -0.2027266958462243; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.30084429646746724]' ℓ = multivariate_normal(μ, Diagonal(d) * C) title = "ill-conditioned multivariate normal (isolated test case 1)" NUTS_tests(RNG, ℓ, title, 1000; mcmc_args = MCMC_ARGS2) d = [0.44940324099952655, 1.2470316880832284, 1.4254609657195896, 0.47414925026956667, 0.7208717869588667, 0.9012540329863461, 0.259210347514327, 0.48018821609980755, 0.036285320442367444] C = [1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.007468818792116497 0.999972107983943 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.9511843069109334 0.06094826193577815 0.30254540758929904 0.0 0.0 0.0 0.0 0.0 0.0; 0.5836451073483746 0.5224198876250752 -0.1567642318026896 0.6015486890596806 0.0 0.0 0.0 0.0 0.0; -0.04549583361258265 0.16604582867077644 -0.6573154635023393 0.5230837360874556 0.5144693366823966 0.0 0.0 0.0 0.0; 0.3090114014598978 0.21784144366429148 0.09455066936309542 0.7472520532986878 0.3661721405808872 0.39452447632098014 0.0 0.0 0.0; 0.27849576428755396 0.008203485989481384 -0.6289527864239539 0.5299626182310367 -0.18989119185086065 0.3458859908657774 0.30039148523055575 0.0 0.0; -0.7595504281026706 -0.6109486667620377 0.08322674440383553 -0.12441158714041263 -0.15879164203513468 -0.0032350588677425886 0.027740844099589795 0.03775094878848311 0.0; 0.8843786481850745 0.4137017432529274 0.19839646818921372 -0.07842556868606812 0.03458430271168502 0.0036393230648423818 0.0006870732712296159 -0.0015642900624311437 0.0011437266452138846] ℓ = multivariate_normal(μ, Diagonal(d) * C) title = "ill-conditioned multivariate normal (isolated test case 2)" NUTS_tests(RNG, ℓ, title, 1000; mcmc_args = MCMC_ARGS2) μ = [0.21062974278940136, -1.218937450424899, 0.06421875640449011, -0.8234583898758592, -2.31397504655407, -0.4751175796619936, -1.2623323961397874, 0.2150945580900463, 1.0797988499707567, 0.6923991470384713] d = [1.235510286986013, 0.25725289997297635, 0.39737933906879164, 1.2464348820193416, 0.3082850398698708, 0.9563709407505254, 1.6547932918031834, 1.9782388109071316, 0.38580150239677885, 0.45488559976648274] C = [1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.5858606519975413 0.8104118067013929 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; -0.3184163160259112 0.8041538301838452 0.501943888387077 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.3173460682399272 0.6771172525630316 -0.41159671670836784 0.520952821327462 0.0 0.0 0.0 0.0 0.0 0.0; -0.987376065017123 -0.0893955251935478 -0.1251983682331955 0.015871075518314355 0.03421145802664587 0.0 0.0 0.0 0.0 0.0; 0.37469357703269496 -0.8443427667670257 0.32370544135718116 -0.052396077029688945 -0.14292183643709977 0.13686782878290468 0.0 0.0 0.0 0.0; -0.6171193584146126 -0.6578898907477293 -0.39307408945037237 -0.1518878423897761 -0.04583110799414341 0.024372352823947997 0.0779290101096559 0.0 0.0 0.0; 0.5435692867326045 -0.6050903050824995 0.08910494475273394 -0.3209596162864902 0.39975938033524144 0.07516818530300905 -0.06448639900775556 0.24047260310743332 0.0 0.0; -0.06388905564192496 0.9843759627707926 -0.12367139895609519 -0.02886519073736079 0.08699952332803386 -0.020427021493780943 0.0227516163109634 0.010263085877575476 0.04674602752418515 0.0; -0.05914353971342278 0.5051281727293001 -0.0853459337837312 0.7320866937322082 0.42886052044809864 0.011574865047660135 0.10703394808902246 0.045502786672532804 -0.01539436089666275 0.017135804222740844] ℓ = multivariate_normal(μ, Diagonal(d) * C) title = "ill-conditioned multivariate normal (isolated test case 2)" NUTS_tests(RNG, ℓ, title, 1000; mcmc_args = MCMC_ARGS2) end @testset "NUTS tests with specific normal distributions" begin ℓ = multivariate_normal([0.0], 5e8) NUTS_tests(RNG, ℓ, "univariate huge variance", 1000) ℓ = multivariate_normal([1.0], 5e8) NUTS_tests(RNG, ℓ, "univariate huge variance, offset", 1000) ℓ = multivariate_normal([1.0], 5e-8) NUTS_tests(RNG, ℓ, "univariate tiny variance, offset", 1000) ℓ = multivariate_normal([1.0, 2.0, 3.0], Diagonal([1.0, 2.0, 3.0])) NUTS_tests(RNG, ℓ, "mildly scaled diagonal", 1000) # these tests are kept because they did produce errors for some code that turned out to # be buggy in the early development version; this does not meant that they are # particularly powerful or sensitive ones ℓ = multivariate_normal([-0.37833073009094703, -0.3973395239297558], cholesky([0.08108928067723374 -0.19742780267879112; -0.19742780267879112 1.2886298811010262]).L) NUTS_tests(RNG, ℓ, "kept 2 dim", 1000) ℓ = multivariate_normal( [-1.0960316317778482, -0.2779143641884689, -0.4566289703243874], cholesky([2.2367476976202463 1.4710084974801891 2.41285525745893; 1.4710084974801891 1.1684361535929932 0.9632367554302268; 2.41285525745893 0.9632367554302268 4.5595606374865785]).L) NUTS_tests(RNG, ℓ, "kept 3 dim", 1000) ℓ = multivariate_normal( [-1.42646, 0.94423, 0.852379, -1.12906, 0.0868619, 0.948781, -0.875067, 1.07243], cholesky([14.8357 2.42526 -2.97011 2.08363 -1.67358 4.02846 5.57947 7.28634; 2.42526 10.8874 -1.08992 1.99358 1.85011 -2.29754 -0.0540131 1.79718; -2.97011 -1.08992 3.05794 0.0321187 1.8052 -1.5309 1.78163 -0.0821483; 2.08363 1.99358 0.0321187 2.38112 -0.252784 0.666474 1.73862 2.55874; -1.67358 1.85011 1.8052 -0.252784 12.3109 -2.3913 -2.99741 -1.95031; 4.02846 -2.29754 -1.5309 0.666474 -2.3913 4.89957 3.6118 5.22626; 5.57947 -0.0540131 1.78163 1.73862 -2.99741 3.6118 10.215 9.60671; 7.28634 1.79718 -0.0821483 2.55874 -1.95031 5.22626 9.60671 11.5554]).L) NUTS_tests(RNG, ℓ, "kept 8 dim", 1000) end @testset "NUTS tests with mixtures" begin ℓ1 = multivariate_normal(zeros(3), 1.0) D2 = I * 0.4 C2 = [1.0 -0.48058358598852935 0.39971148270854306; 0.0 0.876948924897229 -0.5361348433365906; 0.0 0.0 0.7434985947205197] ℓ2 = multivariate_normal(ones(3), D2 * C2) ℓ = mix(0.2, ℓ1, ℓ2) NUTS_tests(RNG, ℓ, "mixture of two normals", 1000; τ_alert = 0.15, p_alert = 0.005) end @testset "NUTS tests with heavier tails and skewness" begin K = 5 𝒩 = StandardMultivariateNormal(K) # somewhat nasty, relaxed requirements ℓ = elongate(1.1)(𝒩) NUTS_tests(RNG, ℓ, "elongate(1.1, 𝑁)", 10000; p_alert = 0.05, EBFMI_alert = 0.2, R̂_fail = 1.05, τ_fail = 0.3) # this has very nasty tails so we relax requirements a bit ℓ = (elongate(1.1) ∘ shift(ones(K)))(𝒩) NUTS_tests(RNG, ℓ, "skew elongate(1.1, 𝑁)", 10000; τ_alert = 0.1, EBFMI_alert = 0.2, R̂_fail = 1.05, p_fail = 0.001) # funnel, mixed with a normal ℓ = mix(0.8, funnel()(𝒩), 𝒩) NUTS_tests(RNG, ℓ, "funnel", 10000; EBFMI_alert = 0.2, τ_alert = 0.1, p_fail = 5e-3, R̂_fail = 1.05) end ================================================ FILE: test/sample-correctness_utilities.jl ================================================ ##### ##### utilities for testing sample correctness ##### using MCMCDiagnosticTools: ess_rhat """ $(SIGNATURES) Run `K` chains of MCMC on `ℓ`, each for `N` samples, return a posterior matrices stacked (indexed by `[draw, parameter, chain]`) and concatenated (indexed by `[draw, parameter]`), and EBFMI statistics as fields of a `NamedTuple`. Keyword arguments are passed to `mcmc_with_warmup`. """ function run_chains(rng, ℓ, N, K; mcmc_args...) results = OhMyThreads.tcollect(mcmc_with_warmup(rng, ℓ, N; reporter = NoProgressReport(), mcmc_args...) for _ in 1:K) (stacked_posterior_matrices = stack_posterior_matrices(results), concat_posterior_matrices = pool_posterior_matrices(results), EBFMIs = map(r -> EBFMI(r.tree_statistics), results)) end ### ### Multivariate normal ℓ for testing ### "Random Cholesky factor for correlation matrix." function rand_C(K) t = TransformVariables.CorrCholeskyFactor(K) TransformVariables.transform(t, randn(RNG, TransformVariables.dimension(t)) ./ 4)' end """ $(SIGNATURES) `R̂` (within/between variance) and `τ` (effective sample size coefficient) statistics for posterior matrices, eg the output of `run_chains`. """ function mcmc_statistics(stacked_posterior_matrices) (; ess, rhat) = ess_rhat(stacked_posterior_matrices) (R̂ = rhat, τ = ess ./ size(stacked_posterior_matrices, 1)) end """ $(SIGNATURES) Jitter in `(-ϵ, +ϵ)`. Useful for tiebreaking for the Kolmogorov-Smirnov tests. """ jitter(rng, len, ϵ = 64*eps()) = (2 * ϵ) .* (rand(rng, len) .- 0.5) """ $(SIGNATURES) Run MCMC on `ℓ`, obtaining `N` samples from `K` independently adapted chains. `R̂`, `τ`, Kolmogorov-Smirnov and Anderson-Darling `p`, and EBFMIs are obtained and compared to thresholds that either *alert* or *fail*. The latter should be lax because of false positives, the tests can be rather hair-trigger. Output is sent to `io`. Specifically, `title` is printed for the first alert. `mcmc_args` are passed down to `mcmc_with_warmup`. """ function NUTS_tests(rng, ℓ, title, N; K = 5, io = stdout, mcmc_args = NamedTuple(), R̂_alert = 1.01, R̂_fail = 2 * (R̂_alert - 1) + 1, τ_alert = 1.0, τ_fail = τ_alert * 0.5, p_alert = 0.1, p_fail = p_alert * 0.1, EBFMI_alert = 0.5, EBFMI_fail = EBFMI_alert / 2) @argcheck 1 < R̂_alert ≤ R̂_fail @argcheck 0 < τ_fail ≤ τ_alert @argcheck 0 < p_fail ≤ p_alert @argcheck 0 < EBFMI_fail < EBFMI_alert d = dimension(ℓ) _round(x) = round(x; sigdigits = 3) # for printing title_printed = false function _print_diagnostics(label, is_min, value, alert_threshold, error_threshold) if !title_printed printstyled(io, "INFO while testing: $(title), dimension $(d)\n"; color = :blue, bold = true) title_printed = true end if is_min vm = minimum(value) if vm ≥ alert_threshold mark, rel, vt, col = '✓', '≥', alert_threshold, :green elseif vm ≥ error_threshold mark, rel, vt, col = '!', '≱', alert_threshold, :yellow else mark, rel, vt, col = '✘', '≱', error_threshold, :red end else vm = maximum(value) if vm ≤ alert_threshold mark, rel, vt, col = '✓', '≤', alert_threshold, :green elseif vm ≤ error_threshold mark, rel, vt, col = '!', '≰', alert_threshold, :yellow else mark, rel, vt, col = '✘', '≰', error_threshold, :red end end printstyled(io, "$(mark) $(label) = $(_round.(vm)) $(rel) $(_round(vt))\n"; color = col) end (; stacked_posterior_matrices, concat_posterior_matrices, EBFMIs) = run_chains(RNG, ℓ, N, K; mcmc_args...) # mixing and autocorrelation diagnostics (; R̂, τ) = mcmc_statistics(stacked_posterior_matrices) _print_diagnostics("R̂", false, R̂, R̂_alert, R̂_fail) @test all(maximum(R̂) ≤ R̂_fail) _print_diagnostics("τ", true, τ, τ_alert, τ_fail) @test all(minimum(τ) ≥ τ_fail) _print_diagnostics("EBFMI", true, EBFMIs, EBFMI_alert, EBFMI_fail) @test all(minimum(EBFMIs) ≥ EBFMI_fail) # distribution comparison tests Z = concat_posterior_matrices Z′ = samples(ℓ, 1000) pd_alert = p_alert / d # a simple Bonferroni correction pd_fail = p_fail / d ps = map((a, b) -> pvalue(KSampleADTest(a, b)), eachrow(Z), eachrow(Z′)) _print_diagnostics("p", true, ps, p_alert, p_fail) @test all(minimum(ps) ≥ p_fail) end ================================================ FILE: test/test_NUTS.jl ================================================ using DynamicHMC: TrajectoryNUTS, rand_bool_logprob, GeneralizedTurnStatistic, AcceptanceStatistic, leaf_acceptance_statistic, acceptance_rate, TreeStatisticsNUTS, NUTS, sample_tree, combine_turn_statistics, combine_visited_statistics, evaluate_ℓ, Hamiltonian ### ### random booleans ### @testset "random booleans" begin for prob in (1:9) ./ 10 logprob = log(prob) @test abs(mean(rand_bool_logprob(RNG, logprob) for _ in 1:10000) - prob) ≤ 0.02 end # these operations don't call the RNG, this is checked RNG′ = copy(RNG) @test all(rand_bool_logprob(RNG, 0) for _ in 1:10000) @test all(rand_bool_logprob(RNG, 10) for _ in 1:10000) @test rand(RNG′) == rand(RNG) end ### ### test turn statistics ### @testset "low-level turn statistics" begin trajectory = TrajectoryNUTS(nothing, 0, 1, -1000, Val{:generalized}) p = ones(3) # unit vector c = 0.1 # a constant, just for consistency checking of combination # turn statistics constructed so that τ₁ + τ₂ won't be turning, τ₁ + τ₃ will be τ₁ = GeneralizedTurnStatistic(p, p .- c, p, p .- c, p) τ₂ = GeneralizedTurnStatistic(3 .* p, 3 .* p .+ c, 3 .* p, 3 .* p .+ c, 3 .* p) τ₃ = GeneralizedTurnStatistic(2 .* p, 2 .* p .+ c, 2 .* p, 2 .* p .+ c, -2 .* p) τ = combine_turn_statistics(trajectory, τ₁, τ₂) # test mechanics of combination @test τ.ρ == τ₁.ρ .+ τ₂.ρ # test non-turning @test !is_turning(trajectory, τ) # test turning @test is_turning(trajectory, combine_turn_statistics(trajectory, τ₁, τ₃)) end @testset "low-level visited statistics" begin trajectory = TrajectoryNUTS(nothing, 0, 1, -1000, Val{:generalized}) vs(p, is_initial = false) = leaf_acceptance_statistic(log(p), is_initial) x = vs(0.3) @test acceptance_rate(x) ≈ 0.3 y = vs(0.6) @test acceptance_rate(y) ≈ 0.6 x0 = vs(10, true) # initial node, does not count z = reduce((x, y) -> combine_visited_statistics(trajectory, x, y), [x, x, y, x0]) @test acceptance_rate(z) ≈ 0.4 end # define a distribution which is divergent everywhere except at 0 struct AlwaysDivergentTest K::Int end function LogDensityProblems.capabilities(::Type{AlwaysDivergentTest}) LogDensityProblems.LogDensityOrder{1}() end LogDensityProblems.dimension(d::AlwaysDivergentTest) = d.K function LogDensityProblems.logdensity_and_gradient(d::AlwaysDivergentTest, x) ∇ = ones(length(x)) if all(iszero.(x)) 0.0, ∇ else -Inf, ∇ end end @testset "unconditional divergence" begin # test NUTS sampler where all movements are divergent K = 3 ℓ = AlwaysDivergentTest(K) Q, tree_statistics = sample_tree(RNG, NUTS(), Hamiltonian(GaussianKineticEnergy(K), ℓ), evaluate_ℓ(ℓ, zeros(K)), 1.0) @test is_divergent(tree_statistics.termination) @test iszero(tree_statistics.acceptance_rate) @test iszero(tree_statistics.depth) @test tree_statistics.steps == 1 end @testset "normal NUTS HMC transition mean and cov" begin # A test for sample_tree with a fixed ϵ and κ, which is perfectly adapted and should # provide excellent mixing for _ in 1:10 K = rand(RNG, 2:8) N = 10000 μ = randn(RNG, K) Σ = rand_Σ(K) L = cholesky(Σ).L ℓ = multivariate_normal(μ, L) Q = evaluate_ℓ(ℓ, randn(RNG, K)) H = Hamiltonian(GaussianKineticEnergy(Σ), ℓ) qs = Array{Float64}(undef, N, K) ϵ = 0.5 algorithm = NUTS() for i in 1:N Q = first(sample_tree(RNG, algorithm, H, Q, ϵ)) qs[i, :] = Q.q end m, C = mean_and_cov(qs, 1) tol = maximum(diag(C)) / 50 @test vec(m) ≈ μ atol = tol rtol = tol norm = x -> norm(x,1) @test cov(qs, dims = 1) ≈ L*L' atol = 0.1 rtol = 0.1 end end ================================================ FILE: test/test_diagnostics.jl ================================================ ##### ##### test diagnostics ##### @testset "summarize tree statistics" begin N = 1000 directions = Directions(UInt32(0)) function rand_invalidtree() if rand(RNG) < 0.1 REACHED_MAX_DEPTH else left = rand(RNG, -5:5) right = left + rand(RNG, 0:5) InvalidTree(left, right) end end tree_statistics = [TreeStatisticsNUTS(randn(RNG), rand(RNG, 0:5), rand_invalidtree(), rand(RNG), rand(RNG, 1:30), directions) for _ in 1:N] stats = summarize_tree_statistics(tree_statistics) # acceptance rates @test stats.N == N @test stats.a_mean ≈ mean(x -> x.acceptance_rate, tree_statistics) @test stats.a_quantiles == quantile((x -> x.acceptance_rate).(tree_statistics), ACCEPTANCE_QUANTILES) # termination counts @test stats.termination_counts.divergence == count(x -> is_divergent(x.termination), tree_statistics) @test stats.termination_counts.max_depth == count(x -> x.termination == REACHED_MAX_DEPTH, tree_statistics) @test stats.termination_counts.turning == (N - stats.termination_counts.max_depth - stats.termination_counts.divergence) # depth counts for (i, c) in enumerate(stats.depth_counts) @test count(x -> x.depth == i - 1, tree_statistics) == c end @test sum(stats.depth_counts) == N # misc @test 1.8 ≤ EBFMI(tree_statistics) ≤ 2.2 # nonsensical value, just checking calculation @test repr(stats) isa AbstractString # just test that it prints w/o error end @testset "log acceptance ratios" begin ℓ = multivariate_normal(ones(5)) log2ϵs = -5:5 N = 13 logA = explore_log_acceptance_ratios(ℓ, zeros(5), log2ϵs; N = N) @test all(isfinite.(logA)) @test size(logA) == (length(log2ϵs), N) end @testset "leapfrog trajectory" begin # problem setup K = 2 ℓ = multivariate_normal(ones(K)) κ = GaussianKineticEnergy(K) q = zeros(K) Q = evaluate_ℓ(ℓ, q) p = ones(K) .* 0.98 H = Hamiltonian(κ, ℓ) ϵ = 0.1 ixs = 1:15 ix0 = 5 # calculate trajectory manually zs1 = let z = PhasePoint(Q, p) [(z1 = z; z = leapfrog(H, z, ϵ); z1) for _ in ixs] end πs1 = logdensity.(Ref(H), zs1) Δs1 = πs1 .- πs1[ix0] # calculate using function traj = leapfrog_trajectory(ℓ, zs1[ix0].Q.q, ϵ, ixs .- ix0; κ = κ, p = zs1[ix0].p) @test all(isapprox.(map(t -> t.Δ, traj), Δs1; atol = 1e-5)) @test all(map((t, y) -> t.z.Q.q ≈ y.Q.q, traj, zs1)) @test all(map((t, y) -> t.z.p ≈ y.p, traj, zs1)) end ================================================ FILE: test/test_hamiltonian.jl ================================================ using DynamicHMC: GaussianKineticEnergy, kinetic_energy, ∇kinetic_energy, rand_p, Hamiltonian, EvaluatedLogDensity, evaluate_ℓ, PhasePoint, logdensity, leapfrog, calculate_p♯, logdensity, find_initial_stepsize, DynamicHMCError, local_log_acceptance_ratio #### #### utility functions #### "Test kinetic energy gradient by automatic differentiation." function test_KE_gradient(κ::DynamicHMC.EuclideanKineticEnergy, p) ∇ = ∇kinetic_energy(κ, p) ∇_AD = ForwardDiff.gradient(p -> kinetic_energy(κ, p), p) @test ∇ ≈ ∇_AD end #### #### testsets #### @testset "Gaussian KE full" begin for _ in 1:100 K = rand(RNG, 2:10) Σ = rand_Σ(Symmetric, K) κ = GaussianKineticEnergy(inv(Σ)) (; M⁻¹, W) = κ @test W isa LowerTriangular @test M⁻¹ * W * W' ≈ Diagonal(ones(K)) m, C = simulated_meancov(()->rand_p(RNG, κ), 10000) @test Matrix(Σ) ≈ C rtol = 0.1 test_KE_gradient(κ, randn(RNG, K)) end end @testset "Gaussian KE diagonal" begin for _ in 1:100 K = rand(RNG, 2:10) Σ = rand_Σ(Diagonal, K) κ = GaussianKineticEnergy(inv(Σ)) (; M⁻¹, W) = κ @test W isa Diagonal # FIXME workaround for https://github.com/JuliaLang/julia/issues/28869 @test M⁻¹ * (W * W') ≈ Diagonal(ones(K)) m, C = simulated_meancov(()->rand_p(RNG, κ), 10000) @test Matrix(Σ) ≈ C rtol = 0.1 test_KE_gradient(κ, randn(RNG, K)) end end @testset "phasepoint internal consistency" begin # when this breaks, interface was modified, rewrite tests @test fieldnames(PhasePoint) == (:Q, :p) "Test the consistency of cached values." function test_consistency(H, z) (; q, ℓq, ∇ℓq) = z.Q (; ℓ) = H ℓ2, ∇ℓ2 = logdensity_and_gradient(ℓ, q) @test ℓ2 == ℓq @test ∇ℓ2 == ∇ℓq end (; H, z, Σ ) = rand_Hz(rand(RNG, 3:10)) test_consistency(H, z) ϵ = find_stable_ϵ(H.κ, Σ) for _ in 1:10 z = leapfrog(H, z, ϵ) test_consistency(H, z) end end @testset "leapfrog calculation" begin # Simple leapfrog implementation. `q`: position, `p`: momentum, `ℓ`: neg_energy, `ϵ`: # stepsize. `m` is the diagonal of the kinetic energy ``K(p)=p'M⁻¹p``, defaults to `I`. function leapfrog_Gaussian(q, p, ℓ, ϵ, m = ones(length(p))) u = .√(1 ./ m) pₕ = p .+ ϵ/2 .* last(logdensity_and_gradient(ℓ, q)) q′ = q .+ ϵ * u .* (u .* pₕ) # mimic numerical calculation leapfrog performs p′ = pₕ .+ ϵ/2 .* last(logdensity_and_gradient(ℓ, q′)) q′, p′ end n = 3 M = rand_Σ(Diagonal, n) m = diag(M) κ = GaussianKineticEnergy(inv(M)) q = randn(RNG, n) p = randn(RNG, n) Σ = rand_Σ(n) ℓ = multivariate_normal(randn(RNG, n), cholesky(Σ).L) H = Hamiltonian(κ, ℓ) ϵ = find_stable_ϵ(H.κ, Σ) z = PhasePoint(evaluate_ℓ(ℓ, q), p) @testset "arguments not modified" begin q₂, p₂ = copy(q), copy(p) q′, p′ = leapfrog_Gaussian(q, p, ℓ, ϵ, m) z′ = leapfrog(H, z, ϵ) @test p == p₂ # arguments not modified @test q == q₂ @test z′.Q.q ≈ q′ @test z′.p ≈ p′ end @testset "leapfrog steps" begin for i in 1:100 q, p = leapfrog_Gaussian(q, p, ℓ, ϵ, m) z = leapfrog(H, z, ϵ) @test z.Q.q ≈ q @test z.p ≈ p end end @testset "invalid values" begin n = 3 ℓ = multivariate_normal(randn(RNG, n), I(n)) @test_throws DynamicHMCError evaluate_ℓ(ℓ, fill(NaN, n)) end end @testset "leapfrog Hamiltonian invariance" begin "Test that the Hamiltonian is invariant using the leapfrog integrator." function test_hamiltonian_invariance(H, z, L, ϵ; atol) π₀ = logdensity(H, z) warned = false for i in 1:L z = leapfrog(H, z, ϵ) Δ = logdensity(H, z) - π₀ if abs(Δ) ≥ atol && !warned @warn "Hamiltonian invariance violated" step = i L Δ show(H) show(z) warned = true end @test Δ ≈ 0 atol = atol end end for _ in 1:100 (; H, z) = rand_Hz(rand(RNG, 2:5)) ϵ = find_initial_stepsize(InitialStepsizeSearch(), local_log_acceptance_ratio(H, z)) test_hamiltonian_invariance(H, z, 10, ϵ/100; atol = 0.5) end end @testset "leapfrog back and forth" begin for _ in 1:1000 (; H, z) = rand_Hz(5) z1 = z N = 5 ϵ = 0.1 z1 = leapfrog(H, z1, ϵ) z1 = leapfrog(H, z1, -ϵ) @test z.p ≈ z1.p norm = x -> norm(x, Inf) atol = 1e-5 @test z.Q.q ≈ z1.Q.q norm = x -> norm(x, Inf) atol = 1e-6 end for _ in 1:100 (; H, z, Σ) = rand_Hz(2) z1 = z N = 3 # use something near the stable stepsize to avoid numerical issue, but perturb it a # bit for testing ϵ = find_stable_ϵ(H.κ, Σ) * (0.5 + rand(RNG)) # forward for _ in 1:N z1 = leapfrog(H, z1, ϵ) end # backward for _ in 1:N z1 = leapfrog(H, z1, -ϵ) end @test z.p ≈ z1.p norm = x -> norm(x, Inf) rtol = 0.001 @test z.Q.q ≈ z1.Q.q norm = x -> norm(x, Inf) rtol = 0.001 end end @testset "PhasePoint building blocks and infinite values" begin # wrong gradient length @test_throws ArgumentError EvaluatedLogDensity([1.0, 2.0], 1.0, [1.0]) # wrong p length Q = EvaluatedLogDensity([1.0, 2.0], 1.0, [1.0, 2.0]) @test_throws ArgumentError PhasePoint(Q, [1.0]) @test PhasePoint(Q, [1.0, 2.0]) isa PhasePoint # fallback constructors Q1 = EvaluatedLogDensity([1.0, 2.0], -2.0, [3.0, 3.0]) # standard Q2 = EvaluatedLogDensity([1, 2], -2.0, [3.0, 3.0]) # promote Q3 = EvaluatedLogDensity((i for i in 1:2), -2.0, [3.0, 3.0]) # generator @test Q1.q == Q2.q == Q3.q @test Q1.ℓq == Q2.ℓq == Q3.ℓq @test Q1.∇ℓq == Q2.∇ℓq == Q3.∇ℓq # infinity fallbacks h = Hamiltonian(GaussianKineticEnergy(1), multivariate_normal(zeros(1))) @test logdensity(h, PhasePoint(EvaluatedLogDensity([1.0], -Inf, [1.0]), [1.0])) == -Inf @test logdensity(h, PhasePoint(EvaluatedLogDensity([1.0], NaN, [1.0]), [1.0])) == -Inf @test logdensity(h, PhasePoint(EvaluatedLogDensity([1.0], 9.0, [1.0]), [NaN])) == -Inf end @testset "Hamiltonian and KE printing" begin κ = GaussianKineticEnergy(Diagonal([1.0, 4.0])) @test repr(κ) == "Gaussian kinetic energy (Diagonal), √diag(M⁻¹): [1.0, 2.0]" H = Hamiltonian(κ, multivariate_normal(zeros(2))) @test repr(H) == "Hamiltonian with Gaussian kinetic energy (Diagonal), √diag(M⁻¹): [1.0, 2.0]" @test_throws ArgumentError Hamiltonian(κ, multivariate_normal(zeros(1))) end #### #### test Hamiltonian/leapfrog using HMC #### """ $(SIGNATURES) Simple Hamiltonian Monte Carlo transition, for testing. """ function HMC_transition(H, z::PhasePoint, ϵ, L) π₀ = logdensity(H, z) z′ = z for _ in 1:L z′ = leapfrog(H, z′, ϵ) end Δ = logdensity(H, z′) - π₀ accept = Δ > 0 || (rand(RNG) < exp(Δ)) accept ? z′ : z end """ $(SIGNATURES) Simple Hamiltonian Monte Carlo sample, for testing. """ function HMC_sample(H, q, N, ϵ; L = 10) qs = similar(q, N, length(q)) for i in 1:N z = PhasePoint(evaluate_ℓ(H.ℓ, q), rand_p(RNG, H.κ)) q = HMC_transition( H, z, ϵ, L).Q.q qs[i, :] = q end qs end @testset "unit normal simple HMC" begin # Tests the leapfrog and Hamiltonian code with HMC. K = 2 ℓ = multivariate_normal(zeros(K), Diagonal(ones(K))) q = randn(RNG, K) H = Hamiltonian(GaussianKineticEnergy(Diagonal(ones(K))), ℓ) qs = HMC_sample(H, q, 10000, find_stable_ϵ(H.κ, Diagonal(ones(K))) / 5) m, C = mean_and_cov(qs, 1) @test vec(m) ≈ zeros(K) atol = 0.1 @test C ≈ Matrix(Diagonal(ones(K))) atol = 0.1 end ================================================ FILE: test/test_logging.jl ================================================ # NOTE currently we just check that logging does not error, more explicit testing might make # sense ℓ = multivariate_normal(ones(1)) κ = GaussianKineticEnergy(1) Q = evaluate_ℓ(ℓ, [1.0]) reporters_1 = [ NoProgressReport(), ProgressMeterReport(), ] reporters_2 = [ LogProgressReport(), ] with_logger(NullLogger()) do # suppress logging in CI for reporter in deepcopy(vcat(reporters_1, reporters_2)) results = mcmc_with_warmup(RNG, ℓ, 10_000; reporter = reporter) end for reporter in deepcopy(vcat(reporters_1, reporters_2)) DynamicHMC.report(reporter, "") mcmc_reporter_1 = DynamicHMC.make_mcmc_reporter(reporter, 1_000; currently_warmup = true) DynamicHMC.report(mcmc_reporter_1, "") DynamicHMC.report(mcmc_reporter_1, 1) mcmc_reporter_2 = DynamicHMC.make_mcmc_reporter(reporter, 1_000; currently_warmup = false) DynamicHMC.report(mcmc_reporter_2, "") DynamicHMC.report(mcmc_reporter_2, 1) end for reporter in deepcopy(reporters_1) DynamicHMC.report(reporter, "") DynamicHMC.report(reporter, 1) mcmc_reporter_1 = DynamicHMC.make_mcmc_reporter(reporter, 1_000; currently_warmup = true) DynamicHMC.report(mcmc_reporter_1, "") DynamicHMC.report(mcmc_reporter_1, 1) mcmc_reporter_2 = DynamicHMC.make_mcmc_reporter(reporter, 1_000; currently_warmup = false) DynamicHMC.report(mcmc_reporter_2, "") DynamicHMC.report(mcmc_reporter_2, 1) end end ================================================ FILE: test/test_mcmc.jl ================================================ using DynamicHMC: mcmc_steps, mcmc_next_step, mcmc_keep_warmup, WarmupState ##### ##### Test building blocks of MCMC ##### @testset "printing" begin ℓ = multivariate_normal(ones(1)) κ = GaussianKineticEnergy(1) Q = evaluate_ℓ(ℓ, [1.0]) @test repr(WarmupState(Q, κ, 1.0)) isa String @test repr(WarmupState(Q, κ, nothing)) isa String end @testset "mcmc" begin ℓ = multivariate_normal(ones(5)) @testset "default warmup" begin results = mcmc_with_warmup(RNG, ℓ, 10000; reporter = NoProgressReport()) Z = results.posterior_matrix @test map(z -> LogDensityProblems.logdensity(ℓ, z), eachcol(Z)) ≈ results.logdensities @test norm(mean(Z; dims = 2) .- ones(5), Inf) < 0.04 @test norm(std(Z; dims = 2) .- ones(5), Inf) < 0.04 @test mean(x -> x.acceptance_rate, results.tree_statistics) ≥ 0.8 @test 0.5 ≤ results.ϵ ≤ 2 end @testset "fixed stepsize" begin results = mcmc_with_warmup(RNG, ℓ, 10000; initialization = (ϵ = 1.0, ), reporter = NoProgressReport(), warmup_stages = fixed_stepsize_warmup_stages()) Z = results.posterior_matrix @test norm(mean(Z; dims = 2) .- ones(5), Inf) < 0.04 @test norm(std(Z; dims = 2) .- ones(5), Inf) < 0.04 @test mean(x -> x.acceptance_rate, results.tree_statistics) ≥ 0.7 end @testset "explicitly provided initial stepsize" begin results = mcmc_with_warmup(RNG, ℓ, 10000; initialization = (ϵ = 1.0, ), reporter = NoProgressReport(), warmup_stages = default_warmup_stages(; stepsize_search = nothing)) Z = results.posterior_matrix @test norm(mean(Z; dims = 2) .- ones(5), Inf) < 0.03 @test norm(std(Z; dims = 2) .- ones(5), Inf) < 0.04 @test mean(x -> x.acceptance_rate, results.tree_statistics) ≥ 0.7 end @testset "stepwise" begin results = mcmc_keep_warmup(RNG, ℓ, 0; reporter = NoProgressReport()) steps = mcmc_steps(results.sampling_logdensity, results.final_warmup_state) qs = let Q = results.final_warmup_state.Q [(Q = first(mcmc_next_step(steps, Q)); Q.q) for _ in 1:1000] end @test norm(mean(reduce(hcat, qs); dims = 2) .- ones(5), Inf) ≤ 0.1 end end @testset "robust U-turn tests" begin # Cf https://github.com/tpapp/DynamicHMC.jl/issues/115 function count_max_depth(rng, ℓ, max_depth; N = 1000) results = mcmc_with_warmup(rng, ℓ, N; algorithm = DynamicHMC.NUTS(max_depth = max_depth), reporter = NoProgressReport()) sum(getfield.(results.tree_statistics, :depth) .≥ max_depth) end ℓ = multivariate_normal(zeros(200)) max_depth = 12 M = sum([count_max_depth(RNG, ℓ, max_depth) for _ in 1:20]) @test M == 0 end @testset "posterior accessors sanity checks" begin D, N, K = 5, 100, 7 ℓ = multivariate_normal(ones(5)) results = fill(mcmc_with_warmup(RNG, ℓ, N; reporter = NoProgressReport()), K) @test size(stack_posterior_matrices(results)) == (N, K, D) @test size(pool_posterior_matrices(results)) == (D, N * K) end # @testset "tuner framework" begin # s = StepsizeTuner(10) # @test length(s) == 10 # @test repr(s) == "Stepsize tuner, 10 samples" # c = StepsizeCovTuner(19, 7.0) # @test length(c) == 19 # @test repr(c) == # "Stepsize and covariance tuner, 19 samples, regularization 7.0" # b = bracketed_doubling_tuner() # testing the defaults # @test b isa TunerSequence # @test b.tuners == (StepsizeTuner(75), # init # StepsizeCovTuner(25, 5.0), # doubling each step # StepsizeCovTuner(50, 5.0), # StepsizeCovTuner(100, 5.0), # StepsizeCovTuner(200, 5.0), # StepsizeCovTuner(400, 5.0), # StepsizeTuner(50)) # terminate # @test repr(b) == # """ # Sequence of 7 tuners, 900 total samples # Stepsize tuner, 75 samples # Stepsize and covariance tuner, 25 samples, regularization 5.0 # Stepsize and covariance tuner, 50 samples, regularization 5.0 # Stepsize and covariance tuner, 100 samples, regularization 5.0 # Stepsize and covariance tuner, 200 samples, regularization 5.0 # Stepsize and covariance tuner, 400 samples, regularization 5.0 # Stepsize tuner, 50 samples""" # end ================================================ FILE: test/test_stepsize.jl ================================================ ##### ##### Stepsize and adaptation tests ##### using DynamicHMC: find_initial_stepsize, InitialStepsizeSearch, DualAveraging, initial_adaptation_state, adapt_stepsize, current_ϵ, final_ϵ, FixedStepsize, local_log_acceptance_ratio @testset "stepsize general rootfinding" begin A = ϵ -> -ϵ*3.0 params = InitialStepsizeSearch() # parameters consistency @test_throws ArgumentError InitialStepsizeSearch(; log_threshold = NaN) # not finite @test_throws ArgumentError InitialStepsizeSearch(; log_threshold = 1.0) # too large @test_throws ArgumentError InitialStepsizeSearch(; initial_ϵ = -0.5) # not > 0 @test_throws ArgumentError InitialStepsizeSearch(; maxiter_crossing = 2) # too small # crossing ϵ = find_initial_stepsize(params, A) @test A(ϵ) > params.log_threshold > A(params.initial_ϵ) let params = InitialStepsizeSearch(; initial_ϵ = 0.01) ϵ = find_initial_stepsize(params, A) @test A(ϵ) < params.log_threshold < A(params.initial_ϵ) end @test_throws DynamicHMCError find_initial_stepsize(params, ϵ -> 1) # constant end """ $(SIGNATURES) A parametric random acceptance rate that depends on the stepsize. For unit testing acceptance rate tuning. """ dummy_acceptance_rate(ϵ, σ = 0.05) = min(1/ϵ * exp(randn(RNG)*σ - σ^2/2), 1) mean_dummy_acceptance_rate(ϵ, σ = 0.05) = mean(dummy_acceptance_rate(ϵ, σ) for _ in 1:10000) @testset "dual averaging far" begin ϵ₀ = 100.0 # way off δ = 0.65 dual_averaging = DualAveraging(; δ = δ) A = initial_adaptation_state(dual_averaging, ϵ₀) @test A.logϵ̄ == 0 # ϵ̄₀ = 1 in Gelman and Hoffman (2014) @test A.m == 1 @test A.H̄ == 0 for _ in 1:500 A = adapt_stepsize(dual_averaging, A, dummy_acceptance_rate(current_ϵ(A))) end @test mean_dummy_acceptance_rate(final_ϵ(A)) ≈ δ atol = 0.02 end @testset "dual averaging close" begin ϵ₀ = 2.0 δ = 0.65 dual_averaging = DualAveraging(; δ = δ) A = initial_adaptation_state(dual_averaging, ϵ₀) for _ in 1:2000 A = adapt_stepsize(dual_averaging, A, dummy_acceptance_rate(current_ϵ(A))) end @test mean_dummy_acceptance_rate(final_ϵ(A)) ≈ δ atol = 0.01 end @testset "dual averaging far and noisy" begin ϵ₀ = 20.0 δ = 0.65 dual_averaging = DualAveraging(; δ = δ) A = initial_adaptation_state(dual_averaging, ϵ₀) for _ in 1:10000 A = adapt_stepsize(dual_averaging, A, dummy_acceptance_rate(current_ϵ(A), 2.0)) end @test mean_dummy_acceptance_rate(final_ϵ(A), 2.0) ≈ δ atol = 0.04 end @testset "fixed stepsize sanity checks" begin fs = FixedStepsize() ϵ = 1.0 A = initial_adaptation_state(fs, ϵ) @test A == adapt_stepsize(fs, A, ϵ) @test current_ϵ(A) == ϵ @test final_ϵ(A) == ϵ end @testset "find reasonable stepsize - random H, z" begin p = InitialStepsizeSearch() _bkt(A, ϵ, C) = (A(ϵ) - p.log_threshold) * (A(ϵ * C) - p.log_threshold) ≤ 0 for _ in 1:100 (; H, z) = rand_Hz(rand(RNG, 3:5)) A = local_log_acceptance_ratio(H, z) ϵ = find_initial_stepsize(p, A) @test _bkt(A, ϵ, 0.5) || _bkt(A, ϵ, 2.0) end end @testset "error for non-finite initial density" begin p = InitialStepsizeSearch() (; H, z) = rand_Hz(2) z = DynamicHMC.PhasePoint(z.Q, [NaN, NaN]) @test_throws DynamicHMCError find_initial_stepsize(p, local_log_acceptance_ratio(H, z)) end ================================================ FILE: test/test_trees.jl ================================================ using DynamicHMC: Directions, next_direction, biased_progressive_logprob2, adjacent_tree, sample_trajectory, InvalidTree, is_turning #### #### test directions mechanism #### @testset "directions" begin directions = Directions(0b110101) is_forwards = Vector{Bool}() for i in 1:6 is_forward, directions = next_direction(directions) push!(is_forwards, is_forward) end @test collect(is_forwards) == [true, false, true, false, true, true] @test rand(RNG, Directions).flags isa UInt32 end #### #### dummy trajectory for unit testing #### """ Trajectory type that is easy to inspect and reason about, for unit testing. The field `visited` keeps track of visited nodes, can be reset with `empty!`. """ struct DummyTrajectory{L,T,D} "Positions that trigger turning." turning::T "Positions that trigger/mimic divergence." divergent::D "Log density." ℓ::L "Visited positions." visited::Vector{Int} end function DummyTrajectory(ℓ; turning = Set(Int[]), divergent = Set(Int[])) DummyTrajectory(turning, divergent, ℓ, Int[]) end Base.empty!(trajectory) = empty(trajectory.visited) DynamicHMC.move(::DummyTrajectory, z, is_forward) = z + (is_forward ? 1 : -1) const DUMMY_TURN_STATISTICS = Tuple{Bool,UnitRange{Int64}} function DynamicHMC.is_turning(::DummyTrajectory, τ::DUMMY_TURN_STATISTICS) turn_flag, positions = τ @test length(positions) > 1 # not called on a leaf turn_flag end function DynamicHMC.combine_turn_statistics(::DummyTrajectory, τ₁::DUMMY_TURN_STATISTICS, τ₂::DUMMY_TURN_STATISTICS) turn_flag1, positions1 = τ₁ turn_flag2, positions2 = τ₂ @test last(positions1) + 1 == first(positions2) # adjacency and order (turn_flag1 && turn_flag2, first(positions1):last(positions2)) end function DynamicHMC.combine_visited_statistics(::DummyTrajectory, v₁, v₂) a1, s1 = v₁ a2, s2 = v₂ (a1 + a2, s1 + s2) end function DynamicHMC.combine_proposals(_, ::DummyTrajectory, zeta1, zeta2, logprob2, is_forward) lp2 = logprob2 > 0 ? 0.0 : logprob2 lp1 = logprob2 > 0 ? oftype(lp2, -Inf) : log1mexp(lp2) if !is_forward # exchange so that we can test for adjacency, and join as a UnitRange zeta1, zeta2 = zeta2, zeta1 lp1, lp2 = lp2, lp1 end z1, p1 = zeta1 z2, p2 = zeta2 @test last(z1) + 1 == first(z2) # adjacency and order (first(z1):last(z2), vcat(p1 .+ lp1, p2 .+ lp2)) end function DynamicHMC.calculate_logprob2(::DummyTrajectory, is_doubling, ω₁, ω₂, ω) biased_progressive_logprob2(is_doubling, ω₁, ω₂, ω) end function DynamicHMC.leaf(trajectory::DummyTrajectory, z, is_initial) (; turning, divergent, ℓ, visited) = trajectory d = z ∈ divergent is_initial && @argcheck !d # don't start with divergent Δ = ℓ(z) v = is_initial ? (0.0, 0) : (min(exp(Δ), 1), 1) !is_initial && push!(visited, z) # save position if d nothing, v else (((z:z, [0.0]), # ζ = nodes, log prob. within tree Δ, # ω = Δ for leaf (z ∈ turning, z:z)), # τ v) end end "A log density for testing." testℓ(z) = -abs2(z - 3) * 0.1 "Total acceptance rate of `ℓ` over `z`" testA(ℓ, z) = sum(min.(exp.(ℓ.(z)), 1)) "sum of acceptance rates for trajectory." testA(trajectory::DummyTrajectory) = testA(trajectory.ℓ, trajectory.visited) @testset "dummy adjacent tree full" begin trajectory = DummyTrajectory(testℓ) (ζ, ω, τ, z′, i′), v = adjacent_tree(nothing, trajectory, 0, 0, 2, true) @test first(ζ) == 1:4 @test sum(exp, last(ζ)) ≈ 1 @test trajectory.visited == 1:4 @test !is_turning(trajectory, τ) @test v[1] ≈ testA(trajectory) @test v[2] == 4 @test z′ = i′ == 4 end @testset "dummy adjacent tree turning" begin trajectory = DummyTrajectory(testℓ; turning = 5:7) t, v = adjacent_tree(nothing, trajectory, 0, 0, 3, true) @test trajectory.visited == 1:6 # [5,6] is turning @test t == InvalidTree(5, 6) @test v[1] == testA(trajectory) @test v[2] == 6 end @testset "dummy adjacent tree divergent" begin trajectory = DummyTrajectory(testℓ; divergent = 5:7) t, v = adjacent_tree(nothing, trajectory, 0, 0, 3, true) @test trajectory.visited == 1:5 # 5 is divergent @test t == InvalidTree(5) @test v[1] ≈ testA(testℓ, 1:5) @test v[2] == 5 end @testset "dummy adjacent tree full backward" begin trajectory = DummyTrajectory(testℓ) (ζ, ω, τ, z′, i′), v = adjacent_tree(nothing, trajectory, 0, 0, 3, false) @test first(ζ) == -8:-1 @test sum(exp, last(ζ)) ≈ 1 @test trajectory.visited == -(1:8) @test !is_turning(trajectory, τ) @test v[1] ≈ testA(testℓ, -(1:8)) @test v[2] == 8 @test z′ == i′ == -8 end @testset "dummy sampled tree" begin trajectory = DummyTrajectory(testℓ) ζ, v, termination, depth = sample_trajectory(nothing, trajectory, 0, 3, Directions(0b101)) @test trajectory.visited == [1, -1, -2, 2, 3, 4, 5] @test first(ζ) == -2:5 @test sum(exp, last(ζ)) ≈ 1 @test termination == DynamicHMC.REACHED_MAX_DEPTH @test v[1] ≈ testA(trajectory) @test v[2] == 7 # initial node does not participate in acceptance rate end #### #### Detailed balance tests #### "An accumulator for log probabilities associated with a position on a trajectory." empty_accumulator() = Dict{Int,Float64}() "Add log probabilities `πs` at positions `zs` into `accumulator`." function add_log_probabilities!(accumulator, zs, πs) for (z, π) in zip(zs, πs) accumulator[z] = haskey(accumulator, z) ? logaddexp(accumulator[z], π) : π end accumulator end "Normalize an accumulator by depth." function normalize_accumulator(accumulator, depth) D = log(0.5) * depth Dict((k => v + D) for (k, v) in pairs(accumulator)) end """ An accumulator with the probability of visiting nodes for all trees with `depth`, strarting from `z`, on `trajectory`. """ function visited_log_probabilities(trajectory, z, depth) accumulator = empty_accumulator() for flags in 0:(2^depth - 1) ζ = first(sample_trajectory(nothing, trajectory, z, depth, Directions(UInt32(flags)))) add_log_probabilities!(accumulator, ζ...) end normalize_accumulator(accumulator, depth) end """ The probability of visiting node `z′` for all trees with `depth`, strarting from `z`, on `trajectory`. """ function transition_log_probability(trajectory, z, z′, depth) p = -Inf for flags in 0:(2^depth - 1) zs, πs = first(sample_trajectory(nothing, trajectory, z, depth, Directions(UInt32(flags)))) ix = findfirst(isequal(z′), zs) if ix ≠ nothing p = logaddexp(p, πs[ix]) end end p + depth * log(0.5) end @testset "transition calculations consistency check" begin trajectory = DummyTrajectory(testℓ) depth = 5 z = 9 for (z′, π) in pairs(visited_log_probabilities(trajectory, z, depth)) @test π ≈ transition_log_probability(trajectory, z, z′, depth) end end """ $(SIGNATURES) For all transitions from `z`, test the detailed balance condition, ie ``ℙ(z) ℙ(j ∣ z) = ℙ(j) ℙ(z ∣ j)`` where ``ℙ(z) = exp(ℓ(z))`` and the transition probabilities ``ℙ(⋅∣⋅)`` are calculated using `visited_log_probabilities` and `transition_log_probability`. (We use logs for more accurate calculation.) """ function test_detailed_balance(trajectory, z, depth; atol = √eps()) (; ℓ) = trajectory ℓz = ℓ(z) for (z′, π) in pairs(visited_log_probabilities(trajectory, z, depth)) π′ = transition_log_probability(trajectory, z′, z, depth) @test (π + ℓz) ≈ (π′ + ℓ(z′)) atol = atol end end @testset "detailed balance" begin for max_depth in 1:5 test_detailed_balance(DummyTrajectory(testℓ), 0, max_depth) end for max_depth in 1:5 test_detailed_balance(DummyTrajectory(testℓ; turning = 1:2), 3, max_depth) end for max_depth in 1:6 test_detailed_balance(DummyTrajectory(testℓ; divergent = 10:11), 3, max_depth) end for max_depth in 1:6 test_detailed_balance(DummyTrajectory(testℓ; divergent = 10:12, turning = -3:-2), 3, max_depth) end end ================================================ FILE: test/utilities.jl ================================================ """ $(SIGNATURES) Random positive definite matrix of size `n` x `n` (for testing). """ function rand_Σ(::Type{Symmetric}, n) A = randn(RNG, n, n) Symmetric(A'*A .+ 0.01) end rand_Σ(::Type{Diagonal}, n) = Diagonal(randn(RNG, n).^2 .+ 0.01) rand_Σ(n::Int) = rand_Σ(Symmetric, n) """ $(SIGNATURES) Simulated mean and covariance of `N` values from `f()`. """ function simulated_meancov(f, N) s = f() K = length(s) x = similar(s, (N, K)) for i in 1:N x[i, :] = f() end m, C = mean_and_cov(x, 1) vec(m), C end @testset "simulated meancov" begin μ = [2, 1.2] D = [2.0, 0.7] m, C = simulated_meancov(()-> randn(RNG, 2) .* D .+ μ, 10000) @test m ≈ μ atol = 0.05 rtol = 0.1 @test C ≈ Diagonal(abs2.(D)) atol = 0.05 rtol = 0.1 end ### ### Hamiltonian test helper functions ### """ $(SIGNATURES) Return a reasonable estimate for the largest stable stepsize (which may not be stable, but is a good starting point for finding that). `q` is assumed to be normal with variance `Σ`. `κ` is the kinetic energy. Using the transformation ``p̃ = W⁻¹ p``, the kinetic energy is ``p'M⁻¹p = p'W⁻¹'W⁻¹p/2=p̃'p̃/2`` Transforming to ``q̃=W'q``, the variance of which becomes ``W' Σ W``. Return the square root of its smallest eigenvalue, following Neal (2011, p 136). When ``Σ⁻¹=M=WW'``, this the variance of `q̃` is ``W' Σ W=W' W'⁻¹W⁻¹W=I``, and thus decorrelates the density perfectly. """ find_stable_ϵ(κ::GaussianKineticEnergy, Σ) = √eigmin(κ.W'*Σ*κ.W) "Multivariate normal with `Σ = LL'`." multivariate_normal(μ, L) = (shift(μ) ∘ linear(L))(StandardMultivariateNormal(length(μ))) "Multivariate normal with diagonal `Σ` (constant `v` variance)." multivariate_normal(μ, v::Real = 1) = multivariate_normal(μ, I(length(μ)) * v) """ $(SIGNATURES) A `NamedTuple` that contains - a random `K`-element vector `μ` - a random `K × K` covariance matrix `Σ`, - a random Hamiltonian `H` with `ℓ` corresponding to a multivariate normal with `μ`, `Σ`, and a random Gaussian kinetic energy (unrelated to `ℓ`). - a random phasepoint `z`. Useful for testing. """ function rand_Hz(K) μ = randn(K) Σ = rand_Σ(K) L = cholesky(Σ).L κ = GaussianKineticEnergy(inv(rand_Σ(Diagonal, K))) ℓ = multivariate_normal(μ, L) H = Hamiltonian(κ, ℓ) q = rand(RNG, ℓ) p = rand_p(RNG, κ) z = PhasePoint(evaluate_ℓ(H.ℓ, q), rand_p(RNG, κ)) (μ = μ, Σ = Σ, H = H, z = z) end