[
  {
    "path": ".editorconfig",
    "content": "root = true\n\n[*]\ncharset = utf-8\nindent_size = 4\nindent_style = space\nend_of_line = lf\ninsert_final_newline = true\ntrim_trailing_whitespace = true\n\n[*.md]\ntrim_trailing_whitespace = false\n\n[*.{yml,yaml}]\nindent_size = 2\n"
  },
  {
    "path": ".gitattributes",
    "content": "# Path-based git attributes\n# https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html\n\n# Ignore all test and documentation with \"export-ignore\".\n/.github                 export-ignore\n/.gitattributes          export-ignore\n/.gitignore              export-ignore\n/phpunit.xml.dist        export-ignore\n/psalm.xml.dist          export-ignore\n/tests                   export-ignore\n/.editorconfig           export-ignore\n/.php-cs-fixer.dist.php  export-ignore\n/art                     export-ignore\n/docs                    export-ignore\n/UPGRADING.md            export-ignore\n"
  },
  {
    "path": ".github/CONTRIBUTING.md",
    "content": "# Contributing\n\nContributions are **welcome** and will be fully **credited**.\n\nPlease read and understand the contribution guide before creating an issue or pull request.\n\n## Etiquette\n\nThis project is open source, and as such, the maintainers give their free time to build and maintain the source code\nheld within. They make the code freely available in the hope that it will be of use to other developers. It would be\nextremely unfair for them to suffer abuse or anger for their hard work.\n\nPlease be considerate towards maintainers when raising issues or presenting pull requests. Let's show the\nworld that developers are civilized and selfless people.\n\nIt's the duty of the maintainer to ensure that all submissions to the project are of sufficient\nquality to benefit the project. Many developers have different skillsets, strengths, and weaknesses. Respect the maintainer's decision, and do not be upset or abusive if your submission is not used.\n\n## Viability\n\nWhen requesting or submitting new features, first consider whether it might be useful to others. Open\nsource projects are used by many developers, who may have entirely different needs to your own. Think about\nwhether or not your feature is likely to be used by other users of the project.\n\n## Procedure\n\nBefore filing an issue:\n\n- Attempt to replicate the problem, to ensure that it wasn't a coincidental incident.\n- Check to make sure your feature suggestion isn't already present within the project.\n- Check the pull requests tab to ensure that the bug doesn't have a fix in progress.\n- Check the pull requests tab to ensure that the feature isn't already in progress.\n\nBefore submitting a pull request:\n\n- Check the codebase to ensure that your feature doesn't already exist.\n- Check the pull requests to ensure that another person hasn't already submitted the feature or fix.\n\n## Requirements\n\nIf the project maintainer has any additional requirements, you will find them listed here.\n\n- **[PSR-2 Coding Standard](https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-2-coding-style-guide.md)** - The easiest way to apply the conventions is to install [PHP Code Sniffer](https://pear.php.net/package/PHP_CodeSniffer).\n\n- **Add tests!** - Your patch won't be accepted if it doesn't have tests.\n\n- **Document any change in behaviour** - Make sure the `README.md` and any other relevant documentation are kept up-to-date.\n\n- **Consider our release cycle** - We try to follow [SemVer v2.0.0](https://semver.org/). Randomly breaking public APIs is not an option.\n\n- **One pull request per feature** - If you want to do more than one thing, send multiple pull requests.\n\n- **Send coherent history** - Make sure each individual commit in your pull request is meaningful. If you had to make multiple intermediate commits while developing, please [squash them](https://www.git-scm.com/book/en/v2/Git-Tools-Rewriting-History#Changing-Multiple-Commit-Messages) before submitting.\n\n**Happy coding**!\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "content": "blank_issues_enabled: false\ncontact_links:\n    - name: Request a new feature\n      url: https://github.com/hi-folks/statistics/issues/new?labels=enhancement\n      about: Share ideas for new features / functions\n    - name: Report a bug\n      url: https://github.com/hi-folks/statistics/issues/new?labels=bug\n      about: Report a reproducable bug\n    - name: Documentation\n      url: https://github.com/hi-folks/statistics/issues/new?labels=documentation\n      about: Improvements or additions to documentation\n"
  },
  {
    "path": ".github/SECURITY.md",
    "content": "# Package Security Policy\n\n## Reporting Security Issues\n\nIf you discover any security-related issues within our package, we take these matters seriously and encourage you to report them to us promptly. Your assistance in disclosing potential security vulnerabilities is highly appreciated.\n\nTo report a security issue, please send an email to us at [roberto.butti@gmail.com](mailto:roberto.butti@gmail.com). We request that you do not use public issue trackers or other public communication channels to report security concerns related to this package. This helps us maintain the confidentiality and integrity of the issue while we investigate and address it.\n\n## Responsible Disclosure\n\nWe follow a responsible disclosure policy, and we kindly ask you to:\n\n1. **Provide Sufficient Details**: When reporting a security issue, please include as much information as possible so that we can reproduce and understand the problem. This may include steps to reproduce, the affected component, and any proof-of-concept code if available.\n\n2. **Allow Time for Resolution**: We will acknowledge the receipt of your report promptly and work diligently to assess and resolve the issue. We appreciate your patience and understanding during this process.\n\n3. **Keep Information Confidential**: Please do not disclose or share the details of the security issue with others until we have addressed and resolved it. This helps protect our users and the security of our package.\n\n4. **Do Not Impact Other Users**: Please refrain from taking any actions that may negatively impact the availability or integrity of our package or the data of other users.\n\nIf you are unsure whether a specific issue qualifies, please report it, and we will assess its validity.\n\nThank you for your cooperation in helping us maintain the security of our package and protecting our users. We value your contributions to our security efforts and we deeply appreciate your valuable contributions.\n"
  },
  {
    "path": ".github/dependabot.yml",
    "content": "# Please see the documentation for all configuration options:\n# https://help.github.com/github/administering-a-repository/configuration-options-for-dependency-updates\n\nversion: 2\nupdates:\n\n  - package-ecosystem: \"github-actions\"\n    directory: \"/\"\n    schedule:\n      interval: \"weekly\"\n    labels:\n      - \"dependencies\""
  },
  {
    "path": ".github/workflows/dependabot-auto-merge.yml",
    "content": "name: dependabot-auto-merge\non: pull_request_target\n\npermissions:\n  pull-requests: write\n  contents: write\n\njobs:\n  dependabot:\n    runs-on: ubuntu-latest\n    if: ${{ github.actor == 'dependabot[bot]' }}\n    steps:\n      - name: Dependabot metadata\n        id: metadata\n        uses: dependabot/fetch-metadata@v2.5.0\n        with:\n          github-token: \"${{ secrets.GITHUB_TOKEN }}\"\n\n      - name: Auto-merge Dependabot PRs for semver-minor updates\n        if: ${{steps.metadata.outputs.update-type == 'version-update:semver-minor'}}\n        run: gh pr merge --auto --merge \"$PR_URL\"\n        env:\n          PR_URL: ${{github.event.pull_request.html_url}}\n          GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}\n\n      - name: Auto-merge Dependabot PRs for semver-patch updates\n        if: ${{steps.metadata.outputs.update-type == 'version-update:semver-patch'}}\n        run: gh pr merge --auto --merge \"$PR_URL\"\n        env:\n          PR_URL: ${{github.event.pull_request.html_url}}\n          GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}\n"
  },
  {
    "path": ".github/workflows/run-tests.yml",
    "content": "name: Tests\n\non: [push, pull_request]\n\njobs:\n  test:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: true\n      matrix:\n        os: [ubuntu-latest, windows-latest]\n        php: [8.2, 8.3, 8.4, 8.5]\n        exclude:\n          - os: windows-latest\n            php: [8.2, 8.3, 8.5]\n\n    name: P${{ matrix.php }} - ${{ matrix.os }}\n\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v6\n\n      - name: Setup PHP\n        uses: shivammathur/setup-php@v2\n        with:\n          php-version: ${{ matrix.php }}\n          extensions: dom, curl, libxml, mbstring, zip, pcntl, bcmath, soap, intl, iconv, fileinfo\n          coverage: xdebug\n\n      - name: Setup problem matchers\n        run: |\n          echo \"::add-matcher::${{ runner.tool_cache }}/php.json\"\n          echo \"::add-matcher::${{ runner.tool_cache }}/phpunit.json\"\n\n      - name: Install dependencies\n        run: composer install --prefer-dist --no-interaction\n\n      - name: Execute tests\n        run: composer test\n"
  },
  {
    "path": ".github/workflows/static-code-analysis.yml",
    "content": "name: Static Code Analysis\n\non: [push, pull_request]\n\njobs:\n  test:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: true\n      matrix:\n        os: [ubuntu-latest]\n        php: [8.4]\n        stability: [prefer-stable]\n\n    name: P${{ matrix.php }} - ${{ matrix.stability }} - ${{ matrix.os }}\n\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v6\n\n      - name: Setup PHP\n        uses: shivammathur/setup-php@v2\n        with:\n          php-version: ${{ matrix.php }}\n          extensions: dom, curl, libxml, mbstring, zip, pcntl, bcmath, intl, iconv, fileinfo\n          coverage: none\n\n      - name: Setup problem matchers\n        run: |\n          echo \"::add-matcher::${{ runner.tool_cache }}/php.json\"\n          echo \"::add-matcher::${{ runner.tool_cache }}/phpunit.json\"\n\n      - name: Install dependencies\n        run: composer update --${{ matrix.stability }} --prefer-dist --no-interaction\n\n      - name: Execute static code analysis\n        run: vendor/bin/phpstan analyse src --level 8 --error-format=github --no-progress --no-ansi\n"
  },
  {
    "path": ".gitignore",
    "content": ".idea\n.php_cs\n.php_cs.cache\n.phpunit.result.cache\nbuild\nbin\ncomposer.lock\ncoverage\ndocs\nphpunit.xml\npsalm.xml\nvendor\n.php-cs-fixer.cache\n/.phpunit.cache\n"
  },
  {
    "path": ".php-cs-fixer.dist.php",
    "content": "<?php\n\n$finder = new PhpCsFixer\\Finder()->in([\n    __DIR__ . \"/src\",\n    __DIR__ . \"/tests\",\n    __DIR__ . \"/examples\",\n]);\n\nreturn new PhpCsFixer\\Config()\n    ->setRules([\n        \"@PER-CS\" => true,\n        \"@PHP82Migration\" => true,\n        \"class_attributes_separation\" => [\n            \"elements\" => [\n                \"const\" => \"one\",\n                \"method\" => \"one\",\n                \"property\" => \"one\",\n                \"trait_import\" => \"none\",\n            ],\n        ],\n        \"no_extra_blank_lines\" => [\n            \"tokens\" => [\"extra\", \"throw\", \"use\"],\n        ],\n        \"no_blank_lines_after_class_opening\" => true,\n        \"no_blank_lines_after_phpdoc\" => true,\n        \"no_closing_tag\" => true,\n        \"no_empty_phpdoc\" => true,\n        \"no_empty_statement\" => true,\n\n        //'strict_param' => true,\n        \"array_indentation\" => true,\n        \"array_syntax\" => [\"syntax\" => \"short\"],\n        \"binary_operator_spaces\" => [\n            \"default\" => \"single_space\",\n            \"operators\" => [\"=>\" => null],\n        ],\n        \"whitespace_after_comma_in_array\" => true,\n    ])\n    ->setFinder($finder);\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# Changelog\n\n## 1.5.0 - 2026-03-07\n- Adding `logarithmicRegression()`, `powerRegression()`, and `exponentialRegression()` methods for non-linear regression models\n\n## 1.4.0 - 2026-03-03\n- Adding `Utils\\Arr` class with `extract()` method for multi-column extraction from arrays of associative arrays, and `partition()` method for splitting arrays into matching/non-matching groups by field condition (supports ==, !=, >, <, >=, <= operators)\n- Adding `Utils\\Format` class with `secondsToTime()`, `timeToSeconds()`, `secondsToHms()`, and `hmsToSeconds()` methods for time formatting and parsing\n- Adding `Utils\\Math` class — reorganized from root namespace for consistency with `Enums/` and `Exception/` sub-directories\n- Reorganized `ArrUtil` and `Math` into `Utils\\Arr` and `Utils\\Math` sub-namespace; original classes remain as deprecated proxies for backward compatibility\n- Updated internal references in `Stat`, `Freq`, `StreamingStat`, and `Statistics` to use the new `Utils` namespace\n\n## 1.3.1 - 2026-02-23\n- Adding `tTestTwoSample()` method for two-sample independent t-test (Welch's t-test) — compares the means of two independent groups without assuming equal variances\n- Adding `tTestPaired()` method for paired t-test — tests whether the mean difference between paired observations (e.g. before/after) is significantly different from zero\n- Adding `StudentT` class for the Student's t-distribution (pdf, cdf, invCdf) — building block for t-tests and confidence intervals with small samples\n- Adding `tTest()` method for one-sample t-test — like z-test but appropriate for small samples where the population standard deviation is unknown\n- Adding `zTest()` method for one-sample Z-test — tests whether the sample mean differs significantly from a hypothesized population mean (includes p-value calculation)\n- Adding `Alternative` enum (`TwoSided`, `Greater`, `Less`) for hypothesis testing\n- Adding `confidenceInterval()` method for computing confidence intervals for the mean using the normal (z) distribution\n- Adding `rSquared()` method for R² (coefficient of determination) — proportion of variance explained by linear regression\n\n## 1.3.0 - 2026-02-22\n- Adding `StreamingStat` class (experimental) for streaming/online computation of mean, variance, stdev, skewness, kurtosis, sum, min, and max with O(1) memory\n- Adding `percentile()` method for computing the value at any percentile (0–100) with linear interpolation\n- Adding `coefficientOfVariation()` method for relative dispersion (CV%), supporting both sample and population modes\n- Adding `trimmedMean()` method for robust central tendency — computes the mean after removing outliers from each side\n- Adding `weightedMedian()` method for computing the median with weighted observations\n- Adding `sem()` method for standard error of the mean\n- Adding `meanAbsoluteDeviation()` method for mean absolute deviation — average distance from the mean\n- Adding `medianAbsoluteDeviation()` method for median absolute deviation — robust dispersion measure resistant to outliers\n- Adding `zscores()` method for computing z-scores of each value in a dataset\n- Adding `outliers()` method for z-score based outlier detection with configurable threshold\n- Adding `iqrOutliers()` method for IQR-based outlier detection (box plot whiskers), robust for skewed data\n- Adding `rSquared()` method for R² (coefficient of determination) — proportion of variance explained by linear regression\n\n## 1.2.5 - 2026-02-22\n- Adding `kurtosis()` method for excess kurtosis\n\n## 1.2.4 - 2026-02-21\n- Adding `skewness()` method for adjusted Fisher-Pearson sample skewness\n- Adding `pskewness()` method for population (biased) skewness\n- Full Coverage Tests (adding some edge cases)\n- Create KDE example\n\n## 1.2.3 - 2026-02-21\n- Adding `kde()` method for Kernel Density Estimation — returns a closure that estimates PDF or CDF from sample data, supporting 9 kernel functions with aliases\n- Adding `kdeRandom()` method for random sampling from a Kernel Density Estimate — returns a closure that generates random floats from the KDE distribution\n- Introducing `KdeKernel` backed string enum — `kde()` and `kdeRandom()`. It accepts `KdeKernel` enum cases\n- Adding Kernel Density Estimation (KDE) examples\n\n## 1.2.2 - 2026-02-21\n- Adding `method` parameter to `quantiles()` supporting `'exclusive'` (default) and `'inclusive'` interpolation methods\n- Adding `medianGrouped()` method for estimating the median of grouped/binned continuous data using interpolation\n- Adding Spearman rank correlation via `method` parameter in `correlation()` (`method='ranked'`)\n- Adding proportional linear regression via `proportional` parameter in `linearRegression()` for regression through the origin\n- Adding optional pre-computed mean parameter to `variance()` (`xbar`) and `pvariance()` (`mu`)\n\n\n## 1.2.1 - 2026-02-20\n- Adding `invCdf()` method to normal distribution\n- Adding `getVariance()` method to normal distribution (sigma squared)\n- Adding `getMedian()` method to normal distribution (equals mean)\n- Adding `getMode()` method to normal distribution (equals mean)\n- Adding `quantiles()` method to normal distribution (divide into n equal-probability intervals)\n- Adding `overlap()` method to normal distribution (overlapping coefficient between two distributions)\n- Adding `zscore()` method to normal distribution (standard score)\n- Adding `samples()` method to normal distribution (generate random samples with optional seed)\n- Adding `subtract()` method to normal distribution (counterpart to add)\n- Adding `divide()` method to normal distribution (counterpart to multiply)\n\n## 1.2.0 - 2026-02-19\n- Welcome to PHP 8.5\n- Upgrading to PHPstan new rules (offsetAccess)\n- Tests migrated from PestPHP 2 to PHPUnit 11\n- Code Syntax checker from Pint to PHP CS Fixer\n\n## 1.1.4 - 2025-04-25\n- Adding `fmean()` method for computing the arithmetic mean with float numbers.\n\n## 1.1.3 - 2024-12-14\n- Adding `multiply()` method to scale NormalDist by a constant\n\n## 1.1.2 - 2024-12-14\n- Implementing `add()` method for NormalDist\n\n## 1.1.1 - 2024-12-13\n- Implementing fromSample method for NormalDist\n\n## 1.1.0 - 2024-12-13\n- Upgrading RectorPHP v 2\n- Upgrading PHPStan v 2\n\n## 1.0.2 - 2024-12-10\n- NormalDist class, with `cdf()` and `pdf()`\n- Fix deprecations for PHP 8.4\n\n## 1.0.1 - 2024-11-21\n\n- Welcome PHP 8.4\n- Upgrading to Rector 1\n\n## 1.0.0 - 2023-12-26\n\n- Fixed `median()` function to handle unsorted data by @keatis\n- Rector refactor\n- PHPstan level 8\n- Support for PHP 8.1 and above\n- Add support for PHP 8.2 by @AmooAti\n- Update to PestPHP v2 by @AmooAti\n- Improving documentation (readme, contributing, code of conduct, security policies) by @AbhineshJha, @Arcturus22, @tvermaashutosh, @Abhishekgupta204, @Aryan4884\n- Rector v0.18.5 by @sukuasoft\n- Introducing Pint by @sukuasoft\n- GitHub Actions: Updating actions/checkout v4\n\n\n## 0.2.1 - 2022-02-22\n- Linear regression\n\n## 0.2.0 - 2022-02-21\n- Raise Exception instead of returning null if there is no valid input. By Artem Trokhymchuk @trokhymchuk [thanks for the PR #15](https://github.com/Hi-Folks/statistics/pull/15);\n- PHPStan, level 9\n\n## 0.1.7 - 2022-02-19\n- Code refactoring by @trokhymchuk\n- Clean phpdoc blocks by @trokhymchuk\n- Stat::correlation()\n- PHPStan, level 8\n\n## 0.1.6 -2022-02-17\n- Stat::covariance()\n\n## 0.1.5 - 2022-02-05\n- frequencyTable()\n- frequencyTableBySize()\n- code refactoring and documenting some functions by Artem Trokhymchuk @trokhymchuk [thanks for the PR #2](https://github.com/Hi-Folks/statistics/pull/2)\n- add tests for Math class\n\n## 0.1.4 - 2022-01-30\n- quantiles()\n- firstQuartile()\n- thirdQuartile()\n\n## 0.1.3 - 2022-01-29\n- geometricMean(): geometric mean\n- harmonicMean(): harmonic mean and weighted harmonic mean\n\n\n## 0.1.2 - 2022-01-28\n\n- pstdev(): Population standard deviation\n- stdev(): Sample standard deviation\n- pvariance(): variance for a population\n- variance(): variance for a sample\n\n## 0.1.1 - 2022-01-27\n\n- Create Freq class with static method for managing frequencies table\n- Create Stat class with static methods for basci statistic functions like: mean, mode, median, multimode...\n- Refactor Statistics class in order to use logic provided by Freq and Stat class\n- Create ArrUtil with some helpers/functions to manage arrays\n- Add CICD test for PHP 8.1\n\n## Initial release - 2022-01-08\n\nInitial release with:\n\n- getMean()\n- count()\n- median()\n- firstQuartile()\n- thirdQuartile()\n- mode()\n- frequencies(): a frequency is the number of times a value of the data occurs;\n- relativeFrequencies(): a relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes;\n- cumulativeFrequencies(): is the accumulation of the previous relative frequencies.;\n- cumulativeRelativeFrequencies(): is the accumulation of the previous relative ratio.\n\n## 0.1.0 - 2022-01-08\n\nInitial release with:\n\n- getMean()\n- count()\n- median()\n- firstQuartile()\n- thirdQuartile()\n- mode()\n- frequencies(): a frequency is the number of times a value of the data occurs;\n- relativeFrequencies(): a relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes;\n- cumulativeFrequencies(): is the accumulation of the previous relative frequencies.;\n- cumulativeRelativeFrequencies(): is the accumulation of the previous relative ratio.\n"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "content": "# Contributor Covenant Code of Conduct\n\n## Our Commitment\n\nWe, as members, contributors, and leaders, are committed to ensuring that participation in our community is a positive and respectful experience for everyone, regardless of their age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.\n\nOur goal is to create an open, welcoming, diverse, inclusive, and healthy community where everyone feels valued and respected.\n\n## Expectations for Behavior\n\nIn order to maintain a positive community environment, we expect all members, contributors, and leaders to adhere to the following guidelines:\n\n* **Empathy and Kindness**: Treat others with empathy and kindness. Show understanding and consideration towards fellow community members.\n\n* **Respect for Diverse Perspectives**: Be respectful of differing opinions, viewpoints, and experiences. Acknowledge that diversity of thought enriches our community.\n\n* **Constructive Feedback**: Give feedback in a constructive manner and be open to receiving it. Take responsibility for your actions and apologize when necessary, using the experience as an opportunity to learn.\n\n* **Community-Centered Focus**: Prioritize the well-being of the entire community, not just individual interests. Strive for what benefits the community as a whole.\n\n## Unacceptable Behavior\n\nThe following behaviors are not tolerated within our community:\n\n* **Sexualized Language or Imagery**: Avoid using sexualized language or imagery and refrain from making sexual advances.\n\n* **Trolling and Insults**: Do not engage in trolling, insulting or derogatory comments, or personal or political attacks.\n\n* **Harassment**: Harassment, whether public or private, is not acceptable. Respect personal boundaries and avoid intrusive behavior.\n\n* **Sharing Private Information**: Do not publish others' private information, such as physical or email addresses, without their explicit permission.\n\n* **Inappropriate Conduct**: Refrain from any conduct that could be considered unprofessional in a professional setting.\n\n## Responsibilities for Enforcement\n\nCommunity leaders are responsible for upholding and enforcing these standards of behavior. They will take appropriate and fair corrective action in response to any behavior that is deemed inappropriate, threatening, offensive, or harmful.\n\nCommunity leaders have the authority to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that do not align with this Code of Conduct. They will communicate the reasons for moderation decisions when necessary.\n\n## Scope\n\nThis Code of Conduct applies in all community spaces. It also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event.\n\n## Reporting and Enforcement\n\nIf you encounter abusive, harassing, or otherwise unacceptable behavior, please report it to the community leaders responsible for enforcement via email. All complaints will be promptly and fairly reviewed and investigated.\n\nCommunity leaders are obligated to respect the privacy and security of the reporter of any incident.\n\n## Enforcement Guidelines\n\nCommunity leaders will follow these guidelines to determine the consequences for any actions that violate this Code of Conduct:\n\n### 1. Correction\n\n**Community Impact**: Inappropriate language or other unprofessional behavior.\n\n**Consequence**: A private, written warning from community leaders, providing clarity about the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.\n\n### 2. Warning\n\n**Community Impact**: A violation through a single incident or series of actions.\n\n**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.\n\n### 3. Temporary Ban\n\n**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.\n\n**Consequence**: A temporary ban from any form of interaction or public communication with the community for a specified period. No public or private interaction with the people involved, including unsolicited interactions with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.\n\n### 4. Permanent Ban\n\n**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.\n\n**Consequence**: A permanent ban from any form of public interaction within the community.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.\n\nThe Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity).\n\n[homepage]: https://www.contributor-covenant.org\n\nFor answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "# Contributing\n\nYour contributions are highly appreciated, and they will be duly recognized.\n\nBefore you proceed to create an issue or a pull request, please take a moment to familiarize yourself with our contribution guide.\n\n## Etiquette\n\nThis project thrives on the spirit of open source collaboration. Our maintainers dedicate their precious time to create and uphold the source code, and they share it with the hope that it will benefit fellow developers. Let's ensure they don't bear the brunt of abuse or anger for their hard work.\n\nWhen raising issues or submitting pull requests, let's maintain a considerate and respectful tone. Our goal is to exemplify that developers are a courteous and collaborative community.\n\nThe maintainers have the responsibility to evaluate the quality and compatibility of all contributions with the project. Every developer brings unique skills, strengths, and perspectives to the table. Please respect their decisions, even if your submission isn't integrated.\n\n## Relevance\n\nBefore proposing or submitting new features, consider whether they are genuinely beneficial to the broader user base. Open source projects serve a diverse group of developers with varying needs. It's important to assess whether your feature is likely to be widely useful.\n\n## Procedure\n\n### Preliminary Steps Before Filing an Issue\n\n- Try to replicate the problem to ensure it's not an isolated occurrence.\n- Verify if your feature suggestion has already been addressed within the project.\n- Review the pull requests to make sure a solution for the bug isn't already underway.\n- Check the pull requests to confirm that the feature isn't already under development.\n\n### Preparing Your Pull Request\n\n- Examine the codebase to prevent duplication of your proposed feature.\n- Check the pull requests to verify that another contributor hasn't already submitted the same feature or fix.\n\n## Opening a Pull Request\n\nTo maintain coding consistency, we adhere to the PSR-12 coding standard and use PHPStan for static code analysis. You can utilize the following command:\n\n```bash\ncomposer all-check\n```\nThis command encompasses:\n\n- PSR-12 Coding Standard checks employing PHP_CodeSniffer.\n- PHPStan analysis at level 8.\n- Execution of all tests from the `./tests/*` directory using PestPHP.\n\nWe recommend running `composer all-check` before committing and creating a pull request.\n\nWhen working on a pull request, it is advisable to create a new branch that originates from the main branch. This branch can serve as the target branch when you submit your pull request to the original repository.\n\nFor a high-quality pull request, please ensure that you:\n\n- Include tests as part of your patch. We cannot accept submissions lacking tests.\n- Document changes in behavior, keeping the README.md and other pertinent documentation up-to-date.\n- Respect our release cycle. We follow SemVer v2.0.0, and we cannot afford to randomly break public APIs.\n- Stick to one pull request per feature. Multiple changes should be presented through separate pull requests.\n- Provide a cohesive history. Each individual commit within your pull request should serve a meaningful purpose. If you have made several intermediary commits during development, please consolidate them before submission.\n\nHappy coding! 🚀\n"
  },
  {
    "path": "LICENSE.md",
    "content": "The MIT License (MIT)\n\nCopyright (c) hi-folks <roberto.butti@gmail.com>\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "<p align=\"center\">\n    <img src=\"https://repository-images.githubusercontent.com/445609326/e2539776-0f8f-4556-be1d-887ea2368813\" alt=\"PHP package for Statistics\">\n</p>\n\n<h1 align=\"center\">\n    Statistics PHP package\n</h1>\n\n<p align=center>\n    <a href=\"https://packagist.org/packages/hi-folks/statistics\">\n        <img src=\"https://img.shields.io/packagist/v/hi-folks/statistics.svg?style=for-the-badge\" alt=\"Latest Version on Packagist\">\n    </a>\n    <a href=\"https://packagist.org/packages/hi-folks/statistics\">\n        <img src=\"https://img.shields.io/packagist/dt/hi-folks/statistics.svg?style=for-the-badge\" alt=\"Total Downloads\">\n    </a>\n    <br>\n    <a href=\"https://github.com/Hi-Folks/statistics/blob/main/.github/workflows/static-code-analysis.yml\">\n        <img src=\"https://img.shields.io/badge/PHPStan-level%208-brightgreen.svg?style=for-the-badge\" alt=\"Static Code analysis\">\n    </a>\n    <img src=\"https://img.shields.io/packagist/l/hi-folks/statistics?style=for-the-badge\" alt=\"Packagist License\">\n    <br>\n    <img src=\"https://img.shields.io/packagist/php-v/hi-folks/statistics?style=for-the-badge\" alt=\"Packagist PHP Version Support\">\n    <img src=\"https://img.shields.io/github/last-commit/hi-folks/statistics?style=for-the-badge\" alt=\"GitHub last commit\">\n</p>\n\n<p align=center>\n    <a href=\"https://github.com/hi-folks/statistics/actions/workflows/run-tests.yml\">\n        <img src=\"https://github.com/hi-folks/statistics/actions/workflows/run-tests.yml/badge.svg?branch=main&style=for-the-badge\" alt=\"Tests\">\n    </a>\n</p>\n\n<p align=center>\n    <i>\n        A PHP package for descriptive statistics, normal distribution, outlier detection, and streaming analytics on numeric data.\n    </i>\n</p>\n\nThis package provides a comprehensive set of statistical functions for PHP: descriptive statistics (mean, median, mode, standard deviation, variance, quantiles), robust measures (trimmed mean, weighted median, median absolute deviation), distribution modelling (normal distribution with PDF, CDF, and inverse CDF), outlier detection (z-score and IQR-based), z-scores, percentiles, coefficient of variation, frequency tables, correlation, regression (linear, logarithmic, power, and exponential), kernel density estimation, and O(1) memory streaming statistics.\n\nIt works with any numeric dataset — from sports telemetry and sensor data to race results, survey responses, and financial time series.\n\n**Articles and resources:**\n- [Exploring Olympic Downhill Results with PHP Statistics](https://dev.to/robertobutti/exploring-olympic-downhill-results-with-php-statistics-3eo1) — a step-by-step analysis of 2026 Olympic downhill race data\n- [Statistics with PHP](https://dev.to/robertobutti/statistics-with-php-4pfp) — introduction to the package and its core functions\n- [PHP Statistics on Laravel News](https://laravel-news.com/php-statistics)\n\n> This package is inspired by the [Python statistics module](https://docs.python.org/3/library/statistics.html)\n\n## Installation\n\nYou can install the package via composer:\n\n```bash\ncomposer require hi-folks/statistics\n```\n\n## Usage\n\n### Stat class\n\nStat class has methods to calculate an average or typical value from a population or sample.\nThis class provides methods for calculating mathematical statistics of numeric data.\nThe various mathematical statistics are listed below:\n\n\n| Mathematical Statistic | Description |\n| ---------------------- | ----------- |\n| `mean()` | arithmetic mean or \"average\" of data |\n| `fmean()` | floating-point arithmetic mean, with optional weighting and precision |\n| `trimmedMean()` | trimmed (truncated) mean — mean after removing outliers from each side |\n| `median()` | median or \"middle value\" of data |\n| `weightedMedian()` | weighted median — median with weights, where each value has a different importance |\n| `medianLow()` | low median of data |\n| `medianHigh()` | high median of data |\n| `medianGrouped()` | median of grouped data, using interpolation |\n| `mode()` | single mode (most common value) of discrete or nominal data |\n| `multimode()` | list of modes (most common values) of discrete or nominal data |\n| `quantiles()` | cut points dividing the range of a probability distribution into continuous intervals with equal probabilities (supports `exclusive` and `inclusive` methods) |\n| `thirdQuartile()` | 3rd quartile, is the value at which 75 percent of the data is below it |\n| `firstQuartile()` | first quartile, is the value at which 25 percent of the data is below it |\n| `percentile()` | value at any percentile (0–100) with linear interpolation |\n| `pstdev()` | Population standard deviation |\n| `stdev()` | Sample standard deviation |\n| `sem()` | Standard error of the mean (SEM) — measures precision of the sample mean |\n| `meanAbsoluteDeviation()` | mean absolute deviation (MAD) — average distance from the mean |\n| `medianAbsoluteDeviation()` | median absolute deviation — median distance from the median, robust to outliers |\n| `pvariance()` | variance for a population (supports pre-computed mean via `mu`) |\n| `variance()` | variance for a sample (supports pre-computed mean via `xbar`) |\n| `skewness()` | adjusted Fisher-Pearson sample skewness |\n| `pskewness()` | population (biased) skewness |\n| `kurtosis()` | excess kurtosis (sample formula, 0 for normal distribution) |\n| `coefficientOfVariation()` | coefficient of variation (CV%), relative dispersion as percentage |\n| `zscores()` | z-scores for each value — how many standard deviations from the mean |\n| `outliers()` | outlier detection based on z-score threshold |\n| `iqrOutliers()` | outlier detection based on IQR method (box plot whiskers), robust for skewed data |\n| `geometricMean()` | geometric mean |\n| `harmonicMean()` | harmonic mean |\n| `correlation()` | Pearson’s or Spearman’s rank correlation coefficient for two inputs |\n| `covariance()` | the sample covariance of two inputs |\n| `linearRegression()` | return the slope and intercept of simple linear regression parameters estimated using ordinary least squares (supports `proportional: true` for regression through the origin) |\n| `logarithmicRegression()` | logarithmic regression — fits `y = a × ln(x) + b`, ideal for diminishing returns patterns (e.g., athletic improvement, learning curves) |\n| `powerRegression()` | power regression — fits `y = a × x^b`, useful for power law relationships |\n| `exponentialRegression()` | exponential regression — fits `y = a × e^(b×x)`, useful for exponential growth or decay |\n| `rSquared()` | coefficient of determination (R²) — proportion of variance explained by linear regression |\n| `confidenceInterval()` | confidence interval for the mean using the normal (z) distribution |\n| `zTest()` | one-sample Z-test — tests whether the sample mean differs significantly from a hypothesized population mean |\n| `tTest()` | one-sample t-test — like z-test but appropriate for small samples where the population standard deviation is unknown |\n| `tTestTwoSample()` | two-sample independent t-test (Welch's) — compares the means of two independent groups without assuming equal variances |\n| `tTestPaired()` | paired t-test — tests whether the mean difference between paired observations is significantly different from zero |\n| `kde()` | kernel density estimation — returns a closure that estimates the probability density (or CDF) at any point |\n| `kdeRandom()` | random sampling from a kernel density estimate — returns a closure that generates random floats from the KDE distribution |\n\n#### Stat::mean( array $data )\nReturn the sample arithmetic mean of the array _$data_.\nThe arithmetic mean is the sum of the data divided by the number of data points. It is commonly called “the average”, although it is only one of many mathematical averages. It is a measure of the central location of the data.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mean = Stat::mean([1, 2, 3, 4, 4]);\n// 2.8\n$mean = Stat::mean([-1.0, 2.5, 3.25, 5.75]);\n// 2.625\n```\n\n#### Stat::fmean( array $data, array|null $weights = null, int|null $precision = null )\nReturn the arithmetic mean of the array `$data`, as a float, with optional weights and precision control.\nThis function behaves like `mean()` but ensures a floating-point result and supports weighted datasets.\nIf `$weights` is provided, it computes the weighted average. The result is rounded to a given decimal $precision.\nThe result is rounded to `$precision` decimal places. \nIf `$precision` is null, no rounding is applied — this may lead to results with long or unexpected decimal expansions due to the nature of floating-point arithmetic in PHP. Using rounding helps ensure cleaner, more predictable output.\n\n```php \nuse HiFolks\\Statistics\\Stat;\n\n// Unweighted mean (same as mean but always float)\n$fmean = Stat::fmean([3.5, 4.0, 5.25]);\n// 4.25\n\n// Weighted mean\n$fmean = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1]);\n// 4.1875\n\n// Custom precision\n$fmean = Stat::fmean([3.5, 4.0, 5.25], null, 2);\n// 4.25\n\n$fmean = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1], 3);\n// 4.188\n\n```\n\nIf the input is empty, or weights are invalid (e.g., length mismatch or sum is zero), an exception is thrown.\nUse this function when you need floating-point accuracy or to apply custom weighting and rounding to your average.\n\n#### Stat::trimmedMean( array $data, float $proportionToCut = 0.1, ?int $round = null )\nReturn the trimmed (truncated) mean of the data. Computes the mean after removing the lowest and highest fraction of values. This is a robust measure of central tendency, less sensitive to outliers than the regular mean.\n\nThe `$proportionToCut` parameter specifies the fraction to trim from **each** side (must be in the range `[0, 0.5)`). For example, `0.1` removes the bottom 10% and top 10%.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 100], 0.1);\n// 5.5 (outlier 100 and lowest value 1 removed)\n\n$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 0.2);\n// 5.5 (removes 2 values from each side)\n\n$mean = Stat::trimmedMean([1, 2, 3, 4, 5], 0.0);\n// 3.0 (no trimming, same as regular mean)\n```\n\n#### Stat::geometricMean( array $data )\nThe geometric mean indicates the central tendency or typical value of the data using the product of the values (as opposed to the arithmetic mean which uses their sum).\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mean = Stat::geometricMean([54, 24, 36], 1);\n// 36.0\n```\n#### Stat::harmonicMean( array $data )\nThe harmonic mean is the reciprocal of the arithmetic mean() of the reciprocals of the data. For example, the harmonic mean of three values a, b, and c will be equivalent to 3/(1/a + 1/b + 1/c). If one of the values is zero, the result will be zero.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mean = Stat::harmonicMean([40, 60], null, 1);\n// 48.0\n```\n\nYou can also calculate the harmonic weighted mean.\nSuppose a car travels 40 km/hr for 5 km, and when traffic clears, speeds up to 60 km/hr for the remaining 30 km of the journey. What is the average speed?\n\n```php\nuse HiFolks\\Statistics\\Stat;\nStat::harmonicMean([40, 60], [5, 30], 1);\n// 56.0\n```\nwhere:\n- 40, 60:  are the elements\n- 5, 30: are the weights for each element (the first weight is the weight of the first element, the second one is the weight of the second element)\n- 1: is the decimal numbers you want to round\n\n\n#### Stat::median( array $data )\nReturn the median (middle value) of numeric data, using the common “mean of middle two” method.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$median = Stat::median([1, 3, 5]);\n// 3\n$median = Stat::median([1, 3, 5, 7]);\n// 4\n```\n\n#### Stat::weightedMedian( array $data, array $weights, ?int $round = null )\nReturn the weighted median of the data. The weighted median is the value where the cumulative weight reaches 50% of the total weight. This is useful for survey data, financial analysis, or any dataset where observations have different importance.\n\nAll weights must be positive numbers and the weights array must have the same length as the data array.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$median = Stat::weightedMedian([1, 2, 3], [1, 1, 1]);\n// 2.0 (equal weights, same as regular median)\n\n$median = Stat::weightedMedian([1, 2, 3], [1, 1, 10]);\n// 3.0 (heavy weight on 3 pulls the median)\n\n$median = Stat::weightedMedian([1, 2, 3, 4], [1, 1, 1, 1]);\n// 2.5 (equal weights, even count — averages the two middle values)\n```\n\n#### Stat::medianLow( array $data )\nReturn the low median of numeric data.\nThe low median is always a member of the data set. When the number of data points is odd, the middle value is returned. When it is even, the smaller of the two middle values is returned.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$median = Stat::medianLow([1, 3, 5]);\n// 3\n$median = Stat::medianLow([1, 3, 5, 7]);\n// 3\n```\n\n\n\n#### Stat::medianHigh( array $data )\nReturn the high median of data.\nThe high median is always a member of the data set. When the number of data points is odd, the middle value is returned. When it is even, the larger of the two middle values is returned.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$median = Stat::medianHigh([1, 3, 5]);\n// 3\n$median = Stat::medianHigh([1, 3, 5, 7]);\n// 5\n```\n\n#### Stat::medianGrouped( array $data, float $interval = 1.0 )\nEstimate the median for numeric data that has been grouped or binned around the midpoints of consecutive, fixed-width intervals.\nThe `$interval` parameter specifies the width of each bin (default `1.0`). This function uses interpolation within the median interval, assuming values are evenly distributed across each bin.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$median = Stat::medianGrouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5]);\n// 3.7\n$median = Stat::medianGrouped([1, 3, 3, 5, 7]);\n// 3.25\n$median = Stat::medianGrouped([1, 3, 3, 5, 7], 2);\n// 3.5\n```\n\nFor example, demographic data summarized into ten-year age groups:\n```php\nuse HiFolks\\Statistics\\Stat;\n// 172 people aged 20-30, 484 aged 30-40, 387 aged 40-50, etc.\n$data = array_merge(\n    array_fill(0, 172, 25),\n    array_fill(0, 484, 35),\n    array_fill(0, 387, 45),\n    array_fill(0, 22, 55),\n    array_fill(0, 6, 65),\n);\nround(Stat::medianGrouped($data, 10), 1);\n// 37.5\n```\n\n#### Stat::quantiles( array $data, $n=4, $round=null, $method='exclusive'  )\nDivide data into n continuous intervals with equal probability. Returns a list of n - 1 cut points separating the intervals.\nSet n to 4 for quartiles (the default). Set n to 10 for deciles. Set n to 100 for percentiles which gives the 99 cut points that separate data into 100 equal-sized groups.\n\nThe `$method` parameter controls the interpolation method:\n- `'exclusive'` (default): uses `m = count + 1`. Suitable for sampled data that may have more extreme values beyond the sample.\n- `'inclusive'`: uses `m = count - 1`. Suitable for population data or samples known to include the most extreme values. The minimum value is treated as the 0th percentile and the maximum as the 100th percentile.\n\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$quantiles = Stat::quantiles([98, 90, 70,18,92,92,55,83,45,95,88]);\n// [ 55.0, 88.0, 92.0 ]\n$quantiles = Stat::quantiles([105, 129, 87, 86, 111, 111, 89, 81, 108, 92, 110,100, 75, 105, 103, 109, 76, 119, 99, 91, 103, 129,106, 101, 84, 111, 74, 87, 86, 103, 103, 106, 86,111, 75, 87, 102, 121, 111, 88, 89, 101, 106, 95,103, 107, 101, 81, 109, 104], 10);\n// [81.0, 86.2, 89.0, 99.4, 102.5, 103.6, 106.0, 109.8, 111.0]\n\n// Inclusive method\n$quantiles = Stat::quantiles([1, 2, 3, 4, 5], method: 'inclusive');\n// [2.0, 3.0, 4.0]\n```\n#### Stat::firstQuartile( array $data, $round=null  )\nThe lower quartile, or first quartile (Q1), is the value under which 25% of data points are found when they are arranged in increasing order.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$percentile = Stat::firstQuartile([98, 90, 70,18,92,92,55,83,45,95,88]);\n// 55.0\n```\n\n#### Stat::thirdQuartile( array $data, $round=null  )\nThe upper quartile, or third quartile (Q3), is the value under which 75% of data points are found when arranged in increasing order.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$percentile = Stat::thirdQuartile([98, 90, 70,18,92,92,55,83,45,95,88]);\n// 92.0\n```\n\n#### Stat::percentile( array $data, float $p, ?int $round = null )\nReturn the value at the given percentile of the data, using linear interpolation between adjacent data points (exclusive method, consistent with `quantiles()`).\n\nThe percentile `$p` must be between 0 and 100. Requires at least 2 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$value = Stat::percentile([10, 20, 30, 40, 50, 60, 70, 80, 90, 100], 50);\n// 55.0 (median)\n\n$value = Stat::percentile([10, 20, 30, 40, 50, 60, 70, 80, 90, 100], 90);\n// 91.0\n```\n\n#### Stat::pstdev( array $data )\nReturn the **Population** Standard Deviation, a measure of the amount of variation or dispersion of a set of values.\nA low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$stdev = Stat::pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]);\n// 0.986893273527251\n$stdev = Stat::pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75], 4);\n// 0.9869\n```\n\n#### Stat::stdev( array $data )\nReturn the **Sample** Standard Deviation, a measure of the amount of variation or dispersion of a set of values.\nA low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$stdev = Stat::stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]);\n// 1.0810874155219827\n$stdev = Stat::stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75], 4);\n// 1.0811\n```\n\n#### Stat::sem( array $data, ?int $round = null )\nReturn the standard error of the mean (SEM). SEM measures how precisely the sample mean estimates the population mean. It decreases as the sample size grows.\n\nFormula: `stdev / sqrt(n)`\n\nRequires at least 2 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$sem = Stat::sem([2, 4, 4, 4, 5, 5, 7, 9]);\n// 0.7559...\n\n$sem = Stat::sem([2, 4, 4, 4, 5, 5, 7, 9], 4);\n// 0.7559\n```\n\n#### Stat::meanAbsoluteDeviation( array $data, ?int $round = null )\nReturn the mean absolute deviation (MAD) — the average of the absolute deviations from the mean.\n\nMAD is a simple, intuitive measure of dispersion: it tells you \"on average, how far values are from the mean\". Unlike standard deviation, it does not square the differences, making it easier to interpret and somewhat less sensitive to outliers.\n\nUse MAD when you want a straightforward, interpretable measure of spread, especially for reporting to non-technical audiences.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mad = Stat::meanAbsoluteDeviation([1, 2, 3, 4, 5]);\n// 1.2\n\n$mad = Stat::meanAbsoluteDeviation([1, 2, 3, 4, 5], 1);\n// 1.2\n```\n\n#### Stat::medianAbsoluteDeviation( array $data, ?int $round = null )\nReturn the median absolute deviation — the median of the absolute deviations from the median.\n\nThis is one of the most **robust measures of dispersion** available. Because it uses the median (not the mean) as the center and takes the median (not the mean) of deviations, it is highly resistant to outliers. Even if up to half the data points are extreme, the median absolute deviation remains stable.\n\nUse it when your data may contain outliers, when you need a robust alternative to standard deviation, or for outlier detection (values far from the median in units of MAD are likely outliers).\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$mad = Stat::medianAbsoluteDeviation([1, 2, 3, 4, 5]);\n// 1.0\n\n// Robust to outliers — the outlier 1000 does not affect the result:\n$mad = Stat::medianAbsoluteDeviation([1, 2, 3, 4, 1000]);\n// 1.0\n```\n\n#### Stat::variance ( array $data, ?int $round = null, int|float|null $xbar = null)\nVariance is a measure of dispersion of data points from the mean.\nLow variance indicates that data points are generally similar and do not vary widely from the mean.\nHigh variance indicates that data values have greater variability and are more widely dispersed from the mean.\n\nTo calculate the variance from a *sample*:\n```php\nuse HiFolks\\Statistics\\Stat;\n$variance = Stat::variance([2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]);\n// 1.3720238095238095\n```\n\nIf you have already computed the mean, you can pass it via `xbar` to avoid recalculation:\n```php\n$data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5];\n$mean = Stat::mean($data);\n$variance = Stat::variance($data, xbar: $mean);\n```\n\nIf you need to calculate the variance on the whole population and not just on a sample you need to use *pvariance* method. You can optionally pass the population mean via `mu`:\n```php\nuse HiFolks\\Statistics\\Stat;\n$variance = Stat::pvariance([0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]);\n// 1.25\n\n// With pre-computed mean:\n$data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25];\n$mu = Stat::mean($data);\n$variance = Stat::pvariance($data, mu: $mu);\n```\n\n\n#### Stat::skewness ( array $data, ?int $round = null )\nSkewness is a measure of the asymmetry of a distribution. The adjusted Fisher-Pearson formula is used, which is the same as Excel's `SKEW()` and Python's `scipy.stats.skew(bias=False)`.\n\nA positive skewness indicates a right-skewed distribution (tail extends to the right), while a negative skewness indicates a left-skewed distribution. A symmetric distribution has a skewness of 0.\n\nRequires at least 3 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$skewness = Stat::skewness([1, 2, 3, 4, 5]);\n// 0.0 (symmetric)\n\n$skewness = Stat::skewness([1, 1, 1, 1, 1, 10]);\n// positive (right-skewed)\n```\n\nIf you need the population (biased) skewness instead of the sample skewness, use `pskewness()`. This is equivalent to `scipy.stats.skew(bias=True)`:\n```php\nuse HiFolks\\Statistics\\Stat;\n$pskewness = Stat::pskewness([1, 1, 1, 1, 1, 10]);\n```\n\n#### Stat::kurtosis ( array $data, ?int $round = null )\nKurtosis measures the \"tailedness\" of a distribution — how much data lives in the extreme tails compared to a normal distribution. This method returns the **excess kurtosis** using the sample formula, which is the same as Excel's `KURT()` and Python's `scipy.stats.kurtosis(bias=False)`.\n\nA normal distribution has excess kurtosis of 0. Positive values (leptokurtic) indicate heavier tails and more outliers. Negative values (platykurtic) indicate lighter tails and fewer outliers.\n\nRequires at least 4 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$kurtosis = Stat::kurtosis([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);\n// negative (platykurtic, lighter tails than normal)\n\n$kurtosis = Stat::kurtosis([1, 2, 2, 2, 2, 2, 2, 2, 2, 50]);\n// positive (leptokurtic, heavier tails due to outlier)\n```\n\n#### Stat::coefficientOfVariation( array $data, ?int $round = null, bool $population = false )\nThe coefficient of variation (CV) is the ratio of the standard deviation to the mean, expressed as a percentage. It measures relative variability and is useful for comparing dispersion across datasets with different units or scales.\n\nBy default it uses the sample standard deviation. Pass `population: true` to use the population standard deviation instead.\n\nRequires at least 2 data points (sample) or 1 (population). Throws if the mean is zero.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$cv = Stat::coefficientOfVariation([10, 20, 30, 40, 50]);\n// ~52.70 (sample)\n\n$cv = Stat::coefficientOfVariation([10, 20, 30, 40, 50], round: 2);\n// 52.7\n\n$cv = Stat::coefficientOfVariation([10, 20, 30, 40, 50], population: true);\n// ~47.14 (population)\n```\n\n#### Stat::zscores( array $data, ?int $round = null )\nReturn the z-score for each value in the dataset. A z-score indicates how many standard deviations a value is from the mean. Z-scores are useful for standardizing data, comparing values from different distributions, and identifying outliers.\n\nThe z-scores of any dataset always sum to zero, and values beyond ±2 or ±3 are typically considered unusual or outliers.\n\nRequires at least 2 data points and non-zero standard deviation.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$zscores = Stat::zscores([2, 4, 4, 4, 5, 5, 7, 9]);\n// array of z-scores, one per value\n\n$zscores = Stat::zscores([2, 4, 4, 4, 5, 5, 7, 9], 2);\n// z-scores rounded to 2 decimal places\n```\n\n#### Stat::outliers( array $data, float $threshold = 3.0 )\nReturn values from the dataset that are outliers based on z-score threshold. A value is considered an outlier if its absolute z-score exceeds the threshold.\n\nThe default threshold of 3.0 is a widely used convention — in a normal distribution, about 99.7% of values fall within 3 standard deviations of the mean, so values beyond that are rare. Use a lower threshold (e.g. 2.0) for stricter detection, or a higher one for more lenient filtering.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$outliers = Stat::outliers([1, 2, 3, 4, 5, 6, 7, 8, 9, 100]);\n// [100]\n\n$outliers = Stat::outliers([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 1.0);\n// values more than 1 stdev from the mean\n```\n\n#### Stat::iqrOutliers( array $data, float $factor = 1.5 )\nReturn values that are outliers based on the Interquartile Range (IQR) method. A value is an outlier if it falls below `Q1 - factor * IQR` or above `Q3 + factor * IQR`. This is the same method used for box plot whiskers.\n\nUnlike z-score based detection, the IQR method is **robust** — it does not assume a normal distribution and is not influenced by extreme values themselves. This makes it the preferred choice for skewed data or when the dataset may already contain outliers that would distort the mean and standard deviation.\n\nUse `factor: 1.5` (default) for mild outliers, or `factor: 3.0` for extreme outliers only.\n\n**Example: Ski downhill race times**\n\nIn a ski downhill race, most athletes finish between 108–116 seconds. A time of 200s (e.g. a crash/DNF) or 50s (e.g. a timing error) would be flagged as outliers:\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$times = [110.2, 112.5, 108.9, 115.3, 111.7, 114.0, 109.8, 113.6, 200.0, 50.0];\n$outliers = Stat::iqrOutliers($times);\n// [200.0, 50.0] — the crash and the timing error are detected\n\n$extremeOnly = Stat::iqrOutliers($times, 3.0);\n// only the most extreme values\n```\n\n#### Stat::covariance ( array $x , array $y )\nCovariance, static method, returns the sample covariance of two inputs *$x* and *$y*.\nCovariance is a measure of the joint variability of two inputs.\n\n```php\n$covariance = Stat::covariance(\n    [1, 2, 3, 4, 5, 6, 7, 8, 9],\n    [1, 2, 3, 1, 2, 3, 1, 2, 3]\n);\n// 0.75\n```\n\n```php\n$covariance = Stat::covariance(\n    [1, 2, 3, 4, 5, 6, 7, 8, 9],\n    [9, 8, 7, 6, 5, 4, 3, 2, 1]\n);\n// -7.5\n```\n\n#### Stat::correlation ( array $x , array $y, string $method = ‘linear’ )\nReturn the Pearson’s correlation coefficient for two inputs. Pearson’s correlation coefficient r takes values between -1 and +1. It measures the strength and direction of the linear relationship, where +1 means very strong, positive linear relationship, -1 very strong, negative linear relationship, and 0 no linear relationship.\n\nUse `$method = ‘ranked’` for Spearman’s rank correlation, which measures monotonic relationships (not just linear). Spearman’s correlation is computed by applying Pearson’s formula to the ranks of the data.\n\n```php\n$correlation = Stat::correlation(\n    [1, 2, 3, 4, 5, 6, 7, 8, 9],\n    [1, 2, 3, 4, 5, 6, 7, 8, 9]\n);\n// 1.0\n```\n\n```php\n$correlation = Stat::correlation(\n    [1, 2, 3, 4, 5, 6, 7, 8, 9],\n    [9, 8, 7, 6, 5, 4, 3, 2, 1]\n);\n// -1.0\n```\n\nSpearman’s rank correlation (non-linear but monotonic relationship):\n```php\n$correlation = Stat::correlation(\n    [1, 2, 3, 4, 5],\n    [1, 4, 9, 16, 25],\n    ‘ranked’\n);\n// 1.0\n```\n\n#### Stat::linearRegression ( array $x , array $y , bool $proportional = false )\nReturn the slope and intercept of simple linear regression  parameters estimated using ordinary least squares.\nSimple linear regression describes the relationship between an independent variable *$x* and a dependent variable *$y* in terms of a linear function.\n\n```php\n$years = [1971, 1975, 1979, 1982, 1983];\n$films_total = [1, 2, 3, 4, 5]\nlist($slope, $intercept) = Stat::linearRegression(\n    $years,\n    $films_total\n);\n// 0.31\n// -610.18\n```\nWhat happens in 2022, according to the samples above?\n\n```php\nround($slope * 2022 + $intercept);\n// 17.0\n```\n\nWhen `proportional` is `true`, the regression line is forced through the origin (intercept = 0). This is useful when the relationship between *$x* and *$y* is known to be proportional:\n\n```php\nlist($slope, $intercept) = Stat::linearRegression(\n    [1, 2, 3, 4, 5],\n    [2, 4, 6, 8, 10],\n    proportional: true,\n);\n// $slope = 2.0\n// $intercept = 0.0\n```\n\n#### Stat::logarithmicRegression( array $x, array $y )\nFit a logarithmic model **y = a × ln(x) + b**. Returns `[a, b]`.\n\nThis model naturally captures diminishing returns — fast initial change that gradually flattens. It is useful for data where early gains are large but improvement slows over time, such as athletic performance trends, learning curves, or market saturation.\n\nAll x values must be positive (you cannot take the logarithm of zero or negative numbers).\n\nInternally, this transforms x to ln(x) and applies linear regression, so it leverages the same robust ordinary least squares implementation.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n\n// Simulated weekly running paces (seconds/km) — diminishing improvement\n$weeks = [1, 2, 3, 4, 5, 6, 7, 8];\n$paces = [350, 342, 337, 333, 330, 328, 326, 325];\n\n[$a, $b] = Stat::logarithmicRegression($weeks, $paces);\n// $a = -12.33 (pace drops by 12.33 sec per unit of ln(week))\n// $b = 350.2\n\n// Predict pace at week 12:\n$predicted = $a * log(12) + $b;\n// ~320 seconds = 5:20/km\n```\n\nCompare with linear regression to see which fits better:\n\n```php\n// R² for logarithmic model (transform x first)\n$logWeeks = array_map(fn($v) => log($v), $weeks);\n$r2Log = Stat::rSquared($logWeeks, $paces);\n// 0.9987\n\n// R² for linear model\n$r2Linear = Stat::rSquared($weeks, $paces);\n// 0.9176\n\n// Logarithmic wins — the data has diminishing returns\n```\n\n#### Stat::powerRegression( array $x, array $y )\nFit a power model **y = a × x^b**. Returns `[a, b]`.\n\nPower regression is useful for data following power law relationships (e.g., scaling laws, allometric relationships). Both x and y values must be positive.\n\nInternally, this linearizes as ln(y) = ln(a) + b × ln(x) and applies linear regression.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n\n// Data following y = 3 * x^2\n$x = [1, 2, 3, 4, 5];\n$y = [3, 12, 27, 48, 75];\n\n[$a, $b] = Stat::powerRegression($x, $y);\n// $a = 3.0\n// $b = 2.0 (the exponent)\n```\n\n#### Stat::exponentialRegression( array $x, array $y )\nFit an exponential model **y = a × e^(b×x)**. Returns `[a, b]`.\n\nExponential regression is useful for data with exponential growth (positive b) or decay (negative b), such as population growth, compound interest, or radioactive decay. All y values must be positive.\n\nInternally, this linearizes as ln(y) = ln(a) + b × x and applies linear regression.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n\n// Data following y = 2 * e^(0.5*x)\n$x = [1, 2, 3, 4, 5];\n$y = [3.30, 5.44, 8.96, 14.78, 24.36];\n\n[$a, $b] = Stat::exponentialRegression($x, $y);\n// $a ≈ 2.0\n// $b ≈ 0.5\n```\n\n#### Stat::rSquared( array $x, array $y, bool $proportional = false, ?int $round = null )\nReturn the coefficient of determination (R²) — the proportion of variance in the dependent variable explained by the linear regression model. Values range from 0 (no explanatory power) to 1 (perfect fit).\n\nRequires at least 2 data points and arrays of the same length.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n$r2 = Stat::rSquared([1, 2, 3, 4, 5], [2, 4, 6, 8, 10]);\n// 1.0 (perfect linear relationship)\n\n$r2 = Stat::rSquared(\n    [1971, 1975, 1979, 1982, 1983],\n    [1, 2, 3, 4, 5],\n    round: 2,\n);\n// 0.96\n```\n\nWith proportional regression (through the origin):\n\n```php\n$r2 = Stat::rSquared(\n    [1, 2, 3, 4, 5],\n    [2, 4, 6, 8, 10],\n    proportional: true,\n);\n// 1.0\n```\n\nTo compute R² for non-linear models, transform the data the same way the regression method does:\n\n```php\n// R² for logarithmic regression\n$logX = array_map(fn($v) => log($v), $x);\n$r2 = Stat::rSquared($logX, $y);\n\n// R² for power regression\n$logX = array_map(fn($v) => log($v), $x);\n$logY = array_map(fn($v) => log($v), $y);\n$r2 = Stat::rSquared($logX, $logY);\n\n// R² for exponential regression\n$logY = array_map(fn($v) => log($v), $y);\n$r2 = Stat::rSquared($x, $logY);\n```\n\n#### Stat::confidenceInterval( array $data, float $confidenceLevel = 0.95, ?int $round = null )\nReturn the confidence interval for the mean using the normal (z) distribution.\n\nComputes: `mean ± z * (stdev / √n)`, where the z-critical value is derived from the inverse normal CDF.\n\nRequires at least 2 data points. The confidence level must be between 0 and 1 exclusive.\n\n```php\nuse HiFolks\\Statistics\\Stat;\n[$lower, $upper] = Stat::confidenceInterval([2, 4, 4, 4, 5, 5, 7, 9]);\n// 95% CI: [3.52, 6.48] (approximately)\n\n[$lower, $upper] = Stat::confidenceInterval([2, 4, 4, 4, 5, 5, 7, 9], confidenceLevel: 0.99);\n// 99% CI: wider interval\n\n[$lower, $upper] = Stat::confidenceInterval([2, 4, 4, 4, 5, 5, 7, 9], round: 2);\n// [3.52, 6.48]\n```\n\n#### Stat::zTest( array $data, float $populationMean, Alternative $alternative = Alternative::TwoSided, ?int $round = null )\nPerform a one-sample Z-test for the mean. Tests whether the sample mean differs significantly from a hypothesized population mean using the normal distribution.\n\nReturns an associative array with `zScore` and `pValue`. The alternative hypothesis can be `TwoSided` (default), `Greater`, or `Less`.\n\nRequires at least 2 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\Alternative;\n\n$result = Stat::zTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0);\n// ['zScore' => 2.6457..., 'pValue' => 0.0081...]\n\n$result = Stat::zTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0, alternative: Alternative::Greater);\n// one-tailed test: is the sample mean greater than 3?\n\n$result = Stat::zTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0, round: 4);\n// ['zScore' => 2.6458, 'pValue' => 0.0081]\n```\n\n#### Stat::tTest( array $data, float $populationMean, Alternative $alternative = Alternative::TwoSided, ?int $round = null )\nPerform a one-sample t-test for the mean. Tests whether the sample mean differs significantly from a hypothesized population mean using the Student's t-distribution. Unlike the z-test, the t-test is appropriate for small samples where the population standard deviation is unknown.\n\nReturns an associative array with `tStatistic`, `pValue`, and `degreesOfFreedom`. The alternative hypothesis can be `TwoSided` (default), `Greater`, or `Less`.\n\nRequires at least 2 data points.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\Alternative;\n\n$result = Stat::tTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0);\n// ['tStatistic' => 2.6457..., 'pValue' => 0.0331..., 'degreesOfFreedom' => 7]\n\n$result = Stat::tTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0, alternative: Alternative::Greater);\n// one-tailed test: is the sample mean greater than 3?\n\n$result = Stat::tTest([2, 4, 4, 4, 5, 5, 7, 9], populationMean: 3.0, round: 4);\n// ['tStatistic' => 2.6458, 'pValue' => 0.0331, 'degreesOfFreedom' => 7]\n```\n\n#### Stat::tTestTwoSample( array $data1, array $data2, Alternative $alternative = Alternative::TwoSided, ?int $round = null )\nPerform a two-sample independent t-test (Welch's t-test). Compares the means of two independent groups without assuming equal variances. Uses the Welch–Satterthwaite approximation for degrees of freedom.\n\nReturns an associative array with `tStatistic`, `pValue`, and `degreesOfFreedom`. The alternative hypothesis can be `TwoSided` (default), `Greater`, or `Less`.\n\nRequires at least 2 data points in each sample.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\Alternative;\n\n// Compare two groups\n$group1 = [30.02, 29.99, 30.11, 29.97, 30.01, 29.99];\n$group2 = [29.89, 29.93, 29.72, 29.98, 30.02, 29.98];\n$result = Stat::tTestTwoSample($group1, $group2);\n// ['tStatistic' => 1.6245..., 'pValue' => 0.1444..., 'degreesOfFreedom' => 6.84...]\n\n// One-tailed test: is group1 mean greater than group2 mean?\n$result = Stat::tTestTwoSample($group1, $group2, alternative: Alternative::Greater);\n\n// Groups can have different sizes\n$result = Stat::tTestTwoSample([1, 2, 3, 4, 5, 6, 7, 8], [3, 4, 5], round: 4);\n```\n\n#### Stat::tTestPaired( array $data1, array $data2, Alternative $alternative = Alternative::TwoSided, ?int $round = null )\nPerform a paired t-test. Tests whether the mean difference between paired observations (e.g. before/after measurements on the same subjects) is significantly different from zero.\n\nReturns an associative array with `tStatistic`, `pValue`, and `degreesOfFreedom`. Both arrays must have the same length.\n\nRequires at least 2 paired observations.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\Alternative;\n\n// Before and after treatment measurements\n$before = [200, 190, 210, 220, 215, 205, 195, 225];\n$after  = [192, 186, 198, 212, 208, 198, 188, 215];\n$result = Stat::tTestPaired($before, $after);\n// ['tStatistic' => 5.715..., 'pValue' => 0.0007..., 'degreesOfFreedom' => 7]\n\n// One-tailed: did the treatment decrease the values?\n$result = Stat::tTestPaired($before, $after, alternative: Alternative::Greater);\n\n$result = Stat::tTestPaired($before, $after, round: 4);\n```\n\n#### Stat::kde ( array $data , float $h , KdeKernel $kernel = KdeKernel::Normal , bool $cumulative = false )\nCreate a continuous probability density function (or cumulative distribution function) from discrete sample data using Kernel Density Estimation.\nReturns a `Closure` that can be called with any point to estimate the density (or CDF value).\n\nSupported kernels: `KdeKernel::Normal` (alias `KdeKernel::Gauss`), `KdeKernel::Logistic`, `KdeKernel::Sigmoid`, `KdeKernel::Rectangular` (alias `KdeKernel::Uniform`), `KdeKernel::Triangular`, `KdeKernel::Parabolic` (alias `KdeKernel::Epanechnikov`), `KdeKernel::Quartic` (alias `KdeKernel::Biweight`), `KdeKernel::Triweight`, `KdeKernel::Cosine`.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\n\n$data = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2];\n$f = Stat::kde($data, h: 1.5);\n$f(2.5);\n// estimated density at x = 2.5\n```\n\nUsing a different kernel:\n\n```php\n$f = Stat::kde($data, h: 1.5, kernel: KdeKernel::Triangular);\n$f(2.5);\n```\n\nCumulative distribution function:\n\n```php\n$F = Stat::kde($data, h: 1.5, cumulative: true);\n$F(2.5);\n// estimated CDF at x = 2.5 (probability that a value is <= 2.5)\n```\n\n#### Stat::kdeRandom ( array $data , float $h , KdeKernel $kernel = KdeKernel::Normal , ?int $seed = null )\nGenerate random samples from a Kernel Density Estimate.\nReturns a `Closure` that, when called, produces a random float drawn from the KDE distribution defined by the data and bandwidth.\n\nSupported kernels: `KdeKernel::Normal` (alias `KdeKernel::Gauss`), `KdeKernel::Logistic`, `KdeKernel::Sigmoid`, `KdeKernel::Rectangular` (alias `KdeKernel::Uniform`), `KdeKernel::Triangular`, `KdeKernel::Parabolic` (alias `KdeKernel::Epanechnikov`), `KdeKernel::Quartic` (alias `KdeKernel::Biweight`), `KdeKernel::Triweight`, `KdeKernel::Cosine`.\n\n```php\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\n\n$data = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2];\n$rand = Stat::kdeRandom($data, h: 1.5, seed: 8675309);\n$samples = [];\nfor ($i = 0; $i < 10; $i++) {\n    $samples[] = round($rand(), 1);\n}\n// [2.5, 3.3, -1.8, 7.3, -2.1, 4.6, 4.4, 5.9, -3.2, -1.6]\n```\n\nUsing a different kernel:\n\n```php\n$rand = Stat::kdeRandom($data, h: 1.5, kernel: KdeKernel::Triangular, seed: 42);\n$rand();\n```\n\n### Freq class\nWith *Statistics* package you can calculate frequency table.\nA frequency table lists the frequency of various outcomes in a sample.\nEach entry in the table contains the frequency or count of the occurrences of values within a particular group or interval.\n\n\n#### Freq::frequencies( array $data )\n```php\nuse HiFolks\\Statistics\\Freq;\n\n$fruits = ['🍈', '🍈', '🍈', '🍉','🍉','🍉','🍉','🍉','🍌'];\n$freqTable = Freq::frequencies($fruits);\nprint_r($freqTable);\n```\nYou can see the frequency table as an array:\n```\nArray\n(\n    [🍈] => 3\n    [🍉] => 5\n    [🍌] => 1\n)\n```\n#### Freq::relativeFrequencies( array $data )\nYou can retrieve the frequency table in relative format (percentage):\n```php\n$freqTable = Freq::relativeFrequencies($fruits, 2);\nprint_r($freqTable);\n```\nYou can see the frequency table as an array with percentage of the occurrences:\n```\nArray\n(\n    [🍈] => 33.33\n    [🍉] => 55.56\n    [🍌] => 11.11\n)\n```\n\n#### Freq::frequencyTableBySize( array $data , $size)\n\nIf you want to create a frequency table based on class (ranges of values) you can use frequencyTableBySize.\nThe first parameter is the array, and the second one is the size of classes.\n\nCalculate the frequency table with classes. Each group size is 4\n```php\n$data = [1,1,1,4,4,5,5,5,6,7,8,8,8,9,9,9,9,9,9,10,10,11,12,12,\n    13,14,14,15,15,16,16,16,16,17,17,17,18,18, ];\n$result = \\HiFolks\\Statistics\\Freq::frequencyTableBySize($data, 4);\nprint_r($result);\n/*\nArray\n(\n    [1] => 5\n    [5] => 8\n    [9] => 11\n    [13] => 9\n    [17] => 5\n)\n */\n```\n\n#### Freq::frequencyTable()\n\nIf you want to create a frequency table based on class (ranges of values) you can use frequencyTable.\nThe first parameter is the array, and the second one is the number of classes.\n\nCalculate the frequency table with 5 classes.\n```php\n$data = [1,1,1,4,4,5,5,5,6,7,8,8,8,9,9,9,9,9,9,10,10,11,12,12,\n    13,14,14,15,15,16,16,16,16,17,17,17,18,18, ];\n$result = \\HiFolks\\Statistics\\Freq::frequencyTable($data, 5);\nprint_r($result);\n/*\nArray\n(\n    [1] => 5\n    [5] => 8\n    [9] => 11\n    [13] => 9\n    [17] => 5\n)\n */\n```\n\n\n### Statistics class\n\nThe methods provided by the `Freq` and the `Stat` classes are mainly **static** methods.\nIf you prefer to use an object instance for calculating statistics you can choose to use an instance of the `Statistics` class.\nSo for calling the statistics methods, you can use your object instance of the `Statistics` class.\n\nFor example for calculating the mean, you can obtain the `Statistics` object via the `make()` static method, and then use the new object `$stat` like in the following example:\n\n```php\n$stat = HiFolks\\Statistics\\Statistics::make(\n    [3,5,4,7,5,2]\n);\necho $stat->valuesToString(5) . PHP_EOL;\n// 2,3,4,5,5\necho \"Mean              : \" . $stat->mean() . PHP_EOL;\n// Mean              : 4.3333333333333\necho \"Count             : \" . $stat->count() . PHP_EOL;\n// Count             : 6\necho \"Median            : \" . $stat->median() . PHP_EOL;\n// Median            : 4.5\necho \"First Quartile  : \" . $stat->firstQuartile() . PHP_EOL;\n// First Quartile  : 2.5\necho \"Third Quartile : \" . $stat->thirdQuartile() . PHP_EOL;\n// Third Quartile : 5\necho \"Mode              : \" . $stat->mode() . PHP_EOL;\n// Mode              : 5\n```\n\n#### Calculate Frequency Table\n\nThe `Statistics` packages have some methods for generating Frequency Table:\n- `frequencies()`: a frequency is the number of times a value of the data occurs;\n- `relativeFrequencies()`: a relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes;\n- `cumulativeFrequencies()`: is the accumulation of the previous relative frequencies;\n- `cumulativeRelativeFrequencies()`: is the accumulation of the previous relative ratio.\n\n```php\nuse HiFolks\\Statistics\\Statistics;\n\n$s = Statistics::make(\n    [98, 90, 70,18,92,92,55,83,45,95,88,76]\n);\n$a = $s->frequencies();\nprint_r($a);\n/*\nArray\n(\n    [18] => 1\n    [45] => 1\n    [55] => 1\n    [70] => 1\n    [76] => 1\n    [83] => 1\n    [88] => 1\n    [90] => 1\n    [92] => 2\n    [95] => 1\n    [98] => 1\n)\n */\n\n$a = $s->relativeFrequencies();\nprint_r($a);\n/*\nArray\n(\n    [18] => 8.3333333333333\n    [45] => 8.3333333333333\n    [55] => 8.3333333333333\n    [70] => 8.3333333333333\n    [76] => 8.3333333333333\n    [83] => 8.3333333333333\n    [88] => 8.3333333333333\n    [90] => 8.3333333333333\n    [92] => 16.666666666667\n    [95] => 8.3333333333333\n    [98] => 8.3333333333333\n)\n */\n\n```\n## `NormalDist` class\n\nThe `NormalDist` class provides an easy way to work with normal distributions in PHP. It allows you to calculate probabilities and densities for a given mean (μ\\muμ) and standard deviation (σ\\sigmaσ).\n\n### Key features\n\n- Define a normal distribution with mean (μ\\muμ) and standard deviation (σ\\sigmaσ).\n- Calculate the **Probability Density Function (PDF)** to evaluate the relative likelihood of a value.\n- Calculate the **Cumulative Distribution Function (CDF)** to determine the probability of a value or lower.\n- Calculate the **Inverse Cumulative Distribution Function (inv_cdf)** to find the value for a given probability.\n\n------\n\n### Class constructor\n\n```php\n$normalDist = new NormalDist(float $mu = 0.0, float $sigma = 1.0);\n```\n\n- `$mu`: The mean (default = `0.0`).\n- `$sigma`: The standard deviation (default = `1.0`).\n- Throws an exception if `$sigma` is non-positive.\n\n------\n\n### Methods\n\n#### Properties: mean, sigma, and variance\n\nYou can access the distribution parameters via getter methods:\n\n```php\n$normalDist = new NormalDist(100, 15);\n$normalDist->getMean();             // 100.0\n$normalDist->getSigma();            // 15.0\n$normalDist->getMedian();           // 100.0 (equals mean for normal dist)\n$normalDist->getMode();             // 100.0 (equals mean for normal dist)\n$normalDist->getVariance();         // 225.0 (sigma squared)\n$normalDist->getVarianceRounded(2); // 225.0\n```\n\nFrom samples:\n\n```php\n$normalDist = NormalDist::fromSamples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5]);\n$normalDist->getVarianceRounded(5); // 0.25767\n```\n\n------\n\n#### Creating a normal distribution instance from sample data\n\nThe `fromSamples()` static method creates a normal distribution instance with mu and sigma parameters estimated from the sample data.\n\nExample:\n\n```php\n$samples = [2.5, 3.1, 2.1, 2.4, 2.7, 3.5];\n$normalDist = NormalDist::fromSamples($samples);\n$normalDist->getMeanRounded(5); // 2.71667\n$normalDist->getSigmaRounded(5); // 0.50761\n```\n\n#### Generate random samples `samples($n, $seed)`\n\nGenerates `$n` random samples from the normal distribution using the Box-Muller transform. An optional `$seed` parameter allows reproducible results.\n\n```php\n$normalDist = new NormalDist(100, 15);\n\n// Generate 5 random samples\n$samples = $normalDist->samples(5);\n// e.g. [98.3, 112.7, 89.1, 105.4, 101.2]\n\n// Reproducible results with a seed\n$samples = $normalDist->samples(1000, seed: 42);\n```\n\n------\n\n#### Z-score `zscore($x)`\n\nComputes the standard score describing `$x` in terms of the number of standard deviations above or below the mean: `(x - mu) / sigma`.\n\n```php\n$normalDist = new NormalDist(100, 15);\necho $normalDist->zscore(130);          // 2.0 (two std devs above mean)\necho $normalDist->zscore(85);           // -1.0 (one std dev below mean)\necho $normalDist->zscoreRounded(114, 3); // 0.933\n```\n\n------\n\n#### Probability Density Function `pdf($x)`\n\nCalculates the **Probability Density Function** at a given value xxx:\n\n```php\n$normalDist->pdf(float $x): float\n```\n\n- Input: the value `$x` at which to evaluate the PDF.\n- Output: the relative likelihood of `$x` in the distribution.\n\nExample:\n\n```php\n$normalDist = new NormalDist(10.0, 2.0);\necho $normalDist->pdf(12.0); // Output: 0.12098536225957168\n```\n\n------\n\n#### Cumulative Distribution Function `cdf($x)`\n\nCalculates the **Cumulative Distribution Function** at a given value `$x`:\n\n```php\n$normalDist->cdf(float $x): float\n```\n- Input: the value `$x` at which to evaluate the CDF.\n- Output: the probability that a random variable `$x` is less than or equal to `$x`.\n\nExample:\n\n```php\n$normalDist = new NormalDist(10.0, 2.0);\necho $normalDist->cdf(12.0); // Output: 0.8413447460685429\n```\n\nCalculating both, CDF and PDF:\n\n```php\n$normalDist = new NormalDist(10.0, 2.0);\n\n// Calculate PDF at x = 12\n$pdf = $normalDist->pdf(12.0);\necho \"PDF at x = 12: $pdf\\n\"; // Output: 0.12098536225957168\n\n// Calculate CDF at x = 12\n$cdf = $normalDist->cdf(12.0);\necho \"CDF at x = 12: $cdf\\n\"; // Output: 0.8413447460685429\n```\n\n------\n\n#### Inverse Cumulative Distribution Function `invCdf($p)`\n\nComputes the **Inverse Cumulative Distribution Function** (also known as the quantile function or percent-point function). Given a probability `$p`, it finds the value `$x` such that `cdf($x) = $p`.\n\n```php\n$normalDist->invCdf(float $p): float\n```\n\n- Input: a probability `$p` in the range (0, 1) exclusive.\n- Output: the value `$x` where `cdf($x) = $p`.\n- Throws an exception if `$p` is not in (0, 1).\n\nExample:\n\n```php\n$normalDist = new NormalDist(0.0, 1.0);\n\n// Find the value at the 95th percentile of a standard normal distribution\necho $normalDist->invCdfRounded(0.95, 5); // Output: 1.64485\n\n// The median of a standard normal distribution\necho $normalDist->invCdf(0.5); // Output: 0.0\n```\n\nThe `invCdf()` method is useful for:\n- **Confidence intervals**: find critical values for a given confidence level.\n- **Hypothesis testing**: determine thresholds for statistical significance.\n- **Percentile calculations**: find the value corresponding to a specific percentile.\n\nRound-trip example with `cdf()`:\n\n```php\n$normalDist = new NormalDist(100, 15);\n\n// inv_cdf(0.5) equals the mean\necho $normalDist->invCdf(0.5); // Output: 100.0\n\n// Round-trip: cdf(invCdf(p)) ≈ p\necho $normalDist->cdfRounded($normalDist->invCdf(0.25), 2); // Output: 0.25\n```\n\n------\n\n#### Quantiles `quantiles($n)`\n\nDivides the normal distribution into `$n` continuous intervals with equal probability. Returns a list of `$n - 1` cut points separating the intervals.\nSet `$n` to 4 for quartiles (the default), `$n` to 10 for deciles, or `$n` to 100 for percentiles.\n\n```php\n$normalDist = new NormalDist(0.0, 1.0);\n\n// Quartiles (default)\n$normalDist->quantiles();    // [-0.6745, 0.0, 0.6745]\n\n// Deciles\n$normalDist->quantiles(10);  // 9 cut points\n\n// Percentiles\n$normalDist->quantiles(100); // 99 cut points\n```\n\n------\n\n#### Overlapping coefficient `overlap($other)`\n\nComputes the overlapping coefficient (OVL) between two normal distributions. Measures the agreement between two normal probability distributions. Returns a value between 0.0 and 1.0 giving the overlapping area in the two underlying probability density functions.\n\n```php\n$n1 = new NormalDist(2.4, 1.6);\n$n2 = new NormalDist(3.2, 2.0);\necho $n1->overlapRounded($n2, 4); // 0.8035\n\n// Identical distributions overlap completely\n$n3 = new NormalDist(0, 1);\necho $n3->overlap($n3); // 1.0\n```\n\n------\n\n#### Combining a normal distribution via `add()` method\n\nThe `add()` method allows you to combine a NormalDist instance with either a constant or another NormalDist object.\nThis operation supports mathematical transformations and the combination of distributions.\n\nThe use cases are:\n- Shifting a distribution: add a constant to shift the mean, useful in translating data.\n- Combining distributions: combine independent or jointly normally distributed variables, commonly used in statistics and probability.\n\n```php\n$birth_weights = NormalDist::fromSamples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5]);\n$drug_effects = new NormalDist(0.4, 0.15);\n$combined = $birth_weights->add($drug_effects);\n\n$combined->getMeanRounded(1); // 3.1\n$combined->getSigmaRounded(1); // 0.5\n\n$birth_weights->getMeanRounded(5); // 2.71667\n$birth_weights->getSigmaRounded(5); // 0.50761\n```\n\n#### Scaling a normal distribution by a costant via `multiply()` method\n\nThe `multiply()` method for NormalDist multiplies both the mean (mu) and standard deviation (sigma) by a constant.\nThis method is useful for rescaling distributions, such as when changing measurement units.\nThe standard deviation is scaled by the absolute value of the constant to ensure it remains non-negative.\n\nThe method does not modify the existing object but instead returns a new NormalDist instance with the updated values.\n\nUse Cases:\n- Rescaling distributions: useful when changing units (e.g., from meters to kilometers, or Celsius to Farenhait).\n- Transforming data: apply proportional scaling to statistical data.\n\n```php\n$tempFebruaryCelsius = new NormalDist(5, 2.5); # Celsius\n$tempFebFahrenheit = $tempFebruaryCelsius->multiply(9 / 5)->add(32); # Fahrenheit\n$tempFebFahrenheit->getMeanRounded(1); // 41.0\n$tempFebFahrenheit->getSigmaRounded(1); // 4.5\n```\n\n\n#### Subtracting from a normal distribution via `subtract()` method\n\nThe `subtract()` method is the counterpart to `add()`. It subtracts a constant or another NormalDist instance from this distribution.\n\n- A constant (float): shifts the mean down, leaving sigma unchanged.\n- A NormalDist instance: subtracts the means and combines the variances.\n\n```php\n$nd = new NormalDist(100, 15);\n$shifted = $nd->subtract(32);\n$shifted->getMean();  // 68.0\n$shifted->getSigma(); // 15.0 (unchanged)\n```\n\n#### Dividing a normal distribution by a constant via `divide()` method\n\nThe `divide()` method is the counterpart to `multiply()`. It divides both the mean (mu) and standard deviation (sigma) by a constant.\n\n```php\n// Convert Fahrenheit back to Celsius: (F - 32) / (9/5)\n$tempFahrenheit = new NormalDist(41, 4.5);\n$tempCelsius = $tempFahrenheit->subtract(32)->divide(9 / 5);\n$tempCelsius->getMeanRounded(1);  // 5.0\n$tempCelsius->getSigmaRounded(1); // 2.5\n```\n\n------\n\n### References for NormalDist\n\nThis class is inspired by Python’s `statistics.NormalDist` and aims to provide similar functionality for PHP users. (Work in Progress)\n\n## `StudentT` class\n\nThe `StudentT` class represents the Student’s t-distribution, which is used for hypothesis testing and confidence intervals when the population standard deviation is unknown, especially with small sample sizes. As the degrees of freedom increase, the t-distribution approaches the standard normal distribution.\n\n### Creating a StudentT instance\n\n```php\nuse HiFolks\\Statistics\\StudentT;\n\n$t = new StudentT(df: 10); // 10 degrees of freedom\n```\n\n### Probability Density Function (PDF)\n\n```php\n$t = new StudentT(5);\n$t->pdf(0);        // ≈ 0.37961 (peak of the distribution)\n$t->pdf(2.0);      // density at t=2\n$t->pdfRounded(0); // 0.38\n```\n\n### Cumulative Distribution Function (CDF)\n\n```php\n$t = new StudentT(5);\n$t->cdf(0);    // 0.5 (symmetric around zero)\n$t->cdf(2.0);  // ≈ 0.94874\n$t->cdfRounded(2.0); // 0.949\n```\n\n### Inverse CDF (Quantile Function)\n\n```php\n$t = new StudentT(10);\n$t->invCdf(0.975);  // ≈ 2.228 (critical value for 95% two-sided test)\n$t->invCdf(0.5);    // 0.0 (median)\n$t->invCdfRounded(0.975, 3); // 2.228\n```\n\n## StreamingStat (Experimental)\n\n> **Note**: `StreamingStat` is experimental in version 1.x. It will be released as stable in version 2. If you want to provide feedback, we are happy to hear from you — please open an issue at https://github.com/Hi-Folks/statistics/issues.\n\n`StreamingStat` computes descriptive statistics in a single pass with O(1) memory, ideal for large datasets or generator-based streams.\n\n```php\nuse HiFolks\\Statistics\\StreamingStat;\n\n$s = new StreamingStat();\n$s->add(1)->add(2)->add(3)->add(4)->add(5);\n\n$s->count();     // 5\n$s->sum();       // 15.0\n$s->min();       // 1.0\n$s->max();       // 5.0\n$s->mean();      // 3.0\n$s->variance();  // 2.5\n$s->stdev();     // 1.5811...\n$s->skewness();  // 0.0\n$s->kurtosis();  // -1.2\n```\n\n| Method | Description | Min n |\n|---|---|---|\n| `count()` | Number of values added | 0 |\n| `sum()` | Sum of all values | 1 |\n| `min()` | Minimum value | 1 |\n| `max()` | Maximum value | 1 |\n| `mean(?int $round = null)` | Arithmetic mean | 1 |\n| `variance(?int $round = null)` | Sample variance | 2 |\n| `pvariance(?int $round = null)` | Population variance | 1 |\n| `stdev(?int $round = null)` | Sample standard deviation | 2 |\n| `pstdev(?int $round = null)` | Population standard deviation | 1 |\n| `skewness(?int $round = null)` | Sample skewness (adjusted Fisher-Pearson) | 3 |\n| `pskewness(?int $round = null)` | Population skewness | 3 |\n| `kurtosis(?int $round = null)` | Excess kurtosis (sample) | 4 |\n\nAll methods throw `InvalidDataInputException` when insufficient data is available.\n\n## Utility classes\n\nThe package includes utility classes under `HiFolks\\Statistics\\Utils` for common array and formatting operations.\n\n### `Arr` — array helpers\n\n```php\nuse HiFolks\\Statistics\\Utils\\Arr;\n```\n\n#### Arr::extract( array $data, array $columns )\n\nExtract one or more columns from an array of associative arrays. Returns one array per requested column.\n\n```php\n$runners = [\n    ['name' => 'Alice', 'age' => 30, 'score' => 95],\n    ['name' => 'Bob',   'age' => 25, 'score' => 87],\n];\n\n[$ages, $scores] = Arr::extract($runners, ['age', 'score']);\n// $ages = [30, 25], $scores = [95, 87]\n```\n\n#### Arr::partition( array $data, string $field, string $operator, mixed $value )\n\nSplit an array of associative arrays into `[$matching, $nonMatching]` groups based on a condition. Supported operators: `==`, `!=`, `>`, `<`, `>=`, `<=`.\n\n```php\n[$men, $women] = Arr::partition($runners, 'gender', '==', 'M');\n[$seniors, $others] = Arr::partition($runners, 'age', '>=', 40);\n```\n\n#### Arr::toString( array $data, bool|int $sample = false )\n\nJoin array values into a comma-separated string. Pass an integer to limit to the first N values.\n\n#### Arr::stripZeroes( array $data )\n\nRemove zero values from the array.\n\n### `Format` — time formatting\n\n```php\nuse HiFolks\\Statistics\\Utils\\Format;\n```\n\n#### Format::secondsToTime( int|float $seconds )\n\nConvert seconds to a human-readable time string.\n\n```php\nFormat::secondsToTime(4845);  // \"1:20:45\"\n```\n\n#### Format::timeToSeconds( string $time )\n\nParse a time string back to total seconds.\n\n```php\nFormat::timeToSeconds('1:20:45');  // 4845\n```\n\n#### Format::secondsToHms( int|float $seconds )\n\nConvert seconds to an associative array with `hours`, `minutes`, `seconds` keys.\n\n```php\nFormat::secondsToHms(4845);  // ['hours' => 1, 'minutes' => 20, 'seconds' => 45]\n```\n\n#### Format::hmsToSeconds( int $hours, int $minutes, int $seconds )\n\nConvert hours, minutes, and seconds to total seconds.\n\n```php\nFormat::hmsToSeconds(1, 20, 45);  // 4845\n```\n\n## Testing\n\n```bash\ncomposer run test           Runs the test script\ncomposer run test-coverage  Runs the test-coverage script\ncomposer run format         Runs the format script\ncomposer run static-code    Runs the static-code script\ncomposer run all-check      Runs the all-check script\n```\n\n\n## Changelog\n\nPlease see [CHANGELOG](CHANGELOG.md) for more information on what has changed recently.\n\n## Contributing\n\nPlease see [CONTRIBUTING](.github/CONTRIBUTING.md) for details.\n\n## Security Vulnerabilities\n\nPlease review [our security policy](../../security/policy) on how to report security vulnerabilities.\n\n## Credits\n\n- [Roberto B.](https://github.com/roberto-butti)\n- [All Contributors](../../contributors)\n\n## License\n\nThe MIT License (MIT). Please see [License File](LICENSE.md) for more information.\n"
  },
  {
    "path": "TODO.md",
    "content": "## Missing Functions\n\n\n\n\n### Correlation & Regression\n\n\n- Kendall tau correlation - another rank-based correlation\n- Multiple/polynomial regression\n\n### Hypothesis Testing\n\n- ~~T-test (two-sample, paired) — one-sample is done~~ DONE: `tTestTwoSample()` (Welch's) and `tTestPaired()`\n- Chi-squared test\n\n### Other Distributions (beyond Normal)\n\n- Chi-squared distribution\n- Binomial distribution\n- Poisson distribution\n- Uniform distribution\n- Exponential distribution\n\n\n\n\n### Ranking & Order Statistics\n\n- Rank - assign ranks to data points\n- Percentile rank - what percentile a given value falls at\n"
  },
  {
    "path": "composer.json",
    "content": "{\n    \"name\": \"hi-folks/statistics\",\n    \"description\": \"PHP package that provides functions for calculating mathematical statistics of numeric data.\",\n    \"keywords\": [\n        \"hi-folks\",\n        \"statistics\"\n    ],\n    \"homepage\": \"https://github.com/hi-folks/statistics\",\n    \"license\": \"MIT\",\n    \"authors\": [\n        {\n            \"name\": \"Roberto B.\",\n            \"email\": \"roberto.butti@gmail.com\",\n            \"role\": \"Developer\"\n        }\n    ],\n    \"require\": {\n        \"php\": \"^8.2|^8.3|^8.4|8.5\"\n    },\n    \"require-dev\": {\n        \"friendsofphp/php-cs-fixer\": \"^3.65\",\n        \"phpstan/phpstan\": \"^2\",\n        \"phpstan/phpstan-phpunit\": \"^2.0\",\n        \"phpunit/phpunit\": \"^11.0\",\n        \"rector/rector\": \"^2\"\n    },\n    \"autoload\": {\n        \"psr-4\": {\n            \"HiFolks\\\\Statistics\\\\\": \"src\"\n        }\n    },\n    \"autoload-dev\": {\n        \"psr-4\": {\n            \"HiFolks\\\\Statistics\\\\Tests\\\\\": \"tests\"\n        }\n    },\n    \"scripts\": {\n        \"format\": \"vendor/bin/php-cs-fixer fix\",\n        \"test\": \"vendor/bin/phpunit\",\n        \"test-coverage\": \"vendor/bin/phpunit --coverage-text\",\n        \"static-code\": \"vendor/bin/phpstan analyse -c phpstan.neon\",\n        \"rector-dry-run\": \"rector process --dry-run\",\n        \"rector\": \"rector process\",\n        \"all-check\": [\n            \"@format\",\n            \"@rector-dry-run\",\n            \"@static-code\",\n            \"@test\"\n        ]\n    },\n    \"config\": {\n        \"sort-packages\": true,\n        \"allow-plugins\": {}\n    },\n    \"minimum-stability\": \"dev\",\n    \"prefer-stable\": true\n}\n"
  },
  {
    "path": "examples/article-boston-marathon-analysis.php",
    "content": "<?php\n\n/**\n * Analyzing 75,000 Boston Marathon Runners with PHP Statistics\n *\n * This script accompanies the article that uses a representative sample\n * from the Boston Marathon 2015–2017 Kaggle dataset to showcase the\n * statistics library's capabilities — especially tTestTwoSample() and\n * tTestPaired().\n *\n * Dataset: https://www.kaggle.com/datasets/rojour/boston-results\n * Run it with: php examples/article-boston-marathon-analysis.php\n */\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Utils\\Arr;\nuse HiFolks\\Statistics\\Utils\\Format;\n\n// === The Data ===\n// Representative sample of 60 finishers from the 2017 Boston Marathon.\n// Times are stored in seconds for easy arithmetic.\n// 'half' = cumulative time at the half-marathon mark (21.1 km)\n// 'finish' = gun-to-finish time\n// 'splits' = 8 individual 5K segment times (5K through 40K)\n\n$runners = [\n    // --- Fast men ---\n    ['name' => 'James Karanja',      'age' => 28, 'gender' => 'M', 'country' => 'KEN', 'half' => 4520, 'finish' => 9280,  'splits' => [1100, 1105, 1110, 1115, 1120, 1125, 1140, 1165]],\n    ['name' => 'Michael Kiprop',     'age' => 31, 'gender' => 'M', 'country' => 'KEN', 'half' => 4600, 'finish' => 9450,  'splits' => [1115, 1120, 1125, 1130, 1135, 1145, 1160, 1200]],\n    ['name' => 'David Chen',         'age' => 26, 'gender' => 'M', 'country' => 'USA', 'half' => 4680, 'finish' => 9600,  'splits' => [1130, 1135, 1140, 1145, 1155, 1170, 1190, 1225]],\n    ['name' => 'Ryan O\\'Brien',      'age' => 29, 'gender' => 'M', 'country' => 'USA', 'half' => 4750, 'finish' => 9780,  'splits' => [1150, 1155, 1160, 1165, 1175, 1195, 1220, 1260]],\n    ['name' => 'Tadesse Bekele',     'age' => 33, 'gender' => 'M', 'country' => 'ETH', 'half' => 4820, 'finish' => 9920,  'splits' => [1165, 1170, 1180, 1185, 1195, 1215, 1245, 1280]],\n    ['name' => 'Carlos Gutierrez',   'age' => 27, 'gender' => 'M', 'country' => 'MEX', 'half' => 4900, 'finish' => 10100, 'splits' => [1190, 1195, 1200, 1210, 1225, 1240, 1265, 1300]],\n    ['name' => 'Thomas Mueller',     'age' => 30, 'gender' => 'M', 'country' => 'GER', 'half' => 5020, 'finish' => 10380, 'splits' => [1220, 1225, 1230, 1240, 1260, 1280, 1310, 1350]],\n    ['name' => 'Hiroshi Tanaka',     'age' => 34, 'gender' => 'M', 'country' => 'JPN', 'half' => 5100, 'finish' => 10560, 'splits' => [1240, 1245, 1250, 1260, 1280, 1300, 1330, 1370]],\n    // --- Mid-pack men ---\n    ['name' => 'John Smith',         'age' => 34, 'gender' => 'M', 'country' => 'USA', 'half' => 5400, 'finish' => 11200, 'splits' => [1310, 1320, 1330, 1340, 1370, 1400, 1440, 1490]],\n    ['name' => 'Patrick Sullivan',   'age' => 38, 'gender' => 'M', 'country' => 'USA', 'half' => 5550, 'finish' => 11520, 'splits' => [1350, 1360, 1370, 1380, 1410, 1440, 1480, 1530]],\n    ['name' => 'Marco Rossi',        'age' => 36, 'gender' => 'M', 'country' => 'ITA', 'half' => 5620, 'finish' => 11700, 'splits' => [1370, 1375, 1385, 1395, 1425, 1460, 1500, 1550]],\n    ['name' => 'Daniel Park',        'age' => 32, 'gender' => 'M', 'country' => 'KOR', 'half' => 5700, 'finish' => 11880, 'splits' => [1390, 1395, 1405, 1415, 1450, 1485, 1525, 1575]],\n    ['name' => 'Andrew Taylor',      'age' => 41, 'gender' => 'M', 'country' => 'USA', 'half' => 5800, 'finish' => 12100, 'splits' => [1410, 1420, 1430, 1445, 1480, 1520, 1570, 1625]],\n    ['name' => 'Pierre Dubois',      'age' => 37, 'gender' => 'M', 'country' => 'FRA', 'half' => 5850, 'finish' => 12240, 'splits' => [1425, 1435, 1445, 1460, 1500, 1540, 1590, 1650]],\n    ['name' => 'Robert Johnson',     'age' => 44, 'gender' => 'M', 'country' => 'USA', 'half' => 5950, 'finish' => 12480, 'splits' => [1450, 1460, 1470, 1490, 1530, 1570, 1620, 1690]],\n    ['name' => 'William Davis',      'age' => 39, 'gender' => 'M', 'country' => 'USA', 'half' => 6020, 'finish' => 12660, 'splits' => [1470, 1480, 1490, 1510, 1555, 1600, 1660, 1730]],\n    ['name' => 'Kevin Brown',        'age' => 42, 'gender' => 'M', 'country' => 'CAN', 'half' => 6100, 'finish' => 12840, 'splits' => [1490, 1500, 1515, 1535, 1580, 1630, 1690, 1760]],\n    ['name' => 'Liam Walsh',         'age' => 35, 'gender' => 'M', 'country' => 'IRL', 'half' => 6180, 'finish' => 13020, 'splits' => [1510, 1520, 1535, 1555, 1605, 1660, 1720, 1795]],\n    ['name' => 'Matt Henderson',     'age' => 46, 'gender' => 'M', 'country' => 'USA', 'half' => 6250, 'finish' => 13200, 'splits' => [1530, 1540, 1555, 1575, 1630, 1685, 1750, 1830]],\n    ['name' => 'José Fernandez',     'age' => 40, 'gender' => 'M', 'country' => 'ESP', 'half' => 6320, 'finish' => 13380, 'splits' => [1545, 1560, 1575, 1600, 1655, 1715, 1780, 1860]],\n    ['name' => 'Brian Miller',       'age' => 48, 'gender' => 'M', 'country' => 'USA', 'half' => 6400, 'finish' => 13560, 'splits' => [1565, 1580, 1595, 1620, 1680, 1740, 1810, 1900]],\n    ['name' => 'Chris Anderson',     'age' => 43, 'gender' => 'M', 'country' => 'USA', 'half' => 6480, 'finish' => 13740, 'splits' => [1585, 1600, 1620, 1645, 1710, 1775, 1850, 1940]],\n    ['name' => 'Sean O\\'Connor',     'age' => 45, 'gender' => 'M', 'country' => 'USA', 'half' => 6550, 'finish' => 13920, 'splits' => [1600, 1620, 1640, 1670, 1735, 1805, 1885, 1980]],\n    // --- Slow men ---\n    ['name' => 'Greg Thompson',      'age' => 52, 'gender' => 'M', 'country' => 'USA', 'half' => 6700, 'finish' => 14280, 'splits' => [1630, 1650, 1675, 1710, 1780, 1860, 1950, 2060]],\n    ['name' => 'Tom Williams',       'age' => 55, 'gender' => 'M', 'country' => 'USA', 'half' => 6850, 'finish' => 14640, 'splits' => [1665, 1690, 1720, 1760, 1840, 1930, 2030, 2150]],\n    ['name' => 'Richard Clark',      'age' => 50, 'gender' => 'M', 'country' => 'GBR', 'half' => 6950, 'finish' => 14940, 'splits' => [1695, 1720, 1750, 1795, 1880, 1975, 2085, 2210]],\n    ['name' => 'Hans Weber',         'age' => 58, 'gender' => 'M', 'country' => 'GER', 'half' => 7100, 'finish' => 15300, 'splits' => [1730, 1760, 1795, 1845, 1940, 2045, 2165, 2300]],\n    ['name' => 'James Wilson',       'age' => 53, 'gender' => 'M', 'country' => 'USA', 'half' => 7200, 'finish' => 15540, 'splits' => [1755, 1785, 1825, 1880, 1980, 2090, 2215, 2360]],\n    ['name' => 'Paul Martin',        'age' => 60, 'gender' => 'M', 'country' => 'USA', 'half' => 7400, 'finish' => 16020, 'splits' => [1800, 1840, 1885, 1945, 2055, 2175, 2310, 2470]],\n    ['name' => 'George Baker',       'age' => 62, 'gender' => 'M', 'country' => 'USA', 'half' => 7600, 'finish' => 16500, 'splits' => [1850, 1895, 1945, 2010, 2130, 2260, 2410, 2590]],\n    ['name' => 'Frank Harris',       'age' => 64, 'gender' => 'M', 'country' => 'CAN', 'half' => 7900, 'finish' => 17280, 'splits' => [1920, 1975, 2035, 2115, 2250, 2400, 2570, 2770]],\n    // --- Fast women ---\n    ['name' => 'Sarah Kimutai',      'age' => 27, 'gender' => 'F', 'country' => 'KEN', 'half' => 5250, 'finish' => 10800, 'splits' => [1280, 1285, 1290, 1300, 1320, 1345, 1375, 1410]],\n    ['name' => 'Emma Johansson',     'age' => 30, 'gender' => 'F', 'country' => 'SWE', 'half' => 5380, 'finish' => 11100, 'splits' => [1310, 1320, 1330, 1340, 1365, 1390, 1425, 1465]],\n    ['name' => 'Lisa Zhang',         'age' => 25, 'gender' => 'F', 'country' => 'CHN', 'half' => 5480, 'finish' => 11340, 'splits' => [1335, 1345, 1355, 1370, 1395, 1425, 1465, 1510]],\n    ['name' => 'Anna Petrov',        'age' => 29, 'gender' => 'F', 'country' => 'RUS', 'half' => 5560, 'finish' => 11520, 'splits' => [1355, 1365, 1375, 1390, 1420, 1455, 1495, 1545]],\n    ['name' => 'Maria Santos',       'age' => 32, 'gender' => 'F', 'country' => 'BRA', 'half' => 5650, 'finish' => 11700, 'splits' => [1375, 1385, 1395, 1415, 1445, 1480, 1525, 1580]],\n    // --- Mid-pack women ---\n    ['name' => 'Jennifer Adams',     'age' => 35, 'gender' => 'F', 'country' => 'USA', 'half' => 5850, 'finish' => 12180, 'splits' => [1425, 1435, 1450, 1470, 1510, 1555, 1610, 1675]],\n    ['name' => 'Rachel Green',       'age' => 38, 'gender' => 'F', 'country' => 'USA', 'half' => 6050, 'finish' => 12660, 'splits' => [1475, 1490, 1510, 1535, 1585, 1640, 1710, 1790]],\n    ['name' => 'Sophie Laurent',     'age' => 33, 'gender' => 'F', 'country' => 'FRA', 'half' => 6200, 'finish' => 13020, 'splits' => [1515, 1530, 1550, 1580, 1635, 1700, 1775, 1865]],\n    ['name' => 'Emily Watson',       'age' => 40, 'gender' => 'F', 'country' => 'USA', 'half' => 6350, 'finish' => 13380, 'splits' => [1550, 1570, 1590, 1625, 1685, 1755, 1840, 1940]],\n    ['name' => 'Amy Chen',           'age' => 36, 'gender' => 'F', 'country' => 'USA', 'half' => 6480, 'finish' => 13680, 'splits' => [1585, 1605, 1625, 1665, 1730, 1805, 1895, 2000]],\n    ['name' => 'Kate Murphy',        'age' => 42, 'gender' => 'F', 'country' => 'IRL', 'half' => 6600, 'finish' => 13980, 'splits' => [1615, 1635, 1660, 1700, 1775, 1860, 1955, 2070]],\n    ['name' => 'Michelle Lee',       'age' => 37, 'gender' => 'F', 'country' => 'USA', 'half' => 6720, 'finish' => 14280, 'splits' => [1645, 1665, 1695, 1740, 1820, 1910, 2015, 2140]],\n    ['name' => 'Olivia Garcia',      'age' => 44, 'gender' => 'F', 'country' => 'USA', 'half' => 6850, 'finish' => 14580, 'splits' => [1675, 1700, 1730, 1780, 1870, 1965, 2080, 2210]],\n    ['name' => 'Laura Schmidt',      'age' => 41, 'gender' => 'F', 'country' => 'GER', 'half' => 6950, 'finish' => 14820, 'splits' => [1700, 1725, 1760, 1810, 1910, 2015, 2135, 2275]],\n    ['name' => 'Hannah Kim',         'age' => 39, 'gender' => 'F', 'country' => 'USA', 'half' => 7050, 'finish' => 15060, 'splits' => [1725, 1750, 1790, 1845, 1950, 2060, 2190, 2340]],\n    // --- Slow women ---\n    ['name' => 'Diane Cooper',       'age' => 50, 'gender' => 'F', 'country' => 'USA', 'half' => 7250, 'finish' => 15480, 'splits' => [1770, 1800, 1845, 1905, 2015, 2140, 2280, 2440]],\n    ['name' => 'Nancy Taylor',       'age' => 53, 'gender' => 'F', 'country' => 'USA', 'half' => 7450, 'finish' => 15960, 'splits' => [1820, 1855, 1905, 1970, 2095, 2230, 2385, 2560]],\n    ['name' => 'Barbara White',      'age' => 48, 'gender' => 'F', 'country' => 'USA', 'half' => 7600, 'finish' => 16320, 'splits' => [1860, 1900, 1955, 2030, 2160, 2310, 2475, 2670]],\n    ['name' => 'Susan Hall',         'age' => 56, 'gender' => 'F', 'country' => 'CAN', 'half' => 7850, 'finish' => 16860, 'splits' => [1915, 1960, 2020, 2105, 2250, 2410, 2595, 2810]],\n    ['name' => 'Patricia Evans',     'age' => 58, 'gender' => 'F', 'country' => 'USA', 'half' => 8050, 'finish' => 17340, 'splits' => [1965, 2015, 2085, 2175, 2340, 2520, 2720, 2950]],\n    ['name' => 'Carol Robinson',     'age' => 61, 'gender' => 'F', 'country' => 'USA', 'half' => 8300, 'finish' => 17940, 'splits' => [2025, 2085, 2160, 2260, 2445, 2645, 2865, 3120]],\n    // --- Additional men for sample size ---\n    ['name' => 'Steve Campbell',     'age' => 47, 'gender' => 'M', 'country' => 'USA', 'half' => 6650, 'finish' => 14100, 'splits' => [1620, 1640, 1665, 1705, 1775, 1855, 1950, 2065]],\n    ['name' => 'Mark Phillips',      'age' => 36, 'gender' => 'M', 'country' => 'USA', 'half' => 5480, 'finish' => 11380, 'splits' => [1335, 1345, 1355, 1370, 1400, 1430, 1470, 1520]],\n    ['name' => 'Jason Reed',         'age' => 33, 'gender' => 'M', 'country' => 'USA', 'half' => 5250, 'finish' => 10860, 'splits' => [1280, 1290, 1300, 1315, 1340, 1370, 1405, 1450]],\n    ['name' => 'Alex Turner',        'age' => 28, 'gender' => 'M', 'country' => 'GBR', 'half' => 5150, 'finish' => 10620, 'splits' => [1255, 1260, 1270, 1285, 1310, 1340, 1375, 1415]],\n    ['name' => 'Nick Peterson',      'age' => 50, 'gender' => 'M', 'country' => 'USA', 'half' => 6900, 'finish' => 14760, 'splits' => [1685, 1710, 1740, 1790, 1880, 1975, 2085, 2215]],\n    ['name' => 'Derek Hughes',       'age' => 42, 'gender' => 'M', 'country' => 'AUS', 'half' => 6250, 'finish' => 13140, 'splits' => [1525, 1540, 1560, 1590, 1650, 1720, 1800, 1890]],\n    ['name' => 'Tim Wright',         'age' => 56, 'gender' => 'M', 'country' => 'USA', 'half' => 7350, 'finish' => 15900, 'splits' => [1795, 1830, 1875, 1935, 2055, 2185, 2335, 2510]],\n    ['name' => 'Scott Mitchell',     'age' => 39, 'gender' => 'M', 'country' => 'USA', 'half' => 5700, 'finish' => 11820, 'splits' => [1390, 1400, 1410, 1425, 1460, 1500, 1545, 1600]],\n];\n\n// =====================================================================\n// Extract common arrays using Arr utility\n// =====================================================================\n[$finishTimes, $ages] = Arr::extract($runners, ['finish', 'age']);\n\n[$menRunners, $womenRunners] = Arr::partition($runners, 'gender', '==', 'M');\n[$menTimes] = Arr::extract($menRunners, ['finish']);\n[$womenTimes] = Arr::extract($womenRunners, ['finish']);\n\n// =====================================================================\n// Step 1: The Data & Descriptive Statistics\n// =====================================================================\necho \"=== Step 1: The Data & Descriptive Statistics ===\" . PHP_EOL;\necho \"\\\"What does a typical Boston Marathon finish look like?\\\"\" . PHP_EOL . PHP_EOL;\n\n$mean = Stat::mean($finishTimes);\n$median = Stat::median($finishTimes);\n$stdev = Stat::stdev($finishTimes);\n$quartiles = Stat::quantiles($finishTimes);\n\necho \"Sample size:  \" . count($runners) . \" runners (\" . count($menTimes) . \" men, \" . count($womenTimes) . \" women)\" . PHP_EOL;\necho \"Mean finish:  \" . Format::secondsToTime($mean) . \" (\" . round($mean) . \"s)\" . PHP_EOL;\necho 'Median finish: ' . Format::secondsToTime($median) . \" (\" . round($median) . \"s)\" . PHP_EOL;\necho 'Std deviation: ' . Format::secondsToTime($stdev) . \" (\" . round($stdev) . \"s)\" . PHP_EOL;\necho \"Min:          \" . Format::secondsToTime(min($finishTimes)) . \" | Max: \" . Format::secondsToTime(max($finishTimes)) . PHP_EOL;\necho \"Quartiles:    Q1=\" . Format::secondsToTime($quartiles[0])\n    . \"  Q2=\" . Format::secondsToTime($quartiles[1])\n    . \"  Q3=\" . Format::secondsToTime($quartiles[2]) . PHP_EOL;\necho PHP_EOL;\necho \"How to interpret:\" . PHP_EOL;\necho \"- If the mean is higher than the median, the distribution is right-skewed.\" . PHP_EOL;\necho \"- Compare the full range (min-max) to the interquartile range (Q1-Q3 = \"\n    . Format::secondsToTime($quartiles[2] - $quartiles[0]) . \") to see how spread the middle 50% is.\" . PHP_EOL;\necho \"- A large standard deviation relative to the mean reflects wide diversity in the field.\" . PHP_EOL;\n\n// =====================================================================\n// Step 2: Men vs Women — Two-Sample T-Test\n// =====================================================================\necho PHP_EOL . \"=== Step 2: Men vs Women — Two-Sample T-Test ===\" . PHP_EOL;\necho \"\\\"Are men statistically faster, or could the difference be random?\\\"\" . PHP_EOL . PHP_EOL;\n\necho \"Men:   n=\" . count($menTimes) . \", mean=\" . Format::secondsToTime(Stat::mean($menTimes))\n    . \" (\" . round(Stat::mean($menTimes)) . \"s)\" . PHP_EOL;\necho \"Women: n=\" . count($womenTimes) . \", mean=\" . Format::secondsToTime(Stat::mean($womenTimes))\n    . \" (\" . round(Stat::mean($womenTimes)) . \"s)\" . PHP_EOL;\necho \"Difference: \" . Format::secondsToTime(Stat::mean($womenTimes) - Stat::mean($menTimes))\n    . \" (\" . round(Stat::mean($womenTimes) - Stat::mean($menTimes)) . \"s)\" . PHP_EOL;\necho PHP_EOL;\n\n$tTest2 = Stat::tTestTwoSample($menTimes, $womenTimes);\necho \"Two-sample t-test results:\" . PHP_EOL;\necho \"  t-statistic:       \" . round($tTest2['tStatistic'], 4) . PHP_EOL;\necho '  Degrees of freedom: ' . round($tTest2['degreesOfFreedom'], 1) . PHP_EOL;\necho \"  p-value:           \" . round($tTest2['pValue'], 6) . PHP_EOL;\necho PHP_EOL;\n\necho \"How to interpret:\" . PHP_EOL;\necho \"- If p-value < 0.05, the difference is statistically significant (unlikely due to chance).\" . PHP_EOL;\necho \"- The t-statistic measures the gap relative to within-group variation; further from zero = stronger evidence.\" . PHP_EOL;\necho \"- Degrees of freedom are adjusted for unequal sample sizes (Welch-Satterthwaite approximation).\" . PHP_EOL;\n\n// =====================================================================\n// Step 3: Pacing Strategy — Paired T-Test\n// =====================================================================\necho PHP_EOL . \"=== Step 3: Pacing Strategy — Paired T-Test ===\" . PHP_EOL;\necho \"\\\"Do runners slow down in the second half? (positive split analysis)\\\"\" . PHP_EOL . PHP_EOL;\n\n$firstHalf = array_column($runners, 'half');\n$secondHalf = [];\nforeach ($runners as $r) {\n    $secondHalf[] = $r['finish'] - $r['half'];\n}\n\n$meanFirst = Stat::mean($firstHalf);\n$meanSecond = Stat::mean($secondHalf);\n\necho \"Mean first half:  \" . Format::secondsToTime($meanFirst) . \" (\" . round($meanFirst) . \"s)\" . PHP_EOL;\necho \"Mean second half: \" . Format::secondsToTime($meanSecond) . \" (\" . round($meanSecond) . \"s)\" . PHP_EOL;\necho \"Avg slowdown:     \" . Format::secondsToTime($meanSecond - $meanFirst)\n    . \" (\" . round($meanSecond - $meanFirst) . \"s)\" . PHP_EOL;\necho PHP_EOL;\n\n$tTestPaired = Stat::tTestPaired($firstHalf, $secondHalf);\necho \"Paired t-test results:\" . PHP_EOL;\necho \"  t-statistic:       \" . round($tTestPaired['tStatistic'], 4) . PHP_EOL;\necho '  Degrees of freedom: ' . $tTestPaired['degreesOfFreedom'] . PHP_EOL;\necho \"  p-value:           \" . round($tTestPaired['pValue'], 6) . PHP_EOL;\necho PHP_EOL;\n\necho \"How to interpret:\" . PHP_EOL;\necho \"- If the mean second half > mean first half, runners slow down on average.\" . PHP_EOL;\necho \"- A negative t-statistic confirms the first half is faster. The more negative, the stronger the evidence.\" . PHP_EOL;\necho \"- If p-value is near zero, the slowdown is overwhelmingly significant.\" . PHP_EOL;\necho \"- The paired test removes between-runner variability, making it very sensitive to systematic differences.\" . PHP_EOL;\n\n// =====================================================================\n// Step 4: Does Age Affect Finish Time?\n// =====================================================================\necho PHP_EOL . \"=== Step 4: Does Age Affect Finish Time? ===\" . PHP_EOL;\necho \"\\\"How many minutes per year of age does the marathon cost you?\\\"\" . PHP_EOL . PHP_EOL;\n\n$pearson = Stat::correlation($ages, $finishTimes);\n$spearman = Stat::correlation($ages, $finishTimes, 'ranked');\n$regression = Stat::linearRegression($ages, $finishTimes);\n$r2 = Stat::rSquared($ages, $finishTimes, false, 4);\n\necho \"Pearson correlation:  \" . round($pearson, 4) . PHP_EOL;\necho \"Spearman correlation: \" . round($spearman, 4) . PHP_EOL;\necho PHP_EOL;\necho \"Linear regression:    finish = \" . round($regression[0], 1) . \" × age + \" . round($regression[1]) . PHP_EOL;\necho \"R-squared:            \" . $r2 . PHP_EOL;\necho PHP_EOL;\n\necho \"How to interpret:\" . PHP_EOL;\necho \"- Pearson and Spearman close to +1 = strong positive relationship (older = slower).\" . PHP_EOL;\necho \"- If both correlations are similar, the relationship is linear, not just monotonic.\" . PHP_EOL;\necho \"- The slope tells you seconds added per year of age. Divide by 60 for minutes.\" . PHP_EOL;\necho \"- R-squared tells you what fraction of variation age explains (0 = none, 1 = all).\" . PHP_EOL;\n\n// =====================================================================\n// Step 5: Consistency — Who Paces Best?\n// =====================================================================\necho PHP_EOL . \"=== Step 5: Consistency — Who Paces Best? ===\" . PHP_EOL;\necho \"\\\"Do fast runners pace more evenly than slow runners?\\\"\" . PHP_EOL . PHP_EOL;\n\n$medianFinish = Stat::median($finishTimes);\n$fastCV = [];\n$slowCV = [];\n\nforeach ($runners as $r) {\n    $cv = Stat::coefficientOfVariation($r['splits'], 2);\n    if ($r['finish'] <= $medianFinish) {\n        $fastCV[] = $cv;\n    } else {\n        $slowCV[] = $cv;\n    }\n}\n\necho \"Pacing consistency (CV of 5K splits):\" . PHP_EOL;\necho \"  Fast group (below median): mean CV = \" . round(Stat::mean($fastCV), 2) . \"%\" . PHP_EOL;\necho \"  Slow group (above median): mean CV = \" . round(Stat::mean($slowCV), 2) . \"%\" . PHP_EOL;\necho PHP_EOL;\n\n$tTestCV = Stat::tTestTwoSample($fastCV, $slowCV);\necho \"Two-sample t-test on CV:\" . PHP_EOL;\necho \"  t-statistic: \" . round($tTestCV['tStatistic'], 4) . PHP_EOL;\necho \"  p-value:     \" . round($tTestCV['pValue'], 6) . PHP_EOL;\necho PHP_EOL;\n\necho \"How to interpret:\" . PHP_EOL;\necho \"- If the slow group's mean CV is higher, slower runners pace less consistently.\" . PHP_EOL;\necho \"- If p-value < 0.05, the difference in pacing consistency is statistically significant.\" . PHP_EOL;\necho \"- A low CV = even pacing; a high CV = the runner faded or surged during the race.\" . PHP_EOL;\n\n// =====================================================================\n// Step 6: The Finish Time Distribution\n// =====================================================================\necho PHP_EOL . \"=== Step 6: The Finish Time Distribution ===\" . PHP_EOL;\necho \"\\\"Is marathon finish time normally distributed?\\\"\" . PHP_EOL . PHP_EOL;\n\n$skewness = Stat::skewness($finishTimes, 4);\n$kurtosis = Stat::kurtosis($finishTimes, 4);\necho \"Skewness: \" . $skewness . PHP_EOL;\necho \"  (positive = right-skewed, a long tail of slower finishers)\" . PHP_EOL;\necho \"Kurtosis: \" . $kurtosis . PHP_EOL;\necho \"  (excess kurtosis — 0 is normal; positive = heavier tails)\" . PHP_EOL;\necho PHP_EOL;\n\n$normal = NormalDist::fromSamples($finishTimes);\necho \"Normal model: mu = \" . Format::secondsToTime($normal->getMeanRounded(0))\n    . \", sigma = \" . Format::secondsToTime((int) round($normal->getSigmaRounded(0))) . PHP_EOL;\necho PHP_EOL;\n\n// Compare model vs actual in ranges\n$ranges = [\n    ['label' => 'Under 3:00:00', 'max' => 10800],\n    ['label' => '3:00-3:30',     'max' => 12600],\n    ['label' => '3:30-4:00',     'max' => 14400],\n    ['label' => '4:00-4:30',     'max' => 16200],\n    ['label' => 'Over 4:30',     'max' => PHP_INT_MAX],\n];\n\necho str_pad(\"Range\", 16) . str_pad(\"Actual\", 10) . \"Model\" . PHP_EOL;\necho str_repeat(\"-\", 36) . PHP_EOL;\n\n$prevMax = 0;\nforeach ($ranges as $range) {\n    $actualCount = count(array_filter($finishTimes, fn($t): bool => $t > $prevMax && $t <= $range['max']));\n    $modelProb = $normal->cdf(min($range['max'], 20000)) - $normal->cdf($prevMax);\n    $modelCount = round($modelProb * count($finishTimes), 1);\n    echo str_pad($range['label'], 16)\n        . str_pad((string) $actualCount, 10)\n        . round($modelCount, 1)\n        . PHP_EOL;\n    $prevMax = $range['max'];\n}\necho PHP_EOL;\necho \"How to interpret:\" . PHP_EOL;\necho \"- Positive skewness = right-skewed (long tail of slower finishers).\" . PHP_EOL;\necho \"- Negative excess kurtosis = lighter tails than a normal distribution.\" . PHP_EOL;\necho \"- Compare Actual vs Model columns: where they diverge, the normal assumption breaks down.\" . PHP_EOL;\n\n// =====================================================================\n// Step 7: Finding the Outliers\n// =====================================================================\necho PHP_EOL . \"=== Step 7: Finding the Outliers ===\" . PHP_EOL;\necho \"\\\"Who had an unusually fast (or slow) day?\\\"\" . PHP_EOL . PHP_EOL;\n\n// Z-score method\necho \"Method 1: Z-score based (threshold = 2.0)\" . PHP_EOL;\n$zscoreOutliers = Stat::outliers($finishTimes, 2.0);\nif ($zscoreOutliers === []) {\n    echo \"  No outliers detected.\" . PHP_EOL;\n} else {\n    foreach ($zscoreOutliers as $time) {\n        $name = '';\n        foreach ($runners as $r) {\n            if ($r['finish'] === $time) {\n                $name = $r['name'];\n                break;\n            }\n        }\n        echo \"  \" . Format::secondsToTime($time) . \" — \" . $name . PHP_EOL;\n    }\n}\n\n// IQR method\necho PHP_EOL . \"Method 2: IQR based (factor = 1.5)\" . PHP_EOL;\n$iqrOutliers = Stat::iqrOutliers($finishTimes);\nif ($iqrOutliers === []) {\n    echo \"  No outliers detected.\" . PHP_EOL;\n} else {\n    foreach ($iqrOutliers as $time) {\n        $name = '';\n        foreach ($runners as $r) {\n            if ($r['finish'] === $time) {\n                $name = $r['name'];\n                break;\n            }\n        }\n        echo \"  \" . Format::secondsToTime($time) . \" — \" . $name . PHP_EOL;\n    }\n}\n\n// Individual z-scores for notable runners\necho PHP_EOL . \"Z-scores for selected runners:\" . PHP_EOL;\n$zscores = Stat::zscores($finishTimes, 2);\n\n// Pair each runner with their z-score and sort by finish time\n$runnerZscores = [];\nforeach ($runners as $i => $r) {\n    $runnerZscores[] = ['name' => $r['name'], 'finish' => $r['finish'], 'z' => $zscores[$i]];\n}\nusort($runnerZscores, fn(array $a, array $b): int => $a['finish'] <=> $b['finish']);\n\n// Show 3 fastest + 3 slowest\n$notableRunners = array_merge(\n    array_slice($runnerZscores, 0, 3),\n    array_slice($runnerZscores, -3),\n);\n\necho str_pad(\"Runner\", 22) . str_pad(\"Time\", 12) . \"Z-score\" . PHP_EOL;\necho str_repeat(\"-\", 45) . PHP_EOL;\nforeach ($notableRunners as $rz) {\n    $zFormatted = ($rz['z'] >= 0 ? \"+\" : \"\") . number_format($rz['z'], 2);\n    echo str_pad($rz['name'], 22)\n        . str_pad(Format::secondsToTime($rz['finish']), 12)\n        . $zFormatted\n        . PHP_EOL;\n}\necho PHP_EOL;\necho \"How to interpret:\" . PHP_EOL;\necho \"- Negative z-scores = faster than average; positive = slower.\" . PHP_EOL;\necho \"- Z-scores beyond +/-2 are unusual; beyond +/-3 are very rare.\" . PHP_EOL;\necho \"- The IQR method is more robust for skewed data (doesn't assume symmetry).\" . PHP_EOL;\necho \"- The z-score method can miss outliers because outliers inflate the standard deviation.\" . PHP_EOL;\n\n// =====================================================================\n// Step 8: Confidence Intervals\n// =====================================================================\necho PHP_EOL . \"=== Step 8: Confidence Intervals ===\" . PHP_EOL;\necho \"\\\"How precisely do we know the average finish time?\\\"\" . PHP_EOL . PHP_EOL;\n\n$ciAll = Stat::confidenceInterval($finishTimes, 0.95, 0);\n$ciMen = Stat::confidenceInterval($menTimes, 0.95, 0);\n$ciWomen = Stat::confidenceInterval($womenTimes, 0.95, 0);\n\n$semAll = Stat::sem($finishTimes, 0);\n$semMen = Stat::sem($menTimes, 0);\n$semWomen = Stat::sem($womenTimes, 0);\n\necho \"95% Confidence Intervals:\" . PHP_EOL;\necho \"  All runners: \" . Format::secondsToTime($ciAll[0]) . \" to \" . Format::secondsToTime($ciAll[1])\n    . \"  (SEM: \" . $semAll . \"s)\" . PHP_EOL;\necho \"  Men:         \" . Format::secondsToTime($ciMen[0]) . \" to \" . Format::secondsToTime($ciMen[1])\n    . \"  (SEM: \" . $semMen . \"s)\" . PHP_EOL;\necho \"  Women:       \" . Format::secondsToTime($ciWomen[0]) . \" to \" . Format::secondsToTime($ciWomen[1])\n    . \"  (SEM: \" . $semWomen . \"s)\" . PHP_EOL;\necho PHP_EOL;\necho \"How to interpret:\" . PHP_EOL;\necho \"- The interval gives you a range: we are 95% confident the true mean falls within it.\" . PHP_EOL;\necho \"- Smaller samples produce wider intervals (more uncertainty).\" . PHP_EOL;\necho \"- SEM = stdev / sqrt(n) — as sample size grows, SEM shrinks and the interval tightens.\" . PHP_EOL;\n\n// =====================================================================\n// Step 9: Percentile Benchmarks\n// =====================================================================\necho PHP_EOL . \"=== Step 9: Percentile Benchmarks ===\" . PHP_EOL;\necho \"\\\"What time do you need to beat 75% of the field?\\\"\" . PHP_EOL . PHP_EOL;\n\necho \"Percentile benchmarks:\" . PHP_EOL;\n$percentiles = [10, 25, 50, 75, 90];\nforeach ($percentiles as $p) {\n    $val = Stat::percentile($finishTimes, $p, 0);\n    echo \"  P\" . str_pad((string) $p, 3) . \": \" . Format::secondsToTime($val) . \" (\" . $val . \"s)\" . PHP_EOL;\n}\n\necho PHP_EOL;\n$trimmed10 = Stat::trimmedMean($finishTimes, 0.1, 0);\n$trimmed20 = Stat::trimmedMean($finishTimes, 0.2, 0);\necho \"Trimmed means (removing extreme runners):\" . PHP_EOL;\necho \"  Regular mean:       \" . Format::secondsToTime(round($mean)) . PHP_EOL;\necho \"  Trimmed mean (10%): \" . Format::secondsToTime($trimmed10) . PHP_EOL;\necho \"  Trimmed mean (20%): \" . Format::secondsToTime($trimmed20) . PHP_EOL;\n\necho PHP_EOL;\n\n// Weighted median — weight by inverse placement (top finishers weighted more)\n$weights = [];\n$n = count($finishTimes);\nforeach (array_keys($runners) as $i) {\n    // Weight inversely by finish order (sorted data: fast = high weight)\n    $weights[] = $n - $i;\n}\n$wMedian = Stat::weightedMedian($finishTimes, $weights, 0);\necho \"Weighted median (top finishers weighted more): \" . Format::secondsToTime($wMedian) . PHP_EOL;\necho \"Regular median:                                \" . Format::secondsToTime(round(Stat::median($finishTimes))) . PHP_EOL;\necho PHP_EOL;\necho \"How to interpret:\" . PHP_EOL;\necho \"- P25 is the cutoff to beat 75% of the field.\" . PHP_EOL;\necho \"- If trimmed means get closer to the median, it confirms right skew (slow outliers pull the mean up).\" . PHP_EOL;\necho \"- If the weighted median is faster than the regular median, the weighting emphasizes the competitive core.\" . PHP_EOL;\n\n// =====================================================================\n// Step 10: Summary & Functions Used\n// =====================================================================\necho PHP_EOL . str_repeat(\"=\", 60) . PHP_EOL;\necho \"SUMMARY: FUNCTIONS DEMONSTRATED\" . PHP_EOL;\necho str_repeat(\"=\", 60) . PHP_EOL . PHP_EOL;\n\necho \"Functions demonstrated (30+):\" . PHP_EOL;\necho str_pad(\"  Function\", 38) . \"Step\" . PHP_EOL;\necho \"  \" . str_repeat(\"-\", 40) . PHP_EOL;\n$functions = [\n    ['Stat::mean()', '1,2,3,5'],\n    ['Stat::median()', '1,9'],\n    ['Stat::stdev()', '1'],\n    ['Stat::quantiles()', '1'],\n    ['Stat::tTestTwoSample()', '2,5'],\n    ['Stat::tTestPaired()', '3'],\n    ['Stat::correlation() — Pearson', '4'],\n    ['Stat::correlation() — Spearman', '4'],\n    ['Stat::linearRegression()', '4'],\n    ['Stat::rSquared()', '4'],\n    ['Stat::coefficientOfVariation()', '5'],\n    ['Stat::skewness()', '6'],\n    ['Stat::kurtosis()', '6'],\n    ['NormalDist::fromSamples()', '6'],\n    ['NormalDist::cdf()', '6'],\n    ['Stat::outliers()', '7'],\n    ['Stat::iqrOutliers()', '7'],\n    ['Stat::zscores()', '7'],\n    ['Stat::confidenceInterval()', '8'],\n    ['Stat::sem()', '8'],\n    ['Stat::percentile()', '9'],\n    ['Stat::trimmedMean()', '9'],\n    ['Stat::weightedMedian()', '9'],\n];\nforeach ($functions as $f) {\n    echo \"  \" . str_pad($f[0], 36) . $f[1] . PHP_EOL;\n}\n"
  },
  {
    "path": "examples/article-downhill-ski-analysis.php",
    "content": "<?php\n\n/**\n * Exploring Olympic Downhill Results with PHP Statistics\n *\n * This script accompanies the article:\n * https://dev.to/robertobutti/exploring-olympic-downhill-results-with-php-statistics-3eo1\n *\n * Each section below corresponds to a step in the article.\n * Run it with: php examples/article-downhill-ski-analysis.php\n */\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\Freq;\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\StreamingStat;\n\n// === The Data ===\n// 2026 Olympic Men's Downhill — 34 athletes, times in seconds.\n\n$results = [\n    [\"name\" => \"Franjo von ALLMEN\", \"time\" => 111.61],\n    [\"name\" => \"Giovanni FRANZONI\", \"time\" => 111.81],\n    [\"name\" => \"Dominik PARIS\", \"time\" => 112.11],\n    [\"name\" => \"Marco ODERMATT\", \"time\" => 112.31],\n    [\"name\" => \"Alexis MONNEY\", \"time\" => 112.36],\n    [\"name\" => \"Vincent KRIECHMAYR\", \"time\" => 112.38],\n    [\"name\" => \"Daniel HEMETSBERGER\", \"time\" => 112.58],\n    [\"name\" => \"Nils ALLEGRE\", \"time\" => 112.8],\n    [\"name\" => \"James CRAWFORD\", \"time\" => 113.0],\n    [\"name\" => \"Kyle NEGOMIR\", \"time\" => 113.2],\n    [\"name\" => \"Mattia CASSE\", \"time\" => 113.28],\n    [\"name\" => \"Miha HROBAT\", \"time\" => 113.3],\n    [\"name\" => \"Bryce BENNETT\", \"time\" => 113.45],\n    [\"name\" => \"Cameron ALEXANDER\", \"time\" => 113.49],\n    [\"name\" => \"Raphael HAASER\", \"time\" => 113.5],\n    [\"name\" => \"Martin CATER\", \"time\" => 113.51],\n    [\"name\" => \"Florian SCHIEDER\", \"time\" => 113.57],\n    [\"name\" => \"Ryan COCHRAN-SIEGLE\", \"time\" => 113.63],\n    [\"name\" => \"Sam MORSE\", \"time\" => 113.68],\n    [\"name\" => \"Elian LEHTO\", \"time\" => 113.83],\n    [\"name\" => \"Simon JOCHER\", \"time\" => 114.01],\n    [\"name\" => \"Nils ALPHAND\", \"time\" => 114.06],\n    [\"name\" => \"Stefan ROGENTIN\", \"time\" => 114.18],\n    [\"name\" => \"Jan ZABYSTRAN\", \"time\" => 114.39],\n    [\"name\" => \"Jeffrey READ\", \"time\" => 114.56],\n    [\"name\" => \"Stefan BABINSKY\", \"time\" => 114.73],\n    [\"name\" => \"Alban ELEZI CANNAFERINA\", \"time\" => 114.9],\n    [\"name\" => \"Brodie SEGER\", \"time\" => 114.96],\n    [\"name\" => \"Marco PFIFFNER\", \"time\" => 115.66],\n    [\"name\" => \"Barnabas SZOLLOS\", \"time\" => 117.03],\n    [\"name\" => \"Arnaud ALESSANDRIA\", \"time\" => 117.15],\n    [\"name\" => \"Elvis OPMANIS\", \"time\" => 119.24],\n    [\"name\" => \"Dmytro SHEPIUK\", \"time\" => 120.11],\n    [\"name\" => \"Cormac COMERFORD\", \"time\" => 124.4],\n];\n\n$times = array_column($results, \"time\");\n\n// =====================================================================\n// Step 1: Descriptive Statistics\n// =====================================================================\necho \"=== Step 1: Descriptive Statistics ===\" . PHP_EOL . PHP_EOL;\n\n$mean = Stat::mean($times);\n$median = Stat::median($times);\n$std = Stat::stdev($times);\n$min = min($times);\n$max = max($times);\n$range = $max - $min;\n$quartiles = Stat::quantiles($times);\n\necho \"Sample size: \" . count($times) . PHP_EOL;\necho \"Mean time:   \" . round($mean, 2) . \" seconds\" . PHP_EOL;\necho \"Median time: \" . round($median, 2) . \" seconds\" . PHP_EOL;\necho \"Std dev:     \" . round($std, 2) . \" seconds\" . PHP_EOL;\necho \"Min: \" . $min . \"s | Max: \" . $max . \"s | Range: \" . round($range, 2) . \"s\" . PHP_EOL;\necho \"Quartiles (Q1, Q2, Q3): \"\n    . round($quartiles[0], 2) . \"s, \"\n    . round($quartiles[1], 2) . \"s, \"\n    . round($quartiles[2], 2) . \"s\"\n    . PHP_EOL;\necho PHP_EOL;\n\necho \"Observations:\" . PHP_EOL;\necho \"- The mean (114.38) is higher than the median (113.60) — right skew.\" . PHP_EOL;\necho \"- The range (12.79s) is large relative to the std dev (2.60s).\" . PHP_EOL;\necho \"- Q1 to Q3 spans only ~1.82s, so the middle 50% is tightly packed.\" . PHP_EOL;\n\n// =====================================================================\n// Step 1b: Robust Central Tendency\n// =====================================================================\necho PHP_EOL . \"=== Step 1b: Robust Central Tendency ===\" . PHP_EOL . PHP_EOL;\n\n$trimmedMean10 = Stat::trimmedMean($times, 0.1, 2);\n$trimmedMean20 = Stat::trimmedMean($times, 0.2, 2);\n\necho \"Regular mean:       \" . round(Stat::mean($times), 2) . \"s\" . PHP_EOL;\necho \"Trimmed mean (10%): \" . $trimmedMean10 . \"s\" . PHP_EOL;\necho \"Trimmed mean (20%): \" . $trimmedMean20 . \"s\" . PHP_EOL;\necho PHP_EOL;\necho \"The trimmed mean removes extreme values from each end.\" . PHP_EOL;\necho \"With 10% cut, the 3 fastest and 3 slowest are excluded.\" . PHP_EOL;\necho \"Result: the 'typical' time drops from 114.38s to 113.91s.\" . PHP_EOL;\n\n// =====================================================================\n// Step 1c: Percentile Analysis\n// =====================================================================\necho PHP_EOL . \"=== Step 1c: Percentile Analysis ===\" . PHP_EOL . PHP_EOL;\n\necho \"P10: \" . Stat::percentile($times, 10, 2) . \"s — elite threshold\" . PHP_EOL;\necho \"P25: \" . Stat::percentile($times, 25, 2) . \"s — top quarter\" . PHP_EOL;\necho \"P50: \" . Stat::percentile($times, 50, 2) . \"s — median\" . PHP_EOL;\necho \"P75: \" . Stat::percentile($times, 75, 2) . \"s — bottom quarter\" . PHP_EOL;\necho \"P90: \" . Stat::percentile($times, 90, 2) . \"s — struggling\" . PHP_EOL;\necho PHP_EOL;\necho \"Notice: P75-P90 gap (3.4s) is much larger than P10-P25 gap (0.7s).\" . PHP_EOL;\necho \"This asymmetry IS the right skew, quantified.\" . PHP_EOL;\n\n// =====================================================================\n// Step 1d: Precision of the Mean\n// =====================================================================\necho PHP_EOL . \"=== Step 1d: Precision of the Mean (SEM) ===\" . PHP_EOL . PHP_EOL;\n\n$sem = Stat::sem($times, 2);\necho \"SEM: \" . $sem . \"s\" . PHP_EOL;\necho \"95% confidence interval: \"\n    . round(Stat::mean($times) - 1.96 * $sem, 2) . \"s to \"\n    . round(Stat::mean($times) + 1.96 * $sem, 2) . \"s\"\n    . PHP_EOL;\necho PHP_EOL;\necho \"With 34 athletes, we estimate the true mean within ~\"\n    . round($sem * 1.96, 2) . \"s at 95% confidence.\" . PHP_EOL;\n\n// =====================================================================\n// Step 2: Fitting a Normal Distribution\n// =====================================================================\necho PHP_EOL . \"=== Step 2: Fitting a Normal Distribution ===\" . PHP_EOL . PHP_EOL;\n\n$normal = NormalDist::fromSamples($times);\necho \"Estimated mu (mean):     \" . $normal->getMeanRounded(2) . \" seconds\" . PHP_EOL;\necho \"Estimated sigma (std):   \" . $normal->getSigmaRounded(2) . \" seconds\" . PHP_EOL;\necho PHP_EOL;\necho \"Model median: \" . $normal->getMedianRounded(2) . \"s\" . PHP_EOL;\necho \"Actual median: \" . round($median, 2) . \"s\" . PHP_EOL;\necho \"Difference: \" . round($normal->getMedianRounded(2) - $median, 2) . \"s\" . PHP_EOL;\necho \"(the right skew pulls the model median = mean upward)\" . PHP_EOL;\n\n// =====================================================================\n// Step 3: Asking Probabilistic Questions\n// =====================================================================\necho PHP_EOL . \"=== Step 3: Probabilistic Questions ===\" . PHP_EOL . PHP_EOL;\n\n$target = 113.0;\n$probUnder = $normal->cdfRounded($target, 4);\n$actualUnder = count(array_filter($times, fn(float $t): bool => $t <= $target));\necho \"Q: What is the probability of finishing in \" . $target . \"s or less?\" . PHP_EOL;\necho \"Model:  P(time <= \" . $target . \"s) = \"\n    . round($probUnder * 100, 1) . \"%\" . PHP_EOL;\necho \"Actual: \" . $actualUnder . \"/\" . count($times)\n    . \" = \" . round(($actualUnder / count($times)) * 100, 1) . \"%\" . PHP_EOL;\necho \"(the gap shows the effect of skewness on the normal model)\" . PHP_EOL;\necho PHP_EOL;\necho \"PDF at \" . $target . \"s = \" . $normal->pdfRounded($target, 6) . PHP_EOL;\n\n// =====================================================================\n// Step 4: Performance Thresholds (Inverse CDF)\n// =====================================================================\necho PHP_EOL . \"=== Step 4: Performance Thresholds ===\" . PHP_EOL . PHP_EOL;\n\n$eliteThreshold = $normal->invCdfRounded(0.2, 2);\n$slowThreshold = $normal->invCdfRounded(0.8, 2);\necho \"Top 20% fastest (below):  \" . $eliteThreshold . \" seconds\" . PHP_EOL;\necho \"Slowest 20% (above):      \" . $slowThreshold . \" seconds\" . PHP_EOL;\n\n// =====================================================================\n// Step 5: Z-scores\n// =====================================================================\necho PHP_EOL . \"=== Step 5: Z-scores ===\" . PHP_EOL . PHP_EOL;\n\necho str_pad(\"Athlete\", 30)\n    . str_pad(\"Time\", 10)\n    . str_pad(\"Z-score\", 10)\n    . \"Tier\"\n    . PHP_EOL;\necho str_repeat(\"-\", 65) . PHP_EOL;\n\n$tierDefinitions = [\n    [\"max\" => 0.20, \"label\" => \"Elite\"],\n    [\"max\" => 0.50, \"label\" => \"Strong\"],\n    [\"max\" => 0.80, \"label\" => \"Average\"],\n    [\"max\" => 1.00, \"label\" => \"Below avg\"],\n];\n\nforeach ($results as $r) {\n    $time = $r[\"time\"];\n    $percentile = $normal->cdf($time);\n\n    $tier = \"Below avg\";\n    foreach ($tierDefinitions as $def) {\n        if ($percentile <= $def[\"max\"]) {\n            $tier = $def[\"label\"];\n            break;\n        }\n    }\n\n    $z = $normal->zscoreRounded($time, 2);\n    $zFormatted = ($z >= 0 ? \"+\" : \"\") . number_format($z, 2);\n\n    echo str_pad($r[\"name\"], 30)\n        . str_pad(number_format($time, 2) . \"s\", 10)\n        . str_pad($zFormatted, 10)\n        . $tier\n        . PHP_EOL;\n}\n\n// =====================================================================\n// Step 5b: Outlier Detection\n// =====================================================================\necho PHP_EOL . \"=== Step 5b: Outlier Detection ===\" . PHP_EOL . PHP_EOL;\n\n// Method 1: Z-score\necho \"Method 1: Z-score based (threshold = 2.5)\" . PHP_EOL;\n$zscoreOutliers = Stat::outliers($times, 2.5);\nif ($zscoreOutliers === []) {\n    echo \"  No outliers detected.\" . PHP_EOL;\n} else {\n    foreach ($zscoreOutliers as $time) {\n        $name = \"\";\n        foreach ($results as $r) {\n            if ($r[\"time\"] === $time) {\n                $name = $r[\"name\"];\n                break;\n            }\n        }\n        echo \"  \" . $time . \"s — \" . $name . PHP_EOL;\n    }\n}\n\n// Method 2: IQR\necho PHP_EOL . \"Method 2: IQR based (factor = 1.5, box plot whiskers)\" . PHP_EOL;\n$iqrOutliers = Stat::iqrOutliers($times);\nif ($iqrOutliers === []) {\n    echo \"  No outliers detected.\" . PHP_EOL;\n} else {\n    foreach ($iqrOutliers as $time) {\n        $name = \"\";\n        foreach ($results as $r) {\n            if ($r[\"time\"] === $time) {\n                $name = $r[\"name\"];\n                break;\n            }\n        }\n        echo \"  \" . $time . \"s — \" . $name . PHP_EOL;\n    }\n}\n\necho PHP_EOL;\necho \"Z-score detected 1 outlier; IQR detected 3.\" . PHP_EOL;\necho \"IQR is more robust for skewed data — outliers don't inflate\" . PHP_EOL;\necho \"the detection threshold (unlike z-score, where they inflate stdev).\" . PHP_EOL;\n\n// =====================================================================\n// Step 6: Classifying Athletes into Tiers\n// =====================================================================\necho PHP_EOL . \"=== Step 6: Athlete Tier Classification ===\" . PHP_EOL . PHP_EOL;\n\necho \"Using the normal model's CDF to assign tiers:\" . PHP_EOL;\necho \"  Elite:     bottom 20% of the CDF (fastest)\" . PHP_EOL;\necho \"  Strong:    20%–50%\" . PHP_EOL;\necho \"  Average:   50%–80%\" . PHP_EOL;\necho \"  Below avg: 80%–100% (slowest)\" . PHP_EOL;\necho PHP_EOL;\n\n$tierCounts = [\"Elite\" => 0, \"Strong\" => 0, \"Average\" => 0, \"Below avg\" => 0];\nforeach ($results as $r) {\n    $percentile = $normal->cdf($r[\"time\"]);\n    foreach ($tierDefinitions as $def) {\n        if ($percentile <= $def[\"max\"]) {\n            $tierCounts[$def[\"label\"]]++;\n            break;\n        }\n    }\n}\nforeach ($tierCounts as $tier => $count) {\n    echo str_pad($tier, 12) . str_repeat(\"*\", $count) . \" (\" . $count . \")\" . PHP_EOL;\n}\n\n// =====================================================================\n// Step 7: Frequency Table\n// =====================================================================\necho PHP_EOL . \"=== Step 7: Frequency Table (1-second bins) ===\" . PHP_EOL . PHP_EOL;\n\n$freqTable = Freq::frequencyTableBySize($times, 1);\nforeach ($freqTable as $class => $count) {\n    echo str_pad($class . \"s\", 8)\n        . str_repeat(\"*\", $count)\n        . \" (\" . $count . \")\"\n        . PHP_EOL;\n}\n\n// =====================================================================\n// Step 8: Skewness and Kurtosis\n// =====================================================================\necho PHP_EOL . \"=== Step 8: Skewness and Kurtosis ===\" . PHP_EOL . PHP_EOL;\n\necho \"Skewness: \" . Stat::skewness($times, 4) . PHP_EOL;\necho \"  (positive = right-skewed, a few slow finishers pull the tail)\" . PHP_EOL;\necho \"Kurtosis: \" . Stat::kurtosis($times, 4) . PHP_EOL;\necho \"  (positive = heavy tails, outliers present)\" . PHP_EOL;\n\n// =====================================================================\n// Step 9: Dispersion Beyond Standard Deviation\n// =====================================================================\necho PHP_EOL . \"=== Step 9: Dispersion Measures Compared ===\" . PHP_EOL . PHP_EOL;\n\n$stdev = Stat::stdev($times, 4);\n$mad = Stat::meanAbsoluteDeviation($times, 4);\n$medianAD = Stat::medianAbsoluteDeviation($times, 4);\n\necho \"Standard deviation:        \" . $stdev . \"s\" . PHP_EOL;\necho \"Mean Absolute Deviation:   \" . $mad . \"s\" . PHP_EOL;\necho \"Median Absolute Deviation: \" . $medianAD . \"s\" . PHP_EOL;\necho PHP_EOL;\necho \"The median absolute deviation (0.88s) is much smaller than\" . PHP_EOL;\necho \"the stdev (2.60s). This reveals two groups: a tight core pack\" . PHP_EOL;\necho \"(within ~1 second of each other) and a few stragglers.\" . PHP_EOL;\n\n// =====================================================================\n// Step 10: Coefficient of Variation\n// =====================================================================\necho PHP_EOL . \"=== Step 10: Coefficient of Variation ===\" . PHP_EOL . PHP_EOL;\n\n$cvFull = Stat::coefficientOfVariation($times, 2);\n$top10 = array_slice($times, 0, 10);\n$cvTop10 = Stat::coefficientOfVariation($top10, 2);\n\necho \"Full field CV: \" . $cvFull . \"%\" . PHP_EOL;\necho \"Top 10 CV:     \" . $cvTop10 . \"%\" . PHP_EOL;\necho PHP_EOL;\necho \"The top 10 is 5x tighter than the full field.\" . PHP_EOL;\necho \"CV lets you compare tightness across different events or years.\" . PHP_EOL;\n\n// =====================================================================\n// Step 11: Weighted Median\n// =====================================================================\necho PHP_EOL . \"=== Step 11: Weighted Median ===\" . PHP_EOL . PHP_EOL;\n\n$weights = [];\nforeach ($results as $i => $r) {\n    $weights[] = $i < 15 ? 3.0 : 1.0;\n}\n$wMedian = Stat::weightedMedian($times, $weights, 2);\n\necho \"Regular median:  \" . round(Stat::median($times), 2) . \"s\" . PHP_EOL;\necho \"Weighted median: \" . $wMedian . \"s  (top-15 seeded athletes weighted 3x)\" . PHP_EOL;\necho PHP_EOL;\necho \"The weighted median answers: 'What does a competitive time look like?'\" . PHP_EOL;\necho \"rather than 'What does the typical time look like?'\" . PHP_EOL;\n\n// =====================================================================\n// Step 12: StreamingStat — Real-Time Processing\n// =====================================================================\necho PHP_EOL . \"=== Step 12: StreamingStat (O(1) Memory) ===\" . PHP_EOL . PHP_EOL;\n\n$stream = new StreamingStat();\n\nforeach ($results as $i => $r) {\n    $stream->add($r[\"time\"]);\n\n    if (in_array($i + 1, [5, 10, 20, 34])) {\n        echo \"After \" . str_pad($stream->count(), 2) . \" athletes: \"\n            . \"mean=\" . $stream->mean(2) . \"s, \"\n            . \"stdev=\" . $stream->stdev(2) . \"s, \"\n            . \"min=\" . $stream->min() . \"s, \"\n            . \"max=\" . $stream->max() . \"s\"\n            . PHP_EOL;\n    }\n}\n\necho PHP_EOL;\necho \"Final streaming results match Stat:\" . PHP_EOL;\necho \"  Streaming mean:  \" . $stream->mean(2) . \"s  vs  Stat::mean: \" . round($mean, 2) . \"s\" . PHP_EOL;\necho \"  Streaming stdev: \" . $stream->stdev(2) . \"s  vs  Stat::stdev: \" . round($std, 2) . \"s\" . PHP_EOL;\n\n// =====================================================================\n// When the Normal Distribution Works (and When It Doesn't)\n// =====================================================================\necho PHP_EOL . \"=== Model Limitations ===\" . PHP_EOL . PHP_EOL;\n\necho \"The normal model is a useful approximation, but this data is\" . PHP_EOL;\necho \"right-skewed (skewness: \" . Stat::skewness($times, 2) . \"). Signs of misfit:\" . PHP_EOL;\necho \"- Model median (\" . $normal->getMedianRounded(2)\n    . \"s) differs from actual median (\" . round($median, 2) . \"s)\" . PHP_EOL;\necho \"- Model P(time <= 113s) = \" . round($normal->cdf(113.0) * 100, 1)\n    . \"%, actual = \" . round((count(array_filter($times, fn(float $t): bool => $t <= 113.0)) / count($times)) * 100, 1) . \"%\" . PHP_EOL;\necho \"- Kurtosis (\" . Stat::kurtosis($times, 2) . \") >> 0 — heavier tails than normal\" . PHP_EOL;\necho PHP_EOL;\necho \"For this dataset, robust measures (trimmed mean, IQR outliers,\" . PHP_EOL;\necho \"median absolute deviation) give more reliable insights than\" . PHP_EOL;\necho \"methods that assume normality.\" . PHP_EOL;\n\n// =====================================================================\n// Summary\n// =====================================================================\necho PHP_EOL . str_repeat(\"=\", 55) . PHP_EOL;\necho \"SUMMARY\" . PHP_EOL;\necho str_repeat(\"=\", 55) . PHP_EOL;\necho \"Winner:              \" . $results[0][\"name\"] . \" (\" . $results[0][\"time\"] . \"s)\" . PHP_EOL;\necho \"Mean / Median:       \" . round($mean, 2) . \"s / \" . round($median, 2) . \"s\" . PHP_EOL;\necho \"Trimmed mean (10%):  \" . $trimmedMean10 . \"s\" . PHP_EOL;\necho \"Core pack spread:    \" . $medianAD . \"s (median abs deviation)\" . PHP_EOL;\necho \"Race tightness (CV): \" . $cvFull . \"% (full field), \" . $cvTop10 . \"% (top 10)\" . PHP_EOL;\necho \"Outliers (IQR):      \" . count($iqrOutliers) . \" athletes flagged\" . PHP_EOL;\necho \"Distribution:        right-skewed (skewness \" . Stat::skewness($times, 2) . \")\" . PHP_EOL;\n"
  },
  {
    "path": "examples/article-gpx-running-analysis.php",
    "content": "<?php\n\n/**\n * Analyze Your Running Performance with GPX Data and PHP Statistics\n *\n * This script shows how to parse a GPX file from your sport watch\n * and analyze your running performance using the hi-folks/statistics package.\n *\n * It includes helper functions for GPX parsing, plus simulated data\n * so you can run it immediately without a GPX file.\n *\n * Run it with: php examples/article-gpx-running-analysis.php\n */\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\Freq;\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Utils\\Arr;\nuse HiFolks\\Statistics\\Utils\\Format;\n\n// ============================================================\n// HELPER FUNCTIONS — GPX parsing and distance calculation\n// ============================================================\n\n/**\n * Parse a GPX file and return an array of trackpoints.\n * Each trackpoint: ['lat' => float, 'lon' => float, 'ele' => float,\n *                   'time' => int (unix timestamp), 'hr' => int|null]\n */\nfunction parseGpx(string $filePath): array\n{\n    $xml = simplexml_load_file($filePath);\n    if ($xml === false) {\n        throw new RuntimeException(\"Cannot parse GPX file: {$filePath}\");\n    }\n\n    $namespaces = $xml->getNamespaces(true);\n\n    $points = [];\n    foreach ($xml->trk->trkseg->trkpt as $trkpt) {\n        $point = [\n            \"lat\" => (float) $trkpt[\"lat\"],\n            \"lon\" => (float) $trkpt[\"lon\"],\n            \"ele\" => property_exists($trkpt, 'ele') && $trkpt->ele !== null ? (float) $trkpt->ele : 0.0,\n            \"time\" => property_exists($trkpt, 'time') && $trkpt->time !== null\n                ? strtotime((string) $trkpt->time)\n                : 0,\n            \"hr\" => null,\n        ];\n\n        // Try to extract heart rate from Garmin TrackPointExtension\n        if (isset($namespaces[\"gpxtpx\"])) {\n            $extensions = $trkpt->extensions;\n            if ($extensions) {\n                $gpxtpx = $extensions->children($namespaces[\"gpxtpx\"]);\n                if (property_exists($gpxtpx->TrackPointExtension, 'hr') && $gpxtpx->TrackPointExtension->hr !== null) {\n                    $point[\"hr\"] = (int) $gpxtpx->TrackPointExtension->hr;\n                }\n            }\n        }\n\n        $points[] = $point;\n    }\n\n    return $points;\n}\n\n/**\n * Haversine distance between two GPS coordinates in meters.\n */\nfunction haversineDistance(\n    float $lat1,\n    float $lon1,\n    float $lat2,\n    float $lon2,\n): float {\n    $R = 6371000; // Earth radius in meters\n    $dLat = deg2rad($lat2 - $lat1);\n    $dLon = deg2rad($lon2 - $lon1);\n    $a\n        = sin($dLat / 2) ** 2\n        + cos(deg2rad($lat1)) * cos(deg2rad($lat2)) * sin($dLon / 2) ** 2;\n\n    return $R * 2 * atan2(sqrt($a), sqrt(1 - $a));\n}\n\n/**\n * Build per-kilometer splits from trackpoints.\n * Returns array of ['km' => int, 'time' => int (seconds), 'pace' => int (sec/km),\n *                    'eleGain' => float, 'eleLoss' => float, 'avgHr' => int|null]\n */\nfunction buildKmSplits(array $trackpoints): array\n{\n    $splits = [];\n    $currentKm = 1;\n    $kmDistance = 0;\n    $kmStartTime = $trackpoints[0][\"time\"];\n    $kmEleGain = 0;\n    $kmEleLoss = 0;\n    $kmHrValues = [];\n    $counter = count($trackpoints);\n\n    for ($i = 1; $i < $counter; $i++) {\n        $prev = $trackpoints[$i - 1];\n        $curr = $trackpoints[$i];\n\n        $segDist = haversineDistance(\n            $prev[\"lat\"],\n            $prev[\"lon\"],\n            $curr[\"lat\"],\n            $curr[\"lon\"],\n        );\n        $kmDistance += $segDist;\n\n        $eleDiff = $curr[\"ele\"] - $prev[\"ele\"];\n        if ($eleDiff > 0) {\n            $kmEleGain += $eleDiff;\n        } else {\n            $kmEleLoss += abs($eleDiff);\n        }\n\n        if ($curr[\"hr\"] !== null) {\n            $kmHrValues[] = $curr[\"hr\"];\n        }\n\n        if ($kmDistance >= 1000) {\n            $kmTime = $curr[\"time\"] - $kmStartTime;\n            $splits[] = [\n                \"km\" => $currentKm,\n                \"time\" => $kmTime,\n                \"pace\" => $kmTime,\n                \"eleGain\" => round($kmEleGain, 1),\n                \"eleLoss\" => round($kmEleLoss, 1),\n                \"avgHr\"\n                    => count($kmHrValues) > 0\n                        ? (int) round(Stat::mean($kmHrValues))\n                        : null,\n            ];\n\n            $currentKm++;\n            $kmDistance -= 1000;\n            $kmStartTime = $curr[\"time\"];\n            $kmEleGain = 0;\n            $kmEleLoss = 0;\n            $kmHrValues = [];\n        }\n    }\n\n    return $splits;\n}\n\n/**\n * Format a pace in seconds as \"M:SS/km\".\n */\nfunction formatPace(int|float $seconds): string\n{\n    return Format::secondsToTime((int) round($seconds)) . \"/km\";\n}\n\n// ============================================================\n// THE DATA\n// ============================================================\n\n// === Option 1: Parse a real GPX file ===\n// Uncomment these lines if you have a GPX file from your sport watch:\n//\n// $trackpoints = parseGpx('your-run.gpx');\n// $splits = buildKmSplits($trackpoints);\n\n// === Option 2: Simulated 10K run ===\n// A realistic 10K with a hilly middle section, slight positive split,\n// and heart rate drifting upward as fatigue accumulates.\n$splits = [\n    [\n        \"km\" => 1,\n        \"time\" => 322,\n        \"pace\" => 322,\n        \"eleGain\" => 5,\n        \"eleLoss\" => 2,\n        \"avgHr\" => 145,\n    ],\n    [\n        \"km\" => 2,\n        \"time\" => 318,\n        \"pace\" => 318,\n        \"eleGain\" => 8,\n        \"eleLoss\" => 3,\n        \"avgHr\" => 150,\n    ],\n    [\n        \"km\" => 3,\n        \"time\" => 335,\n        \"pace\" => 335,\n        \"eleGain\" => 22,\n        \"eleLoss\" => 4,\n        \"avgHr\" => 158,\n    ],\n    [\n        \"km\" => 4,\n        \"time\" => 348,\n        \"pace\" => 348,\n        \"eleGain\" => 28,\n        \"eleLoss\" => 5,\n        \"avgHr\" => 164,\n    ],\n    [\n        \"km\" => 5,\n        \"time\" => 340,\n        \"pace\" => 340,\n        \"eleGain\" => 15,\n        \"eleLoss\" => 18,\n        \"avgHr\" => 162,\n    ],\n    [\n        \"km\" => 6,\n        \"time\" => 312,\n        \"pace\" => 312,\n        \"eleGain\" => 2,\n        \"eleLoss\" => 30,\n        \"avgHr\" => 155,\n    ],\n    [\n        \"km\" => 7,\n        \"time\" => 325,\n        \"pace\" => 325,\n        \"eleGain\" => 3,\n        \"eleLoss\" => 8,\n        \"avgHr\" => 158,\n    ],\n    [\n        \"km\" => 8,\n        \"time\" => 338,\n        \"pace\" => 338,\n        \"eleGain\" => 12,\n        \"eleLoss\" => 5,\n        \"avgHr\" => 165,\n    ],\n    [\n        \"km\" => 9,\n        \"time\" => 352,\n        \"pace\" => 352,\n        \"eleGain\" => 18,\n        \"eleLoss\" => 3,\n        \"avgHr\" => 170,\n    ],\n    [\n        \"km\" => 10,\n        \"time\" => 330,\n        \"pace\" => 330,\n        \"eleGain\" => 4,\n        \"eleLoss\" => 15,\n        \"avgHr\" => 172,\n    ],\n];\n\n// Extract column arrays we will reuse throughout\n[$paces, $eleGains, $hrValues, $kmNumbers] = Arr::extract($splits, [\n    \"pace\",\n    \"eleGain\",\n    \"avgHr\",\n    \"km\",\n]);\n\n// ============================================================\n// STEP 1: Run Overview\n// ============================================================\n\n$totalDistance = count($splits);\n$totalTime = array_sum(array_column($splits, \"time\"));\n$totalEleGain = array_sum(array_column($splits, \"eleGain\"));\n$totalEleLoss = array_sum(array_column($splits, \"eleLoss\"));\n\necho \"=== STEP 1: Run Overview ===\" . PHP_EOL;\necho \"Distance:        \" . $totalDistance . \" km\" . PHP_EOL;\necho \"Total time:      \" . Format::secondsToTime($totalTime) . PHP_EOL;\necho \"Average pace:    \" . formatPace(Stat::mean($paces)) . PHP_EOL;\necho \"Elevation gain:  +\" . $totalEleGain . \" m\" . PHP_EOL;\necho \"Elevation loss:  -\" . $totalEleLoss . \" m\" . PHP_EOL;\necho \"Average HR:      \" . round(Stat::mean($hrValues)) . \" bpm\" . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 2: Pace Descriptive Statistics\n// ============================================================\n\n$meanPace = Stat::mean($paces);\n$medianPace = Stat::median($paces);\n$stdevPace = Stat::stdev($paces);\n$quartiles = Stat::quantiles($paces);\n\necho \"=== STEP 2: Pace Descriptive Statistics ===\" . PHP_EOL;\necho \"Mean pace:       \" . formatPace($meanPace) . PHP_EOL;\necho \"Median pace:     \" . formatPace($medianPace) . PHP_EOL;\necho \"Std deviation:   \" . round($stdevPace, 1) . \" sec\" . PHP_EOL;\necho \"Fastest km:      \"\n    . formatPace(min($paces))\n    . \" (km \"\n    . $splits[array_search(min($paces), $paces)][\"km\"]\n    . \")\"\n    . PHP_EOL;\necho \"Slowest km:      \"\n    . formatPace(max($paces))\n    . \" (km \"\n    . $splits[array_search(max($paces), $paces)][\"km\"]\n    . \")\"\n    . PHP_EOL;\necho \"Quartiles:       Q1=\"\n    . formatPace($quartiles[0])\n    . \"  Q2=\"\n    . formatPace($quartiles[1])\n    . \"  Q3=\"\n    . formatPace($quartiles[2])\n    . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 3: Pacing Consistency\n// ============================================================\n\n$cv = Stat::coefficientOfVariation($paces, 2);\n$halfPoint = intdiv(count($splits), 2);\n$firstHalfPaces = array_slice($paces, 0, $halfPoint);\n$secondHalfPaces = array_slice($paces, $halfPoint);\n$meanFirst = Stat::mean($firstHalfPaces);\n$meanSecond = Stat::mean($secondHalfPaces);\n$splitDiff = $meanSecond - $meanFirst;\n$splitPct = round(($splitDiff / $meanFirst) * 100, 1);\n\necho \"=== STEP 3: Pacing Consistency ===\" . PHP_EOL;\necho \"Coefficient of Variation: \" . $cv . \"%\" . PHP_EOL;\necho \"First half avg pace:  \"\n    . formatPace($meanFirst)\n    . \" (km 1-\"\n    . $halfPoint\n    . \")\"\n    . PHP_EOL;\necho \"Second half avg pace: \"\n    . formatPace($meanSecond)\n    . \" (km \"\n    . ($halfPoint + 1)\n    . \"-\"\n    . $totalDistance\n    . \")\"\n    . PHP_EOL;\nif ($splitDiff > 0) {\n    echo \"Positive split: +\"\n        . round($splitDiff, 1)\n        . \" sec/km slower (\"\n        . $splitPct\n        . \"% fade)\"\n        . PHP_EOL;\n} elseif ($splitDiff < 0) {\n    echo \"Negative split: \"\n        . round(abs($splitDiff), 1)\n        . \" sec/km faster (\"\n        . abs($splitPct)\n        . \"% improvement)\"\n        . PHP_EOL;\n} else {\n    echo \"Even split: perfectly consistent pacing\" . PHP_EOL;\n}\necho PHP_EOL;\n\n// ============================================================\n// STEP 4: Elevation Impact on Pace\n// ============================================================\n\n$corrEle = Stat::correlation($eleGains, $paces);\n$regEle = Stat::linearRegression($eleGains, $paces);\n$r2Ele = Stat::rSquared($eleGains, $paces, false, 4);\n\necho \"=== STEP 4: Elevation Impact on Pace ===\" . PHP_EOL;\necho \"Correlation (elevation gain vs pace): \" . round($corrEle, 4) . PHP_EOL;\necho \"Linear regression: pace = \"\n    . round($regEle[0], 2)\n    . \" x eleGain + \"\n    . round($regEle[1], 1)\n    . PHP_EOL;\necho \"R-squared: \" . $r2Ele . PHP_EOL;\necho \"Interpretation: each meter of elevation gain costs ~\"\n    . round($regEle[0], 1)\n    . \" seconds per km\"\n    . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 5: Heart Rate Analysis\n// ============================================================\n\n$meanHr = Stat::mean($hrValues);\n$medianHr = Stat::median($hrValues);\n$stdevHr = Stat::stdev($hrValues);\n\n// Cardiac drift: does HR rise over the course of the run?\n$corrHrKm = Stat::correlation($kmNumbers, $hrValues);\n$regHrKm = Stat::linearRegression($kmNumbers, $hrValues);\n$r2HrKm = Stat::rSquared($kmNumbers, $hrValues, false, 4);\n\n// HR vs pace correlation\n$corrHrPace = Stat::correlation($hrValues, $paces);\n\necho \"=== STEP 5: Heart Rate Analysis ===\" . PHP_EOL;\necho \"Mean HR:    \" . round($meanHr) . \" bpm\" . PHP_EOL;\necho \"Median HR:  \" . round($medianHr) . \" bpm\" . PHP_EOL;\necho \"Std dev:    \" . round($stdevHr, 1) . \" bpm\" . PHP_EOL;\necho \"Min HR:     \"\n    . min($hrValues)\n    . \" bpm | Max HR: \"\n    . max($hrValues)\n    . \" bpm\"\n    . PHP_EOL;\necho PHP_EOL;\n\necho \"Cardiac drift (HR vs km):\" . PHP_EOL;\necho \"  Correlation:      \" . round($corrHrKm, 4) . PHP_EOL;\necho \"  Regression:       HR = \"\n    . round($regHrKm[0], 2)\n    . \" x km + \"\n    . round($regHrKm[1], 1)\n    . PHP_EOL;\necho \"  R-squared:        \" . $r2HrKm . PHP_EOL;\necho \"  HR drift per km:  +\" . round($regHrKm[0], 1) . \" bpm/km\" . PHP_EOL;\necho PHP_EOL;\n\necho \"HR vs pace correlation: \" . round($corrHrPace, 4) . PHP_EOL;\necho PHP_EOL;\n\n// Heart rate zone distribution\n$hrZones = Freq::frequencyTableBySize($hrValues, 10);\necho \"Heart Rate Zone Distribution:\" . PHP_EOL;\nforeach ($hrZones as $range => $count) {\n    echo \"  \"\n        . $range\n        . \" bpm: \"\n        . str_repeat(\"#\", $count)\n        . \" (\"\n        . $count\n        . \" km)\"\n        . PHP_EOL;\n}\necho PHP_EOL;\n\n// ============================================================\n// STEP 6: Outlier Detection\n// ============================================================\n\n$zscores = Stat::zscores($paces, 2);\n$zOutliers = Stat::outliers($paces, 2.0);\n$iqrOutliers = Stat::iqrOutliers($paces);\n\necho \"=== STEP 6: Outlier Detection ===\" . PHP_EOL;\necho \"Per-km z-scores (negative = faster than average):\" . PHP_EOL;\nforeach ($splits as $i => $split) {\n    $z = $zscores[$i];\n    $bar\n        = $z < 0\n            ? str_repeat(\"<\", (int) abs(round($z * 5)))\n            : str_repeat(\">\", (int) round($z * 5));\n    echo \"  km \"\n        . str_pad((string) $split[\"km\"], 2, \" \", STR_PAD_LEFT)\n        . \": \"\n        . formatPace($split[\"pace\"])\n        . \"  z=\"\n        . sprintf(\"%+.2f\", $z)\n        . \"  \"\n        . $bar\n        . PHP_EOL;\n}\necho PHP_EOL;\necho \"Z-score outliers (|z| > 2.0): \"\n    . (count($zOutliers) > 0\n        ? implode(\", \", array_map(formatPace(...), $zOutliers))\n        : \"none\")\n    . PHP_EOL;\necho \"IQR outliers:                 \"\n    . (count($iqrOutliers) > 0\n        ? implode(\", \", array_map(formatPace(...), $iqrOutliers))\n        : \"none\")\n    . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 7: Percentile Benchmarks\n// ============================================================\n\necho \"=== STEP 7: Percentile Benchmarks ===\" . PHP_EOL;\necho \"Your pace distribution across this run:\" . PHP_EOL;\n$percentiles = [10, 25, 50, 75, 90];\nforeach ($percentiles as $p) {\n    $val = Stat::percentile($paces, $p, 0);\n    echo \"  P\"\n        . str_pad((string) $p, 2, \" \", STR_PAD_LEFT)\n        . \": \"\n        . formatPace($val)\n        . PHP_EOL;\n}\necho PHP_EOL;\necho \"P10 = your fastest 10% of km were at this pace or faster\" . PHP_EOL;\necho \"P90 = your slowest 10% of km were at this pace or slower\" . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 8: Distribution Shape\n// ============================================================\n\n$skewness = Stat::skewness($paces, 4);\n$kurtosis = Stat::kurtosis($paces, 4);\n\necho \"=== STEP 8: Distribution Shape ===\" . PHP_EOL;\necho \"Skewness: \" . $skewness . PHP_EOL;\necho \"Kurtosis: \" . $kurtosis . PHP_EOL;\nif ($skewness > 0.2) {\n    echo \"Right-skewed: you have a tail of slower km (hills? fatigue?)\"\n        . PHP_EOL;\n} elseif ($skewness < -0.2) {\n    echo \"Left-skewed: you have a tail of faster km (downhills? strong start?)\"\n        . PHP_EOL;\n} else {\n    echo \"Approximately symmetric pacing\" . PHP_EOL;\n}\necho PHP_EOL;\n\n// ============================================================\n// STEP 9: Confidence Interval on True Pace\n// ============================================================\n\n$ci = Stat::confidenceInterval($paces, 0.95, 0);\n$sem = Stat::sem($paces, 1);\n\necho \"=== STEP 9: Confidence Interval ===\" . PHP_EOL;\necho \"95% CI for your true pace: \"\n    . formatPace($ci[0])\n    . \" to \"\n    . formatPace($ci[1])\n    . PHP_EOL;\necho \"Standard Error of the Mean: \" . $sem . \" sec\" . PHP_EOL;\necho \"With more km (longer runs), this interval would narrow.\" . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 10: Multi-Run Trend Analysis (Simulated)\n// ============================================================\n\n// Simulated: 8 weeks of average 10K paces showing diminishing improvement\n// Early weeks show big gains; later weeks show smaller improvements (plateau effect)\n$weeks = [1, 2, 3, 4, 5, 6, 7, 8];\n$weeklyPaces = [350, 342, 337, 333, 330, 328, 326, 325];\n\n$trendReg = Stat::linearRegression($weeks, $weeklyPaces);\n$trendR2 = Stat::rSquared($weeks, $weeklyPaces, false, 4);\n$trendCorr = Stat::correlation($weeks, $weeklyPaces);\n\necho \"=== STEP 10: Multi-Run Trend (8-Week Simulation) ===\" . PHP_EOL;\necho \"Weekly average paces:\" . PHP_EOL;\nforeach ($weeks as $i => $w) {\n    echo \"  Week \" . $w . \": \" . formatPace($weeklyPaces[$i]) . PHP_EOL;\n}\necho PHP_EOL;\necho \"Trend regression: pace = \"\n    . round($trendReg[0], 2)\n    . \" x week + \"\n    . round($trendReg[1], 1)\n    . PHP_EOL;\necho \"R-squared:        \" . $trendR2 . PHP_EOL;\necho \"Correlation:      \" . round($trendCorr, 4) . PHP_EOL;\necho \"Improvement rate:  \"\n    . round(abs($trendReg[0]), 1)\n    . \" seconds/km per week\"\n    . PHP_EOL;\necho PHP_EOL;\n\n// Linear prediction for week 12\n$linearPrediction12 = $trendReg[0] * 12 + $trendReg[1];\necho \"Linear prediction at week 12: \"\n    . formatPace(max(0, $linearPrediction12))\n    . PHP_EOL;\necho \"(Extrapolation — use with caution!)\" . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 10b: Logarithmic Regression — Modeling the Plateau\n// ============================================================\n\necho \"=== STEP 10b: Logarithmic Regression ===\" . PHP_EOL;\necho PHP_EOL;\n\n// Logarithmic model: pace = a * ln(week) + b\n$logReg = Stat::logarithmicRegression($weeks, $weeklyPaces);\n$logWeeks = array_map(log(...), $weeks);\n$logR2 = Stat::rSquared($logWeeks, $weeklyPaces, false, 4);\n\necho \"Logarithmic regression: pace = \"\n    . round($logReg[0], 2)\n    . \" x ln(week) + \"\n    . round($logReg[1], 1)\n    . PHP_EOL;\necho \"R-squared:              \" . $logR2 . PHP_EOL;\necho PHP_EOL;\n\n// Compare models\necho \"Model comparison:\" . PHP_EOL;\necho \"  Linear R²:      \" . $trendR2 . PHP_EOL;\necho \"  Logarithmic R²: \" . $logR2 . PHP_EOL;\necho \"  Better fit:      \"\n    . ($logR2 > $trendR2 ? \"Logarithmic\" : \"Linear\")\n    . PHP_EOL;\necho PHP_EOL;\n\n// Compare predictions\n$logPrediction12 = $logReg[0] * log(12) + $logReg[1];\n$logPrediction20 = $logReg[0] * log(20) + $logReg[1];\n$linearPrediction20 = $trendReg[0] * 20 + $trendReg[1];\n\necho \"Predictions:\" . PHP_EOL;\necho \"  Week 12 — Linear: \"\n    . formatPace(max(0, $linearPrediction12))\n    . \"  |  Logarithmic: \"\n    . formatPace(max(0, $logPrediction12))\n    . PHP_EOL;\necho \"  Week 20 — Linear: \"\n    . formatPace(max(0, $linearPrediction20))\n    . \"  |  Logarithmic: \"\n    . formatPace(max(0, $logPrediction20))\n    . PHP_EOL;\necho PHP_EOL;\necho \"The logarithmic model predicts more conservative (realistic) paces\" . PHP_EOL;\necho \"because it accounts for the natural plateau in athletic improvement.\" . PHP_EOL;\necho PHP_EOL;\n\n// ============================================================\n// STEP 10c: All Four Models Compared\n// ============================================================\n\necho \"=== STEP 10c: All Four Models Compared ===\" . PHP_EOL;\necho PHP_EOL;\n\n// Power: pace = a * week^b\n[$aPow, $bPow] = Stat::powerRegression($weeks, $weeklyPaces);\n$logPaces = array_map(log(...), $weeklyPaces);\n$r2Pow = Stat::rSquared($logWeeks, $logPaces, false, 4);\n\n// Exponential: pace = a * e^(b * week)\n[$aExp, $bExp] = Stat::exponentialRegression($weeks, $weeklyPaces);\n$r2Exp = Stat::rSquared($weeks, $logPaces, false, 4);\n\n// Predictions for week 12, 20, 52\n$predWeeks = [12, 20, 52];\n$models = [\n    'Linear' => [\n        'r2' => $trendR2,\n        'predict' => fn($w): int|float => $trendReg[0] * $w + $trendReg[1],\n    ],\n    'Logarithmic' => [\n        'r2' => $logR2,\n        'predict' => fn($w): float => $logReg[0] * log($w) + $logReg[1],\n    ],\n    'Power' => [\n        'r2' => $r2Pow,\n        'predict' => fn($w): float|int => $aPow * $w ** $bPow,\n    ],\n    'Exponential' => [\n        'r2' => $r2Exp,\n        'predict' => fn($w): float => $aExp * exp($bExp * $w),\n    ],\n];\n\necho str_pad(\"Model\", 18)\n    . str_pad(\"R²\", 11)\n    . str_pad(\"Week 12\", 11)\n    . str_pad(\"Week 20\", 11)\n    . \"Week 52\"\n    . PHP_EOL;\necho str_repeat(\"-\", 58) . PHP_EOL;\n\nforeach ($models as $name => $model) {\n    echo str_pad($name, 18)\n        . str_pad((string) $model['r2'], 11)\n        . str_pad(formatPace(max(0, $model['predict'](12))), 11)\n        . str_pad(formatPace(max(0, $model['predict'](20))), 11)\n        . formatPace(max(0, $model['predict'](52)))\n        . PHP_EOL;\n}\necho PHP_EOL;\n\n// Find the best model by R²\n$bestModel = '';\n$bestR2 = 0;\nforeach ($models as $name => $model) {\n    if ($model['r2'] > $bestR2) {\n        $bestR2 = $model['r2'];\n        $bestModel = $name;\n    }\n}\necho \"Best fit by R²: \" . $bestModel . \" (R² = \" . $bestR2 . \")\" . PHP_EOL;\necho \"The data tells us the improvement pattern follows a curve, not a straight line.\" . PHP_EOL;\n"
  },
  {
    "path": "examples/freq_methods.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\n$data = [55, 70, 57, 73, 55, 59, 64, 72,\n    60, 48, 58, 54, 69, 51, 63, 78,\n    75, 64, 65, 57, 71, 78, 76, 62,\n    49, 66, 62, 76, 61, 63, 63, 76,\n    52, 76, 71, 61, 53, 56, 67, 71, ];\n$result = \\HiFolks\\Statistics\\Freq::frequencyTable($data, 7);\necho min($data) . PHP_EOL;\necho max($data) . PHP_EOL;\nprint_r($result);\n\n$data = [1, 1, 1, 4, 4, 5, 5, 5, 6, 7, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 11, 12, 12,\n    13, 14, 14, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18, 18, ];\n$result = \\HiFolks\\Statistics\\Freq::frequencyTableBySize($data, 4);\nprint_r($result);\n$result = \\HiFolks\\Statistics\\Freq::frequencyTable($data, 5);\necho count($data) . PHP_EOL;\necho min($data) . PHP_EOL;\necho max($data) . PHP_EOL;\nprint_r($result);\n"
  },
  {
    "path": "examples/frequencies.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\Freq;\nuse HiFolks\\Statistics\\Statistics;\n\n$fruits = ['🍈', '🍈', '🍈', '🍉', '🍉', '🍉', '🍉', '🍉', '🍌'];\n$freqTable = Freq::frequencies($fruits);\nprint_r($freqTable);\n/*\nArray\n(\n    [🍈] => 3\n    [🍉] => 5\n    [🍌] => 1\n)\n */\n\n$freqTable = Freq::relativeFrequencies($fruits, 2);\nprint_r($freqTable);\n/*\nArray\n(\n    [🍈] => 33.33\n    [🍉] => 55.56\n    [🍌] => 11.11\n)\n */\n\n$s = Statistics::make(\n    [98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88, 76],\n);\n$a = $s->frequencies();\nprint_r($a);\n/*\nArray\n(\n    [18] => 1\n    [45] => 1\n    [55] => 1\n    [70] => 1\n    [76] => 1\n    [83] => 1\n    [88] => 1\n    [90] => 1\n    [92] => 2\n    [95] => 1\n    [98] => 1\n)\n */\n\n$a = $s->relativeFrequencies();\nprint_r($a);\n/*\nArray\n(\n    [18] => 8.3333333333333\n    [45] => 8.3333333333333\n    [55] => 8.3333333333333\n    [70] => 8.3333333333333\n    [76] => 8.3333333333333\n    [83] => 8.3333333333333\n    [88] => 8.3333333333333\n    [90] => 8.3333333333333\n    [92] => 16.666666666667\n    [95] => 8.3333333333333\n    [98] => 8.3333333333333\n)\n */\n"
  },
  {
    "path": "examples/kde.php",
    "content": "<?php\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\nuse HiFolks\\Statistics\\Stat;\n\n/**\n * Kernel Density Estimation (KDE) examples.\n *\n * KDE builds a smooth, continuous probability density function from\n * discrete sample data.  Think of it as a \"smoothed histogram\" that\n * lets you estimate the likelihood of any value — not just the ones\n * you observed.\n *\n * Inspired by the Python statistics module:\n * https://docs.python.org/3/library/statistics.html#statistics.kde\n */\n\n// ---------------------------------------------------------------\n// 1.  Basic PDF estimation (Wikipedia example)\n// ---------------------------------------------------------------\necho \"=== 1. Basic PDF estimation ===\" . PHP_EOL . PHP_EOL;\n\n$sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2];\n$h = 1.5;\n\n$f = Stat::kde($sample, h: $h);\n\n// Evaluate the estimated density at a few points\n$points = [-4.0, -2.0, 0.0, 2.0, 4.0, 6.0, 8.0];\necho \"Sample : \" . implode(\", \", $sample) . PHP_EOL;\necho \"Bandwidth h = $h\" . PHP_EOL . PHP_EOL;\n\necho str_pad(\"x\", 8) . \"f(x)\" . PHP_EOL;\necho str_repeat(\"-\", 24) . PHP_EOL;\nforeach ($points as $x) {\n    $density = $f($x);\n    echo str_pad(number_format($x, 1), 8) . number_format($density, 6) . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 2.  ASCII density plot\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 2. ASCII density plot ===\" . PHP_EOL . PHP_EOL;\n\n$xMin = -6.0;\n$xMax = 10.0;\n$steps = 60;\n$maxBarWidth = 50;\n\n// Compute densities across the range\n$densities = [];\n$maxDensity = 0.0;\nfor ($i = 0; $i <= $steps; $i++) {\n    $x = $xMin + ($xMax - $xMin) * $i / $steps;\n    $d = $f($x);\n    $densities[] = [$x, $d];\n    if ($d > $maxDensity) {\n        $maxDensity = $d;\n    }\n}\n\nforeach ($densities as [$x, $d]) {\n    $barLen = (int) round($d / $maxDensity * $maxBarWidth);\n    echo str_pad(number_format($x, 1), 7)\n        . \" |\"\n        . str_repeat(\"*\", $barLen)\n        . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 3.  Comparing kernels\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 3. Comparing kernels ===\" . PHP_EOL . PHP_EOL;\n\n$data = [1.0, 2.0, 3.0, 4.0, 5.0];\n$evalAt = 3.0;\n\n$kernelsToCompare = [\n    KdeKernel::Normal,\n    KdeKernel::Triangular,\n    KdeKernel::Rectangular,\n    KdeKernel::Parabolic,\n    KdeKernel::Cosine,\n];\n\necho \"Data: \" . implode(\", \", $data) . PHP_EOL;\necho \"Evaluating density at x = $evalAt  (h = 1.0)\" . PHP_EOL . PHP_EOL;\n\necho str_pad(\"Kernel\", 16) . \"f($evalAt)\" . PHP_EOL;\necho str_repeat(\"-\", 30) . PHP_EOL;\nforeach ($kernelsToCompare as $kernel) {\n    $fk = Stat::kde($data, 1.0, $kernel);\n    echo str_pad($kernel->value, 16)\n        . number_format($fk($evalAt), 6)\n        . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 4.  Cumulative Distribution Function (CDF)\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 4. Cumulative Distribution Function ===\" . PHP_EOL . PHP_EOL;\n\n$F = Stat::kde($sample, h: $h, cumulative: true);\n\necho \"Sample : \" . implode(\", \", $sample) . PHP_EOL;\necho \"Bandwidth h = $h\" . PHP_EOL . PHP_EOL;\n\necho str_pad(\"x\", 8) . \"F(x)\" . PHP_EOL;\necho str_repeat(\"-\", 24) . PHP_EOL;\nforeach ([-6.0, -4.0, -2.0, 0.0, 2.0, 4.0, 6.0, 8.0, 10.0] as $x) {\n    echo str_pad(number_format($x, 1), 8)\n        . number_format($F($x), 6)\n        . PHP_EOL;\n}\n\n// P(X <= 2.5)\n$p = $F(2.5);\necho PHP_EOL . \"P(X <= 2.5) = \" . round($p * 100, 1) . \"%\" . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 5.  Alias equivalence\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 5. Alias equivalence ===\" . PHP_EOL . PHP_EOL;\n\n$aliasPairs = [\n    [KdeKernel::Gauss, KdeKernel::Normal],\n    [KdeKernel::Uniform, KdeKernel::Rectangular],\n    [KdeKernel::Epanechnikov, KdeKernel::Parabolic],\n    [KdeKernel::Biweight, KdeKernel::Quartic],\n];\n\necho \"Aliases resolve to their canonical kernel:\" . PHP_EOL;\nforeach ($aliasPairs as [$alias, $canonical]) {\n    $f1 = Stat::kde($data, 1.0, $alias);\n    $f2 = Stat::kde($data, 1.0, $canonical);\n    $match = abs($f1(3.0) - $f2(3.0)) < 1e-15 ? \"OK\" : \"MISMATCH\";\n    echo \"  \" . str_pad($alias->value, 14) . \" => \"\n        . str_pad($canonical->value, 14)\n        . $match . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 6.  Random sampling with kdeRandom()\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 6. Random sampling with kdeRandom() ===\" . PHP_EOL . PHP_EOL;\n\n$rand = Stat::kdeRandom($sample, h: $h, seed: 8675309);\n\n$nSamples = 10;\n$samples = [];\nfor ($i = 0; $i < $nSamples; $i++) {\n    $samples[] = round($rand(), 1);\n}\necho \"Original data : \" . implode(\", \", $sample) . PHP_EOL;\necho \"10 KDE samples: \" . implode(\", \", $samples) . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 7.  Verifying statistical properties of random samples\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 7. Statistical properties of KDE samples ===\" . PHP_EOL . PHP_EOL;\n\n$dataMean = Stat::mean($sample);\n$n = 50000;\n$sampler = Stat::kdeRandom($sample, h: $h, seed: 42);\n\n$sum = 0.0;\nfor ($i = 0; $i < $n; $i++) {\n    $sum += $sampler();\n}\n$sampleMean = $sum / $n;\n\necho \"Original data mean : \" . round($dataMean, 4) . PHP_EOL;\necho \"KDE sample mean (n=$n): \" . round($sampleMean, 4) . PHP_EOL;\necho \"Difference           : \" . round(abs($dataMean - $sampleMean), 4) . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 8.  Sampling with different kernels\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 8. Sampling with different kernels ===\" . PHP_EOL . PHP_EOL;\n\necho \"5 random draws per kernel (seed=42):\" . PHP_EOL . PHP_EOL;\nforeach ($kernelsToCompare as $kernel) {\n    $sampler = Stat::kdeRandom($sample, h: $h, kernel: $kernel, seed: 42);\n    $draws = [];\n    for ($i = 0; $i < 5; $i++) {\n        $draws[] = round($sampler(), 2);\n    }\n    echo str_pad($kernel->value, 16) . implode(\", \", $draws) . PHP_EOL;\n}\n"
  },
  {
    "path": "examples/kde_downhill.php",
    "content": "<?php\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\nuse HiFolks\\Statistics\\Stat;\n\n/**\n * Kernel Density Estimation applied to real sports data.\n *\n * Dataset: Men's Downhill results — Winter Olympic Games 2026.\n *\n * KDE lets us move beyond simple averages and histograms to answer\n * richer questions: Where do finishing times cluster?  What is the\n * probability of finishing under a given threshold?  How would\n * simulated future races look?\n */\n$results = [\n    [\"name\" => \"Franjo von ALLMEN\", \"time\" => 111.61],\n    [\"name\" => \"Giovanni FRANZONI\", \"time\" => 111.81],\n    [\"name\" => \"Dominik PARIS\", \"time\" => 112.11],\n    [\"name\" => \"Marco ODERMATT\", \"time\" => 112.31],\n    [\"name\" => \"Alexis MONNEY\", \"time\" => 112.36],\n    [\"name\" => \"Vincent KRIECHMAYR\", \"time\" => 112.38],\n    [\"name\" => \"Daniel HEMETSBERGER\", \"time\" => 112.58],\n    [\"name\" => \"Nils ALLEGRE\", \"time\" => 112.8],\n    [\"name\" => \"James CRAWFORD\", \"time\" => 113.0],\n    [\"name\" => \"Kyle NEGOMIR\", \"time\" => 113.2],\n    [\"name\" => \"Mattia CASSE\", \"time\" => 113.28],\n    [\"name\" => \"Miha HROBAT\", \"time\" => 113.3],\n    [\"name\" => \"Bryce BENNETT\", \"time\" => 113.45],\n    [\"name\" => \"Cameron ALEXANDER\", \"time\" => 113.49],\n    [\"name\" => \"Raphael HAASER\", \"time\" => 113.5],\n    [\"name\" => \"Martin CATER\", \"time\" => 113.51],\n    [\"name\" => \"Florian SCHIEDER\", \"time\" => 113.57],\n    [\"name\" => \"Ryan COCHRAN-SIEGLE\", \"time\" => 113.63],\n    [\"name\" => \"Sam MORSE\", \"time\" => 113.68],\n    [\"name\" => \"Elian LEHTO\", \"time\" => 113.83],\n    [\"name\" => \"Simon JOCHER\", \"time\" => 114.01],\n    [\"name\" => \"Nils ALPHAND\", \"time\" => 114.06],\n    [\"name\" => \"Stefan ROGENTIN\", \"time\" => 114.18],\n    [\"name\" => \"Jan ZABYSTRAN\", \"time\" => 114.39],\n    [\"name\" => \"Jeffrey READ\", \"time\" => 114.56],\n    [\"name\" => \"Stefan BABINSKY\", \"time\" => 114.73],\n    [\"name\" => \"Alban ELEZI CANNAFERINA\", \"time\" => 114.9],\n    [\"name\" => \"Brodie SEGER\", \"time\" => 114.96],\n    [\"name\" => \"Marco PFIFFNER\", \"time\" => 115.66],\n    [\"name\" => \"Barnabas SZOLLOS\", \"time\" => 117.03],\n    [\"name\" => \"Arnaud ALESSANDRIA\", \"time\" => 117.15],\n    [\"name\" => \"Elvis OPMANIS\", \"time\" => 119.24],\n    [\"name\" => \"Dmytro SHEPIUK\", \"time\" => 120.11],\n    [\"name\" => \"Cormac COMERFORD\", \"time\" => 124.4],\n];\n\n$times = array_column($results, \"time\");\n\necho \"=== Men's Downhill — Olympic Winter Games 2026 ===\" . PHP_EOL;\necho \"Athletes: \" . count($times) . PHP_EOL;\necho \"Winner : \" . $results[0][\"name\"] . \" (\" . $results[0][\"time\"] . \"s)\" . PHP_EOL;\necho \"Mean   : \" . round(Stat::mean($times), 2) . \"s\" . PHP_EOL;\necho \"Median : \" . round(Stat::median($times), 2) . \"s\" . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 1.  Density profile — where do finishing times cluster?\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 1. Density profile ===\" . PHP_EOL . PHP_EOL;\n\n// A bandwidth of 0.8s is a good fit: wide enough to smooth out\n// individual gaps, narrow enough to reveal the shape of the\n// distribution.  With 34 athletes, a smaller h would produce\n// spiky noise; a larger h would wash out the interesting\n// right-skewed tail.\n$h = 0.8;\n$f = Stat::kde($times, h: $h, kernel: KdeKernel::Normal);\n\necho \"Bandwidth h = {$h}s\" . PHP_EOL . PHP_EOL;\n\n// Scan for the peak (mode of the continuous distribution)\n$peakX = 0.0;\n$peakD = 0.0;\n$maxBarWidth = 50;\n$densities = [];\n\nfor ($x = 110.0; $x <= 126.0; $x += 0.2) {\n    $d = $f($x);\n    $densities[] = [$x, $d];\n    if ($d > $peakD) {\n        $peakD = $d;\n        $peakX = $x;\n    }\n}\n\necho \"Density plot (each * ~ \"\n    . round($peakD / $maxBarWidth, 5)\n    . \" density units):\" . PHP_EOL . PHP_EOL;\n\nforeach ($densities as [$x, $d]) {\n    // Only print every 0.6s to keep it readable\n    if (round(($x - 110.0) * 10) % 6 !== 0) {\n        continue;\n    }\n    $barLen = (int) round($d / $peakD * $maxBarWidth);\n    echo str_pad(number_format($x, 1) . \"s\", 8)\n        . \"|\"\n        . str_repeat(\"*\", $barLen)\n        . PHP_EOL;\n}\n\necho PHP_EOL;\necho \"Peak density at \"\n    . number_format($peakX, 1) . \"s\"\n    . \" — this is the KDE mode, the most likely finishing time.\"\n    . PHP_EOL;\necho \"Compare with the arithmetic mean (\"\n    . round(Stat::mean($times), 2)\n    . \"s): the mean is pulled right\" . PHP_EOL;\necho \"by slow outliers, but KDE reveals the true concentration point.\"\n    . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 2.  Probability thresholds via CDF\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 2. Probability thresholds (CDF) ===\" . PHP_EOL . PHP_EOL;\n\n$F = Stat::kde($times, h: $h, cumulative: true);\n\n$thresholds = [\n    [112.0, \"podium contender\"],\n    [113.0, \"top-10 territory\"],\n    [113.5, \"solid mid-pack\"],\n    [114.0, \"~top 20\"],\n    [115.0, \"lower pack\"],\n    [117.0, \"off the pace\"],\n    [120.0, \"struggling finisher\"],\n];\n\necho str_pad(\"Threshold\", 12)\n    . str_pad(\"P(time <= t)\", 15)\n    . \"Interpretation\" . PHP_EOL;\necho str_repeat(\"-\", 65) . PHP_EOL;\n\nforeach ($thresholds as [$t, $label]) {\n    $prob = $F($t);\n    echo str_pad(number_format($t, 1) . \"s\", 12)\n        . str_pad(round($prob * 100, 1) . \"%\", 15)\n        . $label\n        . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 3.  Classifying each athlete by density region\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 3. Athlete classification by density ===\" . PHP_EOL . PHP_EOL;\n\n// Use the CDF to assign a percentile to each athlete.\n// KDE percentiles reflect the actual shape of the distribution,\n// unlike assuming a normal distribution.\necho str_pad(\"Rank\", 5)\n    . str_pad(\"Athlete\", 30)\n    . str_pad(\"Time\", 9)\n    . str_pad(\"Pctile\", 9)\n    . \"Tier\" . PHP_EOL;\necho str_repeat(\"-\", 65) . PHP_EOL;\n\nforeach ($results as $rank => $r) {\n    $pctile = $F($r[\"time\"]) * 100;\n\n    if ($pctile <= 15) {\n        $tier = \"Elite\";\n    } elseif ($pctile <= 40) {\n        $tier = \"Strong\";\n    } elseif ($pctile <= 70) {\n        $tier = \"Mid-pack\";\n    } elseif ($pctile <= 90) {\n        $tier = \"Back\";\n    } else {\n        $tier = \"Outlier\";\n    }\n\n    echo str_pad((string) ($rank + 1), 5)\n        . str_pad($r[\"name\"], 30)\n        . str_pad(number_format($r[\"time\"], 2) . \"s\", 9)\n        . str_pad(round($pctile, 1) . \"%\", 9)\n        . $tier\n        . PHP_EOL;\n}\n\n// ---------------------------------------------------------------\n// 4.  Comparing kernels — does the choice matter here?\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 4. Kernel comparison ===\" . PHP_EOL . PHP_EOL;\n\n$kernels = [\n    KdeKernel::Normal,\n    KdeKernel::Triangular,\n    KdeKernel::Parabolic,\n    KdeKernel::Cosine,\n];\n\n$evalPoints = [112.0, 113.5, 115.0, 120.0];\n\necho str_pad(\"Kernel\", 14);\nforeach ($evalPoints as $ep) {\n    echo str_pad(number_format($ep, 1) . \"s\", 10);\n}\necho PHP_EOL . str_repeat(\"-\", 54) . PHP_EOL;\n\nforeach ($kernels as $kernel) {\n    $fk = Stat::kde($times, $h, $kernel);\n    echo str_pad($kernel->value, 14);\n    foreach ($evalPoints as $ep) {\n        echo str_pad(number_format($fk($ep), 5), 10);\n    }\n    echo PHP_EOL;\n}\n\necho PHP_EOL\n    . \"With enough data (34 athletes) the kernel choice has minimal\"\n    . PHP_EOL\n    . \"impact — the bandwidth h matters far more.\"\n    . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 5.  Simulating future races with kdeRandom()\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 5. Simulating future races with kdeRandom() ===\" . PHP_EOL . PHP_EOL;\n\n// kdeRandom() draws random values from the estimated density.\n// This is useful for \"what-if\" analysis: if the same field raced\n// again under similar conditions, what might the results look like?\n\n$nRaces = 10000;\n$raceSize = count($times);\n$rand = Stat::kdeRandom($times, h: $h, seed: 2026);\n\necho \"Simulating $nRaces races of $raceSize athletes...\" . PHP_EOL . PHP_EOL;\n\n$winningTimes = [];\n$podiumCuts = [];\nfor ($race = 0; $race < $nRaces; $race++) {\n    $simTimes = [];\n    for ($a = 0; $a < $raceSize; $a++) {\n        $simTimes[] = $rand();\n    }\n    sort($simTimes);\n    $winningTimes[] = $simTimes[0];\n    $podiumCuts[] = $simTimes[2]; // 3rd place\n}\n\nsort($winningTimes);\nsort($podiumCuts);\n\necho \"Winning time distribution (from $nRaces simulations):\" . PHP_EOL;\necho \"  Fastest simulated winner : \" . round(min($winningTimes), 2) . \"s\" . PHP_EOL;\necho \"  Median winning time      : \" . round(Stat::median($winningTimes), 2) . \"s\" . PHP_EOL;\necho \"  Slowest simulated winner : \" . round(max($winningTimes), 2) . \"s\" . PHP_EOL;\necho \"  Actual winner            : \" . $results[0][\"time\"] . \"s (\"\n    . $results[0][\"name\"] . \")\" . PHP_EOL;\n\necho PHP_EOL . \"Podium threshold (3rd-place time):\" . PHP_EOL;\necho \"  Median podium cut-off    : \" . round(Stat::median($podiumCuts), 2) . \"s\" . PHP_EOL;\necho \"  Actual 3rd place         : \" . $results[2][\"time\"] . \"s (\"\n    . $results[2][\"name\"] . \")\" . PHP_EOL;\n\n// ---------------------------------------------------------------\n// 6.  Podium probability per athlete\n// ---------------------------------------------------------------\necho PHP_EOL . \"=== 6. Podium probability per athlete ===\" . PHP_EOL . PHP_EOL;\n\n// For each athlete, we simulate many individual runs drawn from\n// a personal KDE centered on their actual time.  We then count\n// how often each athlete's simulated time would beat the simulated\n// podium cut-off.\n//\n// This captures two sources of uncertainty:\n// - race-to-race variation across the whole field (podium cut-off)\n// - each athlete's own run-to-run variation\n\n$nSim = 50000;\n$personalH = 0.5; // personal run-to-run variation (narrower than field)\n\necho \"Estimating podium probability ($nSim simulations per athlete)...\" . PHP_EOL;\necho \"Personal bandwidth h = {$personalH}s\" . PHP_EOL . PHP_EOL;\n\n// Pre-sort podium cuts for percentile lookup\nsort($podiumCuts);\n$nPodium = count($podiumCuts);\n\necho str_pad(\"Athlete\", 30)\n    . str_pad(\"Actual\", 9)\n    . \"P(podium)\" . PHP_EOL;\necho str_repeat(\"-\", 52) . PHP_EOL;\n\n// Show top-15 athletes (the realistic podium contenders)\nfor ($idx = 0; $idx < min(15, count($results)); $idx++) {\n    $r = $results[$idx];\n    $athleteSampler = Stat::kdeRandom([$r[\"time\"]], h: $personalH, seed: $idx);\n    $podiumCount = 0;\n    for ($s = 0; $s < $nSim; $s++) {\n        $simTime = $athleteSampler();\n        // Compare against a random podium cut-off from our race simulations\n        $cutIdx = $s % $nPodium;\n        if ($simTime <= $podiumCuts[$cutIdx]) {\n            $podiumCount++;\n        }\n    }\n    $prob = $podiumCount / $nSim * 100;\n    echo str_pad($r[\"name\"], 30)\n        . str_pad(number_format($r[\"time\"], 2) . \"s\", 9)\n        . round($prob, 1) . \"%\"\n        . PHP_EOL;\n}\n\necho PHP_EOL\n    . \"These probabilities reflect both the athlete's expected pace\" . PHP_EOL\n    . \"and the random variation inherent in downhill racing.\" . PHP_EOL;\n"
  },
  {
    "path": "examples/norm_dist.php",
    "content": "<?php\n\nrequire __DIR__ . \"/../vendor/autoload.php\";\n\nuse HiFolks\\Statistics\\Freq;\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\Stat;\n\n/**\n * This is the result of the Downhill race at Olympic Games 2026.\n * The results are stored in an array with name and the time in\n * seconds.\n */\n$results = [\n    [\"name\" => \"Franjo von ALLMEN\", \"time\" => 111.61],\n    [\"name\" => \"Giovanni FRANZONI\", \"time\" => 111.81],\n    [\"name\" => \"Dominik PARIS\", \"time\" => 112.11],\n    [\"name\" => \"Marco ODERMATT\", \"time\" => 112.31],\n    [\"name\" => \"Alexis MONNEY\", \"time\" => 112.36],\n    [\"name\" => \"Vincent KRIECHMAYR\", \"time\" => 112.38],\n    [\"name\" => \"Daniel HEMETSBERGER\", \"time\" => 112.58],\n    [\"name\" => \"Nils ALLEGRE\", \"time\" => 112.8],\n    [\"name\" => \"James CRAWFORD\", \"time\" => 113.0],\n    [\"name\" => \"Kyle NEGOMIR\", \"time\" => 113.2],\n    [\"name\" => \"Mattia CASSE\", \"time\" => 113.28],\n    [\"name\" => \"Miha HROBAT\", \"time\" => 113.3],\n    [\"name\" => \"Bryce BENNETT\", \"time\" => 113.45],\n    [\"name\" => \"Cameron ALEXANDER\", \"time\" => 113.49],\n    [\"name\" => \"Raphael HAASER\", \"time\" => 113.5],\n    [\"name\" => \"Martin CATER\", \"time\" => 113.51],\n    [\"name\" => \"Florian SCHIEDER\", \"time\" => 113.57],\n    [\"name\" => \"Ryan COCHRAN-SIEGLE\", \"time\" => 113.63],\n    [\"name\" => \"Sam MORSE\", \"time\" => 113.68],\n    [\"name\" => \"Elian LEHTO\", \"time\" => 113.83],\n    [\"name\" => \"Simon JOCHER\", \"time\" => 114.01],\n    [\"name\" => \"Nils ALPHAND\", \"time\" => 114.06],\n    [\"name\" => \"Stefan ROGENTIN\", \"time\" => 114.18],\n    [\"name\" => \"Jan ZABYSTRAN\", \"time\" => 114.39],\n    [\"name\" => \"Jeffrey READ\", \"time\" => 114.56],\n    [\"name\" => \"Stefan BABINSKY\", \"time\" => 114.73],\n    [\"name\" => \"Alban ELEZI CANNAFERINA\", \"time\" => 114.9],\n    [\"name\" => \"Brodie SEGER\", \"time\" => 114.96],\n    [\"name\" => \"Marco PFIFFNER\", \"time\" => 115.66],\n    [\"name\" => \"Barnabas SZOLLOS\", \"time\" => 117.03],\n    [\"name\" => \"Arnaud ALESSANDRIA\", \"time\" => 117.15],\n    [\"name\" => \"Elvis OPMANIS\", \"time\" => 119.24],\n    [\"name\" => \"Dmytro SHEPIUK\", \"time\" => 120.11],\n    [\"name\" => \"Cormac COMERFORD\", \"time\" => 124.4],\n];\n\n$times = array_column($results, \"time\");\n\n// --- Descriptive Statistics ---\necho \"=== Downhill Race Analysis - Olympic Games 2026 ===\" . PHP_EOL . PHP_EOL;\n\n$mean = Stat::mean($times);\n$median = Stat::median($times);\n$std = Stat::stdev($times);\n$min = min($times);\n$max = max($times);\n$range = $max - $min;\n$quartiles = Stat::quantiles($times);\n\necho \"Sample size: \" . count($times) . PHP_EOL;\necho \"Mean time: \" . round($mean, 2) . \" seconds\" . PHP_EOL;\necho \"Median time: \" . round($median, 2) . \" seconds\" . PHP_EOL;\necho \"Standard deviation: \" . round($std, 2) . \" seconds\" . PHP_EOL;\necho \"Min: \"\n    . $min\n    . \"s | Max: \"\n    . $max\n    . \"s | Range: \"\n    . round($range, 2)\n    . \"s\"\n    . PHP_EOL;\necho \"Quartiles (Q1, Q2, Q3): \"\n    . round($quartiles[0], 2)\n    . \"s, \"\n    . round($quartiles[1], 2)\n    . \"s, \"\n    . round($quartiles[2], 2)\n    . \"s\"\n    . PHP_EOL;\necho \"Skewness: \" . Stat::skewness($times, 4)\n    . \" (positive = right-skewed, a few slow finishers pull the tail right)\"\n    . PHP_EOL;\necho \"Kurtosis: \" . Stat::kurtosis($times, 4)\n    . \" (positive = leptokurtic, heavy tails with outliers)\"\n    . PHP_EOL;\n\n// --- Normal Distribution Model ---\necho PHP_EOL . \"=== Normal Distribution Model ===\" . PHP_EOL . PHP_EOL;\n\n$normal = NormalDist::fromSamples($times);\necho \"Estimated mu (mean): \"\n    . $normal->getMeanRounded(2)\n    . \" seconds\"\n    . PHP_EOL;\necho \"Estimated sigma (std dev): \"\n    . $normal->getSigmaRounded(2)\n    . \" seconds\"\n    . PHP_EOL;\n\n// Compare model median vs actual median\n// For a normal distribution, median = mean, so getMedian() returns mu directly.\necho \"Model median: \" . $normal->getMedianRounded(2) . \" seconds\" . PHP_EOL;\necho \"Actual median: \" . round($median, 2) . \" seconds\" . PHP_EOL;\n\n// Note: the model median equals the mean (as expected for a normal\n// distribution), but it differs from the actual median by\n// ~0.78 seconds. This gap tells us the data is right-skewed:\n// a few very slow finishers (119s, 120s, 124s) pull the mean up.\n// A normal distribution assumes symmetry, so it is not a perfect\n// fit for this dataset.\n\n// --- Thresholds from the model ---\necho PHP_EOL . \"=== Performance Thresholds ===\" . PHP_EOL . PHP_EOL;\n\n$eliteThreshold = $normal->invCdfRounded(0.2, 2);\n$slowThreshold = $normal->invCdfRounded(0.8, 2);\necho \"Top 20% fastest (below): \" . $eliteThreshold . \" seconds\" . PHP_EOL;\necho \"Slowest 20% (above): \" . $slowThreshold . \" seconds\" . PHP_EOL;\n\n// --- Probability questions ---\necho PHP_EOL . \"=== Probability Questions ===\" . PHP_EOL . PHP_EOL;\n\n$target = 113.0;\n$probUnder = $normal->cdfRounded($target, 4);\n$actualUnder = count(array_filter($times, fn(float $t): bool => $t <= $target));\necho \"Model: P(time <= \"\n    . $target\n    . \"s) = \"\n    . round($probUnder * 100, 1)\n    . \"%\"\n    . PHP_EOL;\necho \"Actual: \"\n    . $actualUnder\n    . \"/\"\n    . count($times)\n    . \" = \"\n    . round(($actualUnder / count($times)) * 100, 1)\n    . \"%\"\n    . PHP_EOL;\necho \"(The gap shows the effect of skewness on the normal model)\" . PHP_EOL;\n\n$pdfAt = $normal->pdfRounded($target, 6);\necho \"PDF at \" . $target . \"s = \" . $pdfAt . PHP_EOL;\n\n// --- Athlete Tier Classification ---\necho PHP_EOL . \"=== Athlete Tier Classification ===\" . PHP_EOL . PHP_EOL;\n\n// We use percentile ranks based on the normal model.\n// Lower time = better performance = lower percentile.\n$tierDefinitions = [\n    [\"max\" => 0.2, \"label\" => \"Elite\"],\n    [\"max\" => 0.5, \"label\" => \"Strong\"],\n    [\"max\" => 0.8, \"label\" => \"Average\"],\n    [\"max\" => 1.0, \"label\" => \"Below avg\"],\n];\n\nforeach ($results as $r) {\n    $time = $r[\"time\"];\n    $percentile = $normal->cdf($time);\n\n    $tier = \"Below avg\";\n    foreach ($tierDefinitions as $def) {\n        if ($percentile <= $def[\"max\"]) {\n            $tier = $def[\"label\"];\n            break;\n        }\n    }\n\n    $z = $normal->zscoreRounded($time, 2);\n    $zFormatted = ($z >= 0 ? \"+\" : \"\") . number_format($z, 2);\n\n    echo str_pad($r[\"name\"], 30)\n        . str_pad(number_format($time, 2) . \"s\", 10)\n        . str_pad($tier, 12)\n        . \"z: \"\n        . str_pad($zFormatted, 7)\n        . \"(percentile: \"\n        . min(round($percentile * 100, 1), 99.9)\n        . \"%)\"\n        . PHP_EOL;\n}\n\n// --- Frequency Table ---\necho PHP_EOL . \"=== Frequency Table (2-second classes) ===\" . PHP_EOL . PHP_EOL;\n\n$freqTable = Freq::frequencyTableBySize($times, 1);\nforeach ($freqTable as $class => $count) {\n    echo str_pad($class . \"s\", 8)\n        . str_repeat(\"*\", $count)\n        . \" (\"\n        . $count\n        . \")\"\n        . PHP_EOL;\n}\n\n// --- Distribution Shape ---\necho PHP_EOL . \"=== Distribution Shape ===\" . PHP_EOL . PHP_EOL;\n\necho \"Skewness: \" . Stat::skewness($times, 4) . PHP_EOL;\necho \"Kurtosis: \" . Stat::kurtosis($times, 4) . PHP_EOL;\n"
  },
  {
    "path": "examples/recipes_binomial_approximation.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\NormalDist;\n\n/**\n * Recipe: Approximating Binomial Distributions\n *\n * Adapted from the Python statistics module \"Examples and Recipes\":\n * https://docs.python.org/3/library/statistics.html#examples-and-recipes\n *\n * NormalDist can be used to approximate binomial distributions\n * when the sample size is large (via the Central Limit Theorem).\n *\n * Scenario: a]conference has 750 attendees. 65% prefer Python\n * and 35% prefer Ruby. The \"Python\" room holds 500 people.\n * What is the probability that the room will stay within capacity?\n */\necho \"=== Approximating Binomial Distributions ===\" . PHP_EOL . PHP_EOL;\n\n$n = 750;            // Sample size (attendees)\n$p = 0.65;           // Probability of preferring Python\n$q = 1.0 - $p;       // Probability of preferring Ruby\n$k = 500;            // Room capacity\n\n// For a binomial distribution B(n, p):\n//   mean  = n * p\n//   sigma = sqrt(n * p * q)\n$mu = $n * $p;\n$sigma = sqrt($n * $p * $q);\n\necho \"Binomial parameters:\" . PHP_EOL;\necho \"  n = \" . $n . \" (attendees)\" . PHP_EOL;\necho \"  p = \" . $p . \" (Python preference)\" . PHP_EOL;\necho \"  Expected Python fans: \" . $mu . PHP_EOL;\necho \"  Standard deviation: \" . round($sigma, 2) . PHP_EOL;\necho PHP_EOL;\n\n// Normal approximation with continuity correction\n$normal = new NormalDist($mu, $sigma);\n$probNormal = $normal->cdf($k + 0.5);\necho \"Normal approximation: P(X <= \" . $k . \") = \"\n    . round($probNormal, 4) . PHP_EOL;\n\n// Exact binomial calculation using log-space arithmetic.\n// P(X <= k) = sum from r=0 to k of C(n,r) * p^r * q^(n-r)\n// We use Stirling's log-gamma via log() of factorials to avoid overflow.\n\n// Build log-factorial lookup table\n$logFact = [0.0]; // log(0!) = 0\nfor ($i = 1; $i <= $n; $i++) {\n    $logFact[$i] = $logFact[$i - 1] + log($i);\n}\n\n$logTerms = [];\nfor ($r = 0; $r <= $k; $r++) {\n    // log(C(n,r)) = log(n!) - log(r!) - log((n-r)!)\n    $logBinom = $logFact[$n] - $logFact[$r] - $logFact[$n - $r];\n    $logTerms[] = $logBinom + $r * log($p) + ($n - $r) * log($q);\n}\n// Log-sum-exp for numerical stability\n$maxLog = max($logTerms);\n$sum = 0.0;\nforeach ($logTerms as $logTerm) {\n    $sum += exp($logTerm - $maxLog);\n}\n$probExact = exp($maxLog + log($sum));\n\necho \"Exact binomial:        P(X <= \" . $k . \") = \"\n    . round($probExact, 4) . PHP_EOL;\n\n// Monte Carlo simulation approximation\n$seed = 8675309;\nmt_srand($seed);\n$trials = 10_000;\n$successes = 0;\nfor ($i = 0; $i < $trials; $i++) {\n    $count = 0;\n    for ($j = 0; $j < $n; $j++) {\n        if (mt_rand() / mt_getrandmax() < $p) {\n            $count++;\n        }\n    }\n    if ($count <= $k) {\n        $successes++;\n    }\n}\n$probSimulation = $successes / $trials;\necho \"Simulation (\" . $trials . \" trials): P(X <= \" . $k . \") = \"\n    . round($probSimulation, 4) . PHP_EOL;\n\necho PHP_EOL . \"All three methods should give approximately the same result (~0.84).\"\n    . PHP_EOL;\n\n// --- Additional: What capacity is needed for 99% confidence? ---\necho PHP_EOL . \"--- Capacity Planning ---\" . PHP_EOL;\n$needed = $normal->invCdfRounded(0.99, 0);\necho \"For 99% confidence, room capacity should be: \"\n    . $needed . \" seats\" . PHP_EOL;\n"
  },
  {
    "path": "examples/recipes_classic_probability.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\NormalDist;\n\n/**\n * Recipe: Classic Probability Problems\n *\n * Adapted from the Python statistics module \"Examples and Recipes\":\n * https://docs.python.org/3/library/statistics.html#examples-and-recipes\n *\n * Using NormalDist to solve classic probability problems.\n */\necho \"=== Classic Probability Problems ===\" . PHP_EOL . PHP_EOL;\n\n// --- SAT scores are normally distributed with mean 1060 and std dev 195 ---\n$sat = new NormalDist(1060, 195);\n\n// What percentage of students score between 1100 and 1200?\n// Adding 0.5 applies a continuity correction for discrete scores.\n$fraction = $sat->cdf(1200 + 0.5) - $sat->cdf(1100 - 0.5);\necho \"Percentage of students scoring between 1100 and 1200: \"\n    . round($fraction * 100, 1) . \"%\" . PHP_EOL;\n\n// Quartiles: divide SAT scores into 4 equal-probability groups\necho PHP_EOL . \"--- SAT Score Quartiles ---\" . PHP_EOL;\n$quartiles = $sat->quantiles(4);\necho \"Quartiles (Q1, Q2, Q3): \"\n    . implode(', ', array_map(round(...), $quartiles))\n    . PHP_EOL;\n\n// Deciles: divide SAT scores into 10 equal-probability groups\necho PHP_EOL . \"--- SAT Score Deciles ---\" . PHP_EOL;\n$deciles = $sat->quantiles(10);\necho \"Deciles: \"\n    . implode(', ', array_map(round(...), $deciles))\n    . PHP_EOL;\n\n// --- What SAT score is needed to be in the top 10%? ---\necho PHP_EOL . \"--- SAT Score Thresholds ---\" . PHP_EOL;\n$top10 = $sat->invCdfRounded(0.90, 0);\necho \"SAT score needed for top 10%: \" . $top10 . PHP_EOL;\n\n$top1 = $sat->invCdfRounded(0.99, 0);\necho \"SAT score needed for top 1%: \" . $top1 . PHP_EOL;\n\n// --- Probability of scoring above a threshold ---\n$threshold = 1300;\n$probAbove = 1 - $sat->cdf($threshold);\necho PHP_EOL . \"Probability of scoring above \" . $threshold . \": \"\n    . round($probAbove * 100, 1) . \"%\" . PHP_EOL;\n\n// --- Z-score for a specific SAT score ---\n$score = 1250;\n$z = $sat->zscoreRounded($score, 2);\necho \"Z-score for SAT score of \" . $score . \": \" . $z . PHP_EOL;\n"
  },
  {
    "path": "examples/recipes_monte_carlo.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\Stat;\n\n/**\n * Recipe: Monte Carlo Inputs for Simulations\n *\n * Adapted from the Python statistics module \"Examples and Recipes\":\n * https://docs.python.org/3/library/statistics.html#examples-and-recipes\n *\n * NormalDist can generate random samples to use as inputs for\n * Monte Carlo simulations.\n */\necho \"=== Monte Carlo Simulation ===\" . PHP_EOL . PHP_EOL;\n\n/**\n * A simple model function that combines three uncertain variables.\n */\nfunction model(float $x, float $y, float $z): float\n{\n    return (3 * $x + 7 * $x * $y - 5 * $y) / (11 * $z);\n}\n\n$n = 100_000;\n\n// Generate random samples from three independent normal distributions\n$X = (new NormalDist(10, 2.5))->samples($n, seed: 3652260728);\n$Y = (new NormalDist(15, 1.75))->samples($n, seed: 4582495471);\n$Z = (new NormalDist(50, 1.25))->samples($n, seed: 6582483453);\n\n// Compute the model output for each set of inputs\n$results = [];\nfor ($i = 0; $i < $n; $i++) {\n    $results[] = model($X[$i], $Y[$i], $Z[$i]);\n}\n\n// Find the quartiles of the model output distribution\n$quantiles = Stat::quantiles($results);\necho \"Model output quartiles (Q1, Q2, Q3):\" . PHP_EOL;\necho \"  Q1: \" . round($quantiles[0], 4) . PHP_EOL;\necho \"  Q2: \" . round($quantiles[1], 4) . PHP_EOL;\necho \"  Q3: \" . round($quantiles[2], 4) . PHP_EOL;\n\n// Basic descriptive statistics of the simulation\necho PHP_EOL . \"--- Simulation Summary ---\" . PHP_EOL;\necho \"Mean:   \" . round(Stat::mean($results), 4) . PHP_EOL;\necho \"Stdev:  \" . round(Stat::stdev($results), 4) . PHP_EOL;\necho \"Min:    \" . round(min($results), 4) . PHP_EOL;\necho \"Max:    \" . round(max($results), 4) . PHP_EOL;\n\n// Fit a normal distribution to the simulation results\n$fitted = NormalDist::fromSamples($results);\necho PHP_EOL . \"--- Fitted Normal Distribution ---\" . PHP_EOL;\necho \"Estimated mu:    \" . $fitted->getMeanRounded(4) . PHP_EOL;\necho \"Estimated sigma: \" . $fitted->getSigmaRounded(4) . PHP_EOL;\n\n// Use the fitted distribution to answer probability questions\n$threshold = 2.0;\n$probAbove = 1 - $fitted->cdf($threshold);\necho PHP_EOL . \"P(result > \" . $threshold . \"): \"\n    . round($probAbove * 100, 1) . \"%\" . PHP_EOL;\n"
  },
  {
    "path": "examples/recipes_naive_bayes.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\NormalDist;\n\n/**\n * Recipe: Naive Bayesian Classifier\n *\n * Adapted from the Python statistics module \"Examples and Recipes\":\n * https://docs.python.org/3/library/statistics.html#examples-and-recipes\n *\n * A simple Naive Bayes classifier using NormalDist.\n * Given training data for height, weight, and foot size of males\n * and females, classify a new person based on their measurements.\n */\necho \"=== Naive Bayesian Classifier ===\" . PHP_EOL . PHP_EOL;\n\n// --- Training data ---\n// Fit normal distributions to each feature for each class\n\necho \"--- Training Phase ---\" . PHP_EOL;\n\n$heightMale = NormalDist::fromSamples([6, 5.92, 5.58, 5.92]);\n$heightFemale = NormalDist::fromSamples([5, 5.5, 5.42, 5.75]);\n\n$weightMale = NormalDist::fromSamples([180, 190, 170, 165]);\n$weightFemale = NormalDist::fromSamples([100, 150, 130, 150]);\n\n$footSizeMale = NormalDist::fromSamples([12, 11, 12, 10]);\n$footSizeFemale = NormalDist::fromSamples([6, 8, 7, 9]);\n\necho \"Height (male):    mu=\" . $heightMale->getMeanRounded(2)\n    . \", sigma=\" . $heightMale->getSigmaRounded(2) . PHP_EOL;\necho \"Height (female):  mu=\" . $heightFemale->getMeanRounded(2)\n    . \", sigma=\" . $heightFemale->getSigmaRounded(2) . PHP_EOL;\necho \"Weight (male):    mu=\" . $weightMale->getMeanRounded(2)\n    . \", sigma=\" . $weightMale->getSigmaRounded(2) . PHP_EOL;\necho \"Weight (female):  mu=\" . $weightFemale->getMeanRounded(2)\n    . \", sigma=\" . $weightFemale->getSigmaRounded(2) . PHP_EOL;\necho \"Foot size (male): mu=\" . $footSizeMale->getMeanRounded(2)\n    . \", sigma=\" . $footSizeMale->getSigmaRounded(2) . PHP_EOL;\necho \"Foot size (female): mu=\" . $footSizeFemale->getMeanRounded(2)\n    . \", sigma=\" . $footSizeFemale->getSigmaRounded(2) . PHP_EOL;\n\n// --- Classification ---\necho PHP_EOL . \"--- Classification Phase ---\" . PHP_EOL . PHP_EOL;\n\n// Person to classify\n$ht = 6.0;    // height in feet\n$wt = 130;    // weight in pounds\n$fs = 8;      // foot size\n\necho \"New person: height=\" . $ht . \"ft, weight=\" . $wt\n    . \"lbs, foot size=\" . $fs . PHP_EOL . PHP_EOL;\n\n// Equal prior probabilities\n$priorMale = 0.5;\n$priorFemale = 0.5;\n\n// Posterior ∝ prior × P(height|class) × P(weight|class) × P(foot_size|class)\n// Naive Bayes assumes features are conditionally independent.\n$posteriorMale = $priorMale\n    * $heightMale->pdf($ht)\n    * $weightMale->pdf($wt)\n    * $footSizeMale->pdf($fs);\n\n$posteriorFemale = $priorFemale\n    * $heightFemale->pdf($ht)\n    * $weightFemale->pdf($wt)\n    * $footSizeFemale->pdf($fs);\n\necho \"Posterior (male):   \" . sprintf(\"%.4e\", $posteriorMale) . PHP_EOL;\necho \"Posterior (female): \" . sprintf(\"%.4e\", $posteriorFemale) . PHP_EOL;\necho PHP_EOL;\n\n$classification = $posteriorMale > $posteriorFemale ? 'male' : 'female';\necho \"Classification: \" . $classification . PHP_EOL;\n\n// Show confidence as normalized probability\n$total = $posteriorMale + $posteriorFemale;\n$confidenceMale = $posteriorMale / $total;\n$confidenceFemale = $posteriorFemale / $total;\necho \"Confidence (male):   \" . round($confidenceMale * 100, 1) . \"%\" . PHP_EOL;\necho \"Confidence (female): \" . round($confidenceFemale * 100, 1) . \"%\" . PHP_EOL;\n\n// --- Classify a second person ---\necho PHP_EOL . \"--- Classify another person ---\" . PHP_EOL . PHP_EOL;\n\n$ht2 = 5.5;\n$wt2 = 175;\n$fs2 = 11;\n\necho \"New person: height=\" . $ht2 . \"ft, weight=\" . $wt2\n    . \"lbs, foot size=\" . $fs2 . PHP_EOL . PHP_EOL;\n\n$posteriorMale2 = $priorMale\n    * $heightMale->pdf($ht2)\n    * $weightMale->pdf($wt2)\n    * $footSizeMale->pdf($fs2);\n\n$posteriorFemale2 = $priorFemale\n    * $heightFemale->pdf($ht2)\n    * $weightFemale->pdf($wt2)\n    * $footSizeFemale->pdf($fs2);\n\n$classification2 = $posteriorMale2 > $posteriorFemale2 ? 'male' : 'female';\n$total2 = $posteriorMale2 + $posteriorFemale2;\n\necho \"Posterior (male):   \" . sprintf(\"%.4e\", $posteriorMale2) . PHP_EOL;\necho \"Posterior (female): \" . sprintf(\"%.4e\", $posteriorFemale2) . PHP_EOL;\necho \"Classification: \" . $classification2 . PHP_EOL;\necho \"Confidence: \" . round(max($posteriorMale2, $posteriorFemale2) / $total2 * 100, 1)\n    . \"%\" . PHP_EOL;\n"
  },
  {
    "path": "examples/stat.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\Freq;\nuse HiFolks\\Statistics\\Stat;\n\n$freq = Freq::frequencies(\n    ['red', 'blue', 'blue', 'red', 'green', 'red', 'red'],\n);\nvar_dump($freq);\n$mode = Stat::mode(\n    ['red', 'blue', 'blue', 'red', 'green', 'red', 'red'],\n);\n\nvar_dump($mode);\n"
  },
  {
    "path": "examples/stat_methods.php",
    "content": "<?php\n\nrequire __DIR__ . '/../vendor/autoload.php';\n\nuse HiFolks\\Statistics\\Stat;\n\n$mean = Stat::mean([1, 2, 3, 4, 4]);\n// 2.8\n$mean = Stat::mean([-1.0, 2.5, 3.25, 5.75]);\n// 2.625\n$mean = Stat::geometricMean([54, 24, 36], 1);\n// 36.0\n$mean = Stat::harmonicMean([40, 60], null, 1);\n// 48.0\n$mean = Stat::harmonicMean([40, 60], [5, 30], 1);\n// 56.0\n$median = Stat::median([1, 3, 5, 7]);\n// 4\n$median = Stat::medianLow([1, 3, 5, 7]);\n// 3\n$median = Stat::medianHigh([1, 3, 5, 7]);\n// 5\n$percentile = Stat::firstQuartile([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88]);\n// 55.0\n$percentile = Stat::thirdQuartile([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88]);\n// 92.0\n$quantiles = Stat::quantiles([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88]);\n// [ 55.0, 88.0, 92.0 ]\n$quantiles = Stat::quantiles([105, 129, 87, 86, 111, 111, 89, 81, 108, 92, 110, 100, 75, 105, 103, 109, 76, 119, 99, 91, 103, 129, 106, 101, 84, 111, 74, 87, 86, 103, 103, 106, 86, 111, 75, 87, 102, 121, 111, 88, 89, 101, 106, 95, 103, 107, 101, 81, 109, 104], 10);\n// [81.0, 86.2, 89.0, 99.4, 102.5, 103.6, 106.0, 109.8, 111.0]\n$stdev = Stat::pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75], 4);\n// 0.9869\n$stdev = Stat::stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75], 4);\n// 1.0811\n$variance = Stat::pvariance([0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]);\n// 1.25\n$variance = Stat::variance([2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]);\n// 1.3720238095238095\n[$slope, $intercept] = Stat::linearRegression(\n    [1971, 1975, 1979, 1982, 1983],\n    [1, 2, 3, 4, 5],\n);\n// 0.31\n// -610.18\n\ntry {\n    $mean = Stat::mean([]);\n} catch (\\HiFolks\\Statistics\\Exception\\InvalidDataInputException $e) {\n    echo $e->getMessage();\n}\n\n// Exception\n"
  },
  {
    "path": "phpstan.neon",
    "content": "includes:\n    - vendor/phpstan/phpstan-phpunit/extension.neon\n\nparameters:\n\tlevel: 8\n\ttreatPhpDocTypesAsCertain: false\n\tpaths:\n\t\t- src\n\t\t- tests\n"
  },
  {
    "path": "phpunit.xml.dist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<phpunit\n    xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\n    xsi:noNamespaceSchemaLocation=\"https://schema.phpunit.de/11.0/phpunit.xsd\"\n    bootstrap=\"vendor/autoload.php\"\n    colors=\"true\"\n    executionOrder=\"random\"\n    failOnWarning=\"true\"\n    failOnRisky=\"true\"\n    failOnEmptyTestSuite=\"true\"\n    beStrictAboutOutputDuringTests=\"true\"\n    cacheDirectory=\".phpunit.cache\"\n>\n    <testsuites>\n        <testsuite name=\"HiFolks Test Suite\">\n            <directory suffix=\".php\">tests</directory>\n        </testsuite>\n    </testsuites>\n    <coverage>\n        <report>\n            <html outputDirectory=\"build/coverage\"/>\n            <text outputFile=\"build/coverage.txt\"/>\n            <clover outputFile=\"build/logs/clover.xml\"/>\n        </report>\n    </coverage>\n    <logging>\n        <junit outputFile=\"build/report.junit.xml\"/>\n    </logging>\n    <source>\n        <include>\n            <directory suffix=\".php\">./src</directory>\n        </include>\n    </source>\n</phpunit>\n"
  },
  {
    "path": "rector.php",
    "content": "<?php\n\ndeclare(strict_types=1);\n\nuse Rector\\CodeQuality\\Rector\\Class_\\InlineConstructorDefaultToPropertyRector;\nuse Rector\\Config\\RectorConfig;\nuse Rector\\Set\\ValueObject\\LevelSetList;\nuse Rector\\Set\\ValueObject\\SetList;\n\nreturn static function (RectorConfig $rectorConfig): void {\n    $rectorConfig->paths([\n        __DIR__ . \"/examples\",\n        __DIR__ . \"/src\",\n        __DIR__ . \"/tests\",\n    ]);\n\n    // register a single rule\n    $rectorConfig->rule(InlineConstructorDefaultToPropertyRector::class);\n\n    // define sets of rules\n    $rectorConfig->sets([\n        LevelSetList::UP_TO_PHP_82,\n        SetList::DEAD_CODE,\n        SetList::CODE_QUALITY,\n        SetList::EARLY_RETURN,\n        SetList::TYPE_DECLARATION,\n        // SetList::PRIVATIZATION\n    ]);\n};\n"
  },
  {
    "path": "src/ArrUtil.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Utils\\Arr;\n\n/**\n * @deprecated Use \\HiFolks\\Statistics\\Utils\\Arr instead.\n */\nclass ArrUtil extends Arr {}\n"
  },
  {
    "path": "src/Enums/Alternative.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Enums;\n\nenum Alternative: string\n{\n    case TwoSided = 'two-sided';\n    case Greater = 'greater';\n    case Less = 'less';\n}\n"
  },
  {
    "path": "src/Enums/KdeKernel.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Enums;\n\nenum KdeKernel: string\n{\n    case Normal = 'normal';\n    case Gauss = 'gauss';\n    case Logistic = 'logistic';\n    case Sigmoid = 'sigmoid';\n    case Rectangular = 'rectangular';\n    case Uniform = 'uniform';\n    case Triangular = 'triangular';\n    case Parabolic = 'parabolic';\n    case Epanechnikov = 'epanechnikov';\n    case Quartic = 'quartic';\n    case Biweight = 'biweight';\n    case Triweight = 'triweight';\n    case Cosine = 'cosine';\n\n    public function resolve(): self\n    {\n        return match ($this) {\n            self::Gauss => self::Normal,\n            self::Uniform => self::Rectangular,\n            self::Epanechnikov => self::Parabolic,\n            self::Biweight => self::Quartic,\n            default => $this,\n        };\n    }\n}\n"
  },
  {
    "path": "src/Exception/InvalidDataInputException.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Exception;\n\nuse InvalidArgumentException;\n\nclass InvalidDataInputException extends InvalidArgumentException {}\n"
  },
  {
    "path": "src/Freq.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Utils\\Math;\n\nclass Freq\n{\n    /**\n     * Return true is the type of the variable is integer, boolean or string\n     */\n    private static function isDiscreteType(mixed $value): bool\n    {\n        $type = gettype($value);\n\n        return in_array($type, ['string', 'boolean', 'integer']);\n    }\n\n    /**\n     * Return an array with the number of occurrences of each element.\n     * Useful for the frequencies table.\n     *\n     * @param  mixed[]  $data\n     * @param  bool  $transformToInteger whether data should be transformed to integer\n     * @return array<mixed, int>\n     */\n    public static function frequencies(array $data, bool $transformToInteger = false): array\n    {\n        if (Stat::count($data) === 0) {\n            return [];\n        }\n        if ($transformToInteger || !self::isDiscreteType($data[0])) {\n            foreach ($data as $key => $value) {\n                $data[$key] = (int) $value;\n            }\n        }\n        $frequencies = array_count_values($data);\n        ksort($frequencies);\n\n        return $frequencies;\n    }\n\n    /**\n     * Calculate cumulative (number of occurrences of element + sum of the numbers of occurrences of the elements,\n     * that come before that element)frequency of elements.\n     * For the array like ['A', 'A', 'B', 'C'] it would be ['A' => 2, 'B' => 3, 'C' => 4]\n     *\n     * @param  mixed[]  $data\n     * @return array<mixed, int>\n     */\n    public static function cumulativeFrequencies(array $data): array\n    {\n        $freqCumul = [];\n        $cumul = 0;\n        $freqs = self::frequencies($data);\n        foreach ($freqs as $key => $value) {\n            $cumul += $value;\n            $freqCumul[$key] = $cumul;\n        }\n\n        return $freqCumul;\n    }\n\n    /**\n     * Calculate relative frequencies. Basically it is the percentage of occurrences of each element in the array.\n     * For the array like ['A', 'A', 'B', 'C'] it would be ['A' => 50, 'B' => 25, 'C' => 25]\n     *\n     * @param  mixed[]  $data\n     * @param  ?int  $round whether to round values or not\n     * @return array<mixed, float>\n     */\n    public static function relativeFrequencies(array $data, ?int $round = null): array\n    {\n        $returnArray = [];\n        $n = Stat::count($data);\n        $freq = self::frequencies($data);\n        foreach ($freq as $key => $value) {\n            $relValue = $value * 100 / $n;\n            $returnArray[$key] = Math::round($relValue, $round);\n        }\n\n        return $returnArray;\n    }\n\n    /**\n     * Calculate cumulative relative frequencies.\n     * For the array like ['A', 'A', 'B', 'C'] it would be ['A' => 50, 'B' => 75, 'C' => 100]\n     *\n     * @param  mixed[]  $data\n     * @return array<mixed, float>\n     */\n    public static function cumulativeRelativeFrequencies(array $data): array\n    {\n        $freqCumul = [];\n        $cumul = 0;\n        $relFreqs = self::relativeFrequencies($data);\n        foreach ($relFreqs as $key => $value) {\n            $cumul += $value;\n            $freqCumul[$key] = $cumul;\n        }\n\n        return $freqCumul;\n    }\n\n    /**\n     * @param  mixed[]  $data\n     * @return int[]\n     */\n    public static function frequencyTableBySize(array $data, int $chunkSize = 1): array\n    {\n        $result = [];\n        if ($data === []) {\n            return $result;\n        }\n        $min = floor((float) min($data));\n        $max = ceil((float) max($data));\n        //$limit = ceil(($max - $min) / $category);\n\n        sort($data);\n        $rangeLow = $min;\n        $rangeHigh = $rangeLow;\n        while ($rangeHigh < $max) {\n            $count = 0;\n            $rangeHigh = ($rangeLow + $chunkSize);\n            foreach ($data as $number) {\n                if (\n                    ($number >= $rangeLow)\n                    && ($number < $rangeHigh)\n                ) {\n                    $count++;\n                    //unset($data[$key]);\n                }\n            }\n            $result[(string) $rangeLow] = $count;\n            $rangeLow = $rangeHigh;\n        }\n\n        return $result;\n    }\n\n    /**\n     * Returns the frequency table grouped by class.\n     * The parameter $category set the number of classes.\n     * If $category is null (default value for the optional parameter),\n     * each class is not a range.\n     *\n     * @param  mixed[]  $data\n     * @return int[]\n     */\n    public static function frequencyTable(array $data, ?int $category = null): array\n    {\n        $result = [];\n        if ($data === []) {\n            return $result;\n        }\n        $min = floor((float) min($data));\n        $max = ceil((float) max($data));\n        if (is_null($category)) {\n            $category = ($max - $min) + 1;\n        }\n\n        $limit = ceil(($max - $min) / $category);\n        sort($data);\n        $rangeLow = $min;\n        for ($i = 0; $i < $category; $i++) {\n            $count = 0;\n            $rangeHigh = $rangeLow + $limit;\n            foreach ($data as $number) {\n                if (\n                    ($number >= $rangeLow)\n                    && ($number < $rangeHigh)\n                ) {\n                    $count++;\n                    //unset($data[$key]);\n                }\n            }\n            $result[(string) $rangeLow] = $count;\n            $rangeLow = $rangeHigh;\n        }\n\n        // eliminate\n        foreach (array_keys($result) as $key) {\n            if ($key > max($data)) {\n                unset($result[$key]);\n            }\n        }\n\n        return $result;\n    }\n}\n"
  },
  {
    "path": "src/Math.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Utils\\Math as UtilsMath;\n\n/**\n * @deprecated Use \\HiFolks\\Statistics\\Utils\\Math instead.\n */\nclass Math extends UtilsMath {}\n"
  },
  {
    "path": "src/NormalDist.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\n\nclass NormalDist\n{\n    // Mean\n\n    private readonly float $sigma; // Standard deviation\n\n    // Constructor to initialize mu and sigma\n    public function __construct(private readonly float $mu = 0.0, float $sigma = 1.0)\n    {\n        if ($sigma < 0) {\n            throw new InvalidDataInputException('Standard deviation (sigma) cannot be negative.');\n        }\n        $this->sigma = $sigma;\n    }\n\n    // Getter for mean (read-only)\n    public function getMean(): float\n    {\n        return $this->mu;\n    }\n\n    public function getMeanRounded(int $precision = 3): float\n    {\n        return round($this->getMean(), $precision);\n    }\n\n    // Getter for standard deviation (read-only)\n    public function getSigma(): float\n    {\n        return $this->sigma;\n    }\n\n    public function getSigmaRounded(int $precision = 3): float\n    {\n        return round($this->getSigma(), $precision);\n    }\n\n    // Getter for median (equals mu for a normal distribution, read-only)\n    public function getMedian(): float\n    {\n        return $this->mu;\n    }\n\n    public function getMedianRounded(int $precision = 3): float\n    {\n        return round($this->getMedian(), $precision);\n    }\n\n    // Getter for mode (equals mu for a normal distribution, read-only)\n    public function getMode(): float\n    {\n        return $this->mu;\n    }\n\n    public function getModeRounded(int $precision = 3): float\n    {\n        return round($this->getMode(), $precision);\n    }\n\n    // Getter for variance (sigma squared, read-only)\n    public function getVariance(): float\n    {\n        return $this->sigma ** 2;\n    }\n\n    public function getVarianceRounded(int $precision = 3): float\n    {\n        return round($this->getVariance(), $precision);\n    }\n\n    /**\n     * Creates a NormalDist instance from a set of data samples.\n     *\n     * This static method calculates the mean (μ) and standard deviation (σ)\n     * from the provided array of numeric samples and initializes a new\n     * NormalDist object with these values.\n     *\n     * @param float[] $samples An array of numeric samples to calculate the distribution.\n     *                         The array must contain at least one element.\n     *\n     * @return NormalDist Returns a new NormalDist object with the calculated mean and standard deviation.\n     *\n     * @throws InvalidDataInputException If the samples array is empty or contains non-numeric values.\n     *\n     */\n    public static function fromSamples(array $samples): self\n    {\n        if ($samples === []) {\n            throw new InvalidDataInputException(\"Samples array must not be empty.\");\n        }\n        $mean = Stat::mean($samples);\n        $std_dev = Stat::stdev($samples);\n        return new self((float) $mean, $std_dev);\n    }\n\n    /**\n     * Computes the standard score (z-score) describing x in terms of\n     * the number of standard deviations above or below the mean.\n     *\n     * @param float $x The value to compute the z-score for.\n     * @return float The z-score: (x - mu) / sigma.\n     * @throws InvalidDataInputException If sigma is zero.\n     */\n    public function zscore(float $x): float\n    {\n        if ($this->sigma === 0.0) {\n            throw new InvalidDataInputException('zscore() not defined when sigma is zero.');\n        }\n\n        return ($x - $this->mu) / $this->sigma;\n    }\n\n    public function zscoreRounded(float $x, int $precision = 3): float\n    {\n        return round($this->zscore($x), $precision);\n    }\n\n    /**\n     * Generates n random samples from the normal distribution.\n     *\n     * Uses the Box-Muller transform to generate normally distributed values.\n     * An optional seed can be provided for reproducible results.\n     *\n     * @param int $n The number of samples to generate (must be >= 1).\n     * @param int|null $seed Optional seed for the random number generator.\n     * @return float[] An array of n random samples.\n     * @throws InvalidDataInputException If n is less than 1.\n     */\n    public function samples(int $n, ?int $seed = null): array\n    {\n        if ($n < 1) {\n            throw new InvalidDataInputException('n must be at least 1.');\n        }\n\n        if ($seed !== null) {\n            mt_srand($seed);\n        }\n\n        $result = [];\n        for ($i = 0; $i < $n; $i++) {\n            // Box-Muller transform\n            $u1 = mt_rand(1, mt_getrandmax()) / mt_getrandmax();\n            $u2 = mt_rand(1, mt_getrandmax()) / mt_getrandmax();\n            $z = sqrt(-2.0 * log($u1)) * cos(2.0 * M_PI * $u2);\n            $result[] = $this->mu + $z * $this->sigma;\n        }\n\n        return $result;\n    }\n\n    // A utility function to calculate the probability density function (PDF)\n    public function pdf(float $x): float\n    {\n        $coeff = 1 / (sqrt(2 * M_PI) * $this->sigma);\n        $exponent = -($x - $this->mu) ** 2 / (2 * $this->sigma ** 2);\n\n        return $coeff * exp($exponent);\n    }\n\n    public function pdfRounded(float $x, int $precision = 3): float\n    {\n        return round($this->pdf($x), $precision);\n    }\n\n    // Approximate the complementary error function (erfc)\n    private function erfc(float $z): float\n    {\n        return 1.0 - $this->erf($z);\n    }\n\n    // Approximate the error function (erf)\n    private function erf(float $z): float\n    {\n        $t = 1 / (1 + 0.5 * abs($z));\n        $tau = $t * exp(-$z * $z\n                - 1.26551223\n                + 1.00002368 * $t\n                + 0.37409196 * $t ** 2\n                + 0.09678418 * $t ** 3\n                - 0.18628806 * $t ** 4\n                + 0.27886807 * $t ** 5\n                - 1.13520398 * $t ** 6\n                + 1.48851587 * $t ** 7\n                - 0.82215223 * $t ** 8\n                + 0.17087277 * $t ** 9);\n        return $z >= 0 ? 1 - $tau : $tau - 1;\n    }\n\n    // A utility function to calculate the cumulative density function (CDF)\n    public function cdf(float $x): float\n    {\n        $z = ($x - $this->mu) / ($this->sigma * sqrt(2));\n\n        return 0.5 * (1 + $this->erf($z));\n    }\n\n    public function cdfRounded(float $x, int $precision = 3): float\n    {\n        return round($this->cdf($x), $precision);\n    }\n\n    /**\n     * Computes the inverse cumulative distribution function (quantile function).\n     *\n     * Given a probability p, finds the value x such that P(X <= x) = p.\n     *\n     * Uses the rational approximation algorithm by Peter Acklam\n     * for the standard normal inverse CDF, then scales to (mu, sigma).\n     *\n     * @param float $p A probability value in the range (0, 1) exclusive.\n     * @return float The value x where cdf(x) = p.\n     * @throws InvalidDataInputException If p is not in (0, 1).\n     */\n    public function invCdf(float $p): float\n    {\n        if ($p <= 0.0 || $p >= 1.0) {\n            throw new InvalidDataInputException('p must be in the range (0, 1) exclusive.');\n        }\n\n        // Rational approximation for the standard normal inverse CDF\n        // Coefficients from Peter Acklam's algorithm\n        $a = [\n            -3.969683028665376e+01,\n            2.209460984245205e+02,\n            -2.759285104469687e+02,\n            1.383577518672690e+02,\n            -3.066479806614716e+01,\n            2.506628277459239e+00,\n        ];\n\n        $b = [\n            -5.447609879822406e+01,\n            1.615858368580409e+02,\n            -1.556989798598866e+02,\n            6.680131188771972e+01,\n            -1.328068155288572e+01,\n        ];\n\n        $c = [\n            -7.784894002430293e-03,\n            -3.223964580411365e-01,\n            -2.400758277161838e+00,\n            -2.549732539343734e+00,\n            4.374664141464968e+00,\n            2.938163982698783e+00,\n        ];\n\n        $d = [\n            7.784695709041462e-03,\n            3.224671290700398e-01,\n            2.445134137142996e+00,\n            3.754408661907416e+00,\n        ];\n\n        $pLow = 0.02425;\n        $pHigh = 1.0 - $pLow;\n\n        if ($p < $pLow) {\n            // Rational approximation for lower region\n            $q = sqrt(-2.0 * log($p));\n            $x = ((((($c[0] * $q + $c[1]) * $q + $c[2]) * $q + $c[3]) * $q + $c[4]) * $q + $c[5])\n                / (((($d[0] * $q + $d[1]) * $q + $d[2]) * $q + $d[3]) * $q + 1.0);\n        } elseif ($p <= $pHigh) {\n            // Rational approximation for central region\n            $q = $p - 0.5;\n            $r = $q * $q;\n            $x = ((((($a[0] * $r + $a[1]) * $r + $a[2]) * $r + $a[3]) * $r + $a[4]) * $r + $a[5]) * $q\n                / ((((($b[0] * $r + $b[1]) * $r + $b[2]) * $r + $b[3]) * $r + $b[4]) * $r + 1.0);\n        } else {\n            // Rational approximation for upper region\n            $q = sqrt(-2.0 * log(1.0 - $p));\n            $x = -((((($c[0] * $q + $c[1]) * $q + $c[2]) * $q + $c[3]) * $q + $c[4]) * $q + $c[5])\n                / (((($d[0] * $q + $d[1]) * $q + $d[2]) * $q + $d[3]) * $q + 1.0);\n        }\n\n        // Scale from standard normal to (mu, sigma)\n        return $this->mu + $x * $this->sigma;\n    }\n\n    public function invCdfRounded(float $p, int $precision = 3): float\n    {\n        return round($this->invCdf($p), $precision);\n    }\n\n    /**\n     * Divides the normal distribution into n continuous intervals\n     * with equal probability.\n     *\n     * Returns an array of (n - 1) cut points separating the intervals.\n     * Set n to 4 for quartiles (the default), n to 10 for deciles,\n     * or n to 100 for percentiles.\n     *\n     * @param int $n The number of equal-probability intervals (must be >= 1).\n     * @return float[] An array of (n - 1) cut points.\n     * @throws InvalidDataInputException If n is less than 1.\n     */\n    public function quantiles(int $n = 4): array\n    {\n        if ($n < 1) {\n            throw new InvalidDataInputException('n must be at least 1.');\n        }\n\n        $points = [];\n        for ($i = 1; $i < $n; $i++) {\n            $points[] = $this->invCdf($i / $n);\n        }\n\n        return $points;\n    }\n\n    /**\n     * Computes the overlapping coefficient (OVL) between two normal distributions.\n     *\n     * Measures the agreement between two normal probability distributions.\n     * Returns a value between 0.0 and 1.0 giving the overlapping area in\n     * the two underlying probability density functions.\n     *\n     * @param NormalDist $other The other normal distribution to compare with.\n     * @return float The overlapping coefficient between 0.0 and 1.0.\n     * @throws InvalidDataInputException If either distribution has sigma equal to zero.\n     */\n    public function overlap(NormalDist $other): float\n    {\n        $x = $this;\n        $y = $other;\n\n        // Order so that X has the smaller (sigma, mu)\n        if (($y->sigma <=> $x->sigma) ?: ($y->mu <=> $x->mu) < 0) {\n            [$x, $y] = [$y, $x];\n        }\n\n        $xVar = $x->sigma ** 2;\n        $yVar = $y->sigma ** 2;\n\n        if ($xVar === 0.0 || $yVar === 0.0) {\n            throw new InvalidDataInputException('overlap() not defined when sigma is zero.');\n        }\n\n        $dv = $yVar - $xVar;\n        $dm = abs($y->mu - $x->mu);\n\n        // Equal variances: simplified formula using erfc\n        if ($dv === 0.0) {\n            return $this->erfc($dm / (2.0 * $x->sigma * M_SQRT2));\n        }\n\n        // Unequal variances: find intersection points of the two PDFs\n        $a = $x->mu * $yVar - $y->mu * $xVar;\n        $b = $x->sigma * $y->sigma * sqrt($dm * $dm + $dv * log($yVar / $xVar));\n        $x1 = ($a + $b) / $dv;\n        $x2 = ($a - $b) / $dv;\n\n        return 1.0 - (abs($y->cdf($x1) - $x->cdf($x1)) + abs($y->cdf($x2) - $x->cdf($x2)));\n    }\n\n    public function overlapRounded(NormalDist $other, int $precision = 3): float\n    {\n        return round($this->overlap($other), $precision);\n    }\n\n    /**\n     * Adds a constant or another NormalDist instance to this distribution.\n     *\n     * If the argument is:\n     * - A constant (float): Adjusts the mean (mu), leaving sigma unchanged.\n     * - A NormalDist instance: Combines the means and variances.\n     *\n     * @param float|NormalDist $x2 The value or NormalDist to add.\n     * @return NormalDist A new NormalDist instance with the updated parameters.\n     * @throws InvalidDataInputException If the argument is not a float or NormalDist.\n     */\n    public function add(float|NormalDist $x2): NormalDist\n    {\n        if ($x2 instanceof NormalDist) {\n            // Add the means and combine the variances (using the Pythagorean theorem)\n            $newMu = $this->mu + $x2->mu;\n            $newSigma = hypot($this->sigma, $x2->sigma);\n            // sqrt(sigma1^2 + sigma2^2)\n            return new NormalDist($newMu, $newSigma);\n        }\n        // Add a constant to the mean, sigma remains unchanged\n        return new NormalDist($this->mu + $x2, $this->sigma);\n    }\n\n    /**\n     * Subtracts a constant or another NormalDist instance from this distribution.\n     *\n     * If the argument is:\n     * - A constant (float): Shifts the mean (mu) down, leaving sigma unchanged.\n     * - A NormalDist instance: Subtracts the means and combines the variances.\n     *\n     * @param float|NormalDist $x2 The value or NormalDist to subtract.\n     * @return NormalDist A new NormalDist instance with the updated parameters.\n     */\n    public function subtract(float|NormalDist $x2): NormalDist\n    {\n        if ($x2 instanceof NormalDist) {\n            $newMu = $this->mu - $x2->mu;\n            $newSigma = hypot($this->sigma, $x2->sigma);\n\n            return new NormalDist($newMu, $newSigma);\n        }\n\n        return new NormalDist($this->mu - $x2, $this->sigma);\n    }\n\n    /**\n     * Multiplies both the mean (mu) and standard deviation (sigma) by a constant.\n     *\n     * This method is useful for rescaling distributions, such as when changing\n     * measurement units. The standard deviation is scaled by the absolute value\n     * of the constant to ensure it remains non-negative.\n     *\n     * @param float $constant The constant by which to scale mu and sigma.\n     * @return NormalDist A new NormalDist instance with scaled mu and sigma.\n     */\n    public function multiply(float $constant): NormalDist\n    {\n        return new self(\n            $this->mu * $constant,                  // Scale the mean\n            $this->sigma * abs($constant),          // Scale the standard deviation by the absolute value of the constant\n        );\n    }\n\n    /**\n     * Divides both the mean (mu) and standard deviation (sigma) by a constant.\n     *\n     * This method is useful for rescaling distributions, such as when changing\n     * measurement units. The standard deviation is scaled by the absolute value\n     * of the constant to ensure it remains non-negative.\n     *\n     * @param float $constant The constant by which to divide mu and sigma (must not be zero).\n     * @return NormalDist A new NormalDist instance with scaled mu and sigma.\n     * @throws InvalidDataInputException If the constant is zero.\n     */\n    public function divide(float $constant): NormalDist\n    {\n        if ($constant === 0.0) {\n            throw new InvalidDataInputException('Cannot divide by zero.');\n        }\n\n        return new self(\n            $this->mu / $constant,\n            $this->sigma / abs($constant),\n        );\n    }\n}\n"
  },
  {
    "path": "src/Stat.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Enums\\Alternative;\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\StudentT;\nuse HiFolks\\Statistics\\Utils\\Math;\n\nclass Stat\n{\n    final public const MEDIAN_TYPE_LOW = \"LOW\";\n\n    final public const MEDIAN_TYPE_HIGH = \"HIGH\";\n\n    final public const MEDIAN_TYPE_MIDDLE = \"MIDDLE\";\n\n    /**\n     * Count the element in the array\n     *\n     * @param  mixed[]  $data\n     */\n    public static function count(array $data): int\n    {\n        return count($data);\n    }\n\n    /**\n     * Return the sample arithmetic mean of numeric data\n     * The arithmetic mean is the sum of the data divided by the number of data points.\n     * It is commonly called “the average”,\n     * although it is only one of many different mathematical averages.\n     * It is a measure of the central location of the data.\n     * If data is empty, null is returned\n     *\n     * @param  array<int|float>  $data array of data\n     * @return int|float|null arithmetic mean\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function mean(array $data): int|float|null\n    {\n        $sum = 0;\n        if (self::count($data) === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        if (array_filter($data, is_string(...)) !== []) {\n            throw new InvalidDataInputException(\n                \"The data array contains a string.\",\n            );\n        }\n        $sum = array_sum($data);\n\n        return $sum / self::count($data);\n    }\n\n    /**\n     * Calculate the float number arithmetic mean of a float numbers dataset with optional weights and precision.\n     *\n     * Supports both unweighted and weighted means. Automatically casts values to float.\n     * Returns `null` if the input data is empty.\n     *\n     * @param  array<float>  $data Array of floating numbers\n     * @param  null|array<float>  $weights Optional array of weights (same length as $data).\n     * @param null|int $precision Optional number of decimal places to round the result (default is null, no round() is applied).\n     * @return float|null arithmetic mean\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function fmean(\n        array $data,\n        ?array $weights = null,\n        ?int $precision = null,\n    ): ?float {\n        $sum = 0;\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        // Unweighted mean\n        if ($weights === null) {\n            $sum = array_sum(array_map(floatval(...), $data));\n            $count = count($data);\n            if ($precision) {\n                return round($sum / $count, $precision);\n            }\n            return $sum / $count;\n        }\n\n        // Check lengths\n        if ($count !== count($weights)) {\n            throw new InvalidDataInputException(\n                \"The data and weights must be the same length\",\n            );\n        }\n\n        $weightedSum = 0.0;\n        $weightTotal = 0.0;\n        foreach ($data as $i => $value) {\n            $w = floatval($weights[$i]);\n            $weightedSum += floatval($value) * $w;\n            $weightTotal += $w;\n        }\n\n        if ($weightTotal == 0) {\n            throw new InvalidDataInputException(\n                \"The sum of weights must be non-zero\",\n            );\n        }\n\n        if ($precision) {\n            return round($weightedSum / $weightTotal, $precision);\n        }\n        return $weightedSum / $weightTotal;\n    }\n\n    /**\n     * Return the trimmed (truncated) mean of the data.\n     * Computes the mean after removing the lowest and highest fraction of values.\n     * This is a robust measure of central tendency, less sensitive to outliers.\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $proportionToCut  fraction (0..0.5) to trim from each side\n     * @param  int|null  $round whether to round the result\n     * @return float the trimmed mean\n     *\n     * @throws InvalidDataInputException if the data is empty, or proportionToCut is out of range,\n     *         or trimming would remove all elements\n     */\n    public static function trimmedMean(\n        array $data,\n        float $proportionToCut = 0.1,\n        ?int $round = null,\n    ): float {\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        if ($proportionToCut < 0 || $proportionToCut >= 0.5) {\n            throw new InvalidDataInputException(\n                \"proportionToCut must be in the range [0, 0.5).\",\n            );\n        }\n\n        sort($data);\n        $trimCount = (int) floor($count * $proportionToCut);\n        $trimmedData = array_slice($data, $trimCount, $count - 2 * $trimCount);\n\n        if ($trimmedData === []) {\n            // @codeCoverageIgnoreStart\n            throw new InvalidDataInputException(\n                \"Trimming removed all elements.\",\n            );\n            // @codeCoverageIgnoreEnd\n        }\n\n        return Math::round(\n            array_sum($trimmedData) / count($trimmedData),\n            $round,\n        );\n    }\n\n    /**\n     * Return the median (middle value) of data,\n     * using the common “mean of middle two” method.\n     *\n     * @param  mixed[]  $data\n     * @return mixed median of the data\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function median(\n        array $data,\n        string $medianType = self::MEDIAN_TYPE_MIDDLE,\n    ): mixed {\n        sort($data);\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        $index = (int) floor($count / 2); // cache the index\n        if (($count & 1) !== 0) {\n            // count is odd\n            return $data[$index];\n        }\n\n        // count is even\n        return match ($medianType) {\n            self::MEDIAN_TYPE_LOW => $data[$index - 1],\n            self::MEDIAN_TYPE_HIGH => $data[$index],\n            default => ($data[$index - 1] + $data[$index]) / 2,\n        };\n    }\n\n    /**\n     * Return the weighted median of the data.\n     * The weighted median is the value where the cumulative weight\n     * reaches 50% of the total weight.\n     *\n     * @param  array<int|float>  $data\n     * @param  array<int|float>  $weights  array of weights (same length as $data, all > 0)\n     * @param  int|null  $round whether to round the result\n     * @return float the weighted median\n     *\n     * @throws InvalidDataInputException if the data is empty, weights length mismatches,\n     *         or any weight is not positive\n     */\n    public static function weightedMedian(array $data, array $weights, ?int $round = null): float\n    {\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        if ($count !== count($weights)) {\n            throw new InvalidDataInputException(\n                \"Data and weights must have the same number of elements.\",\n            );\n        }\n\n        // Validate weights and pair with data\n        $paired = [];\n        for ($i = 0; $i < $count; $i++) {\n            if (!is_numeric($weights[$i]) || $weights[$i] <= 0) {\n                throw new InvalidDataInputException(\n                    \"All weights must be positive numbers.\",\n                );\n            }\n            $paired[] = [(float) $data[$i], (float) $weights[$i]];\n        }\n\n        // Sort by value\n        usort($paired, fn(array $a, array $b): int => $a[0] <=> $b[0]);\n\n        $totalWeight = array_sum($weights);\n        $halfWeight = $totalWeight / 2.0;\n        $cumulative = 0.0;\n\n        for ($i = 0; $i < $count; $i++) {\n            $cumulative += $paired[$i][1];\n            if ($cumulative > $halfWeight) {\n                return Math::round($paired[$i][0], $round);\n            }\n            if ($cumulative === $halfWeight && $i + 1 < $count) {\n                // Exactly at the midpoint — average with the next value\n                return Math::round(($paired[$i][0] + $paired[$i + 1][0]) / 2.0, $round);\n            }\n        }\n\n        // @codeCoverageIgnoreStart\n        // Fallback: last element (all weight in one point)\n        /** @var array{float, float} $last */\n        $last = end($paired);\n\n        return Math::round($last[0], $round);\n        // @codeCoverageIgnoreEnd\n    }\n\n    /**\n     * Estimate the median for grouped data that has been binned\n     * around the midpoints of consecutive, fixed-width intervals.\n     *\n     * Uses interpolation within the median interval:\n     * L + interval * (n/2 - cf) / f\n     *\n     * where:\n     * - L is the lower limit of the median interval\n     * - cf is the cumulative frequency of the preceding interval\n     * - f is the number of elements in the median interval\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $interval the width of each bin\n     * @return float the estimated median for grouped data\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function medianGrouped(\n        array $data,\n        float $interval = 1.0,\n    ): float {\n        sort($data);\n        $n = count($data);\n        if ($n === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        // Find the value at the midpoint (midpoint of the class interval)\n        $x = (float) $data[intdiv($n, 2)];\n\n        // Find where all the x values occur in the sorted data\n        // All x will lie within data[i:j]\n        $i = self::bisectLeft($data, $x);\n        $j = self::bisectRight($data, $x, $i);\n\n        // Lower limit of the median interval\n        $L = $x - $interval / 2.0;\n        // Cumulative frequency of the preceding interval\n        $cf = $i;\n        // Number of elements in the median interval\n        $f = $j - $i;\n\n        return $L + ($interval * ($n / 2.0 - $cf)) / $f;\n    }\n\n    /**\n     * Binary search: find the leftmost position where $target can be inserted\n     * in $data while keeping it sorted.\n     *\n     * @param  array<int|float>  $data sorted array\n     * @param  float  $target value to locate\n     */\n    private static function bisectLeft(array $data, float $target): int\n    {\n        $lo = 0;\n        $hi = count($data);\n        while ($lo < $hi) {\n            $mid = intdiv($lo + $hi, 2);\n            if ($data[$mid] < $target) {\n                $lo = $mid + 1;\n            } else {\n                $hi = $mid;\n            }\n        }\n\n        return $lo;\n    }\n\n    /**\n     * Binary search: find the rightmost position where $target can be inserted\n     * in $data while keeping it sorted.\n     *\n     * @param  array<int|float>  $data sorted array\n     * @param  float  $target value to locate\n     * @param  int  $lo lower bound for the search\n     */\n    private static function bisectRight(\n        array $data,\n        float $target,\n        int $lo = 0,\n    ): int {\n        $hi = count($data);\n        while ($lo < $hi) {\n            $mid = intdiv($lo + $hi, 2);\n            if ($data[$mid] <= $target) {\n                $lo = $mid + 1;\n            } else {\n                $hi = $mid;\n            }\n        }\n\n        return $lo;\n    }\n\n    /**\n     * Return the low median of data.\n     * The low median is always a member of the data set.\n     * When the number of data points is odd, the middle value is returned.\n     * When it is even, the smaller of the two middle values is returned.\n     *\n     * @param  mixed[]  $data\n     *\n     * @see Stat::median()\n     *\n     * @return mixed low median of the data\n     */\n    public static function medianLow(array $data): mixed\n    {\n        return self::median($data, self::MEDIAN_TYPE_LOW);\n    }\n\n    /**\n     * Return the high median of data.\n     * The high median is always a member of the data set.\n     * When the number of data points is odd, the middle value is returned.\n     * When it is even, the larger of the two middle values is returned.\n     *\n     * @param  mixed[]  $data\n     *\n     * @see Stat::median()\n     *\n     * @return mixed high median of the data\n     */\n    public static function medianHigh(array $data): mixed\n    {\n        return self::median($data, self::MEDIAN_TYPE_HIGH);\n    }\n\n    /**\n     * Return the most common data point from discrete or nominal data.\n     * The mode (when it exists) is the most typical value and serves as a measure of central location.\n     * If there are multiple modes with the same frequency, returns the first one encountered in the data.\n     *\n     * @param  mixed[]  $data\n     * @param  bool  $multimode whether to return all the modes\n     * @return mixed|mixed[]|null the most common data point, array of them or null, if there is no mode\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function mode(array $data, bool $multimode = false): mixed\n    {\n        $frequencies = Freq::frequencies($data);\n        if ($frequencies === []) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        $sameMode = true;\n        foreach ($frequencies as $value) {\n            if ($value > 1) {\n                $sameMode = false;\n\n                break;\n            }\n        }\n        if ($sameMode) {\n            return null;\n        }\n\n        $highestFreq = max($frequencies);\n        $modes = array_keys($frequencies, $highestFreq, true);\n        if ($multimode) {\n            return $modes;\n        }\n\n        return $modes[0];\n    }\n\n    /**\n     * Return a list of the most frequently occurring values\n     *\n     * @param  mixed[]  $data\n     *\n     * @see Stat::mode()\n     *\n     * @return mixed[]|null array of the most common data points or null, if all elements occurs once\n     */\n    public static function multimode(array $data): ?array\n    {\n        return self::mode($data, true);\n    }\n\n    /**\n     * Return the quantiles of the data.\n     *\n     * @param  mixed[]  $data\n     * @param  int  $n number of quantiles\n     * @param  int|null  $round whether to round the result\n     * @param  string  $method 'exclusive' (default) or 'inclusive'\n     * @return mixed[] array of quantiles\n     *\n     * @throws InvalidDataInputException if number of quantiles is less than 1, or the data size is less than 2, or the method is invalid\n     */\n    public static function quantiles(\n        array $data,\n        int $n = 4,\n        ?int $round = null,\n        string $method = \"exclusive\",\n    ): array {\n        $count = self::count($data);\n        if ($count < 2 || $n < 1) {\n            throw new InvalidDataInputException(\n                \"The size of the data must be greater than 2 and the number of quantiles must be greater than 1.\",\n            );\n        }\n\n        if ($method !== \"exclusive\" && $method !== \"inclusive\") {\n            throw new InvalidDataInputException(\n                \"Invalid method '{$method}'. Must be 'exclusive' or 'inclusive'.\",\n            );\n        }\n\n        sort($data);\n        $result = [];\n\n        if ($method === \"inclusive\") {\n            $m = $count - 1;\n            foreach (range(1, $n - 1) as $i) {\n                $j = intdiv($i * $m, $n);\n                $delta = $i * $m - $j * $n;\n                $interpolated\n                    = ($data[$j] * ($n - $delta) + $data[$j + 1] * $delta) / $n;\n                $result[] = Math::round($interpolated, $round);\n            }\n        } else {\n            $m = $count + 1;\n            foreach (range(1, $n - 1) as $i) {\n                $j = (int) floor(($i * $m) / $n);\n                if ($j < 1) {\n                    $j = 1;\n                } elseif ($j > $count - 1) {\n                    $j = $count - 1;\n                }\n                $delta = $i * $m - $j * $n;\n                $interpolated\n                    = ($data[$j - 1] * ($n - $delta) + $data[$j] * $delta) / $n;\n                $result[] = Math::round($interpolated, $round);\n            }\n        }\n\n        return $result;\n    }\n\n    /**\n     * Return the first or lower quartile a.k.a. 25th percentile.\n     *\n     * @param  mixed[]  $data\n     *\n     * @see Stat::quantiles()\n     *\n     * @return mixed the first quartile\n     */\n    public static function firstQuartile(array $data, ?int $round = null): mixed\n    {\n        $quartiles = self::quantiles($data, 4, $round);\n\n        return $quartiles[0];\n    }\n\n    /**\n     * Return the third or upper quartile a.k.a. 75th percentile.\n     *\n     * @param  mixed[]  $data\n     *\n     * @see Stat::quantiles()\n     *\n     * @return mixed the third quartile\n     */\n    public static function thirdQuartile(array $data): mixed\n    {\n        $quartiles = self::quantiles($data, 4);\n\n        return $quartiles[2];\n    }\n\n    /**\n     * Return the value at the given percentile of the data.\n     *\n     * Uses linear interpolation between adjacent data points,\n     * consistent with the exclusive quantile method.\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $p  percentile in range 0..100\n     * @param  int|null  $round whether to round the result\n     * @return float the interpolated value at the given percentile\n     *\n     * @throws InvalidDataInputException if the data has fewer than 2 elements or p is out of range\n     */\n    public static function percentile(\n        array $data,\n        float $p,\n        ?int $round = null,\n    ): float {\n        $count = self::count($data);\n        if ($count < 2) {\n            throw new InvalidDataInputException(\n                \"Percentile requires at least 2 data points.\",\n            );\n        }\n        if ($p < 0 || $p > 100) {\n            throw new InvalidDataInputException(\n                \"Percentile must be between 0 and 100.\",\n            );\n        }\n\n        sort($data);\n\n        // Exclusive method: rank = p/100 * (n + 1), 1-based index\n        $rank = ($p / 100) * ($count + 1);\n\n        if ($rank <= 1) {\n            return Math::round((float) $data[0], $round);\n        }\n        if ($rank >= $count) {\n            return Math::round((float) $data[$count - 1], $round);\n        }\n\n        $lower = (int) floor($rank) - 1;\n        $fraction = $rank - floor($rank);\n        $interpolated\n            = $data[$lower] + $fraction * ($data[$lower + 1] - $data[$lower]);\n\n        return Math::round($interpolated, $round);\n    }\n\n    /**\n     * Return the **population** standard deviation,\n     * a measure of the amount of variation or dispersion of a set of values.\n     * A low standard deviation indicates that\n     * the values tend to be close to the mean of the set,\n     * while a high standard deviation indicates that\n     * the values are spread out over a wider range.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::pvariance()\n     *\n     * @return float the population standard deviation\n     */\n    public static function pstdev(array $data, ?int $round = null): float\n    {\n        $variance = self::pvariance($data);\n\n        return Math::round(sqrt($variance), $round);\n    }\n\n    /**\n     * Return dispersion of the numeric data.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float the dispersion of the data\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function pvariance(\n        array $data,\n        ?int $round = null,\n        int|float|null $mu = null,\n    ): float {\n        $num_of_elements = self::count($data);\n        if ($num_of_elements === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        $sumSquareDifferences = 0.0;\n        $average = $mu ?? self::mean($data);\n\n        foreach ($data as $i) {\n            // sum of squares of differences between\n            // all numbers and means.\n            $sumSquareDifferences += ($i - $average) ** 2;\n        }\n\n        return Math::round($sumSquareDifferences / $num_of_elements, $round);\n    }\n\n    /**\n     * Return the standard deviation of the numeric data.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::variance()\n     *\n     * @return float the standard deviation of the numeric data\n     */\n    public static function stdev(array $data, ?int $round = null): float\n    {\n        $variance = self::variance($data);\n\n        return Math::round(sqrt($variance), $round);\n    }\n\n    /**\n     * Return the standard error of the mean (SEM).\n     * SEM measures how precisely the sample mean estimates the population mean.\n     *\n     * Formula: stdev / sqrt(n)\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float the standard error of the mean\n     *\n     * @throws InvalidDataInputException if data size is less than 2\n     */\n    public static function sem(array $data, ?int $round = null): float\n    {\n        return Math::round(self::stdev($data) / sqrt(self::count($data)), $round);\n    }\n\n    /**\n     * Return the mean absolute deviation (MAD) of the data.\n     * MAD is the average of the absolute deviations from the mean.\n     *\n     * Formula: (1/n) * Σ|xi - mean|\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float the mean absolute deviation\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function meanAbsoluteDeviation(array $data, ?int $round = null): float\n    {\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        $mean = self::mean($data);\n        $sum = 0.0;\n        foreach ($data as $value) {\n            $sum += abs($value - $mean);\n        }\n\n        return Math::round($sum / $count, $round);\n    }\n\n    /**\n     * Return the median absolute deviation of the data.\n     * This is the median of the absolute deviations from the median.\n     * It is a robust measure of dispersion, highly resistant to outliers.\n     *\n     * Formula: median(|xi - median(x)|)\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float the median absolute deviation\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function medianAbsoluteDeviation(array $data, ?int $round = null): float\n    {\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        $median = self::median($data);\n        $deviations = [];\n        foreach ($data as $value) {\n            $deviations[] = abs($value - $median);\n        }\n\n        return Math::round((float) self::median($deviations), $round);\n    }\n\n    /**\n     * Return the z-scores for each value in the dataset.\n     * A z-score indicates how many standard deviations a value is from the mean.\n     *\n     * Formula: zi = (xi - mean) / stdev\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round each z-score\n     * @return array<float> the z-scores\n     *\n     * @throws InvalidDataInputException if data size is less than 2 or stdev is zero\n     */\n    public static function zscores(array $data, ?int $round = null): array\n    {\n        $mean = self::mean($data);\n        $stdev = self::stdev($data);\n        if ($stdev == 0) {\n            throw new InvalidDataInputException(\n                \"Z-scores are undefined when all values are identical (standard deviation is zero).\",\n            );\n        }\n\n        $zscores = [];\n        foreach ($data as $value) {\n            $zscores[] = Math::round(($value - $mean) / $stdev, $round);\n        }\n\n        return $zscores;\n    }\n\n    /**\n     * Return values from the dataset that are outliers based on z-score threshold.\n     * A value is considered an outlier if its absolute z-score exceeds the threshold.\n     *\n     * The default threshold of 3.0 is a common convention (values more than 3 standard\n     * deviations from the mean).\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $threshold  absolute z-score threshold (default 3.0)\n     * @return array<int|float> the outlier values\n     *\n     * @throws InvalidDataInputException if data size is less than 2 or stdev is zero\n     */\n    public static function outliers(array $data, float $threshold = 3.0): array\n    {\n        $zscores = self::zscores($data);\n        $outliers = [];\n        foreach ($data as $i => $value) {\n            if (abs($zscores[$i]) > $threshold) {\n                $outliers[] = $value;\n            }\n        }\n\n        return $outliers;\n    }\n\n    /**\n     * Return values from the dataset that are outliers based on the IQR method.\n     * A value is an outlier if it falls below Q1 - factor * IQR or above Q3 + factor * IQR.\n     *\n     * This method is robust and does not assume a normal distribution, making it\n     * suitable for skewed data. It is the same method used for box plot whiskers.\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $factor  IQR multiplier (default 1.5 for mild outliers, use 3.0 for extreme)\n     * @return array<int|float> the outlier values\n     *\n     * @throws InvalidDataInputException if data has fewer than 2 elements\n     */\n    public static function iqrOutliers(array $data, float $factor = 1.5): array\n    {\n        $q1 = self::firstQuartile($data);\n        $q3 = self::thirdQuartile($data);\n        $iqr = $q3 - $q1;\n        $lowerFence = $q1 - $factor * $iqr;\n        $upperFence = $q3 + $factor * $iqr;\n\n        $outliers = [];\n        foreach ($data as $value) {\n            if ($value < $lowerFence || $value > $upperFence) {\n                $outliers[] = $value;\n            }\n        }\n\n        return $outliers;\n    }\n\n    /**\n     * Return the variance from the numeric data.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float the variance\n     *\n     * @throws InvalidDataInputException if data size is less than 2\n     */\n    public static function variance(\n        array $data,\n        ?int $round = null,\n        int|float|null $xbar = null,\n    ): float {\n        $num_of_elements = self::count($data);\n        if ($num_of_elements <= 1) {\n            throw new InvalidDataInputException(\n                \"The data size must be greater than 1.\",\n            );\n        }\n        $sumSquareDifferences = 0.0;\n        $average = $xbar ?? self::mean($data);\n\n        foreach ($data as $i) {\n            // sum of squares of differences between\n            // all numbers and means.\n            $sumSquareDifferences += ($i - $average) ** 2;\n        }\n\n        return Math::round(\n            $sumSquareDifferences / ($num_of_elements - 1),\n            $round,\n        );\n    }\n\n    /**\n     * Return the adjusted Fisher-Pearson sample skewness of the data.\n     * This is the same formula used by Excel's SKEW() and scipy.stats.skew(bias=False).\n     *\n     * Formula: [n / ((n-1)(n-2))] * Σ((xi - x̄) / s)³\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float skewness\n     *\n     * @throws InvalidDataInputException if the data has fewer than 3 elements or all values are identical\n     */\n    public static function skewness(array $data, ?int $round = null): float\n    {\n        $n = self::count($data);\n        if ($n < 3) {\n            throw new InvalidDataInputException(\n                \"Skewness requires at least 3 data points.\",\n            );\n        }\n\n        $mean = self::mean($data);\n        $stdev = self::stdev($data);\n\n        if ($stdev == 0) {\n            throw new InvalidDataInputException(\n                \"Skewness is undefined when all values are identical (standard deviation is zero).\",\n            );\n        }\n\n        $sumCubes = 0.0;\n        foreach ($data as $xi) {\n            $sumCubes += (($xi - $mean) / $stdev) ** 3;\n        }\n\n        $skewness = ($n / (($n - 1) * ($n - 2))) * $sumCubes;\n\n        return Math::round($skewness, $round);\n    }\n\n    /**\n     * Return the population (biased) skewness of the data.\n     * This is the same formula used by scipy.stats.skew(bias=True).\n     *\n     * Formula: (1/n) * Σ((xi - x̄) / σ)³\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float population skewness\n     *\n     * @throws InvalidDataInputException if the data has fewer than 3 elements or all values are identical\n     */\n    public static function pskewness(array $data, ?int $round = null): float\n    {\n        $n = self::count($data);\n        if ($n < 3) {\n            throw new InvalidDataInputException(\n                \"Skewness requires at least 3 data points.\",\n            );\n        }\n\n        $mean = self::mean($data);\n        $pstdev = self::pstdev($data);\n\n        if ($pstdev == 0) {\n            throw new InvalidDataInputException(\n                \"Skewness is undefined when all values are identical (standard deviation is zero).\",\n            );\n        }\n\n        $sumCubes = 0.0;\n        foreach ($data as $xi) {\n            $sumCubes += (($xi - $mean) / $pstdev) ** 3;\n        }\n\n        $pskewness = $sumCubes / $n;\n\n        return Math::round($pskewness, $round);\n    }\n\n    /**\n     * Return the excess kurtosis of the data using the sample formula.\n     * This is the same formula used by Excel's KURT() and Python's\n     * scipy.stats.kurtosis(bias=False, fisher=True).\n     *\n     * Excess kurtosis measures the \"tailedness\" of a distribution relative\n     * to a normal distribution. A normal distribution has excess kurtosis 0.\n     * Positive values (leptokurtic) indicate heavier tails and more outliers;\n     * negative values (platykurtic) indicate lighter tails and fewer outliers.\n     *\n     * Formula: [n(n+1) / ((n-1)(n-2)(n-3))] * Σ((xi - x̄) / s)⁴ − [3(n-1)² / ((n-2)(n-3))]\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float excess kurtosis\n     *\n     * @throws InvalidDataInputException if the data has fewer than 4 elements or all values are identical\n     */\n    public static function kurtosis(array $data, ?int $round = null): float\n    {\n        $n = self::count($data);\n        if ($n < 4) {\n            throw new InvalidDataInputException(\n                \"Kurtosis requires at least 4 data points.\",\n            );\n        }\n\n        $mean = self::mean($data);\n        $stdev = self::stdev($data);\n\n        if ($stdev == 0) {\n            throw new InvalidDataInputException(\n                \"Kurtosis is undefined when all values are identical (standard deviation is zero).\",\n            );\n        }\n\n        $sumFourth = 0.0;\n        foreach ($data as $xi) {\n            $sumFourth += (($xi - $mean) / $stdev) ** 4;\n        }\n\n        $kurtosis\n            = (($n * ($n + 1)) / (($n - 1) * ($n - 2) * ($n - 3))) * $sumFourth\n            - (3 * ($n - 1) ** 2) / (($n - 2) * ($n - 3));\n\n        return Math::round($kurtosis, $round);\n    }\n\n    /**\n     * Return the coefficient of variation (CV) of the data.\n     * The coefficient of variation is the ratio of the standard deviation\n     * to the mean, expressed as a percentage. It measures relative variability\n     * and is useful for comparing dispersion across datasets with different units or scales.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @param  bool  $population if true, use population stdev/mean; otherwise sample\n     * @return float the coefficient of variation as a percentage\n     *\n     * @throws InvalidDataInputException if the data has fewer than 2 elements (sample)\n     *         or is empty (population), or if the mean is zero\n     */\n    public static function coefficientOfVariation(\n        array $data,\n        ?int $round = null,\n        bool $population = false,\n    ): float {\n        $mean = self::mean($data);\n        if ($mean == 0) {\n            throw new InvalidDataInputException(\n                \"Coefficient of variation is undefined when the mean is zero.\",\n            );\n        }\n\n        $sd = $population ? self::pstdev($data) : self::stdev($data);\n\n        return Math::round(($sd / abs($mean)) * 100, $round);\n    }\n\n    /**\n     * Return the geometric mean of the numeric data.\n     * That is the number that can replace each of these numbers so that their product\n     * does not change.\n     *\n     * @param  array<int|float>  $data\n     * @param  int|null  $round whether to round the result\n     * @return float geometric mean\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function geometricMean(array $data, ?int $round = null): float\n    {\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        $product = 1;\n        foreach ($data as $value) {\n            $product *= $value;\n        }\n        $geometricMean = $product ** (1 / $count);\n\n        return Math::round($geometricMean, $round);\n    }\n\n    /**\n     * Return the harmonic mean (the reciprocal of the arithmetic mean) of the numeric data.\n     *\n     * @param  array<int|float>  $data\n     * @param mixed[]|null $weights additional weight to the elements (as if there were several of them)\n     * @param  int|null  $round whether to round the result\n     * @return float harmonic mean\n     *\n     * @throws InvalidDataInputException if the data is empty\n     */\n    public static function harmonicMean(\n        array $data,\n        ?array $weights = null,\n        ?int $round = null,\n    ): float {\n        $sum = 0;\n        $count = self::count($data);\n        if ($count === 0) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        $sumWeigth = 0;\n        foreach ($data as $key => $value) {\n            if (!$value) {\n                return 0;\n            }\n            $weight = is_null($weights) ? 1 : $weights[$key];\n            $sumWeigth += $weight;\n            $sum += $weight / $value;\n        }\n\n        return Math::round($sumWeigth / $sum, $round);\n    }\n\n    /**\n     * Return the sample covariance of two inputs *$x* and *$y*.\n     * Covariance is a measure of the joint variability of two inputs.\n     *\n     * @param  array<int|float>  $x\n     * @param  array<int|float>  $y\n     *\n     * @throws InvalidDataInputException if 2 arrays have different size,\n     * or if the length of arrays are < 2, or if the 2 input arrays has not numeric elements\n     */\n    public static function covariance(array $x, array $y): false|float\n    {\n        $countX = count($x);\n        $countY = count($y);\n        if ($countX !== $countY) {\n            throw new InvalidDataInputException(\n                \"Covariance requires that both inputs have same number of data points.\",\n            );\n        }\n        if ($countX < 2) {\n            throw new InvalidDataInputException(\n                \"Covariance requires at least two data points.\",\n            );\n        }\n        $meanX = self::mean($x);\n        $meanY = self::mean($y);\n        $add = 0.0;\n\n        for ($pos = 0; $pos < $countX; $pos++) {\n            $valueX = $x[$pos];\n            if (!is_numeric($valueX)) {\n                throw new InvalidDataInputException(\n                    \"Covariance requires numeric data points.\",\n                );\n            }\n            $valueY = $y[$pos];\n            if (!is_numeric($valueY)) {\n                throw new InvalidDataInputException(\n                    \"Covariance requires numeric data points.\",\n                );\n            }\n            $diffX = $valueX - $meanX;\n            $diffY = $valueY - $meanY;\n            $add += $diffX * $diffY;\n        }\n\n        // covariance for sample: N - 1\n        return $add / (float) ($countX - 1);\n    }\n\n    /**\n     * Return the Pearson’s correlation coefficient for two inputs.\n     * Pearson’s correlation coefficient r takes values between -1 and +1.\n     * It measures the strength and direction of the linear relationship,\n     * where +1 means very strong, positive linear relationship,\n     * -1 very strong, negative linear relationship,\n     * and 0 no linear relationship.\n     *\n     * @param  array<int|float>  $x\n     * @param  array<int|float>  $y\n     *\n     * @throws InvalidDataInputException if 2 arrays have different size,\n     * or if the length of arrays are < 2, or if the 2 input arrays has not numeric elements,\n     * or if the elements of the array are constants\n     */\n    public static function correlation(\n        array $x,\n        array $y,\n        string $method = \"linear\",\n    ): false|float {\n        if ($method !== \"linear\" && $method !== \"ranked\") {\n            throw new InvalidDataInputException(\n                \"Correlation method must be 'linear' or 'ranked'.\",\n            );\n        }\n\n        $countX = count($x);\n        $countY = count($y);\n        if ($countX !== $countY) {\n            throw new InvalidDataInputException(\n                \"Correlation requires that both inputs have same number of data points.\",\n            );\n        }\n        if ($countX < 2) {\n            throw new InvalidDataInputException(\n                \"Correlation requires at least two data points.\",\n            );\n        }\n\n        if ($method === \"ranked\") {\n            $x = self::ranks($x);\n            $y = self::ranks($y);\n        }\n\n        $meanX = self::mean($x);\n        $meanY = self::mean($y);\n        $a = 0;\n        $bx = 0;\n        $by = 0;\n        $counter = count($x);\n        for ($i = 0; $i < $counter; $i++) {\n            $xr = $x[$i] - $meanX;\n            $yr = $y[$i] - $meanY;\n            $a += $xr * $yr;\n            $bx += $xr ** 2;\n            $by += $yr ** 2;\n        }\n        $b = sqrt($bx * $by);\n        if ($b == 0) {\n            throw new InvalidDataInputException(\n                \"Correlation, at least one of the inputs is constant.\",\n            );\n        }\n\n        return $a / $b;\n    }\n\n    /**\n     * Assign average ranks to data values (handles ties by averaging).\n     *\n     * @param  array<int|float>  $data\n     * @return array<float>\n     */\n    private static function ranks(array $data): array\n    {\n        $n = count($data);\n        $indexed = [];\n        for ($i = 0; $i < $n; $i++) {\n            $indexed[] = [$data[$i], $i];\n        }\n\n        usort($indexed, fn(array $a, array $b): int => $a[0] <=> $b[0]);\n\n        $ranks = array_fill(0, $n, 0.0);\n        $i = 0;\n        while ($i < $n) {\n            $j = $i;\n            while ($j < $n && $indexed[$j][0] === $indexed[$i][0]) {\n                $j++;\n            }\n            $averageRank = ($i + 1 + $j) / 2.0;\n            for ($k = $i; $k < $j; $k++) {\n                $ranks[$indexed[$k][1]] = $averageRank;\n            }\n            $i = $j;\n        }\n\n        return $ranks;\n    }\n\n    /**\n     * Create a continuous probability density function or cumulative distribution\n     * function from discrete sample data using Kernel Density Estimation.\n     *\n     * Returns a Closure that estimates the density (or CDF) at any given point.\n     *\n     * @param  array<int|float>  $data  sample data\n     * @param  float  $h  bandwidth (smoothing parameter), must be > 0\n     * @param  KdeKernel  $kernel  kernel to use for estimation\n     * @param  bool  $cumulative  if true, return CDF estimator; otherwise PDF estimator\n     * @return \\Closure  a callable that takes a float and returns the estimated density or CDF value\n     *\n     * @throws InvalidDataInputException if data is empty or bandwidth <= 0\n     */\n    public static function kde(\n        array $data,\n        float $h,\n        KdeKernel $kernel = KdeKernel::Normal,\n        bool $cumulative = false,\n    ): \\Closure {\n        if ($data === []) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        if ($h <= 0) {\n            throw new InvalidDataInputException(\n                \"Bandwidth h must be positive.\",\n            );\n        }\n\n        $kernel = $kernel->resolve();\n\n        $sqrt2pi = sqrt(2.0 * M_PI);\n\n        // Standard normal CDF using Abramowitz & Stegun approximation (7.1.26)\n        $normalCdf = static function (float $t) use ($sqrt2pi): float {\n            $negative = $t < 0;\n            $t = abs($t);\n            $b1 = 0.31938153;\n            $b2 = -0.356563782;\n            $b3 = 1.781477937;\n            $b4 = -1.821255978;\n            $b5 = 1.330274429;\n            $p = 0.2316419;\n            $k = 1.0 / (1.0 + $p * $t);\n            $pdf = exp((-$t * $t) / 2.0) / $sqrt2pi;\n            $cdf\n                = 1.0\n                - $pdf\n                    * $k\n                    * ($b1 + $k * ($b2 + $k * ($b3 + $k * ($b4 + $k * $b5))));\n\n            return $negative ? 1.0 - $cdf : $cdf;\n        };\n\n        $kernels = [\n            KdeKernel::Normal->value => [\n                \"pdf\" => static fn(float $t): float => exp((-$t * $t) / 2.0)\n                    / $sqrt2pi,\n                \"cdf\" => $normalCdf,\n                \"support\" => null,\n            ],\n            KdeKernel::Logistic->value => [\n                \"pdf\" => static fn(float $t): float => 0.5 / (1.0 + cosh($t)),\n                \"cdf\" => static fn(float $t): float => 1.0 / (1.0 + exp(-$t)),\n                \"support\" => null,\n            ],\n            KdeKernel::Sigmoid->value => [\n                \"pdf\" => static fn(float $t): float => 1.0 / M_PI / cosh($t),\n                \"cdf\" => static fn(float $t): float => (2.0 / M_PI)\n                    * atan(exp($t)),\n                \"support\" => null,\n            ],\n            KdeKernel::Rectangular->value => [\n                \"pdf\" => static fn(float $t): float => 0.5,\n                \"cdf\" => static fn(float $t): float => 0.5 * $t + 0.5,\n                \"support\" => 1.0,\n            ],\n            KdeKernel::Triangular->value => [\n                \"pdf\" => static fn(float $t): float => 1.0 - abs($t),\n                \"cdf\" => static fn(float $t): float => $t >= 0\n                    ? 1.0 - ((1.0 - $t) * (1.0 - $t)) / 2.0\n                    : ((1.0 + $t) * (1.0 + $t)) / 2.0,\n                \"support\" => 1.0,\n            ],\n            KdeKernel::Parabolic->value => [\n                \"pdf\" => static fn(float $t): float => 0.75 * (1.0 - $t * $t),\n                \"cdf\" => static fn(float $t): float => -0.25 * $t * $t * $t\n                    + 0.75 * $t\n                    + 0.5,\n                \"support\" => 1.0,\n            ],\n            KdeKernel::Quartic->value => [\n                \"pdf\" => static fn(float $t): float => (15.0 / 16.0)\n                    * (1.0 - $t * $t) ** 2,\n                \"cdf\" => static fn(float $t): float => (15.0 * $t\n                    - 10.0 * $t ** 3\n                    + 3.0 * $t ** 5)\n                    / 16.0\n                    + 0.5,\n                \"support\" => 1.0,\n            ],\n            KdeKernel::Triweight->value => [\n                \"pdf\" => static fn(float $t): float => (35.0 / 32.0)\n                    * (1.0 - $t * $t) ** 3,\n                \"cdf\" => static fn(float $t): float => (35.0 * $t\n                    - 35.0 * $t ** 3\n                    + 21.0 * $t ** 5\n                    - 5.0 * $t ** 7)\n                    / 32.0\n                    + 0.5,\n                \"support\" => 1.0,\n            ],\n            KdeKernel::Cosine->value => [\n                \"pdf\" => static fn(float $t): float => (M_PI / 4.0)\n                    * cos((M_PI * $t) / 2.0),\n                \"cdf\" => static fn(float $t): float => 0.5\n                    * sin((M_PI * $t) / 2.0)\n                    + 0.5,\n                \"support\" => 1.0,\n            ],\n        ];\n\n        $kernelDef = $kernels[$kernel->value]; // @phpstan-ignore offsetAccess.notFound\n        $support = $kernelDef[\"support\"];\n        $fn = $cumulative ? $kernelDef[\"cdf\"] : $kernelDef[\"pdf\"];\n\n        $sorted = $data;\n        sort($sorted);\n        $n = count($sorted);\n\n        if ($cumulative) {\n            return static function (float $x) use (\n                $sorted,\n                $n,\n                $h,\n                $fn,\n                $support,\n            ): float {\n                $sum = 0.0;\n                if ($support !== null) {\n                    $lo = self::bisectLeft($sorted, $x - $h * $support);\n                    $hi = self::bisectRight($sorted, $x + $h * $support);\n                    for ($i = $lo; $i < $hi; $i++) {\n                        $t = ($x - $sorted[$i]) / $h;\n                        $sum += $fn($t);\n                    }\n                    // Points entirely to the left contribute 1.0 each\n                    $sum += $lo;\n                } else {\n                    for ($i = 0; $i < $n; $i++) {\n                        $t = ($x - $sorted[$i]) / $h;\n                        $sum += $fn($t);\n                    }\n                }\n\n                return $sum / $n;\n            };\n        }\n\n        return static function (float $x) use (\n            $sorted,\n            $n,\n            $h,\n            $fn,\n            $support,\n        ): float {\n            $sum = 0.0;\n            if ($support !== null) {\n                $lo = self::bisectLeft($sorted, $x - $h * $support);\n                $hi = self::bisectRight($sorted, $x + $h * $support);\n                for ($i = $lo; $i < $hi; $i++) {\n                    $t = ($x - $sorted[$i]) / $h;\n                    $sum += $fn($t);\n                }\n            } else {\n                for ($i = 0; $i < $n; $i++) {\n                    $t = ($x - $sorted[$i]) / $h;\n                    $sum += $fn($t);\n                }\n            }\n\n            return $sum / ($n * $h);\n        };\n    }\n\n    /**\n     * Generate random samples from a Kernel Density Estimate.\n     *\n     * Returns a Closure that, when called, produces a random float drawn\n     * from the KDE distribution defined by the data and bandwidth.\n     *\n     * @param  array<int|float>  $data  sample data\n     * @param  float  $h  bandwidth (smoothing parameter), must be > 0\n     * @param  KdeKernel  $kernel  kernel to use for estimation\n     * @param  int|null  $seed  optional seed for reproducibility\n     * @return \\Closure  a callable that returns a random float from the KDE\n     *\n     * @throws InvalidDataInputException if data is empty or bandwidth <= 0\n     */\n    public static function kdeRandom(\n        array $data,\n        float $h,\n        KdeKernel $kernel = KdeKernel::Normal,\n        ?int $seed = null,\n    ): \\Closure {\n        if ($data === []) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n        if ($h <= 0) {\n            throw new InvalidDataInputException(\n                \"Bandwidth h must be positive.\",\n            );\n        }\n\n        $kernel = $kernel->resolve();\n\n        // Acklam rational approximation for standard normal inverse CDF\n        $normalInvCdf = static function (float $p): float {\n            $a = [\n                -3.969683028665376e1,\n                2.209460984245205e2,\n                -2.759285104469687e2,\n                1.38357751867269e2,\n                -3.066479806614716e1,\n                2.506628277459239,\n            ];\n            $b = [\n                -5.447609879822406e1,\n                1.615858368580409e2,\n                -1.556989798598866e2,\n                6.680131188771972e1,\n                -1.328068155288572e1,\n            ];\n            $c = [\n                -7.784894002430293e-3,\n                -3.223964580411365e-1,\n                -2.400758277161838,\n                -2.549732539343734,\n                4.374664141464968,\n                2.938163982698783,\n            ];\n            $d = [\n                7.784695709041462e-3,\n                3.224671290700398e-1,\n                2.445134137142996,\n                3.754408661907416,\n            ];\n\n            $pLow = 0.02425;\n            $pHigh = 1.0 - $pLow;\n            if ($p < $pLow) {\n                $q = sqrt(-2.0 * log($p));\n                return ((((($c[0] * $q + $c[1]) * $q + $c[2]) * $q + $c[3])\n                    * $q\n                    + $c[4])\n                    * $q\n                    + $c[5])\n                    / (((($d[0] * $q + $d[1]) * $q + $d[2]) * $q + $d[3]) * $q\n                        + 1.0);\n            }\n\n            if ($p <= $pHigh) {\n                $q = $p - 0.5;\n                $r = $q * $q;\n                return (((((($a[0] * $r + $a[1]) * $r + $a[2]) * $r + $a[3])\n                    * $r\n                    + $a[4])\n                    * $r\n                    + $a[5])\n                    * $q)\n                    / ((((($b[0] * $r + $b[1]) * $r + $b[2]) * $r + $b[3]) * $r\n                        + $b[4])\n                        * $r\n                        + 1.0);\n            }\n            $q = sqrt(-2.0 * log(1.0 - $p));\n            return -(\n                (((($c[0] * $q + $c[1]) * $q + $c[2]) * $q + $c[3]) * $q\n                    + $c[4])\n                    * $q\n                + $c[5]\n            )\n                / (((($d[0] * $q + $d[1]) * $q + $d[2]) * $q + $d[3]) * $q + 1.0);\n        };\n\n        // Newton-Raphson solver for kernels without closed-form inverse CDF\n        $newtonRaphson = static function (\n            float $p,\n            callable $cdf,\n            callable $pdf,\n            float $x0,\n        ): float {\n            $x = $x0;\n            for ($i = 0; $i < 100; $i++) {\n                $err = $cdf($x) - $p;\n                if (abs($err) <= 1e-12) {\n                    break;\n                }\n                $x -= $err / $pdf($x);\n            }\n            return $x;\n        };\n\n        // Quartic CDF and PDF for Newton-Raphson\n        $quarticCdf = static fn(float $t): float => $t <= -1.0\n            ? 0.0 // @codeCoverageIgnore\n            : ($t >= 1.0\n                ? 1.0 // @codeCoverageIgnore\n                : (15.0 * $t - 10.0 * $t ** 3 + 3.0 * $t ** 5) / 16.0 + 0.5);\n        $quarticPdf = static fn(float $t): float => $t < -1.0 || $t > 1.0\n            ? 0.0 // @codeCoverageIgnore\n            : (15.0 / 16.0) * (1.0 - $t * $t) ** 2;\n\n        // Triweight CDF and PDF for Newton-Raphson\n        $triweightCdf = static fn(float $t): float => $t <= -1.0\n            ? 0.0 // @codeCoverageIgnore\n            : ($t >= 1.0\n                ? 1.0 // @codeCoverageIgnore\n                : (35.0 * $t\n                        - 35.0 * $t ** 3\n                        + 21.0 * $t ** 5\n                        - 5.0 * $t ** 7)\n                        / 32.0\n                    + 0.5);\n        $triweightPdf = static fn(float $t): float => $t < -1.0 || $t > 1.0\n            ? 0.0 // @codeCoverageIgnore\n            : (35.0 / 32.0) * (1.0 - $t * $t) ** 3;\n\n        $invcdfMap = [\n            KdeKernel::Normal->value => $normalInvCdf,\n            KdeKernel::Logistic->value => static fn(float $p): float => log(\n                $p / (1.0 - $p),\n            ),\n            KdeKernel::Sigmoid->value => static fn(float $p): float => log(\n                tan(($p * M_PI) / 2.0),\n            ),\n            KdeKernel::Rectangular->value => static fn(float $p): float => 2.0\n                * $p\n                - 1.0,\n            KdeKernel::Triangular->value => static fn(float $p): float => $p\n            < 0.5\n                ? sqrt(2.0 * $p) - 1.0\n                : 1.0 - sqrt(2.0 - 2.0 * $p),\n            KdeKernel::Parabolic->value => static fn(float $p): float => 2.0\n                * cos((acos(2.0 * $p - 1.0) + M_PI) / 3.0),\n            KdeKernel::Quartic->value => static function (float $p) use (\n                $newtonRaphson,\n                $quarticCdf,\n                $quarticPdf,\n            ): float {\n                if ($p <= 0.5) {\n                    $sign = 1.0;\n                } else {\n                    $sign = -1.0;\n                    $p = 1.0 - $p;\n                }\n                if ($p < 0.0106) {\n                    $x = (2.0 * $p) ** 0.3838 - 1.0;\n                } else {\n                    $x = (2.0 * $p) ** 0.4258865685331 - 1.0;\n                    if ($p < 0.499) {\n                        $x\n                            += 0.026818732\n                            * sin(7.101753784 * $p + 2.73230839482953);\n                    }\n                }\n                $x *= $sign;\n                return $newtonRaphson(\n                    $sign === 1.0 ? $p : 1.0 - $p,\n                    $quarticCdf,\n                    $quarticPdf,\n                    $x,\n                );\n            },\n            KdeKernel::Triweight->value => static function (float $p) use (\n                $newtonRaphson,\n                $triweightCdf,\n                $triweightPdf,\n            ): float {\n                if ($p <= 0.5) {\n                    $sign = 1.0;\n                } else {\n                    $sign = -1.0;\n                    $p = 1.0 - $p;\n                }\n                $x = (2.0 * $p) ** 0.3400218741872791 - 1.0;\n                if ($p > 0.00001 && $p < 0.499) {\n                    $x -= 0.033 * sin(1.07 * 2.0 * M_PI * ($p - 0.035));\n                }\n                $x *= $sign;\n                return $newtonRaphson(\n                    $sign === 1.0 ? $p : 1.0 - $p,\n                    $triweightCdf,\n                    $triweightPdf,\n                    $x,\n                );\n            },\n            KdeKernel::Cosine->value => static fn(float $p): float => (2.0\n                / M_PI)\n                * asin(2.0 * $p - 1.0),\n        ];\n\n        $invcdf = $invcdfMap[$kernel->value]; // @phpstan-ignore offsetAccess.notFound\n        $n = count($data);\n\n        if ($seed !== null) {\n            mt_srand($seed);\n        }\n\n        return static function () use ($data, $n, $h, $invcdf): float {\n            $i = mt_rand(0, $n - 1);\n            $u = mt_rand(1, mt_getrandmax()) / mt_getrandmax();\n            return $data[$i] + $h * $invcdf($u);\n        };\n    }\n\n    /**\n     * @param  array<int|float>  $x\n     * @param  array<int|float>  $y\n     * @return array{float, float}\n     *\n     * @throws InvalidDataInputException if 2 arrays have different size,\n     * or if the length of arrays are < 2, or if the 2 input arrays has not numeric elements,\n     * or if the elements of the array are constants\n     */\n    public static function linearRegression(\n        array $x,\n        array $y,\n        bool $proportional = false,\n    ): array {\n        $countX = count($x);\n        $countY = count($y);\n        if ($countX !== $countY) {\n            throw new InvalidDataInputException(\n                \"Linear regression requires that both inputs have same number of data points.\",\n            );\n        }\n        if ($countX < 2) {\n            throw new InvalidDataInputException(\n                \"Linear regression requires at least two data points.\",\n            );\n        }\n        $sumX = array_sum($x);\n        $sumY = array_sum($y);\n        $sumXX = 0;\n        $sumXY = 0;\n\n        foreach ($x as $key => $value) {\n            $sumXY += $value * $y[$key];\n            $sumXX += $value * $value;\n        }\n\n        if ($proportional) {\n            if ($sumXX == 0) {\n                throw new InvalidDataInputException(\n                    \"Proportional linear regression requires x values that are not all zeros.\",\n                );\n            }\n            $slope = (float) ($sumXY / $sumXX);\n\n            return [$slope, 0.0];\n        }\n\n        $denominator = $countX * $sumXX - $sumX * $sumX;\n        if ($denominator === 0) {\n            throw new InvalidDataInputException(\n                \"Linear regression, the inputs is constant.\",\n            );\n        }\n        $slope = ($countX * $sumXY - $sumX * $sumY) / $denominator;\n        $intercept = ($sumY - $slope * $sumX) / $countX;\n\n        return [$slope, $intercept];\n    }\n\n    /**\n     * Logarithmic regression: fits y = a * ln(x) + b.\n     *\n     * Transforms x values to ln(x) and applies linear regression.\n     * Useful for data with diminishing returns (e.g., athletic improvement,\n     * learning curves) where growth slows over time.\n     *\n     * @param  array<int|float>  $x  must be positive values\n     * @param  array<int|float>  $y\n     * @return array{0: float, 1: float}  [a, b] coefficients\n     *\n     * @throws InvalidDataInputException if x contains non-positive values,\n     * or if arrays have different sizes or fewer than 2 elements\n     */\n    public static function logarithmicRegression(\n        array $x,\n        array $y,\n    ): array {\n        foreach ($x as $value) {\n            if ($value <= 0) {\n                throw new InvalidDataInputException(\n                    \"Logarithmic regression requires all x values to be positive.\",\n                );\n            }\n        }\n\n        $logX = array_map(log(...), $x);\n\n        return self::linearRegression($logX, $y);\n    }\n\n    /**\n     * Power regression: fits y = a * x^b.\n     *\n     * Linearizes as ln(y) = ln(a) + b * ln(x) and applies linear regression.\n     * Useful for data following power law relationships.\n     *\n     * @param  array<int|float>  $x  must be positive values\n     * @param  array<int|float>  $y  must be positive values\n     * @return array{0: float, 1: float}  [a, b] coefficients where y = a * x^b\n     *\n     * @throws InvalidDataInputException if x or y contain non-positive values,\n     * or if arrays have different sizes or fewer than 2 elements\n     */\n    public static function powerRegression(\n        array $x,\n        array $y,\n    ): array {\n        foreach ($x as $value) {\n            if ($value <= 0) {\n                throw new InvalidDataInputException(\n                    \"Power regression requires all x values to be positive.\",\n                );\n            }\n        }\n        foreach ($y as $value) {\n            if ($value <= 0) {\n                throw new InvalidDataInputException(\n                    \"Power regression requires all y values to be positive.\",\n                );\n            }\n        }\n\n        $logX = array_map(log(...), $x);\n        $logY = array_map(log(...), $y);\n\n        [$b, $logA] = self::linearRegression($logX, $logY);\n\n        return [exp($logA), $b];\n    }\n\n    /**\n     * Exponential regression: fits y = a * e^(b*x).\n     *\n     * Linearizes as ln(y) = ln(a) + b*x and applies linear regression.\n     * Useful for data with exponential growth or decay.\n     *\n     * @param  array<int|float>  $x\n     * @param  array<int|float>  $y  must be positive values\n     * @return array{0: float, 1: float}  [a, b] coefficients where y = a * e^(b*x)\n     *\n     * @throws InvalidDataInputException if y contains non-positive values,\n     * or if arrays have different sizes or fewer than 2 elements\n     */\n    public static function exponentialRegression(\n        array $x,\n        array $y,\n    ): array {\n        foreach ($y as $value) {\n            if ($value <= 0) {\n                throw new InvalidDataInputException(\n                    \"Exponential regression requires all y values to be positive.\",\n                );\n            }\n        }\n\n        $logY = array_map(log(...), $y);\n\n        [$b, $logA] = self::linearRegression($x, $logY);\n\n        return [exp($logA), $b];\n    }\n\n    /**\n     * Calculate the coefficient of determination (R²).\n     *\n     * R² measures the proportion of variance in y explained by the\n     * linear regression on x. Returns a value between 0 and 1.\n     *\n     * @param  array<int|float>  $x\n     * @param  array<int|float>  $y\n     * @throws InvalidDataInputException\n     */\n    public static function rSquared(array $x, array $y, bool $proportional = false, ?int $round = null): float\n    {\n        $countX = count($x);\n        $countY = count($y);\n\n        if ($countX !== $countY) {\n            throw new InvalidDataInputException(\n                \"R-squared requires x and y arrays of the same length.\",\n            );\n        }\n\n        if ($countX < 2) {\n            throw new InvalidDataInputException(\n                \"R-squared requires at least 2 data points.\",\n            );\n        }\n\n        [$slope, $intercept] = self::linearRegression($x, $y, $proportional);\n        $meanY = self::mean($y);\n\n        $ssRes = 0.0;\n        $ssTot = 0.0;\n\n        foreach ($y as $key => $yi) {\n            $predicted = $slope * $x[$key] + $intercept;\n            $ssRes += ($yi - $predicted) ** 2;\n            $ssTot += ($yi - $meanY) ** 2;\n        }\n\n        if ($ssTot == 0) {\n            throw new InvalidDataInputException(\n                \"R-squared is undefined when y values are constant (zero variance).\",\n            );\n        }\n\n        $rSquared = 1 - $ssRes / $ssTot;\n\n        if ($round !== null) {\n            return round($rSquared, $round);\n        }\n\n        return $rSquared;\n    }\n\n    /**\n     * Return the confidence interval for the mean using the normal (z) distribution.\n     *\n     * Computes: mean ± z * (stdev / √n)\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $confidenceLevel the confidence level (e.g. 0.95 for 95%)\n     * @param  int|null  $round whether to round the result\n     * @return array{0: float, 1: float} [lower bound, upper bound]\n     *\n     * @throws InvalidDataInputException if data has fewer than 2 elements or confidence level is not in (0, 1)\n     */\n    public static function confidenceInterval(\n        array $data,\n        float $confidenceLevel = 0.95,\n        ?int $round = null,\n    ): array {\n        if (self::count($data) < 2) {\n            throw new InvalidDataInputException(\n                \"Confidence interval requires at least 2 data points.\",\n            );\n        }\n\n        if ($confidenceLevel <= 0.0 || $confidenceLevel >= 1.0) {\n            throw new InvalidDataInputException(\n                \"Confidence level must be between 0 and 1 exclusive.\",\n            );\n        }\n\n        $mean = self::mean($data);\n        $standardError = self::sem($data);\n\n        $zCritical = (new NormalDist(0.0, 1.0))->invCdf((1 + $confidenceLevel) / 2);\n        $margin = $zCritical * $standardError;\n\n        $lower = $mean - $margin;\n        $upper = $mean + $margin;\n\n        if ($round !== null) {\n            return [Math::round($lower, $round), Math::round($upper, $round)];\n        }\n\n        return [$lower, $upper];\n    }\n\n    /**\n     * Perform a one-sample Z-test for the mean.\n     *\n     * Tests whether the sample mean differs significantly from a known\n     * population mean using the normal distribution.\n     *\n     * @param  array<int|float>  $data\n     * @param  float  $populationMean  the hypothesized population mean\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  whether to round the results\n     * @return array{zScore: float, pValue: float}\n     *\n     * @throws InvalidDataInputException if data has fewer than 2 elements\n     */\n    public static function zTest(\n        array $data,\n        float $populationMean,\n        Alternative $alternative = Alternative::TwoSided,\n        ?int $round = null,\n    ): array {\n        if (self::count($data) < 2) {\n            throw new InvalidDataInputException(\n                \"Z-test requires at least 2 data points.\",\n            );\n        }\n\n        $zScore = (self::mean($data) - $populationMean) / self::sem($data);\n\n        $normalDist = new NormalDist(0.0, 1.0);\n\n        $pValue = match ($alternative) {\n            Alternative::TwoSided => 2 * (1 - $normalDist->cdf(abs($zScore))),\n            Alternative::Greater => 1 - $normalDist->cdf($zScore),\n            Alternative::Less => $normalDist->cdf($zScore),\n        };\n\n        return [\n            'zScore' => Math::round($zScore, $round),\n            'pValue' => Math::round($pValue, $round),\n        ];\n    }\n\n    /**\n     * Perform a one-sample t-test for the mean.\n     *\n     * Tests whether the sample mean differs significantly from a hypothesized\n     * population mean using the Student's t-distribution. Unlike the z-test,\n     * the t-test is appropriate for small samples where the population standard\n     * deviation is unknown.\n     *\n     * @param  array<int|float>  $data  the sample data (at least 2 elements)\n     * @param  float  $populationMean  the hypothesized population mean (H₀: μ = populationMean)\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  optional decimal precision for rounding results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: int}\n     *\n     * @throws InvalidDataInputException if data has fewer than 2 elements\n     */\n    public static function tTest(\n        array $data,\n        float $populationMean,\n        Alternative $alternative = Alternative::TwoSided,\n        ?int $round = null,\n    ): array {\n        if (self::count($data) < 2) {\n            throw new InvalidDataInputException(\n                \"T-test requires at least 2 data points.\",\n            );\n        }\n\n        $df = self::count($data) - 1;\n        $tStatistic = (self::mean($data) - $populationMean) / self::sem($data);\n\n        $studentT = new StudentT($df);\n\n        $pValue = match ($alternative) {\n            Alternative::TwoSided => 2 * (1 - $studentT->cdf(abs($tStatistic))),\n            Alternative::Greater => 1 - $studentT->cdf($tStatistic),\n            Alternative::Less => $studentT->cdf($tStatistic),\n        };\n\n        return [\n            'tStatistic' => Math::round($tStatistic, $round),\n            'pValue' => Math::round($pValue, $round),\n            'degreesOfFreedom' => $df,\n        ];\n    }\n\n    /**\n     * Perform a two-sample independent t-test (Welch's t-test).\n     *\n     * Tests whether two independent samples have different means.\n     * Uses Welch's approximation for degrees of freedom, which does not\n     * assume equal variances.\n     *\n     * @param  array<int|float>  $data1  the first sample (at least 2 elements)\n     * @param  array<int|float>  $data2  the second sample (at least 2 elements)\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  optional decimal precision for rounding results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: float}\n     *\n     * @throws InvalidDataInputException if either sample has fewer than 2 elements\n     */\n    public static function tTestTwoSample(\n        array $data1,\n        array $data2,\n        Alternative $alternative = Alternative::TwoSided,\n        ?int $round = null,\n    ): array {\n        $n1 = self::count($data1);\n        $n2 = self::count($data2);\n\n        if ($n1 < 2 || $n2 < 2) {\n            throw new InvalidDataInputException(\n                \"Two-sample t-test requires at least 2 data points in each sample.\",\n            );\n        }\n\n        $mean1 = self::mean($data1);\n        $mean2 = self::mean($data2);\n        $var1 = self::variance($data1);\n        $var2 = self::variance($data2);\n\n        $se = sqrt($var1 / $n1 + $var2 / $n2);\n\n        if ($se === 0.0) {\n            throw new InvalidDataInputException(\n                \"Two-sample t-test requires non-zero variance in at least one sample.\",\n            );\n        }\n\n        $tStatistic = ($mean1 - $mean2) / $se;\n\n        // Welch–Satterthwaite degrees of freedom\n        $v1 = $var1 / $n1;\n        $v2 = $var2 / $n2;\n        $df = (($v1 + $v2) ** 2) / (($v1 ** 2) / ($n1 - 1) + ($v2 ** 2) / ($n2 - 1));\n\n        $studentT = new StudentT($df);\n\n        $pValue = match ($alternative) {\n            Alternative::TwoSided => 2 * (1 - $studentT->cdf(abs($tStatistic))),\n            Alternative::Greater => 1 - $studentT->cdf($tStatistic),\n            Alternative::Less => $studentT->cdf($tStatistic),\n        };\n\n        return [\n            'tStatistic' => Math::round($tStatistic, $round),\n            'pValue' => Math::round($pValue, $round),\n            'degreesOfFreedom' => Math::round($df, $round),\n        ];\n    }\n\n    /**\n     * Perform a paired t-test.\n     *\n     * Tests whether the mean difference between paired observations is\n     * significantly different from zero. This is equivalent to a one-sample\n     * t-test on the differences.\n     *\n     * @param  array<int|float>  $data1  the first set of observations\n     * @param  array<int|float>  $data2  the second set of observations (same length as $data1)\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  optional decimal precision for rounding results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: int}\n     *\n     * @throws InvalidDataInputException if arrays have different lengths or fewer than 2 elements\n     */\n    public static function tTestPaired(\n        array $data1,\n        array $data2,\n        Alternative $alternative = Alternative::TwoSided,\n        ?int $round = null,\n    ): array {\n        $n1 = self::count($data1);\n        $n2 = self::count($data2);\n\n        if ($n1 !== $n2) {\n            throw new InvalidDataInputException(\n                \"Paired t-test requires both samples to have the same number of observations.\",\n            );\n        }\n\n        if ($n1 < 2) {\n            throw new InvalidDataInputException(\n                \"Paired t-test requires at least 2 data points.\",\n            );\n        }\n\n        // Compute differences\n        $differences = [];\n        for ($i = 0; $i < $n1; $i++) {\n            $differences[] = $data1[$i] - $data2[$i];\n        }\n\n        // Paired t-test is a one-sample t-test on the differences with μ₀ = 0\n        return self::tTest($differences, 0.0, $alternative, $round);\n    }\n}\n"
  },
  {
    "path": "src/Statistics.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Enums\\Alternative;\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Utils\\Arr;\n\nclass Statistics\n{\n    /**\n     * Original array (no sorted and with original keys)\n     *\n     * @var array<mixed>\n     */\n    private array $originalArray = [];\n\n    /**\n     * Sorted values, with 0 index\n     *\n     * @var array<mixed>\n     */\n    private array $values = [];\n\n    /**\n     * Whether array contains not a numbers\n     */\n    private ?bool $containsNan = null;\n\n    /**\n     * @param  array<mixed>  $values\n     */\n    public function __construct(\n        array $values = [],\n    ) {\n        $this->values = array_values($values);\n        $this->originalArray = $values;\n        sort($this->values);\n    }\n\n    /**\n     * @param  array<mixed>  $values\n     */\n    public static function make(array $values): self\n    {\n        return new self($values);\n    }\n\n    /**\n     * Remove '0' values from the array.\n     */\n    public function stripZeroes(): self\n    {\n        $this->values = Arr::stripZeroes($this->values);\n\n        return $this;\n    }\n\n    /**\n     * Get the original array.\n     *\n     * @return mixed[]\n     */\n    public function originalArray(): array\n    {\n        return $this->originalArray;\n    }\n\n    /**\n     * Create a frequencies table.\n     * It counts the occurrences of each value in the array\n     * For not discrete elements you can try to transform to integer\n     *\n     * @see Freq::frequencies()\n     *\n     * @return array<int>\n     */\n    public function frequencies(bool $transformToInteger = false): array\n    {\n        return Freq::frequencies($this->values, $transformToInteger);\n    }\n\n    /**\n     * Return relative frequencies table.\n     *\n     * @see Freq::relativeFrequencies()\n     *\n     * @param  int|null  $round whether to round the result\n     * @return array<float>\n     */\n    public function relativeFrequencies(?int $round = null): array\n    {\n        return Freq::relativeFrequencies($this->values, $round);\n    }\n\n    /**\n     * Return cumulative relative frequencies table.\n     *\n     * @see Freq::cumulativeRelativeFrequencies()\n     *\n     * @return array<float>\n     */\n    public function cumulativeRelativeFrequencies(): array\n    {\n        return Freq::cumulativeRelativeFrequencies($this->values);\n    }\n\n    /**\n     * Return cumulative frequencies table.\n     *\n     * @see Freq::cumulativeFrequencies()\n     *\n     * @return array<float>\n     */\n    public function cumulativeFrequencies(): array\n    {\n        return Freq::cumulativeFrequencies($this->values);\n    }\n\n    /**\n     * Get the highest value.\n     */\n    public function max(): mixed\n    {\n        if ($this->values === []) {\n            return 0;\n        }\n        return max($this->values);\n    }\n\n    /**\n     * Get the lowest value.\n     */\n    public function min(): mixed\n    {\n        if ($this->values === []) {\n            return 0;\n        }\n        return min($this->values);\n    }\n\n    /**\n     * Get the range (max value - min value).\n     */\n    public function range(): int|float\n    {\n        return $this->max() - $this->min();\n    }\n\n    /**\n     * Count elements.\n     */\n    public function count(): int\n    {\n        return Stat::count($this->values);\n    }\n\n    /**\n     * Return the arithmetic mean of numeric data.\n     *\n     * @see Stat::mean()\n     */\n    public function mean(): int|float|null\n    {\n        return Stat::mean($this->numericalArray());\n    }\n\n    /**\n     * Return the trimmed (truncated) mean.\n     *\n     * @param  float  $proportionToCut  fraction (0..0.5) to trim from each side\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::trimmedMean()\n     */\n    public function trimmedMean(float $proportionToCut = 0.1, ?int $round = null): float\n    {\n        return Stat::trimmedMean($this->numericalArray(), $proportionToCut, $round);\n    }\n\n    /**\n     * Return the median (middle value) of data.\n     *\n     * @see Stat::median()\n     */\n    public function median(): mixed\n    {\n        return Stat::median($this->values);\n    }\n\n    /**\n     * Return the weighted median.\n     *\n     * @param  array<int|float>  $weights  array of weights (same length as data, all > 0)\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::weightedMedian()\n     */\n    public function weightedMedian(array $weights, ?int $round = null): float\n    {\n        return Stat::weightedMedian($this->numericalArray(), $weights, $round);\n    }\n\n    /**\n     * Estimate the median for grouped data.\n     *\n     * @param  float  $interval the width of each bin\n     *\n     * @see Stat::medianGrouped()\n     */\n    public function medianGrouped(float $interval = 1.0): float\n    {\n        return Stat::medianGrouped($this->numericalArray(), $interval);\n    }\n\n    /**\n     * Return the first quartile.\n     *\n     * @see Stat::firstQuartile()\n     */\n    public function firstQuartile(): mixed\n    {\n        return Stat::firstQuartile($this->values);\n    }\n\n    /**\n     * Return the third quartile.\n     *\n     * @see Stat::thirdQuartile()\n     */\n    public function thirdQuartile(): mixed\n    {\n        return Stat::thirdQuartile($this->values);\n    }\n\n    /**\n     * Return the interquartile range or midspread.\n     */\n    public function interquartileRange(): mixed\n    {\n        return $this->thirdQuartile() - $this->firstQuartile();\n    }\n\n    /**\n     * Return the most common data point from discrete or nominal data\n     *\n     * @see Stat::mode()\n     */\n    public function mode(): mixed\n    {\n        return Stat::mode($this->values);\n    }\n\n    /**\n     * Return the standard deviation of the numeric data.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::stdev()\n     */\n    public function stdev(?int $round = null): float\n    {\n        return Stat::stdev($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the standard error of the mean (SEM).\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::sem()\n     */\n    public function sem(?int $round = null): float\n    {\n        return Stat::sem($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the confidence interval for the mean using the normal (z) distribution.\n     *\n     * @param  float  $confidenceLevel the confidence level (e.g. 0.95 for 95%)\n     * @param  int|null  $round whether to round the result\n     * @return array{0: float, 1: float} [lower bound, upper bound]\n     *\n     * @see Stat::confidenceInterval()\n     */\n    public function confidenceInterval(float $confidenceLevel = 0.95, ?int $round = null): array\n    {\n        return Stat::confidenceInterval($this->numericalArray(), $confidenceLevel, $round);\n    }\n\n    /**\n     * Perform a one-sample Z-test for the mean.\n     *\n     * @param  float  $populationMean  the hypothesized population mean\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  whether to round the results\n     * @return array{zScore: float, pValue: float}\n     *\n     * @see Stat::zTest()\n     */\n    public function zTest(float $populationMean, Alternative $alternative = Alternative::TwoSided, ?int $round = null): array\n    {\n        return Stat::zTest($this->numericalArray(), $populationMean, $alternative, $round);\n    }\n\n    /**\n     * Perform a one-sample t-test for the mean.\n     *\n     * @param  float  $populationMean  the hypothesized population mean\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  whether to round the results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: int}\n     *\n     * @see Stat::tTest()\n     */\n    public function tTest(float $populationMean, Alternative $alternative = Alternative::TwoSided, ?int $round = null): array\n    {\n        return Stat::tTest($this->numericalArray(), $populationMean, $alternative, $round);\n    }\n\n    /**\n     * Perform a two-sample independent t-test (Welch's t-test).\n     *\n     * @param  array<int|float>  $data2  the second sample\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  whether to round the results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: float}\n     *\n     * @see Stat::tTestTwoSample()\n     */\n    public function tTestTwoSample(array $data2, Alternative $alternative = Alternative::TwoSided, ?int $round = null): array\n    {\n        return Stat::tTestTwoSample($this->numericalArray(), $data2, $alternative, $round);\n    }\n\n    /**\n     * Perform a paired t-test.\n     *\n     * @param  array<int|float>  $data2  the second set of observations (same length)\n     * @param  Alternative  $alternative  the alternative hypothesis\n     * @param  int|null  $round  whether to round the results\n     * @return array{tStatistic: float, pValue: float, degreesOfFreedom: int}\n     *\n     * @see Stat::tTestPaired()\n     */\n    public function tTestPaired(array $data2, Alternative $alternative = Alternative::TwoSided, ?int $round = null): array\n    {\n        return Stat::tTestPaired($this->numericalArray(), $data2, $alternative, $round);\n    }\n\n    /**\n     * Return the mean absolute deviation (MAD).\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::meanAbsoluteDeviation()\n     */\n    public function meanAbsoluteDeviation(?int $round = null): float\n    {\n        return Stat::meanAbsoluteDeviation($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the median absolute deviation.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::medianAbsoluteDeviation()\n     */\n    public function medianAbsoluteDeviation(?int $round = null): float\n    {\n        return Stat::medianAbsoluteDeviation($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the z-scores for each value in the dataset.\n     *\n     * @param  int|null  $round whether to round each z-score\n     * @return array<float>\n     *\n     * @see Stat::zscores()\n     */\n    public function zscores(?int $round = null): array\n    {\n        return Stat::zscores($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return values that are outliers based on z-score threshold.\n     *\n     * @param  float  $threshold  absolute z-score threshold (default 3.0)\n     * @return array<int|float>\n     *\n     * @see Stat::outliers()\n     */\n    public function outliers(float $threshold = 3.0): array\n    {\n        return Stat::outliers($this->numericalArray(), $threshold);\n    }\n\n    /**\n     * Return values that are outliers based on the IQR method.\n     *\n     * @param  float  $factor  IQR multiplier (default 1.5)\n     * @return array<int|float>\n     *\n     * @see Stat::iqrOutliers()\n     */\n    public function iqrOutliers(float $factor = 1.5): array\n    {\n        return Stat::iqrOutliers($this->numericalArray(), $factor);\n    }\n\n    /**\n     * Return the variance from the numeric data\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::variance()\n     */\n    public function variance(?int $round = null): float\n    {\n        return Stat::variance($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the **population** standard deviation.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::pstdev()\n     */\n    public function pstdev(?int $round = null): float\n    {\n        return Stat::pstdev($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return dispersion of the numeric data.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::pvariance()\n     */\n    public function pvariance(?int $round = null): float\n    {\n        return Stat::pvariance($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the adjusted Fisher-Pearson sample skewness.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::skewness()\n     */\n    public function skewness(?int $round = null): float\n    {\n        return Stat::skewness($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the population (biased) skewness.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::pskewness()\n     */\n    public function pskewness(?int $round = null): float\n    {\n        return Stat::pskewness($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the excess kurtosis (sample formula).\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::kurtosis()\n     */\n    public function kurtosis(?int $round = null): float\n    {\n        return Stat::kurtosis($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the value at the given percentile.\n     *\n     * @param  float  $p  percentile in range 0..100\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::percentile()\n     */\n    public function percentile(float $p, ?int $round = null): float\n    {\n        return Stat::percentile($this->numericalArray(), $p, $round);\n    }\n\n    /**\n     * Return the coefficient of variation (CV%) of the numeric data.\n     *\n     * @param  int|null  $round whether to round the result\n     * @param  bool  $population if true, use population stdev/mean\n     *\n     * @see Stat::coefficientOfVariation()\n     */\n    public function coefficientOfVariation(?int $round = null, bool $population = false): float\n    {\n        return Stat::coefficientOfVariation($this->numericalArray(), $round, $population);\n    }\n\n    /**\n     * Return the geometric mean of the numeric data.\n     *\n     * @param  int|null  $round whether to round the result\n     *\n     * @see Stat::geometricMean()\n     */\n    public function geometricMean(?int $round = null): float\n    {\n        return Stat::geometricMean($this->numericalArray(), $round);\n    }\n\n    /**\n     * Return the harmonic mean of the numeric data.\n     *\n     * @param  int|null  $round whether to round the result\n     * @param  mixed[]  $weights additional weight to the elements (as if there were several of them)\n     *\n     * @see Stat::harmonicMean()\n     */\n    public function harmonicMean(?int $round = null, ?array $weights = null): float\n    {\n        return Stat::harmonicMean($this->numericalArray(), $weights, $round);\n    }\n\n    /**\n     * Returns a string with values joined with a separator\n     */\n    public function valuesToString(bool|int $sample = false): string\n    {\n        return Arr::toString($this->values, $sample);\n    }\n\n    /**\n     * Caching-check for array to be numerical (for some functions).\n     *\n     * @return array<int|float>\n     */\n    public function numericalArray(): array\n    {\n        if ($this->containsNan === null) {\n            foreach ($this->values as $value) {\n                if (!is_numeric($value)) {\n                    $this->containsNan = true;\n\n                    break;\n                }\n            }\n        }\n        if ($this->containsNan) {\n            throw new InvalidDataInputException('The data must not contain non-number elements.');\n        }\n\n        return $this->values;\n    }\n}\n"
  },
  {
    "path": "src/StreamingStat.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Utils\\Math;\n\n/**\n * StreamingStat computes descriptive statistics in a single pass with O(1) memory.\n *\n * Uses Welford's online algorithm (extended by Terriberry/Pébay) to maintain\n * running moments, ideal for large datasets or generator-based streams.\n *\n * **Experimental** in version 1.x — will be released as stable in version 2.\n */\nclass StreamingStat\n{\n    private int $n = 0;\n\n    private float $mu = 0.0;\n\n    private float $m2 = 0.0;\n\n    private float $m3 = 0.0;\n\n    private float $m4 = 0.0;\n\n    private float $sum = 0.0;\n\n    private float $min = PHP_FLOAT_MAX;\n\n    private float $max = -PHP_FLOAT_MAX;\n\n    /**\n     * Add a value and update all accumulators using the online algorithm.\n     */\n    public function add(int|float $value): self\n    {\n        $this->sum += $value;\n        if ($value < $this->min) {\n            $this->min = (float) $value;\n        }\n        if ($value > $this->max) {\n            $this->max = (float) $value;\n        }\n\n        $n1 = $this->n;\n        $this->n++;\n        $n = $this->n;\n\n        $delta = $value - $this->mu;\n        $deltaN = $delta / $n;\n        $deltaN2 = $deltaN * $deltaN;\n        $term1 = $delta * $deltaN * $n1;\n\n        $this->mu += $deltaN;\n        $this->m4 += $term1 * $deltaN2 * ($n * $n - 3 * $n + 3)\n            + 6 * $deltaN2 * $this->m2\n            - 4 * $deltaN * $this->m3;\n        $this->m3 += $term1 * $deltaN * ($n - 2)\n            - 3 * $deltaN * $this->m2;\n        $this->m2 += $term1;\n\n        return $this;\n    }\n\n    /**\n     * Return the number of values added.\n     */\n    public function count(): int\n    {\n        return $this->n;\n    }\n\n    /**\n     * Return the sum of all values added.\n     */\n    public function sum(): float\n    {\n        if ($this->n < 1) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        return $this->sum;\n    }\n\n    /**\n     * Return the minimum value added.\n     */\n    public function min(): float\n    {\n        if ($this->n < 1) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        return $this->min;\n    }\n\n    /**\n     * Return the maximum value added.\n     */\n    public function max(): float\n    {\n        if ($this->n < 1) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        return $this->max;\n    }\n\n    /**\n     * Return the arithmetic mean.\n     */\n    public function mean(?int $round = null): float\n    {\n        if ($this->n < 1) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        return Math::round($this->mu, $round);\n    }\n\n    /**\n     * Return the sample variance (m2 / (n - 1)).\n     */\n    public function variance(?int $round = null): float\n    {\n        if ($this->n < 2) {\n            throw new InvalidDataInputException(\n                \"The data size must be greater than 1.\",\n            );\n        }\n\n        return Math::round($this->m2 / ($this->n - 1), $round);\n    }\n\n    /**\n     * Return the population variance (m2 / n).\n     */\n    public function pvariance(?int $round = null): float\n    {\n        if ($this->n < 1) {\n            throw new InvalidDataInputException(\"The data must not be empty.\");\n        }\n\n        return Math::round($this->m2 / $this->n, $round);\n    }\n\n    /**\n     * Return the sample standard deviation.\n     */\n    public function stdev(?int $round = null): float\n    {\n        return Math::round(sqrt($this->variance()), $round);\n    }\n\n    /**\n     * Return the population standard deviation.\n     */\n    public function pstdev(?int $round = null): float\n    {\n        return Math::round(sqrt($this->pvariance()), $round);\n    }\n\n    /**\n     * Return the adjusted Fisher-Pearson sample skewness.\n     */\n    public function skewness(?int $round = null): float\n    {\n        if ($this->n < 3) {\n            throw new InvalidDataInputException(\"Skewness requires at least 3 data points.\");\n        }\n\n        if ($this->m2 === 0.0) {\n            throw new InvalidDataInputException(\"Skewness is undefined when all values are identical (standard deviation is zero).\");\n        }\n\n        $n = $this->n;\n        // population skewness: (sqrt(n) * m3) / m2^1.5\n        $populationSkew = (sqrt($n) * $this->m3) / ($this->m2 ** 1.5);\n        // sample adjustment: n / ((n-1)(n-2)) * n * populationSkew\n        // which simplifies to: (n * sqrt(n)) / ((n-1)*(n-2)) * m3 / m2^1.5\n        // Actually the adjustment from population to sample skewness:\n        // G1 = (sqrt(n*(n-1)) / (n-2)) * g1\n        // where g1 = populationSkew\n        $skewness = (sqrt($n * ($n - 1)) / ($n - 2)) * $populationSkew;\n\n        return Math::round($skewness, $round);\n    }\n\n    /**\n     * Return the population (biased) skewness.\n     */\n    public function pskewness(?int $round = null): float\n    {\n        if ($this->n < 3) {\n            throw new InvalidDataInputException(\"Skewness requires at least 3 data points.\");\n        }\n\n        if ($this->m2 === 0.0) {\n            throw new InvalidDataInputException(\"Skewness is undefined when all values are identical (standard deviation is zero).\");\n        }\n\n        $n = $this->n;\n        $pskewness = (sqrt($n) * $this->m3) / ($this->m2 ** 1.5);\n\n        return Math::round($pskewness, $round);\n    }\n\n    /**\n     * Return the excess kurtosis (sample, Fisher=True, bias=False).\n     * Same formula as Excel's KURT() and scipy.stats.kurtosis(bias=False).\n     */\n    public function kurtosis(?int $round = null): float\n    {\n        if ($this->n < 4) {\n            throw new InvalidDataInputException(\"Kurtosis requires at least 4 data points.\");\n        }\n\n        if ($this->m2 === 0.0) {\n            throw new InvalidDataInputException(\"Kurtosis is undefined when all values are identical (standard deviation is zero).\");\n        }\n\n        $n = $this->n;\n        // population excess kurtosis: (n * m4) / (m2^2) - 3\n        $populationKurtosis = ($n * $this->m4) / ($this->m2 * $this->m2) - 3;\n        // sample adjustment:\n        // G2 = ((n-1) / ((n-2)(n-3))) * ((n+1) * g2 + 6)\n        $kurtosis = (($n - 1) / (($n - 2) * ($n - 3))) * (($n + 1) * $populationKurtosis + 6);\n\n        return Math::round($kurtosis, $round);\n    }\n}\n"
  },
  {
    "path": "src/StudentT.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\n\nclass StudentT\n{\n    public function __construct(private readonly float $df)\n    {\n        if ($df <= 0) {\n            throw new InvalidDataInputException('Degrees of freedom (df) must be greater than 0.');\n        }\n    }\n\n    public function getDegreesOfFreedom(): float\n    {\n        return $this->df;\n    }\n\n    /**\n     * Probability density function of the Student's t-distribution.\n     */\n    public function pdf(float $t): float\n    {\n        $df = $this->df;\n\n        $logCoeff = self::logGamma(($df + 1) / 2) - self::logGamma($df / 2) - 0.5 * log($df * M_PI);\n        $logBody = -(($df + 1) / 2) * log(1 + ($t * $t) / $df);\n\n        return exp($logCoeff + $logBody);\n    }\n\n    public function pdfRounded(float $t, int $precision = 3): float\n    {\n        return round($this->pdf($t), $precision);\n    }\n\n    /**\n     * Cumulative distribution function of the Student's t-distribution.\n     *\n     * Uses the regularized incomplete beta function.\n     */\n    public function cdf(float $t): float\n    {\n        $df = $this->df;\n        $x = $df / ($df + $t * $t);\n        $ibeta = $this->regularizedIncompleteBeta($df / 2, 0.5, $x);\n\n        if ($t >= 0) {\n            return 1 - 0.5 * $ibeta;\n        }\n\n        return 0.5 * $ibeta;\n    }\n\n    public function cdfRounded(float $t, int $precision = 3): float\n    {\n        return round($this->cdf($t), $precision);\n    }\n\n    /**\n     * Inverse CDF (quantile function) using Newton-Raphson iteration.\n     *\n     * @param float $p Probability in (0, 1) exclusive.\n     * @return float The t-value such that cdf(t) = p.\n     * @throws InvalidDataInputException If p is not in (0, 1).\n     */\n    public function invCdf(float $p): float\n    {\n        if ($p <= 0.0 || $p >= 1.0) {\n            throw new InvalidDataInputException('p must be in the range (0, 1) exclusive.');\n        }\n\n        // Use normal approximation as initial guess\n        $normalDist = new NormalDist(0.0, 1.0);\n        $x = $normalDist->invCdf($p);\n\n        // Newton-Raphson iteration\n        $maxIter = 100;\n        $tol = 1e-12;\n        for ($i = 0; $i < $maxIter; $i++) {\n            $fx = $this->cdf($x) - $p;\n            $fpx = $this->pdf($x);\n            if ($fpx < 1e-15) {\n                break;\n            }\n            $delta = $fx / $fpx;\n            $x -= $delta;\n            if (abs($delta) < $tol) {\n                break;\n            }\n        }\n\n        return $x;\n    }\n\n    public function invCdfRounded(float $p, int $precision = 3): float\n    {\n        return round($this->invCdf($p), $precision);\n    }\n\n    /**\n     * Log-gamma function using the Lanczos approximation.\n     */\n    private static function logGamma(float $x): float\n    {\n        // Lanczos approximation coefficients (g=7, n=9)\n        $coef = [\n            0.99999999999980993,\n            676.5203681218851,\n            -1259.1392167224028,\n            771.32342877765313,\n            -176.61502916214059,\n            12.507343278686905,\n            -0.13857109526572012,\n            9.9843695780195716e-6,\n            1.5056327351493116e-7,\n        ];\n\n        if ($x < 0.5) {\n            // Reflection formula: Gamma(x) * Gamma(1-x) = pi / sin(pi*x)\n            return log(M_PI / sin(M_PI * $x)) - self::logGamma(1.0 - $x);\n        }\n\n        $x -= 1.0;\n        $a = $coef[0];\n        $t = $x + 7.5; // g + 0.5\n        for ($i = 1; $i < 9; $i++) {\n            $a += $coef[$i] / ($x + $i);\n        }\n\n        return 0.5 * log(2.0 * M_PI) + ($x + 0.5) * log($t) - $t + log($a);\n    }\n\n    /**\n     * Regularized incomplete beta function I_x(a, b).\n     *\n     * Uses the continued fraction expansion (Lentz's algorithm).\n     */\n    private function regularizedIncompleteBeta(float $a, float $b, float $x): float\n    {\n        // Use the symmetry relation when x > (a+1)/(a+b+2) for better convergence\n        if ($x > ($a + 1) / ($a + $b + 2)) {\n            return 1.0 - $this->regularizedIncompleteBeta($b, $a, 1.0 - $x);\n        }\n\n        // Log of the front factor: x^a * (1-x)^b / (a * B(a,b))\n        $logFront = $a * log($x) + $b * log(1 - $x) - log($a)\n            - (self::logGamma($a) + self::logGamma($b) - self::logGamma($a + $b));\n        $front = exp($logFront);\n\n        return $front * $this->incompleteBetaCf($a, $b, $x);\n    }\n\n    /**\n     * Continued fraction expansion for the incomplete beta function.\n     * Uses the modified Lentz's algorithm.\n     */\n    private function incompleteBetaCf(float $a, float $b, float $x): float\n    {\n        $maxIter = 200;\n        $eps = 1e-15;\n        $tiny = 1e-30;\n\n        $f = 1.0;\n        $c = 1.0;\n        $d = 1.0 - ($a + $b) * $x / ($a + 1);\n        if (abs($d) < $tiny) {\n            $d = $tiny; // @codeCoverageIgnore\n        }\n        $d = 1.0 / $d;\n        $f = $d;\n\n        for ($m = 1; $m <= $maxIter; $m++) {\n            // Even step\n            $numerator = $m * ($b - $m) * $x / (($a + 2 * $m - 1) * ($a + 2 * $m));\n\n            $d = 1.0 + $numerator * $d;\n            if (abs($d) < $tiny) {\n                $d = $tiny; // @codeCoverageIgnore\n            }\n            $c = 1.0 + $numerator / $c;\n            if (abs($c) < $tiny) {\n                $c = $tiny; // @codeCoverageIgnore\n            }\n            $d = 1.0 / $d;\n            $f *= $d * $c;\n\n            // Odd step\n            $numerator = -(($a + $m) * ($a + $b + $m) * $x) / (($a + 2 * $m) * ($a + 2 * $m + 1));\n\n            $d = 1.0 + $numerator * $d;\n            if (abs($d) < $tiny) {\n                $d = $tiny; // @codeCoverageIgnore\n            }\n            $c = 1.0 + $numerator / $c;\n            if (abs($c) < $tiny) {\n                $c = $tiny; // @codeCoverageIgnore\n            }\n            $d = 1.0 / $d;\n            $delta = $d * $c;\n            $f *= $delta;\n\n            if (abs($delta - 1.0) < $eps) {\n                break;\n            }\n        }\n\n        return $f;\n    }\n}\n"
  },
  {
    "path": "src/Utils/Arr.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Utils;\n\nclass Arr\n{\n    /**\n     * Returns a string with values joined with a separator.\n     *\n     * @param  mixed[]  $data\n     */\n    public static function toString(array $data, bool|int $sample = false): string\n    {\n        if ($sample) {\n            return implode(',', array_slice($data, 0, (int) $sample));\n        }\n\n        return implode(',', $data);\n    }\n\n    /**\n     * Eliminate 0 value from the array.\n     *\n     * @param  mixed[]  $data\n     * @return mixed[]\n     */\n    public static function stripZeroes(array $data): array\n    {\n        $del_val = 0;\n\n        return array_values(array_filter($data, fn($e): bool => $e != $del_val));\n    }\n\n    /**\n     * Extract one or more columns from an array of associative arrays.\n     *\n     * Returns one array per requested column, in the same order as $columns.\n     *\n     * Example:\n     *   [$finishTimes, $ages] = Arr::extract($runners, ['finish', 'age']);\n     *\n     * @param  array<array<string, mixed>>  $data\n     * @param  string[]  $columns\n     * @return array<array<mixed>>\n     */\n    public static function extract(array $data, array $columns): array\n    {\n        $result = [];\n        foreach ($columns as $column) {\n            $result[] = array_column($data, $column);\n        }\n\n        return $result;\n    }\n\n    /**\n     * Partition an array of associative arrays into two groups based on a condition.\n     *\n     * Returns [$matching, $nonMatching] — both groups contain full rows.\n     * Supported operators: ==, !=, >, <, >=, <=\n     *\n     * Example:\n     *   [$menRunners, $womenRunners] = Arr::partition($runners, 'gender', '==', 'M');\n     *\n     * @param  array<array<string, mixed>>  $data\n     * @param  string  $operator  one of ==, !=, >, <, >=, <=\n     * @return array{0: array<array<string, mixed>>, 1: array<array<string, mixed>>}\n     */\n    public static function partition(array $data, string $field, string $operator, mixed $value): array\n    {\n        $matching = [];\n        $nonMatching = [];\n\n        foreach ($data as $row) {\n            $fieldValue = $row[$field] ?? null;\n\n            if (self::compare($fieldValue, $operator, $value)) {\n                $matching[] = $row;\n            } else {\n                $nonMatching[] = $row;\n            }\n        }\n\n        return [$matching, $nonMatching];\n    }\n\n    private static function compare(mixed $fieldValue, string $operator, mixed $value): bool\n    {\n        return match ($operator) {\n            '==' => $fieldValue == $value,\n            '!=' => $fieldValue != $value,\n            '>' => $fieldValue > $value,\n            '<' => $fieldValue < $value,\n            '>=' => $fieldValue >= $value,\n            '<=' => $fieldValue <= $value,\n            default => false,\n        };\n    }\n}\n"
  },
  {
    "path": "src/Utils/Format.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Utils;\n\nclass Format\n{\n    /**\n     * Convert seconds to an associative array with hours, minutes, and seconds.\n     *\n     * @return array{hours: int, minutes: int, seconds: int}\n     */\n    public static function secondsToHms(int|float $seconds): array\n    {\n        $totalSeconds = (int) $seconds;\n        $h = intdiv($totalSeconds, 3600);\n        $m = intdiv($totalSeconds % 3600, 60);\n        $s = $totalSeconds % 60;\n\n        return ['hours' => $h, 'minutes' => $m, 'seconds' => $s];\n    }\n\n    /**\n     * Convert hours, minutes, and seconds to total seconds.\n     */\n    public static function hmsToSeconds(int $hours, int $minutes, int $seconds): int\n    {\n        return ($hours * 3600) + ($minutes * 60) + $seconds;\n    }\n\n    /**\n     * Convert seconds to a human-readable time string (e.g. \"1:20:45\").\n     */\n    public static function secondsToTime(int|float $seconds): string\n    {\n        $hms = self::secondsToHms($seconds);\n\n        return sprintf('%d:%02d:%02d', $hms['hours'], $hms['minutes'], $hms['seconds']);\n    }\n\n    /**\n     * Parse a time string (e.g. \"01:20:45\" or \"1:20:45\") to total seconds.\n     */\n    public static function timeToSeconds(string $time): int\n    {\n        $parts = explode(':', $time);\n\n        if (count($parts) !== 3) {\n            throw new \\InvalidArgumentException(\n                \"Invalid time format '{$time}'. Expected format: H:MM:SS or HH:MM:SS\"\n            );\n        }\n\n        return self::hmsToSeconds((int) $parts[0], (int) $parts[1], (int) $parts[2]);\n    }\n}\n"
  },
  {
    "path": "src/Utils/Math.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Utils;\n\nclass Math\n{\n    /**\n     * Rounds value with the given precision, if the round is not null.\n     */\n    public static function round(float $value, ?int $round): float\n    {\n        return is_null($round) ? $value : round($value, $round);\n    }\n\n    /**\n     * Check if number is odd.\n     */\n    public static function isOdd(int $number): bool\n    {\n        return (bool) ($number & 1);\n    }\n}\n"
  },
  {
    "path": "tests/ArrTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Utils\\Arr;\nuse PHPUnit\\Framework\\TestCase;\n\nclass ArrTest extends TestCase\n{\n    public function test_extract_single_column(): void\n    {\n        $data = [\n            ['name' => 'Alice', 'age' => 30],\n            ['name' => 'Bob', 'age' => 25],\n        ];\n\n        [$names] = Arr::extract($data, ['name']);\n        $this->assertSame(['Alice', 'Bob'], $names);\n    }\n\n    public function test_extract_multiple_columns(): void\n    {\n        $data = [\n            ['name' => 'Alice', 'age' => 30, 'score' => 95],\n            ['name' => 'Bob', 'age' => 25, 'score' => 87],\n        ];\n\n        [$ages, $scores] = Arr::extract($data, ['age', 'score']);\n        $this->assertSame([30, 25], $ages);\n        $this->assertSame([95, 87], $scores);\n    }\n\n    public function test_extract_empty_array(): void\n    {\n        $result = Arr::extract([], ['name']);\n        $this->assertSame([[]], $result);\n    }\n\n    public function test_partition_equals(): void\n    {\n        $data = [\n            ['gender' => 'M', 'time' => 100],\n            ['gender' => 'F', 'time' => 110],\n            ['gender' => 'M', 'time' => 105],\n        ];\n\n        [$men, $women] = Arr::partition($data, 'gender', '==', 'M');\n        $this->assertCount(2, $men);\n        $this->assertCount(1, $women);\n        $this->assertSame('F', $women[0]['gender']);\n    }\n\n    public function test_partition_not_equals(): void\n    {\n        $data = [\n            ['status' => 'active'],\n            ['status' => 'inactive'],\n            ['status' => 'active'],\n        ];\n\n        [$nonActive, $active] = Arr::partition($data, 'status', '!=', 'active');\n        $this->assertCount(1, $nonActive);\n        $this->assertCount(2, $active);\n    }\n\n    public function test_partition_greater_than(): void\n    {\n        $data = [\n            ['age' => 30],\n            ['age' => 20],\n            ['age' => 40],\n        ];\n\n        [$older, $younger] = Arr::partition($data, 'age', '>', 25);\n        $this->assertCount(2, $older);\n        $this->assertCount(1, $younger);\n    }\n\n    public function test_partition_less_than(): void\n    {\n        $data = [\n            ['score' => 50],\n            ['score' => 80],\n            ['score' => 30],\n        ];\n\n        [$low, $high] = Arr::partition($data, 'score', '<', 60);\n        $this->assertCount(2, $low);\n        $this->assertCount(1, $high);\n    }\n\n    public function test_partition_greater_than_or_equal(): void\n    {\n        $data = [\n            ['value' => 10],\n            ['value' => 20],\n            ['value' => 20],\n        ];\n\n        [$matching, $nonMatching] = Arr::partition($data, 'value', '>=', 20);\n        $this->assertCount(2, $matching);\n        $this->assertCount(1, $nonMatching);\n    }\n\n    public function test_partition_less_than_or_equal(): void\n    {\n        $data = [\n            ['value' => 10],\n            ['value' => 20],\n            ['value' => 30],\n        ];\n\n        [$matching, $nonMatching] = Arr::partition($data, 'value', '<=', 20);\n        $this->assertCount(2, $matching);\n        $this->assertCount(1, $nonMatching);\n    }\n\n    public function test_partition_empty_array(): void\n    {\n        [$matching, $nonMatching] = Arr::partition([], 'field', '==', 'value');\n        $this->assertSame([], $matching);\n        $this->assertSame([], $nonMatching);\n    }\n\n    public function test_partition_preserves_full_rows(): void\n    {\n        $data = [\n            ['name' => 'Alice', 'age' => 30, 'city' => 'NYC'],\n        ];\n\n        [$matching, $nonMatching] = Arr::partition($data, 'age', '==', 30);\n        $this->assertSame($data[0], $matching[0]);\n    }\n\n    public function test_partition_invalid_operator(): void\n    {\n        $data = [['value' => 10]];\n\n        [$matching, $nonMatching] = Arr::partition($data, 'value', '===', 10);\n        $this->assertCount(0, $matching);\n        $this->assertCount(1, $nonMatching);\n    }\n\n    public function test_to_string(): void\n    {\n        $this->assertSame('1,2,3', Arr::toString([1, 2, 3]));\n    }\n\n    public function test_to_string_with_sample(): void\n    {\n        $this->assertSame('1,2', Arr::toString([1, 2, 3, 4], 2));\n    }\n\n    public function test_strip_zeroes(): void\n    {\n        $this->assertSame([1, 2, 3], Arr::stripZeroes([0, 1, 0, 2, 3, 0]));\n    }\n}\n"
  },
  {
    "path": "tests/FormatTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Utils\\Format;\nuse PHPUnit\\Framework\\TestCase;\n\nclass FormatTest extends TestCase\n{\n    public function test_seconds_to_hms(): void\n    {\n        $result = Format::secondsToHms(4845);\n        $this->assertSame(1, $result['hours']);\n        $this->assertSame(20, $result['minutes']);\n        $this->assertSame(45, $result['seconds']);\n    }\n\n    public function test_seconds_to_hms_zero(): void\n    {\n        $result = Format::secondsToHms(0);\n        $this->assertSame(0, $result['hours']);\n        $this->assertSame(0, $result['minutes']);\n        $this->assertSame(0, $result['seconds']);\n    }\n\n    public function test_seconds_to_hms_with_float(): void\n    {\n        $result = Format::secondsToHms(3661.7);\n        $this->assertSame(1, $result['hours']);\n        $this->assertSame(1, $result['minutes']);\n        $this->assertSame(1, $result['seconds']);\n    }\n\n    public function test_hms_to_seconds(): void\n    {\n        $this->assertSame(4845, Format::hmsToSeconds(1, 20, 45));\n    }\n\n    public function test_hms_to_seconds_zero(): void\n    {\n        $this->assertSame(0, Format::hmsToSeconds(0, 0, 0));\n    }\n\n    public function test_seconds_to_time(): void\n    {\n        $this->assertSame('1:20:45', Format::secondsToTime(4845));\n    }\n\n    public function test_seconds_to_time_with_padding(): void\n    {\n        $this->assertSame('0:05:03', Format::secondsToTime(303));\n    }\n\n    public function test_time_to_seconds(): void\n    {\n        $this->assertSame(4845, Format::timeToSeconds('1:20:45'));\n    }\n\n    public function test_time_to_seconds_with_leading_zeros(): void\n    {\n        $this->assertSame(4845, Format::timeToSeconds('01:20:45'));\n    }\n\n    public function test_round_trip_seconds(): void\n    {\n        $original = 9280;\n        $time = Format::secondsToTime($original);\n        $back = Format::timeToSeconds($time);\n        $this->assertSame($original, $back);\n    }\n\n    public function test_round_trip_hms(): void\n    {\n        $seconds = Format::hmsToSeconds(2, 35, 10);\n        $hms = Format::secondsToHms($seconds);\n        $this->assertSame(2, $hms['hours']);\n        $this->assertSame(35, $hms['minutes']);\n        $this->assertSame(10, $hms['seconds']);\n    }\n\n    public function test_time_to_seconds_invalid_format(): void\n    {\n        $this->expectException(\\InvalidArgumentException::class);\n        Format::timeToSeconds('20:45');\n    }\n\n    public function test_time_to_seconds_invalid_format_too_many_parts(): void\n    {\n        $this->expectException(\\InvalidArgumentException::class);\n        Format::timeToSeconds('1:20:45:00');\n    }\n}\n"
  },
  {
    "path": "tests/FreqTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Freq;\nuse PHPUnit\\Framework\\TestCase;\n\nclass FreqTest extends TestCase\n{\n    public function test_can_calculate_freq_table(): void\n    {\n        $this->assertEquals([4 => 2, 3 => 1, 1 => 1, 2 => 1], Freq::frequencies([1, 2, 3, 4, 4]));\n        $this->assertEquals([], Freq::frequencies([]));\n\n        $result = Freq::frequencies(['red', 'blue', 'blue', 'red', 'green', 'red', 'red']);\n        $this->assertEquals(['red' => 4, 'blue' => 2, 'green' => 1], $result);\n        $this->assertCount(3, $result);\n        $this->assertEquals(4, $result['red']);\n        $this->assertEquals(2, $result['blue']);\n        $this->assertEquals(1, $result['green']);\n\n        $result = Freq::frequencies([2.1, 2.7, 1.4, 2.45], true);\n        $this->assertEquals([2 => 3, 1 => 1], $result);\n        $this->assertCount(2, $result);\n    }\n\n    public function test_can_calculate_relative_freq_table(): void\n    {\n        $this->assertEquals([4 => 40, 3 => 20, 1 => 20, 2 => 20], Freq::relativeFrequencies([1, 2, 3, 4, 4]));\n        $this->assertEquals([], Freq::relativeFrequencies([]));\n\n        $result = Freq::relativeFrequencies(['red', 'blue', 'blue', 'red', 'green', 'red', 'red'], 2);\n        $this->assertEquals(['red' => 57.14, 'blue' => 28.57, 'green' => 14.29], $result);\n        $this->assertCount(3, $result);\n        $this->assertEquals(57.14, $result['red']);\n        $this->assertEquals(28.57, $result['blue']);\n        $this->assertEquals(14.29, $result['green']);\n\n        $result = Freq::relativeFrequencies([2.1, 2.7, 1.4, 2.45], 1);\n        $this->assertEquals([2 => 75, 1 => 25], $result);\n        $this->assertCount(2, $result);\n    }\n\n    public function test_can_calculate_grouped_frequency_table(): void\n    {\n        $data = [1, 1, 1, 4, 4, 5, 5, 5, 6, 7, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 11, 12, 12,\n            13, 14, 14, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18, 18, ];\n\n        $table = Freq::frequencyTable($data, 7);\n        $this->assertCount(6, $table);\n        $this->assertEquals([1 => 3, 4 => 6, 7 => 10, 10 => 5, 13 => 5, 16 => 9], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTable($data, 6);\n        $this->assertCount(6, $table);\n        $this->assertEquals([1 => 3, 4 => 6, 7 => 10, 10 => 5, 13 => 5, 16 => 9], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTable($data, 8);\n        $this->assertCount(6, $table);\n        $this->assertEquals([1 => 3, 4 => 6, 7 => 10, 10 => 5, 13 => 5, 16 => 9], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTable($data, 3);\n        $this->assertCount(3, $table);\n        $this->assertEquals([1 => 9, 7 => 15, 13 => 14], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTable($data);\n        $this->assertCount(18, $table);\n        $this->assertEquals([1 => 3, 2 => 0, 3 => 0, 4 => 2, 5 => 3, 6 => 1, 7 => 1, 8 => 3, 9 => 6,\n            10 => 2, 11 => 1, 12 => 2, 13 => 1, 14 => 2, 15 => 2, 16 => 4, 17 => 3, 18 => 2, ], $table);\n    }\n\n    public function test_can_calculate_grouped_frequency_table_by_size(): void\n    {\n        $data = [1, 1, 1, 4, 4, 5, 5, 5, 6, 7, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 11, 12, 12,\n            13, 14, 14, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18, 18, ];\n\n        $table = Freq::frequencyTableBySize($data, 4);\n        $this->assertCount(5, $table);\n        $this->assertEquals([1 => 5, 5 => 8, 9 => 11, 13 => 9, 17 => 5], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTableBySize($data, 5);\n        $this->assertCount(4, $table);\n        $this->assertEquals([1 => 8, 6 => 13, 11 => 8, 16 => 9], $table);\n        $this->assertEquals(count($data), array_sum($table));\n\n        $table = Freq::frequencyTableBySize($data, 8);\n        $this->assertCount(3, $table);\n        $this->assertEquals([1 => 13, 9 => 20, 17 => 5], $table);\n        $this->assertEquals(count($data), array_sum($table));\n    }\n\n    public function test_frequency_table_with_empty_array(): void\n    {\n        $this->assertSame([], Freq::frequencyTable([]));\n    }\n\n    public function test_frequency_table_by_size_with_empty_array(): void\n    {\n        $this->assertSame([], Freq::frequencyTableBySize([]));\n    }\n}\n"
  },
  {
    "path": "tests/FrequenciesTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Statistics;\nuse PHPUnit\\Framework\\TestCase;\n\nclass FrequenciesTest extends TestCase\n{\n    public function test_can_calculate_frequencies(): void\n    {\n        $s = Statistics::make(\n            [98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88, 76],\n        );\n        $a = $s->frequencies();\n        $this->assertEquals(2, $a[92]);\n        $this->assertCount(11, $a);\n    }\n\n    public function test_can_calculate_relative_frequencies(): void\n    {\n        $s = Statistics::make(\n            [3, 4, 3, 1],\n        );\n        $a = $s->relativeFrequencies();\n        $this->assertEquals(50, $a[3]);\n        $this->assertCount(3, $a);\n        $this->assertCount(4, $s->originalArray());\n    }\n\n    public function test_can_calculate_cumulative_frequencies(): void\n    {\n        $s = Statistics::make(\n            [3, 4, 3, 1],\n        );\n        $a = $s->cumulativeFrequencies();\n        $this->assertEquals(3, $a[3]);\n        $this->assertCount(3, $a);\n        $this->assertCount(4, $s->originalArray());\n    }\n\n    public function test_can_calculate_cumulative_relative_frequencies(): void\n    {\n        $s = Statistics::make(\n            [3, 4, 3, 1],\n        );\n        $a = $s->cumulativeRelativeFrequencies();\n        $this->assertEquals(75, $a[3]);\n        $this->assertCount(3, $a);\n        $this->assertCount(4, $s->originalArray());\n    }\n\n    public function test_can_calculate_first_quartile(): void\n    {\n        $s = Statistics::make([3, 4, 3, 1]);\n        $this->assertEquals(1.5, $s->firstQuartile());\n\n        $s = Statistics::make([3, 4, 3]);\n        $this->assertEquals(3, $s->firstQuartile());\n    }\n\n    public function test_can_calculate_first_quartile_with_empty_array(): void\n    {\n        $s = Statistics::make([]);\n        $this->expectException(InvalidDataInputException::class);\n        $s->firstQuartile();\n    }\n\n    public function test_can_calculate_third_quartile(): void\n    {\n        $s = Statistics::make([3, 4, 3, 1]);\n        $this->assertEquals(3.75, $s->thirdQuartile());\n\n        $s = Statistics::make([3, 4, 3]);\n        $this->assertEquals(4, $s->thirdQuartile());\n    }\n\n    public function test_can_calculate_third_quartile_with_empty_array(): void\n    {\n        $s = Statistics::make([]);\n        $this->expectException(InvalidDataInputException::class);\n        $s->thirdQuartile();\n    }\n}\n"
  },
  {
    "path": "tests/MathTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Utils\\Math;\nuse PHPUnit\\Framework\\TestCase;\n\nclass MathTest extends TestCase\n{\n    public function test_is_odd(): void\n    {\n        $this->assertTrue(Math::isOdd(1));\n        $this->assertFalse(Math::isOdd(0));\n        $this->assertTrue(Math::isOdd(-5));\n        $this->assertFalse(Math::isOdd(-2));\n    }\n}\n"
  },
  {
    "path": "tests/NormalDistTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\NormalDist;\nuse PHPUnit\\Framework\\TestCase;\n\nclass NormalDistTest extends TestCase\n{\n    public function test_init_normal_dist(): void\n    {\n        $nd = new NormalDist(1060, 195);\n        $this->assertEquals(1060, $nd->getMean());\n        $this->assertEquals(195, $nd->getSigma());\n    }\n\n    public function test_can_calculate_normal_dist_cdf(): void\n    {\n        $nd = new NormalDist(1060, 195);\n        $this->assertEquals(0.184, round($nd->cdf(1200 + 0.5) - $nd->cdf(1100 - 0.5), 3));\n    }\n\n    public function test_can_calculate_normal_dist_pdf(): void\n    {\n        $nd = new NormalDist(10, 2);\n        $this->assertEquals(0.121, $nd->pdfRounded(12, 3));\n        $this->assertEquals(0.12, $nd->pdfRounded(12, 2));\n    }\n\n    public function test_median(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $this->assertEquals(100.0, $nd->getMedian());\n        $this->assertEquals(100.0, $nd->getMedianRounded(2));\n\n        // Median always equals mean for a normal distribution\n        $nd2 = new NormalDist(5.123, 2.5);\n        $this->assertEquals($nd2->getMean(), $nd2->getMedian());\n    }\n\n    public function test_median_from_samples(): void\n    {\n        $samples = [2.5, 3.1, 2.1, 2.4, 2.7, 3.5];\n        $nd = NormalDist::fromSamples($samples);\n        $this->assertEquals($nd->getMeanRounded(5), $nd->getMedianRounded(5));\n        $this->assertEquals(2.71667, $nd->getMedianRounded(5));\n    }\n\n    public function test_mode(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $this->assertEquals(100.0, $nd->getMode());\n        $this->assertEquals(100.0, $nd->getModeRounded(2));\n\n        // Mode always equals mean for a normal distribution\n        $nd2 = new NormalDist(5.123, 2.5);\n        $this->assertEquals($nd2->getMean(), $nd2->getMode());\n    }\n\n    public function test_mode_from_samples(): void\n    {\n        $samples = [2.5, 3.1, 2.1, 2.4, 2.7, 3.5];\n        $nd = NormalDist::fromSamples($samples);\n        $this->assertEquals($nd->getMeanRounded(5), $nd->getModeRounded(5));\n        $this->assertEquals(2.71667, $nd->getModeRounded(5));\n    }\n\n    public function test_variance(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $this->assertEquals(1.0, $nd->getVariance());\n\n        $nd2 = new NormalDist(10, 2);\n        $this->assertEquals(4.0, $nd2->getVariance());\n\n        $nd3 = new NormalDist(100, 15);\n        $this->assertEquals(225.0, $nd3->getVariance());\n        $this->assertEquals(225.0, $nd3->getVarianceRounded(2));\n    }\n\n    public function test_variance_from_samples(): void\n    {\n        $samples = [2.5, 3.1, 2.1, 2.4, 2.7, 3.5];\n        $nd = NormalDist::fromSamples($samples);\n        $this->assertEquals(0.25767, $nd->getVarianceRounded(5));\n    }\n\n    public function test_load_normal_dist_from_samples(): void\n    {\n        $samples = [2.5, 3.1, 2.1, 2.4, 2.7, 3.5];\n        $normalDist = NormalDist::fromSamples($samples);\n        $this->assertEquals(2.71667, $normalDist->getMeanRounded(5));\n        $this->assertEquals(0.50761, $normalDist->getSigmaRounded(5));\n    }\n\n    public function test_add_to_normal_dist(): void\n    {\n        $birth_weights = NormalDist::fromSamples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5]);\n        $drug_effects = new NormalDist(0.4, 0.15);\n        $combined = $birth_weights->add($drug_effects);\n        $this->assertEquals(3.1, $combined->getMeanRounded(1));\n        $this->assertEquals(0.5, $combined->getSigmaRounded(1));\n        $this->assertEquals(2.71667, $birth_weights->getMeanRounded(5));\n        $this->assertEquals(0.50761, $birth_weights->getSigmaRounded(5));\n    }\n\n    public function test_multiply_normal_dist(): void\n    {\n        $tempFebruaryCelsius = new NormalDist(5, 2.5);\n        $tempFebFahrenheit = $tempFebruaryCelsius->multiply(9 / 5)->add(32);\n        $this->assertEquals(41.0, $tempFebFahrenheit->getMeanRounded(1));\n        $this->assertEquals(4.5, $tempFebFahrenheit->getSigmaRounded(1));\n        $this->assertEquals(5.0, $tempFebruaryCelsius->getMeanRounded(1));\n        $this->assertEquals(2.5, $tempFebruaryCelsius->getSigmaRounded(1));\n    }\n\n    public function test_subtract_constant_from_normal_dist(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $result = $nd->subtract(32);\n        $this->assertEquals(68.0, $result->getMean());\n        $this->assertEquals(15.0, $result->getSigma());\n        // Original unchanged\n        $this->assertEquals(100.0, $nd->getMean());\n    }\n\n    public function test_subtract_normal_dist(): void\n    {\n        $n1 = new NormalDist(10, 3);\n        $n2 = new NormalDist(4, 2);\n        $result = $n1->subtract($n2);\n        $this->assertEquals(6.0, $result->getMean());\n        // sigma = sqrt(3^2 + 2^2) = sqrt(13)\n        $this->assertEqualsWithDelta(sqrt(13), $result->getSigma(), 1e-10);\n    }\n\n    public function test_divide_normal_dist(): void\n    {\n        // Fahrenheit to Celsius: (F - 32) / (9/5)\n        $tempFahrenheit = new NormalDist(41, 4.5);\n        $tempCelsius = $tempFahrenheit->subtract(32)->divide(9 / 5);\n        $this->assertEquals(5.0, $tempCelsius->getMeanRounded(1));\n        $this->assertEquals(2.5, $tempCelsius->getSigmaRounded(1));\n    }\n\n    public function test_divide_preserves_original(): void\n    {\n        $nd = new NormalDist(100, 20);\n        $result = $nd->divide(2);\n        $this->assertEquals(50.0, $result->getMean());\n        $this->assertEquals(10.0, $result->getSigma());\n        // Original unchanged\n        $this->assertEquals(100.0, $nd->getMean());\n        $this->assertEquals(20.0, $nd->getSigma());\n    }\n\n    public function test_divide_by_zero_throws(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->divide(0);\n    }\n\n    public function test_quantiles_default_quartiles(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $q = $nd->quantiles(); // default n=4\n        $this->assertCount(3, $q);\n        // Q1 ≈ -0.6745, Q2 = 0, Q3 ≈ 0.6745\n        $this->assertEqualsWithDelta(-0.6745, $q[0], 1e-3);\n        $this->assertEqualsWithDelta(0.0, $q[1], 1e-5);\n        $this->assertEqualsWithDelta(0.6745, $q[2], 1e-3);\n    }\n\n    public function test_quantiles_deciles(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $q = $nd->quantiles(10); // deciles\n        $this->assertCount(9, $q);\n        // The 5th decile (median) should equal the mean\n        $this->assertEqualsWithDelta(100.0, $q[4], 1e-5);\n        // Symmetric: distance from mean should be equal for q[0] and q[8]\n        $this->assertEqualsWithDelta(\n            $q[4] - $q[0],\n            $q[8] - $q[4],\n            1e-5,\n        );\n    }\n\n    public function test_quantiles_percentiles(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $q = $nd->quantiles(100);\n        $this->assertCount(99, $q);\n        // 50th percentile should be 0\n        $this->assertEqualsWithDelta(0.0, $q[49], 1e-5);\n    }\n\n    public function test_quantiles_n_one_returns_empty(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $q = $nd->quantiles(1);\n        $this->assertCount(0, $q);\n    }\n\n    public function test_quantiles_throws_for_invalid_n(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->quantiles(0);\n    }\n\n    public function test_samples_count(): void\n    {\n        $nd = new NormalDist(100, 15);\n        $samples = $nd->samples(1000);\n        $this->assertCount(1000, $samples);\n    }\n\n    public function test_samples_statistical_properties(): void\n    {\n        // With enough samples, mean and stdev should approximate mu and sigma\n        $nd = new NormalDist(50, 10);\n        $samples = $nd->samples(10000, seed: 42);\n        $sampleMean = array_sum($samples) / count($samples);\n        $this->assertEqualsWithDelta(50, $sampleMean, 1.0);\n\n        $reconstructed = NormalDist::fromSamples($samples);\n        $this->assertEqualsWithDelta(10, $reconstructed->getSigma(), 1.0);\n    }\n\n    public function test_samples_seed_reproducibility(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $a = $nd->samples(100, seed: 123);\n        $b = $nd->samples(100, seed: 123);\n        $this->assertEquals($a, $b);\n    }\n\n    public function test_samples_throws_for_invalid_n(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->samples(0);\n    }\n\n    public function test_zscore(): void\n    {\n        $nd = new NormalDist(100, 15);\n        // Mean has z-score of 0\n        $this->assertEquals(0.0, $nd->zscore(100));\n        // One standard deviation above\n        $this->assertEqualsWithDelta(1.0, $nd->zscore(115), 1e-10);\n        // One standard deviation below\n        $this->assertEqualsWithDelta(-1.0, $nd->zscore(85), 1e-10);\n        // Two standard deviations above\n        $this->assertEqualsWithDelta(2.0, $nd->zscore(130), 1e-10);\n    }\n\n    public function test_zscore_standard_normal(): void\n    {\n        // For standard normal, zscore(x) == x\n        $nd = new NormalDist(0, 1);\n        $this->assertEqualsWithDelta(1.5, $nd->zscore(1.5), 1e-10);\n        $this->assertEqualsWithDelta(-2.3, $nd->zscore(-2.3), 1e-10);\n    }\n\n    public function test_zscore_rounded(): void\n    {\n        $nd = new NormalDist(10, 3);\n        $this->assertEquals(1.333, $nd->zscoreRounded(14, 3));\n    }\n\n    public function test_zscore_throws_for_zero_sigma(): void\n    {\n        $nd = new NormalDist(5, 0);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->zscore(5);\n    }\n\n    public function test_overlap_identical_distributions(): void\n    {\n        $nd = new NormalDist(0, 1);\n        // Identical distributions overlap completely\n        $this->assertEqualsWithDelta(1.0, $nd->overlap($nd), 1e-5);\n    }\n\n    public function test_overlap_different_means(): void\n    {\n        // Python reference: NormalDist(2.4, 1.6).overlap(NormalDist(3.2, 2.0)) ≈ 0.8035\n        $n1 = new NormalDist(2.4, 1.6);\n        $n2 = new NormalDist(3.2, 2.0);\n        $this->assertEqualsWithDelta(0.8035, $n1->overlapRounded($n2, 4), 1e-3);\n    }\n\n    public function test_overlap_equal_variances(): void\n    {\n        // Equal sigma, different means\n        $n1 = new NormalDist(0, 1);\n        $n2 = new NormalDist(1, 1);\n        $overlap = $n1->overlap($n2);\n        $this->assertGreaterThan(0.0, $overlap);\n        $this->assertLessThan(1.0, $overlap);\n        // Symmetric: order shouldn't matter\n        $this->assertEqualsWithDelta($overlap, $n2->overlap($n1), 1e-10);\n    }\n\n    public function test_overlap_far_apart_distributions(): void\n    {\n        // Very far apart distributions should have near-zero overlap\n        $n1 = new NormalDist(0, 1);\n        $n2 = new NormalDist(100, 1);\n        $this->assertEqualsWithDelta(0.0, $n1->overlap($n2), 1e-5);\n    }\n\n    public function test_overlap_is_symmetric(): void\n    {\n        $n1 = new NormalDist(5, 2);\n        $n2 = new NormalDist(10, 3);\n        $this->assertEqualsWithDelta(\n            $n1->overlap($n2),\n            $n2->overlap($n1),\n            1e-10,\n        );\n    }\n\n    public function test_inv_cdf_standard_normal(): void\n    {\n        $nd = new NormalDist(0, 1);\n        // inv_cdf(0.5) should be 0 for standard normal\n        $this->assertEqualsWithDelta(0.0, $nd->invCdf(0.5), 1e-5);\n        // Known quantiles of standard normal\n        $this->assertEqualsWithDelta(-1.64485, $nd->invCdfRounded(0.05, 5), 1e-4);\n        $this->assertEqualsWithDelta(-1.28155, $nd->invCdfRounded(0.1, 5), 1e-4);\n        $this->assertEqualsWithDelta(1.28155, $nd->invCdfRounded(0.9, 5), 1e-4);\n        $this->assertEqualsWithDelta(1.64485, $nd->invCdfRounded(0.95, 5), 1e-4);\n    }\n\n    public function test_inv_cdf_custom_distribution(): void\n    {\n        $nd = new NormalDist(100, 15);\n        // inv_cdf(0.5) should equal the mean\n        $this->assertEqualsWithDelta(100.0, $nd->invCdf(0.5), 1e-5);\n        // A round-trip: cdf(inv_cdf(p)) should equal p\n        $this->assertEqualsWithDelta(0.25, $nd->cdf($nd->invCdf(0.25)), 1e-5);\n        $this->assertEqualsWithDelta(0.75, $nd->cdf($nd->invCdf(0.75)), 1e-5);\n        $this->assertEqualsWithDelta(0.99, $nd->cdf($nd->invCdf(0.99)), 1e-4);\n    }\n\n    public function test_inv_cdf_throws_for_invalid_p(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->invCdf(0.0);\n    }\n\n    public function test_inv_cdf_throws_for_p_equals_one(): void\n    {\n        $nd = new NormalDist(0, 1);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $nd->invCdf(1.0);\n    }\n\n    public function test_inv_cdf_extreme_tails(): void\n    {\n        $nd = new NormalDist(0, 1);\n        // Very low probability (lower tail)\n        $this->assertEqualsWithDelta(-3.09023, $nd->invCdfRounded(0.001, 5), 1e-3);\n        // Very high probability (upper tail)\n        $this->assertEqualsWithDelta(3.09023, $nd->invCdfRounded(0.999, 5), 1e-3);\n    }\n\n    public function test_constructor_negative_sigma_throws(): void\n    {\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        new NormalDist(0.0, -1.0);\n    }\n\n    public function test_from_samples_empty_throws(): void\n    {\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        NormalDist::fromSamples([]);\n    }\n\n    public function test_cdf_rounded(): void\n    {\n        $nd = new NormalDist(0.0, 1.0);\n        $this->assertEquals(0.5, $nd->cdfRounded(0.0));\n        $this->assertEquals(0.841, $nd->cdfRounded(1.0));\n        $this->assertEquals(0.84134, $nd->cdfRounded(1.0, 5));\n    }\n\n    public function test_overlap_zero_sigma_throws(): void\n    {\n        $a = new NormalDist(0.0, 0.0);\n        $b = new NormalDist(1.0, 1.0);\n        $this->expectException(\\HiFolks\\Statistics\\Exception\\InvalidDataInputException::class);\n        $a->overlap($b);\n    }\n}\n"
  },
  {
    "path": "tests/StatDatasetTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Stat;\nuse PHPUnit\\Framework\\Attributes\\DataProvider;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StatDatasetTest extends TestCase\n{\n    public function test_mean(): void\n    {\n        $this->assertEquals(2.8, Stat::mean([1, 2, 3, 4, 4]));\n        $this->assertEquals(2.625, Stat::mean([-1.0, 2.5, 3.25, 5.75]));\n    }\n\n    public function test_mean_chain(): void\n    {\n        $this->assertEquals(2.8, Stat::mean([1, 2, 3, 4, 4]));\n        $this->assertEquals(2.625, Stat::mean([-1.0, 2.5, 3.25, 5.75]));\n    }\n\n    /** @param array<int|float> $input */\n    #[DataProvider('meanDatasetProvider')]\n    public function test_mean_dataset(array $input, float $result): void\n    {\n        $this->assertEquals($result, Stat::mean($input));\n    }\n\n    /** @return array<array{array<int|float>, float}> */\n    public static function meanDatasetProvider(): array\n    {\n        return [\n            [[1, 2, 3, 4, 4], 2.8],\n            [[-1.0, 2.5, 3.25, 5.75], 2.625],\n        ];\n    }\n\n    /** @param array<int|float> $input */\n    #[DataProvider('dynamicOperationProvider')]\n    public function test_dynamic_operation(string $methodName, array $input, float $result): void\n    {\n        $this->assertEquals($result, Stat::$methodName($input));\n    }\n\n    /** @return array<array{string, array<int|float>, float}> */\n    public static function dynamicOperationProvider(): array\n    {\n        return [\n            ['mean', [1, 2, 3, 4, 4], 2.8],\n            ['mean', [-1.0, 2.5, 3.25, 5.75], 2.625],\n            ['median', [1, 3, 5], 3],\n            ['median', [1, 3, 5, 7], 4],\n            ['medianLow', [1, 3, 5], 3],\n            ['medianLow', [1, 3, 5, 7], 3],\n        ];\n    }\n\n    /** @param array<int|float> $input */\n    #[DataProvider('externalDatasetProvider')]\n    public function test_dynamic_operation_with_external_dataset(string $methodName, array $input, float $result): void\n    {\n        $this->assertEquals(\n            $result,\n            Stat::$methodName($input),\n        );\n\n        $this->assertEquals(\n            $result,\n            Stat::$methodName($input),\n        );\n    }\n\n    /** @return array<array{string, array<int|float>, float}> */\n    public static function externalDatasetProvider(): array\n    {\n        return [\n            ['mean', [1, 2, 3, 4, 4], 2.8],\n            ['mean', [-1.0, 2.5, 3.25, 5.75], 2.625],\n            ['median', [1, 3, 5], 3],\n            ['median', [1, 3, 5, 7], 4],\n            ['medianLow', [1, 3, 5], 3],\n            ['medianLow', [1, 3, 5, 7], 3],\n        ];\n    }\n}\n"
  },
  {
    "path": "tests/StatFromCsvTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Stat;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StatFromCsvTest extends TestCase\n{\n    public function test_parse_csv(): void\n    {\n        $row = 0;\n\n        if (($handle = fopen(getcwd() . '/tests/data/income.data.csv', 'r')) !== false) {\n            $x = [];\n            $y = [];\n            while (($data = fgetcsv(\n                $handle,\n                1000,\n                separator: ',',\n                enclosure: '\"',\n                escape: \"\",\n            )) !== false) {\n                $num = count($data);\n                $this->assertEquals(3, $num);\n                $row++;\n                if ($row === 1) {\n                    continue;\n                }\n                $income = floatval($data[1]);\n                $x[] = $income;\n                $happiness = floatval($data[2]);\n                $y[] = $happiness;\n                $this->assertIsFloat($income);\n                $this->assertGreaterThan(0, $income);\n                $this->assertIsFloat($happiness);\n            }\n            [$slope, $intercept] = Stat::linearRegression($x, $y);\n            $this->assertEquals(0.71383, round($slope, 5));\n            $this->assertEquals(0.20427, round($intercept, 5));\n\n            fclose($handle);\n        }\n\n        $this->assertEquals(499, $row);\n    }\n}\n"
  },
  {
    "path": "tests/StatTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Enums\\Alternative;\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Enums\\KdeKernel;\nuse HiFolks\\Statistics\\Stat;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StatTest extends TestCase\n{\n    public function test_calculates_mean(): void\n    {\n        $this->assertEquals(2.8, Stat::mean([1, 2, 3, 4, 4]));\n        $this->assertEquals(2.625, Stat::mean([-1.0, 2.5, 3.25, 5.75]));\n        $this->expectException(InvalidDataInputException::class);\n        Stat::mean([]);\n    }\n\n    public function test_calculates_fmean(): void\n    {\n        $this->assertEquals(2.8, Stat::mean([1, 2, 3, 4, 4]));\n        $this->assertEquals(2.625, Stat::mean([-1.0, 2.5, 3.25, 5.75]));\n\n        $result = Stat::fmean([3.5, 4.0, 5.25]);\n        $this->assertIsFloat($result);\n        $this->assertEquals(4.25, $result);\n\n        $result = Stat::fmean([85, 92, 83, 91], [0.20, 0.20, 0.30, 0.30], 2);\n        $this->assertIsFloat($result);\n        $this->assertEquals(87.6, $result);\n\n        $result = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1]);\n        $this->assertIsFloat($result);\n        $this->assertEquals(4.1875, $result);\n\n        $result = Stat::fmean([3.5, 4.0, 5.25], precision: 2);\n        $this->assertIsFloat($result);\n        $this->assertEquals(4.25, $result);\n\n        $result = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1], precision: 3);\n        $this->assertIsFloat($result);\n        $this->assertEquals(4.188, $result);\n    }\n\n    public function test_calculates_fmean_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::mean([]);\n    }\n\n    public function test_fmean_empty_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::fmean([]);\n    }\n\n    public function test_fmean_mismatched_weights_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::fmean([1, 2, 3], [1, 2]);\n    }\n\n    public function test_fmean_zero_weight_sum_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::fmean([1, 2, 3], [0, 0, 0]);\n    }\n\n    public function test_calculates_median(): void\n    {\n        $this->assertEquals(3, Stat::median([1, 3, 5]));\n        $this->assertEquals(4, Stat::median([1, 3, 5, 7]));\n        $this->assertEquals(1001, Stat::median([1001, 999, 998, 1001, 1002]));\n        $this->assertEquals(1001.5, Stat::median([1001, 999, 998, 1003, 1002, 1003]));\n        $this->assertEquals(7, Stat::median([1, 3, 5, 7, 9, 11, 13]));\n        $this->assertEquals(6, Stat::median([1, 3, 5, 7, 9, 11]));\n        $this->assertEquals(1.05, Stat::median([-11, 5.5, -3.4, 7.1, -9, 22]));\n        $this->assertEquals(0, Stat::median([-1, -2, -3, -4, 4, 3, 2, 1]));\n    }\n\n    public function test_calculates_median_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::median([]);\n    }\n\n    public function test_calculates_median_low(): void\n    {\n        $this->assertEquals(3, Stat::medianLow([1, 3, 5]));\n        $this->assertEquals(3, Stat::medianLow([1, 3, 5, 7]));\n        $this->assertEquals(1001, Stat::medianLow([1001, 999, 998, 1003, 1002, 1003]));\n    }\n\n    public function test_calculates_median_low_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::medianLow([]);\n    }\n\n    public function test_calculates_median_high(): void\n    {\n        $this->assertEquals(3, Stat::medianHigh([1, 3, 5]));\n        $this->assertEquals(5, Stat::medianHigh([1, 3, 5, 7]));\n        $this->assertEquals(1002, Stat::medianHigh([1001, 999, 998, 1003, 1002, 1003]));\n    }\n\n    public function test_calculates_median_high_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::medianHigh([]);\n    }\n\n    public function test_calculates_median_grouped(): void\n    {\n        // Python: median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5]) == 3.7\n        $this->assertEquals(3.7, Stat::medianGrouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5]));\n\n        // Python: median_grouped([52, 52, 53, 54]) == 52.5\n        $this->assertEquals(52.5, Stat::medianGrouped([52, 52, 53, 54]));\n\n        // Python: median_grouped([1, 3, 3, 5, 7]) == 3.25\n        $this->assertEquals(3.25, Stat::medianGrouped([1, 3, 3, 5, 7]));\n\n        // With interval=2: median_grouped([1, 3, 3, 5, 7], interval=2) == 3.5\n        $this->assertEquals(3.5, Stat::medianGrouped([1, 3, 3, 5, 7], 2));\n\n        // Demographics example from Python docs (interval=10)\n        $data = array_merge(\n            array_fill(0, 172, 25),\n            array_fill(0, 484, 35),\n            array_fill(0, 387, 45),\n            array_fill(0, 22, 55),\n            array_fill(0, 6, 65),\n        );\n        $this->assertEquals(37.5, round(Stat::medianGrouped($data, 10), 1));\n\n        // Single element: L = 1 - 0.5 = 0.5, result = 0.5 + 1*(0.5-0)/1 = 1.0\n        $this->assertEquals(1.0, Stat::medianGrouped([1]));\n    }\n\n    public function test_calculates_median_grouped_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::medianGrouped([]);\n    }\n\n    public function test_calculates_mode(): void\n    {\n        $this->assertEquals(3, Stat::mode([1, 1, 2, 3, 3, 3, 3, 4]));\n        $this->assertNull(Stat::mode([1, 2, 3]));\n        $this->assertEquals('red', Stat::mode(['red', 'blue', 'blue', 'red', 'green', 'red', 'red']));\n    }\n\n    public function test_calculates_mode_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::mode([]);\n    }\n\n    public function test_calculates_multimode(): void\n    {\n        $this->assertEquals([3], Stat::multimode([1, 1, 2, 3, 3, 3, 3, 4]));\n        $this->assertEquals([1, 3], Stat::multimode([1, 1, 2, 3, 3, 3, 3, 1, 1, 4]));\n        $result = Stat::multimode(str_split('aabbbbccddddeeffffgg'));\n        $this->assertNotNull($result);\n        $this->assertEquals(['b', 'd', 'f'], $result);\n        $this->assertCount(3, $result);\n        $this->assertEquals('b', $result[0]);\n        $this->assertEquals('d', $result[1]);\n        $this->assertEquals('f', $result[2]);\n    }\n\n    public function test_calculates_multimode_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::multimode([]);\n    }\n\n    public function test_calculates_population_standard_deviation(): void\n    {\n        $this->assertEquals(0.986893273527251, Stat::pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]));\n        $this->assertEquals(2.4495, Stat::pstdev([1, 2, 4, 5, 8], 4));\n        $this->assertEquals(0, Stat::pstdev([1]));\n        $this->assertEquals(0.8291562, Stat::pstdev([1, 2, 3, 3], 7));\n    }\n\n    public function test_calculates_population_standard_deviation_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::pstdev([]);\n    }\n\n    public function test_calculates_sample_standard_deviation(): void\n    {\n        $this->assertEquals(1.0810874155219827, Stat::stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]));\n        $this->assertEquals(2, Stat::stdev([1, 2, 2, 4, 6]));\n        $this->assertEquals(2.7386, Stat::stdev([1, 2, 4, 5, 8], 4));\n    }\n\n    public function test_calculates_sample_standard_deviation_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::stdev([]);\n    }\n\n    public function test_calculates_sample_standard_deviation_with_single_element(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::stdev([1]);\n    }\n\n    public function test_calculates_variance(): void\n    {\n        $this->assertEquals(1.3720238095238095, Stat::variance([2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]));\n    }\n\n    public function test_calculates_variance_with_precomputed_mean(): void\n    {\n        $data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5];\n        $mean = Stat::mean($data);\n        $this->assertEquals(\n            Stat::variance($data),\n            Stat::variance($data, xbar: $mean),\n        );\n    }\n\n    public function test_calculates_pvariance(): void\n    {\n        $this->assertEquals(1.25, Stat::pvariance([0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]));\n        $this->assertEquals(0.6875, Stat::pvariance([1, 2, 3, 3]));\n    }\n\n    public function test_calculates_pvariance_with_precomputed_mean(): void\n    {\n        $data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25];\n        $mean = Stat::mean($data);\n        $this->assertEquals(\n            Stat::pvariance($data),\n            Stat::pvariance($data, mu: $mean),\n        );\n    }\n\n    public function test_calculates_skewness_symmetric(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::skewness([1, 2, 3, 4, 5]), 1e-10);\n    }\n\n    public function test_calculates_skewness_right_skewed(): void\n    {\n        $skewness = Stat::skewness([1, 1, 1, 1, 1, 10]);\n        $this->assertGreaterThan(0, $skewness);\n    }\n\n    public function test_calculates_skewness_left_skewed(): void\n    {\n        $skewness = Stat::skewness([1, 10, 10, 10, 10, 10]);\n        $this->assertLessThan(0, $skewness);\n    }\n\n    public function test_calculates_skewness_with_rounding(): void\n    {\n        $skewness = Stat::skewness([1, 1, 1, 1, 1, 10], 4);\n        $this->assertGreaterThan(0, $skewness);\n        $this->assertEquals(round($skewness, 4), $skewness);\n    }\n\n    public function test_skewness_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::skewness([]);\n    }\n\n    public function test_skewness_with_two_elements(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::skewness([1, 2]);\n    }\n\n    public function test_skewness_with_identical_values(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::skewness([5, 5, 5, 5]);\n    }\n\n    public function test_calculates_pskewness_symmetric(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::pskewness([1, 2, 3, 4, 5]), 1e-10);\n    }\n\n    public function test_calculates_pskewness_right_skewed(): void\n    {\n        $pskewness = Stat::pskewness([1, 1, 1, 1, 1, 10]);\n        $this->assertGreaterThan(0, $pskewness);\n    }\n\n    public function test_calculates_pskewness_left_skewed(): void\n    {\n        $pskewness = Stat::pskewness([1, 10, 10, 10, 10, 10]);\n        $this->assertLessThan(0, $pskewness);\n    }\n\n    public function test_calculates_pskewness_with_rounding(): void\n    {\n        $pskewness = Stat::pskewness([1, 1, 1, 1, 1, 10], 4);\n        $this->assertGreaterThan(0, $pskewness);\n        $this->assertEquals(round($pskewness, 4), $pskewness);\n    }\n\n    public function test_pskewness_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::pskewness([]);\n    }\n\n    public function test_pskewness_with_two_elements(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::pskewness([1, 2]);\n    }\n\n    public function test_pskewness_with_identical_values(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::pskewness([5, 5, 5, 5]);\n    }\n\n    public function test_pskewness_less_than_skewness_for_small_samples(): void\n    {\n        $data = [1, 1, 1, 1, 1, 10];\n        $skewness = Stat::skewness($data);\n        $pskewness = Stat::pskewness($data);\n        // Population skewness magnitude is smaller than sample skewness for small n\n        $this->assertLessThan(abs($skewness), abs($pskewness));\n    }\n\n    public function test_calculates_kurtosis_normal_like(): void\n    {\n        // A uniform-ish symmetric dataset: excess kurtosis near 0 or negative\n        $this->assertEqualsWithDelta(-1.2, Stat::kurtosis([1, 2, 3, 4, 5]), 0.1);\n    }\n\n    public function test_calculates_kurtosis_heavy_tails(): void\n    {\n        // Data with outliers should have positive excess kurtosis (leptokurtic)\n        $kurtosis = Stat::kurtosis([1, 2, 2, 2, 2, 2, 2, 2, 2, 50]);\n        $this->assertGreaterThan(0, $kurtosis);\n    }\n\n    public function test_calculates_kurtosis_light_tails(): void\n    {\n        // Uniform-like data should have negative excess kurtosis (platykurtic)\n        $kurtosis = Stat::kurtosis([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);\n        $this->assertLessThan(0, $kurtosis);\n    }\n\n    public function test_calculates_kurtosis_with_rounding(): void\n    {\n        $kurtosis = Stat::kurtosis([1, 2, 2, 2, 2, 2, 2, 2, 2, 50], 4);\n        $this->assertEquals(round($kurtosis, 4), $kurtosis);\n    }\n\n    public function test_kurtosis_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kurtosis([]);\n    }\n\n    public function test_kurtosis_with_three_elements(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kurtosis([1, 2, 3]);\n    }\n\n    public function test_kurtosis_with_identical_values(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kurtosis([5, 5, 5, 5]);\n    }\n\n    public function test_calculates_geometric_mean(): void\n    {\n        $this->assertEquals(36, Stat::geometricMean([54, 24, 36], 2));\n    }\n\n    public function test_calculates_geometric_mean_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::geometricMean([]);\n    }\n\n    public function test_calculates_harmonic_mean(): void\n    {\n        $this->assertEquals(48, Stat::harmonicMean([40, 60], round: 2));\n        $this->assertEquals(0, Stat::harmonicMean([10, 100, 0, 1]));\n        $this->assertEquals(56, Stat::harmonicMean([40, 60], [5, 30]));\n        $this->assertEquals(52.2, Stat::harmonicMean([60, 40], [7, 3], 1));\n    }\n\n    public function test_calculates_harmonic_mean_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::harmonicMean([]);\n    }\n\n    public function test_calculates_quantiles(): void\n    {\n        $q = Stat::quantiles([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88, 76]);\n        $this->assertEquals(58.75, $q[0]);\n        $this->assertEquals(85.5, $q[1]);\n        $this->assertEquals(92, $q[2]);\n\n        $q = Stat::quantiles([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88]);\n        $this->assertEquals(55, $q[0]);\n        $this->assertEquals(88, $q[1]);\n        $this->assertEquals(92, $q[2]);\n\n        $q = Stat::quantiles([1, 2]);\n        $this->assertEquals(0.75, $q[0]);\n        $this->assertEquals(1.5, $q[1]);\n        $this->assertEquals(2.25, $q[2]);\n\n        $q = Stat::quantiles([1, 2, 4]);\n        $this->assertEquals(1, $q[0]);\n        $this->assertEquals(2, $q[1]);\n        $this->assertEquals(4, $q[2]);\n    }\n\n    public function test_calculates_quantiles_with_too_few_elements(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::quantiles([1]);\n    }\n\n    public function test_calculates_quantiles_with_invalid_n(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::quantiles([1, 2, 3], 0);\n    }\n\n    public function test_calculates_quantiles_inclusive(): void\n    {\n        $q = Stat::quantiles([1, 2, 3, 4, 5], method: 'inclusive');\n        $this->assertEquals(2.0, $q[0]);\n        $this->assertEquals(3.0, $q[1]);\n        $this->assertEquals(4.0, $q[2]);\n\n        $q = Stat::quantiles([1, 2, 3, 4, 5], 10, method: 'inclusive');\n        $this->assertEquals(1.4, $q[0]);\n        $this->assertEquals(1.8, $q[1]);\n        $this->assertEquals(2.2, $q[2]);\n        $this->assertEquals(2.6, $q[3]);\n        $this->assertEquals(3.0, $q[4]);\n        $this->assertEquals(3.4, $q[5]);\n        $this->assertEquals(3.8, $q[6]);\n        $this->assertEquals(4.2, $q[7]);\n        $this->assertEquals(4.6, $q[8]);\n    }\n\n    public function test_calculates_quantiles_with_invalid_method(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::quantiles([1, 2, 3], method: 'invalid');\n    }\n\n    public function test_calculates_first_quartile(): void\n    {\n        $q = Stat::firstQuartile([98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88, 76]);\n        $this->assertEquals(58.75, $q);\n    }\n\n    public function test_calculates_first_quartile_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::firstQuartile([]);\n    }\n\n    public function test_calculates_covariance(): void\n    {\n        $covariance = Stat::covariance(\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n            [1, 2, 3, 1, 2, 3, 1, 2, 3],\n        );\n        $this->assertEquals(0.75, $covariance);\n\n        $covariance = Stat::covariance(\n            [9, 8, 7, 6, 5, 4, 3, 2, 1],\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n        );\n        $this->assertEquals(-7.5, $covariance);\n\n        $covariance = Stat::covariance(\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n            [9, 8, 7, 6, 5, 4, 3, 2, 1],\n        );\n        $this->assertEquals(-7.5, $covariance);\n    }\n\n    public function test_calculates_covariance_wrong_usage(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::covariance(\n            [9, 8, 7, 6, 5, 4, 3, 2, 1],\n            [1, 2, 3, 4, 5, 6, 7, 8],\n        );\n    }\n\n    public function test_calculates_covariance_with_empty_arrays(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::covariance([], []);\n    }\n\n    public function test_calculates_covariance_with_single_element(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::covariance([3], [3]);\n    }\n\n    public function test_calculates_covariance_with_non_numeric_first(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        // Intentionally passing non-numeric values to test exception handling\n        Stat::covariance(['a', 1], ['b', 2]); // @phpstan-ignore argument.type, argument.type\n    }\n\n    public function test_calculates_covariance_with_non_numeric_second(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        // Intentionally passing non-numeric values to test exception handling\n        Stat::covariance([3, 1], ['b', 2]); // @phpstan-ignore argument.type\n    }\n\n    public function test_calculates_correlation(): void\n    {\n        $correlation = Stat::correlation(\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(1, $correlation);\n\n        $correlation = Stat::correlation(\n            [1, 2, 3, 4, 5, 6, 7, 8, 9],\n            [9, 8, 7, 6, 5, 4, 3, 2, 1],\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(-1, $correlation);\n\n        $correlation = Stat::correlation(\n            [3, 6, 9],\n            [70, 75, 80],\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(1, $correlation);\n\n        $correlation = Stat::correlation(\n            [20, 23, 8, 29, 14, 11, 11, 20, 17, 17],\n            [30, 35, 21, 33, 33, 26, 22, 31, 33, 36],\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(0.71, $correlation);\n    }\n\n    public function test_calculates_spearman_correlation(): void\n    {\n        // Monotonic relationship: ranks are perfectly correlated\n        $correlation = Stat::correlation(\n            [1, 2, 3, 4, 5],\n            [2, 4, 6, 8, 10],\n            'ranked',\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEqualsWithDelta(1.0, $correlation, 1e-9);\n\n        // Inverse monotonic relationship\n        $correlation = Stat::correlation(\n            [1, 2, 3, 4, 5],\n            [10, 8, 6, 4, 2],\n            'ranked',\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEqualsWithDelta(-1.0, $correlation, 1e-9);\n\n        // Non-linear but monotonic: Spearman = 1, Pearson < 1\n        $correlation = Stat::correlation(\n            [1, 2, 3, 4, 5],\n            [1, 4, 9, 16, 25],\n            'ranked',\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEqualsWithDelta(1.0, $correlation, 1e-9);\n    }\n\n    public function test_calculates_spearman_correlation_planets(): void\n    {\n        // Python docs example: planetary orbital periods and distances from the sun\n        $orbitalPeriod = [88, 225, 365, 687, 4331, 10_756, 30_687, 60_190];\n        $distFromSun = [58, 108, 150, 228, 778, 1_400, 2_900, 4_500];\n\n        // Perfect monotonic relationship → Spearman = 1.0\n        $correlation = Stat::correlation($orbitalPeriod, $distFromSun, 'ranked');\n        $this->assertEqualsWithDelta(1.0, $correlation, 1e-9);\n\n        // Linear (Pearson) correlation is imperfect\n        $correlation = Stat::correlation($orbitalPeriod, $distFromSun);\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(0.9882, round($correlation, 4));\n\n        // Kepler's third law: linear correlation between\n        // the square of the period and the cube of the distance\n        $periodSquared = array_map(fn(int $p): int => $p * $p, $orbitalPeriod);\n        $distCubed = array_map(fn(int $d): int => $d * $d * $d, $distFromSun);\n        $correlation = Stat::correlation($periodSquared, $distCubed);\n        $this->assertIsFloat($correlation);\n        $this->assertEquals(1.0, round($correlation, 4));\n    }\n\n    public function test_calculates_spearman_correlation_with_ties(): void\n    {\n        // Ties should receive average ranks\n        $correlation = Stat::correlation(\n            [1, 2, 2, 3],\n            [10, 20, 20, 30],\n            'ranked',\n        );\n        $this->assertIsFloat($correlation);\n        $this->assertEqualsWithDelta(1.0, $correlation, 1e-9);\n    }\n\n    public function test_calculates_correlation_invalid_method(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::correlation(\n            [1, 2, 3],\n            [4, 5, 6],\n            'invalid',\n        );\n    }\n\n    public function test_calculates_correlation_wrong_usage_different_lengths(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::correlation(\n            [9, 8, 7, 6, 5, 4, 3, 2, 1],\n            [1, 2, 3, 4, 5, 6, 7, 8],\n        );\n    }\n\n    public function test_calculates_correlation_wrong_usage_empty(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::correlation([], []);\n    }\n\n    public function test_calculates_correlation_wrong_usage_single(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::correlation([3], [3]);\n    }\n\n    public function test_calculates_correlation_wrong_usage_constant(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::correlation([3, 1, 2], [2, 2, 2]);\n    }\n\n    public function test_calculates_linear_regression(): void\n    {\n        [$slope, $intercept] = Stat::linearRegression(\n            [1971, 1975, 1979, 1982, 1983],\n            [1, 2, 3, 4, 5],\n        );\n        $this->assertIsFloat($slope);\n        $this->assertEquals(0.31, $slope);\n        $this->assertIsFloat($intercept);\n        $this->assertEquals(-610.18, round($intercept, 2));\n\n        [$slope, $intercept] = Stat::linearRegression(\n            [1971, 1975, 1979, 1982, 1983],\n            [1, 2, 1, 3, 1],\n        );\n        $this->assertIsFloat($slope);\n        $this->assertEquals(0.05, $slope);\n        $this->assertIsFloat($intercept);\n        $this->assertEquals(-97.3, $intercept);\n        $this->assertEquals(4, round($slope * 2019 + $intercept));\n    }\n\n    public function test_calculates_linear_regression_with_single_element(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::linearRegression([3], [2]);\n    }\n\n    public function test_calculates_linear_regression_with_different_lengths(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::linearRegression([3, 3, 3, 3], [2, 1, 1, 1, 1]);\n    }\n\n    public function test_calculates_linear_regression_with_constant_x(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::linearRegression([3, 3, 3, 3, 3], [1, 1, 1, 1, 1]);\n    }\n\n    public function test_calculates_proportional_linear_regression(): void\n    {\n        [$slope, $intercept] = Stat::linearRegression(\n            [1, 2, 3, 4, 5],\n            [2, 4, 6, 8, 10],\n            proportional: true,\n        );\n        $this->assertIsFloat($slope);\n        $this->assertEquals(2.0, $slope);\n        $this->assertSame(0.0, $intercept);\n\n        [$slope, $intercept] = Stat::linearRegression(\n            [1, 2, 3, 4, 5],\n            [3, 5, 7, 9, 11],\n            proportional: true,\n        );\n        $this->assertIsFloat($slope);\n        $this->assertEquals(2.27, round($slope, 2));\n        $this->assertSame(0.0, $intercept);\n    }\n\n    public function test_proportional_linear_regression_with_all_zeros_x(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::linearRegression([0, 0, 0, 0, 0], [1, 2, 3, 4, 5], proportional: true);\n    }\n\n    public function test_r_squared_perfect_fit(): void\n    {\n        $r2 = Stat::rSquared([1, 2, 3, 4, 5], [2, 4, 6, 8, 10]);\n        $this->assertEqualsWithDelta(1.0, $r2, 1e-10);\n    }\n\n    public function test_r_squared_real_data(): void\n    {\n        $r2 = Stat::rSquared(\n            [1971, 1975, 1979, 1982, 1983],\n            [1, 2, 3, 4, 5],\n        );\n        $this->assertEqualsWithDelta(0.961, round($r2, 4), 1e-4);\n    }\n\n    public function test_r_squared_with_rounding(): void\n    {\n        $r2 = Stat::rSquared(\n            [1971, 1975, 1979, 1982, 1983],\n            [1, 2, 3, 4, 5],\n            round: 2,\n        );\n        $this->assertSame(0.96, $r2);\n    }\n\n    public function test_r_squared_proportional(): void\n    {\n        $r2 = Stat::rSquared(\n            [1, 2, 3, 4, 5],\n            [2, 4, 6, 8, 10],\n            proportional: true,\n        );\n        $this->assertEqualsWithDelta(1.0, $r2, 1e-10);\n    }\n\n    public function test_r_squared_with_different_lengths(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::rSquared([1, 2, 3], [1, 2]);\n    }\n\n    public function test_r_squared_with_single_element(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::rSquared([1], [2]);\n    }\n\n    public function test_r_squared_with_constant_y(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::rSquared([1, 2, 3, 4, 5], [3, 3, 3, 3, 3]);\n    }\n\n    public function test_logarithmic_regression(): void\n    {\n        // y = 10 * ln(x) + 5\n        $x = [1, 2, 3, 4, 5, 6, 7, 8];\n        $y = array_map(fn(int $v): float => 10 * log($v) + 5, $x);\n\n        [$a, $b] = Stat::logarithmicRegression($x, $y);\n        $this->assertEqualsWithDelta(10.0, $a, 1e-10);\n        $this->assertEqualsWithDelta(5.0, $b, 1e-10);\n    }\n\n    public function test_logarithmic_regression_running_pace(): void\n    {\n        // Simulated weekly running pace (seconds/km) — diminishing improvement\n        $weeks = [1, 2, 3, 4, 5, 6, 7, 8];\n        $paces = [350, 342, 337, 333, 330, 328, 326, 325];\n\n        [$a, $b] = Stat::logarithmicRegression($weeks, $paces);\n        // a should be negative (improvement = pace decreasing)\n        $this->assertLessThan(0, $a);\n        // Prediction for week 12 using log model should be more conservative\n        // than linear (higher pace = slower, closer to reality)\n        $logPrediction = $a * log(12) + $b;\n        [$slope, $intercept] = Stat::linearRegression($weeks, $paces);\n        $linearPrediction = $slope * 12 + $intercept;\n        $this->assertGreaterThan($linearPrediction, $logPrediction);\n    }\n\n    public function test_logarithmic_regression_diminishing_values(): void\n    {\n        $x = range(1, 15);\n        $y = [59, 50, 44, 38, 33, 28, 23, 20, 17, 15, 13, 12, 11, 10, 9.5];\n\n        [$a, $b] = Stat::logarithmicRegression($x, $y);\n        $this->assertEqualsWithDelta(-20.1987, $a, 0.001);\n        $this->assertEqualsWithDelta(63.0686, $b, 0.001);\n    }\n\n    public function test_logarithmic_regression_with_non_positive_x(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::logarithmicRegression([0, 1, 2], [1, 2, 3]);\n    }\n\n    public function test_logarithmic_regression_with_negative_x(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::logarithmicRegression([-1, 1, 2], [1, 2, 3]);\n    }\n\n    public function test_power_regression(): void\n    {\n        // y = 3 * x^2\n        $x = [1, 2, 3, 4, 5];\n        $y = array_map(fn(int $v): int => 3 * $v ** 2, $x);\n\n        [$a, $b] = Stat::powerRegression($x, $y);\n        $this->assertEqualsWithDelta(3.0, $a, 1e-10);\n        $this->assertEqualsWithDelta(2.0, $b, 1e-10);\n    }\n\n    public function test_power_regression_with_non_positive_x(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::powerRegression([0, 1, 2], [1, 2, 3]);\n    }\n\n    public function test_power_regression_with_non_positive_y(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::powerRegression([1, 2, 3], [0, 2, 3]);\n    }\n\n    public function test_exponential_regression(): void\n    {\n        // y = 2 * e^(0.5*x)\n        $x = [1, 2, 3, 4, 5];\n        $y = array_map(fn(int $v): float => 2 * exp(0.5 * $v), $x);\n\n        [$a, $b] = Stat::exponentialRegression($x, $y);\n        $this->assertEqualsWithDelta(2.0, $a, 1e-10);\n        $this->assertEqualsWithDelta(0.5, $b, 1e-10);\n    }\n\n    public function test_exponential_regression_with_non_positive_y(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::exponentialRegression([1, 2, 3], [0, 2, 3]);\n    }\n\n    public function test_confidence_interval_95(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        [$lower, $upper] = Stat::confidenceInterval($data);\n        // mean = 5.0, stdev ≈ 2.1381, sem ≈ 0.7559, z = 1.96\n        // margin ≈ 1.4815\n        $this->assertEqualsWithDelta(3.5185, $lower, 0.01);\n        $this->assertEqualsWithDelta(6.4815, $upper, 0.01);\n    }\n\n    public function test_confidence_interval_99(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        [$lower, $upper] = Stat::confidenceInterval($data, confidenceLevel: 0.99);\n        // 99% CI is wider than 95% CI\n        $this->assertLessThan(3.5, $lower);\n        $this->assertGreaterThan(6.5, $upper);\n    }\n\n    public function test_confidence_interval_with_rounding(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        [$lower, $upper] = Stat::confidenceInterval($data, round: 2);\n        $this->assertSame(3.52, $lower);\n        $this->assertSame(6.48, $upper);\n    }\n\n    public function test_confidence_interval_narrows_with_more_data(): void\n    {\n        $small = [2, 4, 4, 4, 5, 5, 7, 9];\n        $large = [2, 4, 4, 4, 5, 5, 7, 9, 3, 4, 5, 6, 4, 5, 6, 5];\n        [$sLower, $sUpper] = Stat::confidenceInterval($small);\n        [$lLower, $lUpper] = Stat::confidenceInterval($large);\n        $this->assertLessThan($sUpper - $sLower, $lUpper - $lLower);\n    }\n\n    public function test_confidence_interval_single_element_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval([42]);\n    }\n\n    public function test_confidence_interval_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval([]);\n    }\n\n    public function test_confidence_interval_invalid_confidence_level_throws(): void\n    {\n        $data = [1, 2, 3, 4, 5];\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval($data, confidenceLevel: 0.0);\n    }\n\n    public function test_confidence_interval_confidence_level_one_throws(): void\n    {\n        $data = [1, 2, 3, 4, 5];\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval($data, confidenceLevel: 1.0);\n    }\n\n    public function test_confidence_interval_confidence_level_above_one_throws(): void\n    {\n        $data = [1, 2, 3, 4, 5];\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval($data, confidenceLevel: 1.5);\n    }\n\n    public function test_confidence_interval_negative_confidence_level_throws(): void\n    {\n        $data = [1, 2, 3, 4, 5];\n        $this->expectException(InvalidDataInputException::class);\n        Stat::confidenceInterval($data, confidenceLevel: -0.1);\n    }\n\n    // --- zTest ---\n\n    public function test_z_test_two_sided(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::zTest($data, 3.0);\n        // mean = 5.0, sem = stdev/sqrt(8) ≈ 0.7559\n        // zScore = (5.0 - 3.0) / 0.7559 ≈ 2.6458\n        $this->assertArrayHasKey('zScore', $result);\n        $this->assertArrayHasKey('pValue', $result);\n        $this->assertEqualsWithDelta(2.6458, $result['zScore'], 0.001);\n        $this->assertLessThan(0.05, $result['pValue']);\n    }\n\n    public function test_z_test_greater(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::zTest($data, 3.0, Alternative::Greater);\n        // One-tailed p-value should be roughly half the two-sided\n        $twoSided = Stat::zTest($data, 3.0);\n        $this->assertEqualsWithDelta($twoSided['pValue'] / 2, $result['pValue'], 0.001);\n    }\n\n    public function test_z_test_less(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        // mean=5 > populationMean=3, so P(Z < zScore) should be large\n        $result = Stat::zTest($data, 3.0, Alternative::Less);\n        $this->assertGreaterThan(0.95, $result['pValue']);\n    }\n\n    public function test_z_test_non_significant(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        // populationMean close to sample mean (5.0)\n        $result = Stat::zTest($data, 5.0);\n        $this->assertEqualsWithDelta(0.0, $result['zScore'], 1e-10);\n        $this->assertEqualsWithDelta(1.0, $result['pValue'], 0.01);\n    }\n\n    public function test_z_test_with_rounding(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::zTest($data, 3.0, round: 2);\n        $this->assertSame(round($result['zScore'], 2), $result['zScore']);\n        $this->assertSame(round($result['pValue'], 2), $result['pValue']);\n    }\n\n    public function test_z_test_single_element_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::zTest([42], 40.0);\n    }\n\n    public function test_z_test_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::zTest([], 0.0);\n    }\n\n    // --- tTest ---\n\n    public function test_t_test_two_sided(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::tTest($data, 3.0);\n        // mean = 5.0, sem = stdev/sqrt(8) ≈ 0.7559\n        // tStatistic = (5.0 - 3.0) / 0.7559 ≈ 2.6458\n        $this->assertArrayHasKey('tStatistic', $result);\n        $this->assertArrayHasKey('pValue', $result);\n        $this->assertArrayHasKey('degreesOfFreedom', $result);\n        $this->assertEqualsWithDelta(2.6458, $result['tStatistic'], 0.001);\n        $this->assertLessThan(0.05, $result['pValue']);\n        $this->assertEquals(7, $result['degreesOfFreedom']);\n    }\n\n    public function test_t_test_greater(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::tTest($data, 3.0, Alternative::Greater);\n        // One-tailed p-value should be roughly half the two-sided\n        $twoSided = Stat::tTest($data, 3.0);\n        $this->assertEqualsWithDelta($twoSided['pValue'] / 2, $result['pValue'], 0.001);\n    }\n\n    public function test_t_test_less(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        // mean=5 > populationMean=3, so P(T < tStatistic) should be large\n        $result = Stat::tTest($data, 3.0, Alternative::Less);\n        $this->assertGreaterThan(0.95, $result['pValue']);\n    }\n\n    public function test_t_test_non_significant(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        // populationMean close to sample mean (5.0)\n        $result = Stat::tTest($data, 5.0);\n        $this->assertEqualsWithDelta(0.0, $result['tStatistic'], 1e-10);\n        $this->assertEqualsWithDelta(1.0, $result['pValue'], 0.01);\n    }\n\n    public function test_t_test_degrees_of_freedom(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $result = Stat::tTest($data, 5.0);\n        $this->assertEquals(9, $result['degreesOfFreedom']);\n    }\n\n    public function test_t_test_large_sample_converges_to_z_test(): void\n    {\n        // With a large sample, t-test p-value should approximate z-test p-value\n        $data = range(1, 100);\n        $tResult = Stat::tTest($data, 45.0);\n        $zResult = Stat::zTest($data, 45.0);\n        $this->assertEqualsWithDelta($zResult['pValue'], $tResult['pValue'], 0.01);\n    }\n\n    public function test_t_test_with_rounding(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::tTest($data, 3.0, round: 2);\n        $this->assertSame(round($result['tStatistic'], 2), $result['tStatistic']);\n        $this->assertSame(round($result['pValue'], 2), $result['pValue']);\n    }\n\n    public function test_t_test_single_element_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTest([42], 40.0);\n    }\n\n    public function test_t_test_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTest([], 0.0);\n    }\n\n    // --- tTestTwoSample (Welch's t-test) ---\n\n    public function test_t_test_two_sample_two_sided(): void\n    {\n        // Two clearly different groups\n        $group1 = [30.02, 29.99, 30.11, 29.97, 30.01, 29.99];\n        $group2 = [29.89, 29.93, 29.72, 29.98, 30.02, 29.98];\n        $result = Stat::tTestTwoSample($group1, $group2);\n        $this->assertArrayHasKey('tStatistic', $result);\n        $this->assertArrayHasKey('pValue', $result);\n        $this->assertArrayHasKey('degreesOfFreedom', $result);\n        // group1 mean > group2 mean, so tStatistic should be positive\n        $this->assertGreaterThan(0, $result['tStatistic']);\n    }\n\n    public function test_t_test_two_sample_equal_means(): void\n    {\n        $group1 = [1, 2, 3, 4, 5];\n        $group2 = [1, 2, 3, 4, 5];\n        $result = Stat::tTestTwoSample($group1, $group2);\n        $this->assertEqualsWithDelta(0.0, $result['tStatistic'], 1e-10);\n        $this->assertEqualsWithDelta(1.0, $result['pValue'], 0.01);\n    }\n\n    public function test_t_test_two_sample_significant_difference(): void\n    {\n        // Clearly different groups\n        $group1 = [10, 11, 12, 13, 14];\n        $group2 = [20, 21, 22, 23, 24];\n        $result = Stat::tTestTwoSample($group1, $group2);\n        $this->assertLessThan(0.001, $result['pValue']);\n        // group1 mean < group2 mean\n        $this->assertLessThan(0, $result['tStatistic']);\n    }\n\n    public function test_t_test_two_sample_greater(): void\n    {\n        $group1 = [10, 11, 12, 13, 14];\n        $group2 = [5, 6, 7, 8, 9];\n        $result = Stat::tTestTwoSample($group1, $group2, Alternative::Greater);\n        // group1 mean > group2 mean, so one-tailed should be very significant\n        $this->assertLessThan(0.001, $result['pValue']);\n    }\n\n    public function test_t_test_two_sample_less(): void\n    {\n        $group1 = [10, 11, 12, 13, 14];\n        $group2 = [5, 6, 7, 8, 9];\n        // group1 mean > group2 mean, so testing \"less\" should give large p-value\n        $result = Stat::tTestTwoSample($group1, $group2, Alternative::Less);\n        $this->assertGreaterThan(0.95, $result['pValue']);\n    }\n\n    public function test_t_test_two_sample_unequal_sizes(): void\n    {\n        $group1 = [1, 2, 3, 4, 5, 6, 7, 8];\n        $group2 = [3, 4, 5];\n        $result = Stat::tTestTwoSample($group1, $group2);\n        $this->assertArrayHasKey('tStatistic', $result);\n        $this->assertArrayHasKey('pValue', $result);\n        $this->assertArrayHasKey('degreesOfFreedom', $result);\n    }\n\n    public function test_t_test_two_sample_with_rounding(): void\n    {\n        $group1 = [30.02, 29.99, 30.11, 29.97, 30.01, 29.99];\n        $group2 = [29.89, 29.93, 29.72, 29.98, 30.02, 29.98];\n        $result = Stat::tTestTwoSample($group1, $group2, round: 3);\n        $this->assertSame(round($result['tStatistic'], 3), $result['tStatistic']);\n        $this->assertSame(round($result['pValue'], 3), $result['pValue']);\n    }\n\n    public function test_t_test_two_sample_welch_df(): void\n    {\n        // With unequal variances, Welch df should differ from simple pooled df\n        $group1 = [1, 2, 3, 4, 5]; // n=5, var=2.5\n        $group2 = [10, 20, 30, 40, 50]; // n=5, var=250\n        $result = Stat::tTestTwoSample($group1, $group2);\n        // Welch df should be less than n1+n2-2 = 8\n        $this->assertLessThan(8, $result['degreesOfFreedom']);\n        $this->assertGreaterThan(0, $result['degreesOfFreedom']);\n    }\n\n    public function test_t_test_two_sample_single_element_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestTwoSample([42], [1, 2, 3]);\n    }\n\n    public function test_t_test_two_sample_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestTwoSample([], [1, 2, 3]);\n    }\n\n    public function test_t_test_two_sample_zero_variance_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestTwoSample([5, 5, 5], [5, 5, 5]);\n    }\n\n    // --- tTestPaired ---\n\n    public function test_t_test_paired_two_sided(): void\n    {\n        // Before and after treatment\n        $before = [200, 190, 210, 220, 215, 205, 195, 225];\n        $after = [192, 186, 198, 212, 208, 198, 188, 215];\n        $result = Stat::tTestPaired($before, $after);\n        $this->assertArrayHasKey('tStatistic', $result);\n        $this->assertArrayHasKey('pValue', $result);\n        $this->assertArrayHasKey('degreesOfFreedom', $result);\n        // df should be n-1 = 7\n        $this->assertEquals(7, $result['degreesOfFreedom']);\n        // Before values are generally higher, so tStatistic should be positive\n        $this->assertGreaterThan(0, $result['tStatistic']);\n    }\n\n    public function test_t_test_paired_no_difference(): void\n    {\n        // Near-equal pairs with small random differences\n        $data1 = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $data2 = [1.1, 1.9, 3.1, 3.9, 5.1];\n        $result = Stat::tTestPaired($data1, $data2);\n        // Differences are very small and mixed, p-value should be large\n        $this->assertGreaterThan(0.05, $result['pValue']);\n    }\n\n    public function test_t_test_paired_significant(): void\n    {\n        // Clear systematic difference with some variation in the differences\n        $before = [10, 12, 14, 16, 18];\n        $after = [15, 18, 20, 22, 24];\n        $result = Stat::tTestPaired($before, $after);\n        // Differences: -5, -6, -6, -6, -6 — large and consistent\n        $this->assertLessThan(0.001, $result['pValue']);\n        $this->assertEquals(4, $result['degreesOfFreedom']);\n    }\n\n    public function test_t_test_paired_greater(): void\n    {\n        $before = [10, 12, 14, 16, 18];\n        $after = [15, 18, 20, 22, 24];\n        // before < after, so mean diff is negative; testing \"greater\" should give large p\n        $result = Stat::tTestPaired($before, $after, Alternative::Greater);\n        $this->assertGreaterThan(0.95, $result['pValue']);\n    }\n\n    public function test_t_test_paired_less(): void\n    {\n        $before = [10, 12, 14, 16, 18];\n        $after = [15, 18, 20, 22, 24];\n        // before < after, so mean diff is negative; testing \"less\" should be significant\n        $result = Stat::tTestPaired($before, $after, Alternative::Less);\n        $this->assertLessThan(0.001, $result['pValue']);\n    }\n\n    public function test_t_test_paired_with_rounding(): void\n    {\n        $before = [200, 190, 210, 220, 215, 205, 195, 225];\n        $after = [192, 186, 198, 212, 208, 198, 188, 215];\n        $result = Stat::tTestPaired($before, $after, round: 3);\n        $this->assertSame(round($result['tStatistic'], 3), $result['tStatistic']);\n        $this->assertSame(round($result['pValue'], 3), $result['pValue']);\n    }\n\n    public function test_t_test_paired_different_lengths_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestPaired([1, 2, 3], [1, 2]);\n    }\n\n    public function test_t_test_paired_single_element_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestPaired([1], [2]);\n    }\n\n    public function test_t_test_paired_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::tTestPaired([], []);\n    }\n\n    public function test_kde_normal(): void\n    {\n        $data = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2];\n        $f = Stat::kde($data, 1.5);\n        $this->assertIsCallable($f);\n\n        $density = $f(2.5);\n        $this->assertIsFloat($density);\n        $this->assertGreaterThan(0, $density);\n\n        // Verify against manually computed value\n        // f(2.5) = (1/(6*1.5)) * sum of K((2.5 - xi)/1.5)\n        $n = count($data);\n        $h = 1.5;\n        $sum = 0.0;\n        $sqrt2pi = sqrt(2.0 * M_PI);\n        foreach ($data as $xi) {\n            $t = (2.5 - $xi) / $h;\n            $sum += exp(-$t * $t / 2.0) / $sqrt2pi;\n        }\n        $expected = $sum / ($n * $h);\n        $this->assertEqualsWithDelta($expected, $density, 1e-10);\n    }\n\n    public function test_kde_all_kernels(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n\n        foreach (KdeKernel::cases() as $kernel) {\n            $f = Stat::kde($data, 1.0, $kernel);\n            $this->assertIsCallable($f, \"Kernel '{$kernel->value}' should return a callable\");\n            $density = $f(3.0);\n            $this->assertGreaterThanOrEqual(0, $density, \"Kernel '{$kernel->value}' density should be >= 0\");\n        }\n    }\n\n    public function test_kde_cumulative(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $F = Stat::kde($data, 1.0, KdeKernel::Normal, cumulative: true);\n        $this->assertIsCallable($F);\n\n        // CDF should be monotonically non-decreasing\n        $prev = $F(-100.0);\n        foreach ([-10.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 10.0, 100.0] as $x) {\n            $current = $F($x);\n            $this->assertGreaterThanOrEqual($prev, $current, \"CDF should be non-decreasing at x=$x\");\n            $prev = $current;\n        }\n\n        // Approaches 0 far left and 1 far right\n        $this->assertEqualsWithDelta(0.0, $F(-100.0), 0.01);\n        $this->assertEqualsWithDelta(1.0, $F(100.0), 0.01);\n    }\n\n    public function test_kde_aliases(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $x = 3.0;\n\n        // gauss == normal\n        $f1 = Stat::kde($data, 1.0, KdeKernel::Gauss);\n        $f2 = Stat::kde($data, 1.0, KdeKernel::Normal);\n        $this->assertEqualsWithDelta($f1($x), $f2($x), 1e-15);\n\n        // uniform == rectangular\n        $f1 = Stat::kde($data, 1.0, KdeKernel::Uniform);\n        $f2 = Stat::kde($data, 1.0, KdeKernel::Rectangular);\n        $this->assertEqualsWithDelta($f1($x), $f2($x), 1e-15);\n\n        // epanechnikov == parabolic\n        $f1 = Stat::kde($data, 1.0, KdeKernel::Epanechnikov);\n        $f2 = Stat::kde($data, 1.0, KdeKernel::Parabolic);\n        $this->assertEqualsWithDelta($f1($x), $f2($x), 1e-15);\n\n        // biweight == quartic\n        $f1 = Stat::kde($data, 1.0, KdeKernel::Biweight);\n        $f2 = Stat::kde($data, 1.0, KdeKernel::Quartic);\n        $this->assertEqualsWithDelta($f1($x), $f2($x), 1e-15);\n    }\n\n    public function test_kde_empty_data(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kde([], 1.0);\n    }\n\n    public function test_kde_invalid_bandwidth(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kde([1.0, 2.0], 0.0);\n    }\n\n    public function test_kde_invalid_bandwidth_negative(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kde([1.0, 2.0], -1.0);\n    }\n\n    public function test_kde_random_returns_callable(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $sampler = Stat::kdeRandom($data, 1.0);\n        $this->assertIsCallable($sampler);\n\n        $value = $sampler();\n        $this->assertIsFloat($value);\n    }\n\n    public function test_kde_random_all_kernels(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n\n        foreach (KdeKernel::cases() as $kernel) {\n            $sampler = Stat::kdeRandom($data, 1.0, $kernel, seed: 42);\n            $this->assertIsCallable($sampler, \"Kernel '{$kernel->value}' should return a callable\");\n            $value = $sampler();\n            $this->assertIsFloat($value, \"Kernel '{$kernel->value}' should return a float\");\n        }\n    }\n\n    public function test_kde_random_seed_reproducibility(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n\n        $sampler1 = Stat::kdeRandom($data, 1.0, KdeKernel::Normal, seed: 123);\n        $values1 = [];\n        for ($i = 0; $i < 10; $i++) {\n            $values1[] = $sampler1();\n        }\n\n        $sampler2 = Stat::kdeRandom($data, 1.0, KdeKernel::Normal, seed: 123);\n        $values2 = [];\n        for ($i = 0; $i < 10; $i++) {\n            $values2[] = $sampler2();\n        }\n\n        $this->assertSame($values1, $values2);\n    }\n\n    public function test_kde_random_aliases(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $aliasPairs = [\n            [KdeKernel::Gauss, KdeKernel::Normal],\n            [KdeKernel::Uniform, KdeKernel::Rectangular],\n            [KdeKernel::Epanechnikov, KdeKernel::Parabolic],\n            [KdeKernel::Biweight, KdeKernel::Quartic],\n        ];\n\n        foreach ($aliasPairs as [$alias, $canonical]) {\n            $sampler1 = Stat::kdeRandom($data, 1.0, $alias, seed: 42);\n            $values1 = [];\n            for ($i = 0; $i < 5; $i++) {\n                $values1[] = $sampler1();\n            }\n\n            $sampler2 = Stat::kdeRandom($data, 1.0, $canonical, seed: 42);\n            $values2 = [];\n            for ($i = 0; $i < 5; $i++) {\n                $values2[] = $sampler2();\n            }\n\n            $this->assertSame(\n                $values1,\n                $values2,\n                \"Alias '{$alias->value}' should produce same results as '{$canonical->value}'\",\n            );\n        }\n    }\n\n    public function test_kde_random_known_output(): void\n    {\n        $data = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2];\n        $rand = Stat::kdeRandom($data, 1.5, seed: 8675309);\n        $results = [];\n        for ($i = 0; $i < 10; $i++) {\n            $results[] = round($rand(), 1);\n        }\n\n        $this->assertSame(\n            [2.5, 3.3, -1.8, 7.3, -2.1, 4.6, 4.4, 5.9, -3.2, -1.6],\n            $results,\n        );\n    }\n\n    public function test_kde_random_statistical_properties(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $dataMean = 3.0;\n        $sampler = Stat::kdeRandom($data, 0.5, KdeKernel::Normal, seed: 42);\n\n        $sum = 0.0;\n        $n = 10000;\n        for ($i = 0; $i < $n; $i++) {\n            $sum += $sampler();\n        }\n        $sampleMean = $sum / $n;\n\n        $this->assertEqualsWithDelta($dataMean, $sampleMean, 0.15);\n    }\n\n    public function test_kde_random_empty_data(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kdeRandom([], 1.0);\n    }\n\n    public function test_kde_random_invalid_bandwidth(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::kdeRandom([1.0, 2.0], 0.0);\n    }\n\n    public function test_covariance_non_numeric_x_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        // true passes mean()'s string filter and array_sum without warnings,\n        // but is_numeric(true) returns false, triggering the loop guard\n        Stat::covariance([true, 1, 2], [3, 4, 5]); // @phpstan-ignore argument.type\n    }\n\n    public function test_covariance_non_numeric_y_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::covariance([1, 2, 3], [true, 4, 5]); // @phpstan-ignore argument.type\n    }\n\n    public function test_kde_cumulative_bounded_kernels(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        $boundedKernels = [\n            KdeKernel::Triangular,\n            KdeKernel::Parabolic,\n            KdeKernel::Rectangular,\n            KdeKernel::Quartic,\n            KdeKernel::Triweight,\n            KdeKernel::Cosine,\n        ];\n\n        foreach ($boundedKernels as $kernel) {\n            $F = Stat::kde($data, 1.0, $kernel, cumulative: true);\n            $this->assertIsCallable($F);\n\n            // CDF must be monotonically non-decreasing\n            $prev = $F(-100.0);\n            foreach ([0.0, 1.0, 3.0, 5.0, 100.0] as $x) {\n                $current = $F($x);\n                $this->assertGreaterThanOrEqual(\n                    $prev,\n                    $current,\n                    \"CDF ({$kernel->value}) should be non-decreasing at x=$x\",\n                );\n                $prev = $current;\n            }\n\n            // Boundary behaviour\n            $this->assertEqualsWithDelta(0.0, $F(-100.0), 0.01);\n            $this->assertEqualsWithDelta(1.0, $F(100.0), 0.01);\n        }\n    }\n\n    public function test_kde_random_quartic_covers_small_p(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        // Generate enough samples so the rare p < 0.0106 branch is hit\n        $sampler = Stat::kdeRandom($data, 1.0, KdeKernel::Quartic, seed: 42);\n        for ($i = 0; $i < 500; $i++) {\n            $value = $sampler();\n            $this->assertIsFloat($value);\n        }\n    }\n\n    public function test_kde_random_triweight_covers_both_signs(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        // Generate enough samples so both p <= 0.5 and p > 0.5 branches are hit\n        $sampler = Stat::kdeRandom($data, 1.0, KdeKernel::Triweight, seed: 42);\n        for ($i = 0; $i < 500; $i++) {\n            $value = $sampler();\n            $this->assertIsFloat($value);\n        }\n    }\n\n    public function test_kde_random_triangular_covers_both_branches(): void\n    {\n        $data = [1.0, 2.0, 3.0, 4.0, 5.0];\n        // Generate enough samples so both p < 0.5 and p >= 0.5 branches are hit\n        $sampler = Stat::kdeRandom($data, 1.0, KdeKernel::Triangular, seed: 42);\n        for ($i = 0; $i < 500; $i++) {\n            $value = $sampler();\n            $this->assertIsFloat($value);\n        }\n    }\n\n    // --- percentile ---\n\n    public function test_percentile_median_matches(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $p50 = Stat::percentile($data, 50);\n        $this->assertEqualsWithDelta(Stat::median($data), $p50, 1e-10);\n    }\n\n    public function test_percentile_quartiles(): void\n    {\n        $data = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20];\n        $q1 = Stat::percentile($data, 25);\n        $q3 = Stat::percentile($data, 75);\n        $this->assertEqualsWithDelta(Stat::firstQuartile($data), $q1, 1e-10);\n        $this->assertEqualsWithDelta(Stat::thirdQuartile($data), $q3, 1e-10);\n    }\n\n    public function test_percentile_boundaries(): void\n    {\n        $data = [10, 20, 30, 40, 50];\n        $this->assertEqualsWithDelta(10.0, Stat::percentile($data, 0), 1e-10);\n        $this->assertEqualsWithDelta(50.0, Stat::percentile($data, 100), 1e-10);\n    }\n\n    public function test_percentile_rounding(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $result = Stat::percentile($data, 33, 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_percentile_too_few_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::percentile([1], 50);\n    }\n\n    public function test_percentile_out_of_range_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::percentile([1, 2, 3], 101);\n    }\n\n    public function test_percentile_negative_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::percentile([1, 2, 3], -1);\n    }\n\n    // --- coefficientOfVariation ---\n\n    public function test_coefficient_of_variation(): void\n    {\n        $data = [10, 20, 30, 40, 50];\n        $expected = (Stat::stdev($data) / abs((float) Stat::mean($data))) * 100;\n        $this->assertEqualsWithDelta($expected, Stat::coefficientOfVariation($data), 1e-10);\n    }\n\n    public function test_coefficient_of_variation_population(): void\n    {\n        $data = [10, 20, 30, 40, 50];\n        $expected = (Stat::pstdev($data) / abs((float) Stat::mean($data))) * 100;\n        $this->assertEqualsWithDelta($expected, Stat::coefficientOfVariation($data, population: true), 1e-10);\n    }\n\n    public function test_coefficient_of_variation_rounding(): void\n    {\n        $data = [10, 20, 30, 40, 50];\n        $result = Stat::coefficientOfVariation($data, 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_coefficient_of_variation_low_dispersion(): void\n    {\n        // Nearly identical values → low CV\n        $data = [100, 100.1, 99.9, 100.2, 99.8];\n        $cv = Stat::coefficientOfVariation($data);\n        $this->assertLessThan(1.0, $cv);\n    }\n\n    public function test_coefficient_of_variation_zero_mean_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::coefficientOfVariation([-1, 0, 1]);\n    }\n\n    public function test_coefficient_of_variation_negative_mean(): void\n    {\n        $data = [-10, -20, -30];\n        // Should use abs(mean), so CV is still positive\n        $cv = Stat::coefficientOfVariation($data);\n        $this->assertGreaterThan(0, $cv);\n    }\n\n    public function test_coefficient_of_variation_too_few_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::coefficientOfVariation([5]);\n    }\n\n    // --- trimmedMean ---\n\n    public function test_trimmed_mean_basic(): void\n    {\n        // [1, 2, 3, 4, 5, 6, 7, 8, 9, 100] with 10% trim\n        // removes 1 from each side → mean of [2, 3, 4, 5, 6, 7, 8, 9]\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 100];\n        $result = Stat::trimmedMean($data, 0.1);\n        $this->assertEqualsWithDelta(5.5, $result, 1e-10);\n    }\n\n    public function test_trimmed_mean_zero_trim_equals_mean(): void\n    {\n        $data = [1, 2, 3, 4, 5];\n        $this->assertEqualsWithDelta(\n            (float) Stat::mean($data),\n            Stat::trimmedMean($data, 0.0),\n            1e-10,\n        );\n    }\n\n    public function test_trimmed_mean_with_rounding(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $result = Stat::trimmedMean($data, 0.2, 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_trimmed_mean_removes_outliers(): void\n    {\n        // Outliers at both ends; trimmed mean should be close to the \"clean\" mean\n        $data = [-1000, 2, 3, 4, 5, 6, 7, 8, 9, 1000];\n        $trimmed = Stat::trimmedMean($data, 0.1);\n        $this->assertEqualsWithDelta(5.5, $trimmed, 1e-10);\n    }\n\n    public function test_trimmed_mean_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::trimmedMean([]);\n    }\n\n    public function test_trimmed_mean_proportion_too_high_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::trimmedMean([1, 2, 3], 0.5);\n    }\n\n    public function test_trimmed_mean_negative_proportion_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::trimmedMean([1, 2, 3], -0.1);\n    }\n\n    // --- weightedMedian ---\n\n    public function test_weighted_median_basic(): void\n    {\n        // Values: [1, 2, 3], Weights: [1, 1, 1] → equal weights, same as regular median\n        $this->assertEqualsWithDelta(2.0, Stat::weightedMedian([1, 2, 3], [1, 1, 1]), 1e-10);\n    }\n\n    public function test_weighted_median_skewed_weights(): void\n    {\n        // Heavy weight on 3 pulls the weighted median to 3\n        $this->assertEqualsWithDelta(3.0, Stat::weightedMedian([1, 2, 3], [1, 1, 10]), 1e-10);\n    }\n\n    public function test_weighted_median_unsorted_data(): void\n    {\n        // Data not pre-sorted — method should sort internally\n        $this->assertEqualsWithDelta(\n            Stat::weightedMedian([1, 2, 3], [1, 1, 1]),\n            Stat::weightedMedian([3, 1, 2], [1, 1, 1]),\n            1e-10,\n        );\n    }\n\n    public function test_weighted_median_interpolation_at_midpoint(): void\n    {\n        // Cumulative weight hits exactly 50% between two values → average them\n        // Values: [1, 2, 3, 4], Weights: [1, 1, 1, 1] → total=4, half=2\n        // After [1,2]: cumulative = 2 = half → average of 2 and 3 = 2.5\n        $this->assertEqualsWithDelta(2.5, Stat::weightedMedian([1, 2, 3, 4], [1, 1, 1, 1]), 1e-10);\n    }\n\n    public function test_weighted_median_single_element(): void\n    {\n        $this->assertEqualsWithDelta(42.0, Stat::weightedMedian([42], [5]), 1e-10);\n    }\n\n    public function test_weighted_median_with_rounding(): void\n    {\n        $result = Stat::weightedMedian([1, 2, 3, 4], [1, 1, 1, 1], 1);\n        $this->assertEquals(round($result, 1), $result);\n    }\n\n    public function test_weighted_median_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::weightedMedian([], []);\n    }\n\n    public function test_weighted_median_length_mismatch_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::weightedMedian([1, 2, 3], [1, 2]);\n    }\n\n    public function test_weighted_median_negative_weight_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::weightedMedian([1, 2, 3], [1, -1, 1]);\n    }\n\n    public function test_weighted_median_zero_weight_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::weightedMedian([1, 2, 3], [1, 0, 1]);\n    }\n\n    // --- sem ---\n\n    public function test_sem(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $expected = Stat::stdev($data) / sqrt(Stat::count($data));\n        $this->assertEqualsWithDelta($expected, Stat::sem($data), 1e-10);\n    }\n\n    public function test_sem_with_rounding(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $result = Stat::sem($data, 4);\n        $this->assertEquals(round($result, 4), $result);\n    }\n\n    public function test_sem_decreases_with_larger_sample(): void\n    {\n        $small = [1, 2, 3, 4, 5];\n        $large = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5];\n        $this->assertGreaterThan(Stat::sem($large), Stat::sem($small));\n    }\n\n    public function test_sem_too_few_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::sem([5]);\n    }\n\n    // --- meanAbsoluteDeviation ---\n\n    public function test_mean_absolute_deviation(): void\n    {\n        // [1, 2, 3, 4, 5] → mean=3, deviations=[2,1,0,1,2], MAD=6/5=1.2\n        $this->assertEqualsWithDelta(1.2, Stat::meanAbsoluteDeviation([1, 2, 3, 4, 5]), 1e-10);\n    }\n\n    public function test_mean_absolute_deviation_single_element(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::meanAbsoluteDeviation([42]), 1e-10);\n    }\n\n    public function test_mean_absolute_deviation_identical_values(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::meanAbsoluteDeviation([5, 5, 5, 5]), 1e-10);\n    }\n\n    public function test_mean_absolute_deviation_with_rounding(): void\n    {\n        $result = Stat::meanAbsoluteDeviation([1, 2, 3, 4, 5], 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_mean_absolute_deviation_less_than_stdev(): void\n    {\n        // MAD is always <= stdev for any dataset\n        $data = [1, 2, 3, 10, 100];\n        $this->assertLessThanOrEqual(Stat::stdev($data), Stat::meanAbsoluteDeviation($data));\n    }\n\n    public function test_mean_absolute_deviation_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::meanAbsoluteDeviation([]);\n    }\n\n    // --- medianAbsoluteDeviation ---\n\n    public function test_median_absolute_deviation(): void\n    {\n        // [1, 2, 3, 4, 5] → median=3, deviations=[2,1,0,1,2], median of deviations=1\n        $this->assertEqualsWithDelta(1.0, Stat::medianAbsoluteDeviation([1, 2, 3, 4, 5]), 1e-10);\n    }\n\n    public function test_median_absolute_deviation_with_outlier(): void\n    {\n        // MAD should be resistant to the outlier\n        $clean = [1, 2, 3, 4, 5];\n        $withOutlier = [1, 2, 3, 4, 1000];\n        // median of clean = 3, deviations = [2,1,0,1,2], MAD = 1\n        // median of withOutlier = 3, deviations = [2,1,0,1,997], MAD = 1\n        $this->assertEqualsWithDelta(\n            Stat::medianAbsoluteDeviation($clean),\n            Stat::medianAbsoluteDeviation($withOutlier),\n            1e-10,\n        );\n    }\n\n    public function test_median_absolute_deviation_single_element(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::medianAbsoluteDeviation([42]), 1e-10);\n    }\n\n    public function test_median_absolute_deviation_identical_values(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Stat::medianAbsoluteDeviation([5, 5, 5, 5]), 1e-10);\n    }\n\n    public function test_median_absolute_deviation_with_rounding(): void\n    {\n        $result = Stat::medianAbsoluteDeviation([1, 2, 3, 4, 5, 6, 7, 8], 3);\n        $this->assertEquals(round($result, 3), $result);\n    }\n\n    public function test_median_absolute_deviation_empty_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::medianAbsoluteDeviation([]);\n    }\n\n    // --- zscores ---\n\n    public function test_zscores(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $zscores = Stat::zscores($data);\n        $mean = (float) Stat::mean($data);\n        $stdev = Stat::stdev($data);\n\n        $this->assertCount(count($data), $zscores);\n        foreach ($data as $i => $value) {\n            $this->assertEqualsWithDelta(($value - $mean) / $stdev, $zscores[$i], 1e-10);\n        }\n    }\n\n    public function test_zscores_sum_to_zero(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $zscores = Stat::zscores($data);\n        $this->assertEqualsWithDelta(0.0, array_sum($zscores), 1e-10);\n    }\n\n    public function test_zscores_with_rounding(): void\n    {\n        $data = [2, 4, 4, 4, 5, 5, 7, 9];\n        $zscores = Stat::zscores($data, 2);\n        foreach ($zscores as $z) {\n            $this->assertEquals(round($z, 2), $z);\n        }\n    }\n\n    public function test_zscores_identical_values_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::zscores([5, 5, 5, 5]);\n    }\n\n    public function test_zscores_too_few_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::zscores([5]);\n    }\n\n    // --- outliers ---\n\n    public function test_outliers_detects_extreme_values(): void\n    {\n        $data = [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100];\n        $outliers = Stat::outliers($data);\n        $this->assertContains(100, $outliers);\n    }\n\n    public function test_outliers_no_outliers(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $outliers = Stat::outliers($data);\n        $this->assertEmpty($outliers);\n    }\n\n    public function test_outliers_custom_threshold(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        // With a very low threshold, more values are flagged\n        $outliers = Stat::outliers($data, 1.0);\n        $this->assertNotEmpty($outliers);\n    }\n\n    public function test_outliers_identical_values_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::outliers([5, 5, 5, 5]);\n    }\n\n    // --- iqrOutliers ---\n\n    public function test_iqr_outliers_detects_extreme_values(): void\n    {\n        // Ski downhill race times (seconds): most between 108-118, one DNF-like 200, one impossibly fast 50\n        $times = [110.2, 112.5, 108.9, 115.3, 111.7, 114.0, 109.8, 113.6, 200.0, 50.0];\n        $outliers = Stat::iqrOutliers($times);\n        $this->assertContains(200.0, $outliers);\n        $this->assertContains(50.0, $outliers);\n        // Normal times should not be flagged\n        $this->assertNotContains(112.5, $outliers);\n    }\n\n    public function test_iqr_outliers_no_outliers(): void\n    {\n        $data = [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20];\n        $outliers = Stat::iqrOutliers($data);\n        $this->assertEmpty($outliers);\n    }\n\n    public function test_iqr_outliers_custom_factor(): void\n    {\n        // With factor 3.0 (extreme outliers only), fewer values flagged\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 30];\n        $mild = Stat::iqrOutliers($data, 1.5);\n        $extreme = Stat::iqrOutliers($data, 3.0);\n        $this->assertGreaterThanOrEqual(count($extreme), count($mild));\n    }\n\n    public function test_iqr_outliers_identical_values(): void\n    {\n        // IQR = 0, so any different value is an outlier... but all values are the same\n        $data = [5, 5, 5, 5, 5];\n        $outliers = Stat::iqrOutliers($data);\n        $this->assertEmpty($outliers);\n    }\n\n    public function test_iqr_outliers_with_negative_values(): void\n    {\n        $data = [-100, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100];\n        $outliers = Stat::iqrOutliers($data);\n        $this->assertContains(-100, $outliers);\n        $this->assertContains(100, $outliers);\n    }\n\n    public function test_iqr_outliers_too_few_data_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Stat::iqrOutliers([5]);\n    }\n}\n"
  },
  {
    "path": "tests/StatisticTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Enums\\Alternative;\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\Statistics;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StatisticTest extends TestCase\n{\n    public function test_can_calculate_statistics(): void\n    {\n        $s = Statistics::make(\n            [98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88, 76],\n        );\n        $this->assertEquals(12, $s->count());\n        $this->assertEquals(85.5, $s->median());\n        $this->assertEquals(58.75, $s->firstQuartile());\n        $this->assertEquals(92, $s->thirdQuartile());\n        $this->assertEquals(33.25, $s->interquartileRange());\n        $this->assertCount(12, $s->originalArray());\n\n        $s = Statistics::make(\n            [98, 90, 70, 18, 92, 92, 55, 83, 45, 95, 88],\n        );\n        $this->assertEquals(11, $s->count());\n        $this->assertEquals(88, $s->median());\n        $this->assertEquals(55, $s->firstQuartile());\n        $this->assertEquals(92, $s->thirdQuartile());\n        $this->assertEquals(37, $s->interquartileRange());\n        $this->assertCount(11, $s->originalArray());\n    }\n\n    public function test_can_calculate_statistics_again(): void\n    {\n        $s = Statistics::make(\n            [3, 5, 4, 7, 5, 2],\n        );\n        $this->assertEquals(6, $s->count());\n        $this->assertEquals(13 / 3, $s->mean());\n        $this->assertEquals(4.5, $s->median());\n        $this->assertEquals(5, $s->mode());\n        $this->assertEquals(2, $s->min());\n        $this->assertEquals(7, $s->max());\n        $this->assertEquals(5, $s->range());\n        $this->assertEquals(2.75, $s->firstQuartile());\n        $this->assertEquals(5.5, $s->thirdQuartile());\n    }\n\n    public function test_can_calculate_statistics_again_and_again(): void\n    {\n        $s = Statistics::make(\n            [13, 18, 13, 14, 13, 16, 14, 21, 13],\n        );\n        $this->assertEquals(9, $s->count());\n        $this->assertEquals(15, $s->mean());\n        $this->assertEquals(14, $s->median());\n        $this->assertEquals(13, $s->mode());\n        $this->assertEquals(13, $s->min());\n        $this->assertEquals(21, $s->max());\n        $this->assertEquals(8, $s->range());\n        $this->assertEquals(13, $s->firstQuartile());\n        $this->assertEquals(17, $s->thirdQuartile());\n\n        $s = Statistics::make(\n            [1, 2, 4, 7],\n        );\n        $this->assertEquals(4, $s->count());\n        $this->assertEquals(3.5, $s->mean());\n        $this->assertEquals(3, $s->median());\n        $this->assertNull($s->mode());\n        $this->assertEquals(1, $s->min());\n        $this->assertEquals(7, $s->max());\n        $this->assertEquals(6, $s->range());\n    }\n\n    public function test_can_strip_zeros(): void\n    {\n        $s = Statistics::make(\n            [3, 5, 0, 0.1, 4, 7, 5, 2],\n        )->stripZeroes();\n        $this->assertEquals(7, $s->count());\n    }\n\n    public function test_can_calculate_mean(): void\n    {\n        $s = Statistics::make(\n            [3, 5, 4, 7, 5, 2],\n        );\n        $this->assertEquals(6, $s->count());\n        $this->assertEquals(13 / 3, $s->mean());\n\n        $s = Statistics::make([]);\n        $this->assertEquals(0, $s->count());\n        $this->expectException(InvalidDataInputException::class);\n        $s->mean();\n    }\n\n    public function test_can_calculate_mean_again(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 4]);\n        $this->assertEquals(2.8, $s->mean());\n\n        $s = Statistics::make([-1.0, 2.5, 3.25, 5.75]);\n        $this->assertEquals(2.625, $s->mean());\n\n        $s = Statistics::make([0.5, 0.75, 0.625, 0.375]);\n        $this->assertEquals(0.5625, $s->mean());\n\n        $s = Statistics::make([3.5, 4.0, 5.25]);\n        $this->assertEquals(4.25, $s->mean());\n    }\n\n    public function test_can_values_to_string(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 4]);\n        $this->assertEquals('1,2,3,4,4', $s->valuesToString(false));\n        $this->assertEquals('1,2,3', $s->valuesToString(3));\n    }\n\n    public function test_calculates_population_standard_deviation(): void\n    {\n        $this->assertEquals(\n            0.986893273527251,\n            Statistics::make([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])->pstdev(),\n        );\n        $this->assertEquals(\n            2.4495,\n            Statistics::make([1, 2, 4, 5, 8])->pstdev(4),\n        );\n        $this->assertEquals(0, Statistics::make([1])->pstdev());\n        $this->assertEquals(0.8291562, Statistics::make([1, 2, 3, 3])->pstdev(7));\n    }\n\n    public function test_calculates_population_standard_deviation_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([])->pstdev();\n    }\n\n    public function test_calculates_sample_standard_deviation(): void\n    {\n        $this->assertEquals(\n            1.0810874155219827,\n            Statistics::make([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])->stdev(),\n        );\n        $this->assertEquals(2, Statistics::make([1, 2, 2, 4, 6])->stdev());\n        $this->assertEquals(2.7386, Statistics::make([1, 2, 4, 5, 8])->stdev(4));\n    }\n\n    public function test_calculates_sample_standard_deviation_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([])->stdev();\n    }\n\n    public function test_calculates_sample_standard_deviation_with_single_element(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([1])->stdev();\n    }\n\n    public function test_calculates_variance(): void\n    {\n        $this->assertEquals(\n            1.3720238095238095,\n            Statistics::make([2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5])->variance(),\n        );\n    }\n\n    public function test_calculates_pvariance(): void\n    {\n        $this->assertEquals(\n            1.25,\n            Statistics::make([0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25])->pvariance(),\n        );\n        $this->assertEquals(0.6875, Statistics::make([1, 2, 3, 3])->pvariance());\n    }\n\n    public function test_calculates_skewness(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Statistics::make([1, 2, 3, 4, 5])->skewness(), 1e-10);\n    }\n\n    public function test_calculates_pskewness(): void\n    {\n        $this->assertEqualsWithDelta(0.0, Statistics::make([1, 2, 3, 4, 5])->pskewness(), 1e-10);\n    }\n\n    public function test_calculates_kurtosis(): void\n    {\n        // Uniform-like data: negative excess kurtosis\n        $this->assertLessThan(0, Statistics::make([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])->kurtosis());\n    }\n\n    public function test_calculates_geometric_mean(): void\n    {\n        $this->assertEquals(36, Statistics::make([54, 24, 36])->geometricMean(2));\n        $this->assertEquals(6.81, Statistics::make([4, 8, 3, 9, 17])->geometricMean(2));\n    }\n\n    public function test_calculates_geometric_mean_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([])->geometricMean();\n    }\n\n    public function test_calculates_harmonic_mean(): void\n    {\n        $this->assertEquals(48.0, Statistics::make([40, 60])->harmonicMean(1));\n    }\n\n    public function test_calculates_harmonic_mean_with_empty_array(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([])->harmonicMean();\n    }\n\n    public function test_can_distinct_numeric_array(): void\n    {\n        $this->assertEquals([1, 2, 3], Statistics::make([1, 2, 3])->numericalArray());\n        $this->assertEquals([1, '2', 3], Statistics::make([1, '2', 3])->numericalArray());\n        $this->assertEquals([], Statistics::make([])->numericalArray());\n        $this->expectException(InvalidDataInputException::class);\n        Statistics::make([1, 'some string', 3])->numericalArray();\n    }\n\n    public function test_median_grouped(): void\n    {\n        $result = Statistics::make([1, 2, 2, 3, 4, 4, 4, 5])->medianGrouped();\n        $this->assertIsFloat($result);\n        $this->assertEquals(3.5, $result);\n    }\n\n    public function test_max_with_empty_array(): void\n    {\n        $this->assertEquals(0, Statistics::make([])->max());\n    }\n\n    public function test_min_with_empty_array(): void\n    {\n        $this->assertEquals(0, Statistics::make([])->min());\n    }\n\n    public function test_percentile(): void\n    {\n        $s = Statistics::make([10, 20, 30, 40, 50, 60, 70, 80, 90, 100]);\n        $this->assertEqualsWithDelta($s->median(), $s->percentile(50), 1e-10);\n        $this->assertEqualsWithDelta($s->firstQuartile(), $s->percentile(25), 1e-10);\n        $this->assertEqualsWithDelta($s->thirdQuartile(), $s->percentile(75), 1e-10);\n    }\n\n    public function test_percentile_with_rounding(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);\n        $result = $s->percentile(33, 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_coefficient_of_variation(): void\n    {\n        $s = Statistics::make([10, 20, 30, 40, 50]);\n        $cv = $s->coefficientOfVariation();\n        $this->assertGreaterThan(0, $cv);\n\n        $cvPop = $s->coefficientOfVariation(population: true);\n        $this->assertLessThan($cv, $cvPop);\n    }\n\n    public function test_coefficient_of_variation_with_rounding(): void\n    {\n        $s = Statistics::make([10, 20, 30, 40, 50]);\n        $result = $s->coefficientOfVariation(2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_trimmed_mean(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 5, 6, 7, 8, 9, 100]);\n        // 10% trim removes 1 element from each side → mean of [2..9]\n        $this->assertEqualsWithDelta(5.5, $s->trimmedMean(0.1), 1e-10);\n    }\n\n    public function test_trimmed_mean_with_rounding(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);\n        $result = $s->trimmedMean(0.2, 2);\n        $this->assertEquals(round($result, 2), $result);\n    }\n\n    public function test_weighted_median(): void\n    {\n        $s = Statistics::make([1, 2, 3]);\n        // Equal weights → same as regular median\n        $this->assertEqualsWithDelta(2.0, $s->weightedMedian([1, 1, 1]), 1e-10);\n        // Heavy weight on 3 → weighted median is 3\n        $this->assertEqualsWithDelta(3.0, $s->weightedMedian([1, 1, 10]), 1e-10);\n    }\n\n    public function test_sem(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $expected = Stat::stdev([2, 4, 4, 4, 5, 5, 7, 9]) / sqrt(8);\n        $this->assertEqualsWithDelta($expected, $s->sem(), 1e-10);\n    }\n\n    public function test_confidence_interval(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        [$lower, $upper] = $s->confidenceInterval();\n        [$expectedLower, $expectedUpper] = Stat::confidenceInterval([2, 4, 4, 4, 5, 5, 7, 9]);\n        $this->assertEqualsWithDelta($expectedLower, $lower, 1e-10);\n        $this->assertEqualsWithDelta($expectedUpper, $upper, 1e-10);\n    }\n\n    public function test_confidence_interval_with_params(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        [$lower, $upper] = $s->confidenceInterval(confidenceLevel: 0.99, round: 2);\n        [$expectedLower, $expectedUpper] = Stat::confidenceInterval([2, 4, 4, 4, 5, 5, 7, 9], confidenceLevel: 0.99, round: 2);\n        $this->assertSame($expectedLower, $lower);\n        $this->assertSame($expectedUpper, $upper);\n    }\n\n    public function test_mean_absolute_deviation(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 5]);\n        $this->assertEqualsWithDelta(1.2, $s->meanAbsoluteDeviation(), 1e-10);\n    }\n\n    public function test_median_absolute_deviation(): void\n    {\n        $s = Statistics::make([1, 2, 3, 4, 5]);\n        $this->assertEqualsWithDelta(1.0, $s->medianAbsoluteDeviation(), 1e-10);\n    }\n\n    public function test_zscores(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $zscores = $s->zscores();\n        $this->assertCount(8, $zscores);\n        $this->assertEqualsWithDelta(0.0, array_sum($zscores), 1e-10);\n    }\n\n    public function test_outliers(): void\n    {\n        $s = Statistics::make([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100]);\n        $this->assertContains(100, $s->outliers());\n    }\n\n    public function test_iqr_outliers(): void\n    {\n        $s = Statistics::make([110.2, 112.5, 108.9, 115.3, 111.7, 114.0, 109.8, 113.6, 200.0, 50.0]);\n        $outliers = $s->iqrOutliers();\n        $this->assertContains(200.0, $outliers);\n        $this->assertContains(50.0, $outliers);\n    }\n\n    public function test_z_test(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $result = $s->zTest(3.0);\n        $expected = Stat::zTest([2, 4, 4, 4, 5, 5, 7, 9], 3.0);\n        $this->assertEqualsWithDelta($expected['zScore'], $result['zScore'], 1e-10);\n        $this->assertEqualsWithDelta($expected['pValue'], $result['pValue'], 1e-10);\n    }\n\n    public function test_z_test_with_params(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $result = $s->zTest(3.0, Alternative::Greater, round: 4);\n        $expected = Stat::zTest([2, 4, 4, 4, 5, 5, 7, 9], 3.0, Alternative::Greater, round: 4);\n        $this->assertSame($expected['zScore'], $result['zScore']);\n        $this->assertSame($expected['pValue'], $result['pValue']);\n    }\n\n    public function test_t_test(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $result = $s->tTest(3.0);\n        $expected = Stat::tTest([2, 4, 4, 4, 5, 5, 7, 9], 3.0);\n        $this->assertEqualsWithDelta($expected['tStatistic'], $result['tStatistic'], 1e-10);\n        $this->assertEqualsWithDelta($expected['pValue'], $result['pValue'], 1e-10);\n        $this->assertSame($expected['degreesOfFreedom'], $result['degreesOfFreedom']);\n    }\n\n    public function test_t_test_with_params(): void\n    {\n        $s = Statistics::make([2, 4, 4, 4, 5, 5, 7, 9]);\n        $result = $s->tTest(3.0, Alternative::Greater, round: 4);\n        $expected = Stat::tTest([2, 4, 4, 4, 5, 5, 7, 9], 3.0, Alternative::Greater, round: 4);\n        $this->assertSame($expected['tStatistic'], $result['tStatistic']);\n        $this->assertSame($expected['pValue'], $result['pValue']);\n        $this->assertSame($expected['degreesOfFreedom'], $result['degreesOfFreedom']);\n    }\n}\n"
  },
  {
    "path": "tests/StreamingStatTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\Stat;\nuse HiFolks\\Statistics\\StreamingStat;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StreamingStatTest extends TestCase\n{\n    private const TOLERANCE = 1e-10;\n\n    /**\n     * Helper: create a StreamingStat from an array.\n     */\n    /** @param array<int|float> $data */\n    private function fromArray(array $data): StreamingStat\n    {\n        $s = new StreamingStat();\n        foreach ($data as $value) {\n            $s->add($value);\n        }\n\n        return $s;\n    }\n\n    public function test_matches_stat_mean(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::mean($data), $s->mean(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_variance(): void\n    {\n        $data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::variance($data), $s->variance(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_pvariance(): void\n    {\n        $data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::pvariance($data), $s->pvariance(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_stdev(): void\n    {\n        $data = [1.5, 2.5, 2.5, 2.75, 3.25, 4.75];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::stdev($data), $s->stdev(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_pstdev(): void\n    {\n        $data = [1.5, 2.5, 2.5, 2.75, 3.25, 4.75];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::pstdev($data), $s->pstdev(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_skewness(): void\n    {\n        $data = [2, 8, 0, 4, 1, 9, 9, 0];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::skewness($data), $s->skewness(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_pskewness(): void\n    {\n        $data = [2, 8, 0, 4, 1, 9, 9, 0];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::pskewness($data), $s->pskewness(), self::TOLERANCE);\n    }\n\n    public function test_matches_stat_kurtosis(): void\n    {\n        $data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::kurtosis($data), $s->kurtosis(), self::TOLERANCE);\n    }\n\n    public function test_rounding(): void\n    {\n        $data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5];\n        $s = $this->fromArray($data);\n\n        $this->assertEquals(\n            round(Stat::variance($data), 4),\n            $s->variance(4),\n        );\n        $this->assertEquals(\n            round((float) Stat::mean($data), 2),\n            $s->mean(2),\n        );\n    }\n\n    public function test_chaining(): void\n    {\n        $s = (new StreamingStat())->add(1)->add(2)->add(3);\n        $this->assertEquals(3, $s->count());\n        $this->assertEqualsWithDelta(2.0, $s->mean(), self::TOLERANCE);\n    }\n\n    public function test_empty_mean_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        (new StreamingStat())->mean();\n    }\n\n    public function test_one_element_variance_throws(): void\n    {\n        $s = (new StreamingStat())->add(5);\n        $this->assertEqualsWithDelta(5.0, $s->mean(), self::TOLERANCE);\n\n        $this->expectException(InvalidDataInputException::class);\n        $s->variance();\n    }\n\n    public function test_two_elements_skewness_throws(): void\n    {\n        $s = (new StreamingStat())->add(1)->add(2);\n        // variance should work\n        $this->assertEqualsWithDelta(Stat::variance([1, 2]), $s->variance(), self::TOLERANCE);\n\n        $this->expectException(InvalidDataInputException::class);\n        $s->skewness();\n    }\n\n    public function test_three_elements_kurtosis_throws(): void\n    {\n        $s = (new StreamingStat())->add(1)->add(2)->add(3);\n        // skewness should work\n        $this->assertEqualsWithDelta(Stat::skewness([1, 2, 3]), $s->skewness(), self::TOLERANCE);\n\n        $this->expectException(InvalidDataInputException::class);\n        $s->kurtosis();\n    }\n\n    public function test_insufficient_data_pskewness_throws(): void\n    {\n        $s = (new StreamingStat())->add(1)->add(2);\n        $this->expectException(InvalidDataInputException::class);\n        $s->pskewness();\n    }\n\n    public function test_identical_values_pskewness_throws(): void\n    {\n        $s = $this->fromArray([5, 5, 5, 5]);\n        $this->expectException(InvalidDataInputException::class);\n        $s->pskewness();\n    }\n\n    public function test_identical_values_skewness_throws(): void\n    {\n        $s = $this->fromArray([5, 5, 5, 5]);\n        $this->expectException(InvalidDataInputException::class);\n        $s->skewness();\n    }\n\n    public function test_identical_values_kurtosis_throws(): void\n    {\n        $s = $this->fromArray([5, 5, 5, 5]);\n        $this->expectException(InvalidDataInputException::class);\n        $s->kurtosis();\n    }\n\n    public function test_large_dataset(): void\n    {\n        // Generate deterministic pseudo-random data\n        mt_srand(42);\n        $data = [];\n        for ($i = 0; $i < 10000; $i++) {\n            $data[] = mt_rand(-10000, 10000) / 100.0;\n        }\n\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::mean($data), $s->mean(), 1e-6);\n        $this->assertEqualsWithDelta(Stat::variance($data), $s->variance(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::stdev($data), $s->stdev(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::pvariance($data), $s->pvariance(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::pstdev($data), $s->pstdev(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::skewness($data), $s->skewness(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::pskewness($data), $s->pskewness(), 1e-4);\n        $this->assertEqualsWithDelta(Stat::kurtosis($data), $s->kurtosis(), 1e-4);\n    }\n\n    public function test_count(): void\n    {\n        $s = new StreamingStat();\n        $this->assertEquals(0, $s->count());\n        $s->add(1)->add(2);\n        $this->assertEquals(2, $s->count());\n    }\n\n    public function test_negative_values(): void\n    {\n        $data = [-5, -3, -1, 0, 1, 3, 5];\n        $s = $this->fromArray($data);\n\n        $this->assertEqualsWithDelta(Stat::mean($data), $s->mean(), self::TOLERANCE);\n        $this->assertEqualsWithDelta(Stat::variance($data), $s->variance(), self::TOLERANCE);\n        $this->assertEqualsWithDelta(Stat::skewness($data), $s->skewness(), self::TOLERANCE);\n    }\n\n    public function test_pvariance_single_element(): void\n    {\n        $s = (new StreamingStat())->add(42);\n        $this->assertEqualsWithDelta(0.0, $s->pvariance(), self::TOLERANCE);\n    }\n\n    public function test_empty_pvariance_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        (new StreamingStat())->pvariance();\n    }\n\n    public function test_sum(): void\n    {\n        $data = [1.5, 2.5, 3.0, 4.0, 5.5];\n        $s = $this->fromArray($data);\n        $this->assertEqualsWithDelta(array_sum($data), $s->sum(), self::TOLERANCE);\n    }\n\n    public function test_empty_sum_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        (new StreamingStat())->sum();\n    }\n\n    public function test_min(): void\n    {\n        $data = [3, -1, 4, 1, 5, -9, 2, 6];\n        $s = $this->fromArray($data);\n        $this->assertEqualsWithDelta(-9.0, $s->min(), self::TOLERANCE);\n    }\n\n    public function test_empty_min_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        (new StreamingStat())->min();\n    }\n\n    public function test_max(): void\n    {\n        $data = [3, -1, 4, 1, 5, -9, 2, 6];\n        $s = $this->fromArray($data);\n        $this->assertEqualsWithDelta(6.0, $s->max(), self::TOLERANCE);\n    }\n\n    public function test_empty_max_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        (new StreamingStat())->max();\n    }\n\n    public function test_min_max_single_element(): void\n    {\n        $s = (new StreamingStat())->add(42);\n        $this->assertEqualsWithDelta(42.0, $s->min(), self::TOLERANCE);\n        $this->assertEqualsWithDelta(42.0, $s->max(), self::TOLERANCE);\n    }\n\n    public function test_min_max_negative_values(): void\n    {\n        $data = [-10.5, -3.2, -7.8];\n        $s = $this->fromArray($data);\n        $this->assertEqualsWithDelta(-10.5, $s->min(), self::TOLERANCE);\n        $this->assertEqualsWithDelta(-3.2, $s->max(), self::TOLERANCE);\n    }\n}\n"
  },
  {
    "path": "tests/StudentTTest.php",
    "content": "<?php\n\nnamespace HiFolks\\Statistics\\Tests;\n\nuse HiFolks\\Statistics\\Exception\\InvalidDataInputException;\nuse HiFolks\\Statistics\\NormalDist;\nuse HiFolks\\Statistics\\StudentT;\nuse PHPUnit\\Framework\\TestCase;\n\nclass StudentTTest extends TestCase\n{\n    // --- Constructor ---\n\n    public function test_constructor_valid_df(): void\n    {\n        $t = new StudentT(5);\n        $this->assertEquals(5.0, $t->getDegreesOfFreedom());\n    }\n\n    public function test_constructor_fractional_df(): void\n    {\n        $t = new StudentT(2.5);\n        $this->assertEquals(2.5, $t->getDegreesOfFreedom());\n    }\n\n    public function test_constructor_zero_df_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        new StudentT(0);\n    }\n\n    public function test_constructor_negative_df_throws(): void\n    {\n        $this->expectException(InvalidDataInputException::class);\n        new StudentT(-1);\n    }\n\n    // --- PDF ---\n\n    public function test_pdf_df1_cauchy(): void\n    {\n        // For df=1, the t-distribution is the Cauchy distribution\n        // pdf(0) = 1/(pi) ≈ 0.31831\n        $t = new StudentT(1);\n        $this->assertEqualsWithDelta(1 / M_PI, $t->pdf(0), 1e-5);\n    }\n\n    public function test_pdf_df5(): void\n    {\n        // Known value from scipy: scipy.stats.t.pdf(0, 5) ≈ 0.37960669\n        $t = new StudentT(5);\n        $this->assertEqualsWithDelta(0.37961, $t->pdfRounded(0, 5), 1e-4);\n    }\n\n    public function test_pdf_df30(): void\n    {\n        // For df=30, pdf(0) should be close to standard normal pdf(0) ≈ 0.39894\n        $t = new StudentT(30);\n        $normal = new NormalDist(0, 1);\n        $this->assertEqualsWithDelta($normal->pdf(0), $t->pdf(0), 0.01);\n    }\n\n    public function test_pdf_symmetry(): void\n    {\n        $t = new StudentT(5);\n        $this->assertEqualsWithDelta($t->pdf(2.0), $t->pdf(-2.0), 1e-10);\n        $this->assertEqualsWithDelta($t->pdf(0.5), $t->pdf(-0.5), 1e-10);\n    }\n\n    public function test_pdf_tails(): void\n    {\n        // PDF should decrease away from center\n        $t = new StudentT(10);\n        $this->assertGreaterThan($t->pdf(1), $t->pdf(0));\n        $this->assertGreaterThan($t->pdf(2), $t->pdf(1));\n    }\n\n    public function test_pdf_rounded(): void\n    {\n        $t = new StudentT(5);\n        $this->assertEquals(0.38, $t->pdfRounded(0, 2));\n    }\n\n    // --- CDF ---\n\n    public function test_cdf_at_zero(): void\n    {\n        // cdf(0) = 0.5 for all df (by symmetry)\n        foreach ([1, 2, 5, 10, 30, 100] as $df) {\n            $t = new StudentT($df);\n            $this->assertEqualsWithDelta(0.5, $t->cdf(0), 1e-6, \"cdf(0) should be 0.5 for df=$df\");\n        }\n    }\n\n    public function test_cdf_df1_cauchy(): void\n    {\n        // For df=1 (Cauchy): cdf(1) = 0.75, cdf(-1) = 0.25\n        $t = new StudentT(1);\n        $this->assertEqualsWithDelta(0.75, $t->cdf(1), 1e-4);\n        $this->assertEqualsWithDelta(0.25, $t->cdf(-1), 1e-4);\n    }\n\n    public function test_cdf_df5_known_values(): void\n    {\n        // scipy.stats.t.cdf(2.0, 5) ≈ 0.94874\n        $t = new StudentT(5);\n        $this->assertEqualsWithDelta(0.94874, $t->cdf(2.0), 1e-3);\n    }\n\n    public function test_cdf_monotonicity(): void\n    {\n        $t = new StudentT(10);\n        $prev = 0.0;\n        foreach ([-3, -2, -1, 0, 1, 2, 3] as $x) {\n            $val = $t->cdf($x);\n            $this->assertGreaterThan($prev, $val);\n            $prev = $val;\n        }\n    }\n\n    public function test_cdf_converges_to_normal_for_large_df(): void\n    {\n        $t = new StudentT(1000);\n        $normal = new NormalDist(0, 1);\n        foreach ([-2.0, -1.0, 0.0, 1.0, 2.0] as $x) {\n            $this->assertEqualsWithDelta(\n                $normal->cdf($x),\n                $t->cdf($x),\n                0.005,\n                \"t(1000) should approximate normal at x=$x\",\n            );\n        }\n    }\n\n    public function test_cdf_rounded(): void\n    {\n        $t = new StudentT(5);\n        $this->assertEquals(0.949, $t->cdfRounded(2.0, 3));\n    }\n\n    // --- invCdf ---\n\n    public function test_inv_cdf_round_trip(): void\n    {\n        $t = new StudentT(10);\n        foreach ([0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95] as $p) {\n            $this->assertEqualsWithDelta($p, $t->cdf($t->invCdf($p)), 1e-6, \"round-trip for p=$p\");\n        }\n    }\n\n    public function test_inv_cdf_symmetry(): void\n    {\n        $t = new StudentT(10);\n        foreach ([0.1, 0.25, 0.05] as $p) {\n            $this->assertEqualsWithDelta(\n                -$t->invCdf($p),\n                $t->invCdf(1 - $p),\n                1e-6,\n                \"invCdf(p) should equal -invCdf(1-p) for p=$p\",\n            );\n        }\n    }\n\n    public function test_inv_cdf_median(): void\n    {\n        // invCdf(0.5) = 0 for all df\n        foreach ([1, 5, 30] as $df) {\n            $t = new StudentT($df);\n            $this->assertEqualsWithDelta(0.0, $t->invCdf(0.5), 1e-6, \"invCdf(0.5)=0 for df=$df\");\n        }\n    }\n\n    public function test_inv_cdf_throws_for_p_zero(): void\n    {\n        $t = new StudentT(5);\n        $this->expectException(InvalidDataInputException::class);\n        $t->invCdf(0.0);\n    }\n\n    public function test_inv_cdf_throws_for_p_one(): void\n    {\n        $t = new StudentT(5);\n        $this->expectException(InvalidDataInputException::class);\n        $t->invCdf(1.0);\n    }\n\n    public function test_inv_cdf_rounded(): void\n    {\n        $t = new StudentT(10);\n        $val = $t->invCdfRounded(0.975, 3);\n        $this->assertEqualsWithDelta(2.228, $val, 0.001);\n    }\n\n    public function test_inv_cdf_extreme_tail(): void\n    {\n        // Very small p with df=1 (Cauchy) — the normal approximation\n        // initial guess lands where pdf is near zero, hitting the fpx < 1e-15 guard\n        $t = new StudentT(1);\n        $val = $t->invCdf(1e-10);\n        $this->assertLessThan(-1000.0, $val);\n    }\n\n    // --- Coverage: logGamma reflection formula (x < 0.5) ---\n\n    public function test_pdf_fractional_df_triggers_loggamma_reflection(): void\n    {\n        // df=0.5 => logGamma(0.25) is called, which triggers the\n        // reflection branch (x < 0.5) in logGamma\n        $t = new StudentT(0.5);\n        $this->assertGreaterThan(0.0, $t->pdf(0));\n        // Also verify cdf is still valid\n        $this->assertEqualsWithDelta(0.5, $t->cdf(0), 1e-6);\n    }\n\n    // --- Coverage: regularizedIncompleteBeta edge cases ---\n\n    public function test_cdf_very_large_t_value(): void\n    {\n        // Very large t => x = df/(df+t^2) ≈ 0, approaching the x===0 branch\n        $t = new StudentT(5);\n        $this->assertEqualsWithDelta(1.0, $t->cdf(1e8), 1e-6);\n    }\n\n    public function test_cdf_negative_very_large_t_value(): void\n    {\n        // Very negative t => cdf near 0\n        $t = new StudentT(5);\n        $this->assertEqualsWithDelta(0.0, $t->cdf(-1e8), 1e-6);\n    }\n\n    public function test_cdf_df1_wide_range(): void\n    {\n        // df=1 (Cauchy) with various t-values to exercise different\n        // paths in the incomplete beta (symmetry flip, CF convergence)\n        $t = new StudentT(1);\n        // Known Cauchy quantiles\n        $this->assertEqualsWithDelta(0.5, $t->cdf(0), 1e-6);\n        $this->assertEqualsWithDelta(0.75, $t->cdf(1), 1e-4);\n        $this->assertEqualsWithDelta(1.0 - 0.75, $t->cdf(-1), 1e-4);\n        // Extreme tails\n        $this->assertGreaterThan(0.99, $t->cdf(100));\n        $this->assertLessThan(0.01, $t->cdf(-100));\n    }\n\n    // --- Coverage: incompleteBetaCf tiny-value guards ---\n\n    public function test_cdf_df2_known_values(): void\n    {\n        // df=2: cdf(t) = 0.5 + t/(2*sqrt(2+t^2))\n        // This exercises the CF with a=1.0, b=0.5, which can produce\n        // near-zero denominators triggering the tiny guards\n        $t = new StudentT(2);\n        foreach ([0.5, 1.0, 2.0, 5.0] as $tv) {\n            $expected = 0.5 + $tv / (2 * sqrt(2 + $tv * $tv));\n            $this->assertEqualsWithDelta($expected, $t->cdf($tv), 1e-5, \"df=2, t=$tv\");\n        }\n    }\n\n    public function test_cdf_very_high_df(): void\n    {\n        // Very high df pushes x close to 1 in regularizedIncompleteBeta,\n        // forcing the symmetry flip branch\n        $t = new StudentT(10000);\n        $normal = new NormalDist(0, 1);\n        $this->assertEqualsWithDelta($normal->cdf(1.96), $t->cdf(1.96), 0.001);\n    }\n\n    public function test_cdf_small_t_many_df_values(): void\n    {\n        // Small t with various df to exercise different a/b ratios\n        // in the continued fraction, hitting different convergence paths\n        foreach ([1, 2, 3, 4, 5, 10, 50, 200] as $df) {\n            $t = new StudentT($df);\n            // cdf should be between 0.5 and 1 for positive t\n            $val = $t->cdf(0.1);\n            $this->assertGreaterThan(0.5, $val, \"cdf(0.1) > 0.5 for df=$df\");\n            $this->assertLessThan(1.0, $val, \"cdf(0.1) < 1.0 for df=$df\");\n        }\n    }\n\n    public function test_cdf_symmetry_identity(): void\n    {\n        // cdf(t) + cdf(-t) = 1 for all t and df\n        // This exercises both branches (t >= 0 and t < 0) of the CDF\n        $t = new StudentT(3);\n        foreach ([0.1, 0.5, 1.0, 2.0, 5.0, 10.0] as $tv) {\n            $this->assertEqualsWithDelta(\n                1.0,\n                $t->cdf($tv) + $t->cdf(-$tv),\n                1e-10,\n                \"cdf($tv) + cdf(-$tv) should equal 1\",\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "tests/data/income.data.csv",
    "content": "\"\",\"income\",\"happiness\"\r\n\"1\",3.86264741839841,2.31448898284741\r\n\"2\",4.97938138246536,3.43348975853174\r\n\"3\",4.9239569362253,4.59937340433575\r\n\"4\",3.21437243884429,2.79111380280692\r\n\"5\",7.19640925107524,5.59639827336202\r\n\"6\",3.72964347945526,2.45855587216119\r\n\"7\",4.67451738892123,3.19299180946247\r\n\"8\",4.49810382071882,1.90713683309599\r\n\"9\",3.12163052614778,2.94244987165905\r\n\"10\",4.6399144353345,3.73794160485342\r\n\"11\",4.6328395139426,3.17540614694795\r\n\"12\",2.77317889546975,2.00904646125992\r\n\"13\",7.11947859451175,5.95181409854467\r\n\"14\",7.46665319614112,5.96054730753846\r\n\"15\",2.11774233123288,1.44579885772475\r\n\"16\",2.55916581954807,2.89858314201704\r\n\"17\",2.35479322029278,1.23116752380514\r\n\"18\",2.38815724756569,2.3129880546267\r\n\"19\",4.75568027375266,2.66611603454578\r\n\"20\",1.99427504651248,2.58472902193729\r\n\"21\",7.31091602565721,5.74744410385007\r\n\"22\",3.52831895649433,2.54652458689424\r\n\"23\",2.4287516749464,1.20078552531679\r\n\"24\",3.54274787474424,3.07829338129477\r\n\"25\",5.22720123874024,4.31776091901847\r\n\"26\",6.6919931396842,5.38147873730164\r\n\"27\",3.90040993876755,3.56522431909848\r\n\"28\",2.29105547815561,0.953412995602085\r\n\"29\",2.38051270600408,2.16916126054514\r\n\"30\",2.54960877634585,2.0607943111077\r\n\"31\",6.93329582735896,6.29910125940374\r\n\"32\",1.85564517276362,1.59035586179395\r\n\"33\",3.58902313839644,2.25092941356014\r\n\"34\",6.82647791085765,5.91424770000083\r\n\"35\",2.07060188334435,2.19183370197961\r\n\"36\",5.22420527413487,5.76781436577197\r\n\"37\",2.24311362951994,0.972882919584837\r\n\"38\",7.07616636715829,5.01057742729088\r\n\"39\",4.19067249633372,2.23966499176615\r\n\"40\",1.95648612035438,1.92757882928771\r\n\"41\",5.06175818480551,3.35807157024712\r\n\"42\",3.98218993423507,2.4000872914664\r\n\"43\",3.06505861971527,3.40798003884572\r\n\"44\",3.68287749029696,2.57617630706503\r\n\"45\",3.78942929115146,2.47307937742459\r\n\"46\",5.35871566459537,3.75265949244852\r\n\"47\",5.19611977692693,4.08763118311982\r\n\"48\",5.24118957063183,3.54320365405093\r\n\"49\",7.10161959333345,5.34835294041953\r\n\"50\",3.42402109270915,3.05637665252821\r\n\"51\",2.25339902099222,1.55842256342424\r\n\"52\",5.37033690372482,3.22513275150655\r\n\"53\",6.22560599958524,5.03423100904932\r\n\"54\",5.4828622341156,3.85742425923658\r\n\"55\",4.03417210886255,3.61905548983833\r\n\"56\",6.51021871855482,4.00453773871644\r\n\"57\",6.02921386202797,4.80209180676167\r\n\"58\",6.94911289354786,4.65889039162321\r\n\"59\",7.19503728859127,5.23170296769685\r\n\"60\",2.75733849825338,2.4806064668831\r\n\"61\",6.95607948163524,5.4981471927748\r\n\"62\",4.67019257834181,4.55063695843113\r\n\"63\",6.36829266836867,3.57001356718321\r\n\"64\",6.16668116580695,4.71966533293591\r\n\"65\",6.07415829179808,4.50310816816621\r\n\"66\",5.48471896536648,5.04608177486436\r\n\"67\",1.58957473793998,0.669715947797397\r\n\"68\",1.68047392833978,1.60607235734999\r\n\"69\",5.49994795722887,4.82660269623814\r\n\"70\",4.04389090323821,2.20824051578082\r\n\"71\",5.00509324064478,4.05649309934456\r\n\"72\",4.86358172586188,3.56790523339284\r\n\"73\",1.50627504475415,1.30848726312301\r\n\"74\",2.86466384539381,4.15960929985451\r\n\"75\",5.87790629407391,4.63391505682611\r\n\"76\",6.48398378910497,5.06874788141326\r\n\"77\",4.93803659174591,3.04079729363343\r\n\"78\",5.62543442146853,3.80429886107934\r\n\"79\",7.22826521471143,5.03400378544416\r\n\"80\",5.33746029390022,3.70343793163112\r\n\"81\",2.82582659041509,2.18893812105238\r\n\"82\",5.93136667320505,5.53804752930638\r\n\"83\",3.52025520242751,3.58387516187343\r\n\"84\",3.23994059348479,3.09688560965241\r\n\"85\",3.49838636862114,2.20098217177591\r\n\"86\",7.18611225765198,5.15159828927485\r\n\"87\",4.71916607161984,5.95098632787252\r\n\"88\",3.59480195771903,2.96818710245234\r\n\"89\",3.23394243232906,2.3995613260922\r\n\"90\",1.51415322348475,0.859499051809909\r\n\"91\",4.00253706425428,1.77593255181846\r\n\"92\",6.19810375990346,4.6612612088626\r\n\"93\",2.28065086109564,0.727221191435334\r\n\"94\",2.18986551230773,0.771286567223164\r\n\"95\",3.43415113398805,3.3487882059211\r\n\"96\",5.93226967658848,3.96621541073731\r\n\"97\",5.30783943738788,2.89044741631859\r\n\"98\",5.66434490913525,3.7732606523266\r\n\"99\",7.43924761097878,6.35960000434704\r\n\"100\",2.13470214698464,0.268722097401768\r\n\"101\",6.50127472216263,4.37483231772116\r\n\"102\",3.6511831744574,2.15584332319238\r\n\"103\",2.28649539174512,1.89355685987461\r\n\"104\",4.74885913683102,4.90299157512251\r\n\"105\",5.45916076144204,4.83350642716834\r\n\"106\",3.43306476529688,3.17229946342246\r\n\"107\",7.17639965331182,5.02995166474041\r\n\"108\",5.50639511737972,4.26101300319181\r\n\"109\",3.09761573234573,1.67239058048437\r\n\"110\",4.64755599154159,1.49702412799332\r\n\"111\",1.8283063438721,1.26548891585784\r\n\"112\",3.53456623479724,2.6674654346945\r\n\"113\",4.60617584036663,1.9993255273609\r\n\"114\",5.36150313355029,5.23186331277666\r\n\"115\",6.87933303695172,5.21140132049364\r\n\"116\",4.31703230598941,3.66165645637435\r\n\"117\",3.38316357973963,1.41503465557127\r\n\"118\",4.93220723234117,4.93304409176638\r\n\"119\",4.93559651914984,4.13077826510296\r\n\"120\",2.60155331064016,2.28226693420823\r\n\"121\",5.7112635425292,3.90117033371217\r\n\"122\",6.11753055173904,4.69199885099164\r\n\"123\",3.77141514932737,3.57780067459739\r\n\"124\",7.11722011025995,5.56254547499372\r\n\"125\",2.19488185085356,2.39322811959199\r\n\"126\",5.95200191577896,3.56472374018866\r\n\"127\",3.92230276297778,2.25372151555325\r\n\"128\",7.08158905757591,4.12164774705515\r\n\"129\",6.95074499677867,4.16910075873337\r\n\"130\",3.66087654652074,3.82389868566928\r\n\"131\",1.78909178031608,0.458377573792869\r\n\"132\",3.54034109925851,2.5769399593697\r\n\"133\",4.5333951911889,2.94753147141445\r\n\"134\",4.86733944201842,3.73999579597949\r\n\"135\",4.056005443912,3.57144651080101\r\n\"136\",5.63464347738773,4.80815042454505\r\n\"137\",5.46163588156924,4.01761120835699\r\n\"138\",3.18617582553998,1.8398020063859\r\n\"139\",4.41766555700451,3.46857379278554\r\n\"140\",5.76028922433034,4.75878549190717\r\n\"141\",3.71670024935156,2.39167751902943\r\n\"142\",2.18256156658754,0.992917364742155\r\n\"143\",4.29198368964717,3.16938023951209\r\n\"144\",3.41003027558327,2.08904235497713\r\n\"145\",3.58109703613445,1.84367577487797\r\n\"146\",3.50966309336945,1.61660743515923\r\n\"147\",6.66021636407822,5.94934750206047\r\n\"148\",6.27178643038496,4.940227776923\r\n\"149\",3.73501762608066,2.84123867701184\r\n\"150\",4.39320833981037,2.94439133547905\r\n\"151\",3.51221689721569,3.02691817511561\r\n\"152\",6.2397396331653,5.09780250912624\r\n\"153\",3.68148558586836,3.03681993741342\r\n\"154\",7.24131299927831,4.68281699020382\r\n\"155\",6.34536973200738,4.00081918312565\r\n\"156\",5.93974152440205,4.57080112702648\r\n\"157\",2.45932137966156,2.04273931237878\r\n\"158\",2.53908889135346,1.7486511242512\r\n\"159\",6.70860372623429,6.06401304814669\r\n\"160\",6.83132232539356,5.13240146874773\r\n\"161\",5.08265792531893,3.11334209035311\r\n\"162\",6.03060737159103,4.82181026180874\r\n\"163\",6.57459503179416,4.17952102324249\r\n\"164\",3.57429692940786,1.63126110706156\r\n\"165\",5.52990843169391,3.82217964303449\r\n\"166\",2.40938174631447,1.85754188647718\r\n\"167\",4.26478955196217,3.75108928103916\r\n\"168\",3.5303448275663,3.15861859438779\r\n\"169\",6.14314981643111,4.92713262492307\r\n\"170\",5.15769737726077,4.60011475687374\r\n\"171\",4.71084745740518,2.40837990156329\r\n\"172\",6.84751526685432,4.48670366480577\r\n\"173\",5.46464009257033,2.72771571220574\r\n\"174\",4.17653234163299,3.02072444715782\r\n\"175\",3.74809321481735,3.74918072735692\r\n\"176\",2.27452311431989,2.31155423708273\r\n\"177\",1.57636572187766,0.987603171535743\r\n\"178\",1.92413369799033,1.46114442110921\r\n\"179\",5.90424604341388,4.57685653033834\r\n\"180\",5.18903107568622,4.76759563923268\r\n\"181\",1.87986818002537,0.438171566569843\r\n\"182\",2.54434764571488,2.35120246844228\r\n\"183\",3.22139361407608,3.45310273928936\r\n\"184\",7.26037414278835,5.38139847050254\r\n\"185\",6.48161696176976,4.82009124732981\r\n\"186\",5.68848772253841,4.6587426832043\r\n\"187\",6.63361882092431,5.38007020724562\r\n\"188\",5.97274135099724,3.31598947616809\r\n\"189\",3.89773791655898,2.79974748680421\r\n\"190\",6.46124332118779,4.20675491186833\r\n\"191\",6.62803630158305,4.40266319466238\r\n\"192\",3.11895919544622,2.76911812641752\r\n\"193\",4.69596361415461,2.78424454536\r\n\"194\",1.57369371503592,0.688090581428982\r\n\"195\",3.67037693085149,3.47649914026504\r\n\"196\",7.19440695969388,5.83619665594657\r\n\"197\",1.78047865815461,2.00392609305557\r\n\"198\",2.14235955476761,0.971332454187074\r\n\"199\",3.65648598968983,2.8576143643143\r\n\"200\",2.09035414131358,1.85211771113897\r\n\"201\",3.36309717223048,3.51517917678869\r\n\"202\",2.42314385203645,2.10055249406317\r\n\"203\",7.11158364219591,6.08647828390246\r\n\"204\",3.03994185570627,4.08382116125004\r\n\"205\",2.37323203543201,1.52478229133559\r\n\"206\",1.98456368967891,2.57951651511063\r\n\"207\",2.62848285259679,1.61922612693779\r\n\"208\",7.13675997080281,5.50662153705521\r\n\"209\",3.10491835372522,1.09599927077992\r\n\"210\",1.55865675117821,0.968527327097259\r\n\"211\",7.47844661772251,4.87772553834132\r\n\"212\",2.81390004744753,1.75969866729174\r\n\"213\",5.74453964363784,5.03221103359133\r\n\"214\",6.54098835680634,5.71380229380638\r\n\"215\",6.56279413215816,4.79584286738541\r\n\"216\",5.47012499487028,4.66028996116236\r\n\"217\",2.08513052528724,2.74039540725342\r\n\"218\",4.58957249717787,4.25056806449311\r\n\"219\",5.07450176915154,3.94101932161443\r\n\"220\",7.46350981341675,4.50344537998766\r\n\"221\",5.85390576347709,4.63322873238336\r\n\"222\",3.76453996170312,4.03086493801669\r\n\"223\",7.06279211025685,6.86338795095807\r\n\"224\",6.37737642601132,5.07947589856475\r\n\"225\",1.92086276365444,1.62526839290523\r\n\"226\",7.36421337816864,6.61827984207343\r\n\"227\",6.53579886071384,4.65699857188463\r\n\"228\",7.30090272473171,6.00498747478251\r\n\"229\",3.03723172377795,2.41063556603061\r\n\"230\",6.7032668273896,4.26121997230496\r\n\"231\",1.92799668386579,1.06562380948865\r\n\"232\",4.22355357790366,2.29570014192334\r\n\"233\",2.92270632041618,1.91951110291046\r\n\"234\",4.42710860492662,3.58138103837618\r\n\"235\",2.07056241016835,0.628942061074953\r\n\"236\",5.22407022910193,3.34410837245363\r\n\"237\",7.1618728749454,6.16043410141322\r\n\"238\",2.21069647790864,2.86127444364931\r\n\"239\",7.20705969957635,4.20947057298817\r\n\"240\",2.1840849914588,1.22629813434264\r\n\"241\",4.41499846614897,4.79332515991528\r\n\"242\",5.01480966852978,3.3137965162018\r\n\"243\",2.60203711967915,2.15269843256791\r\n\"244\",2.91704867128283,2.92368387386963\r\n\"245\",6.24434163467959,3.45206723346595\r\n\"246\",6.85965449362993,4.25831152921851\r\n\"247\",2.37123042438179,3.96103276619262\r\n\"248\",5.96405759034678,3.4277231628448\r\n\"249\",7.15367408841848,4.72098733829071\r\n\"250\",2.28989742044359,1.57194981288347\r\n\"251\",6.37622803263366,6.18694560947344\r\n\"252\",3.54050429910421,3.5527369641354\r\n\"253\",3.13982571987435,1.24578594443855\r\n\"254\",6.46074230130762,3.42691165170281\r\n\"255\",2.64134846208617,1.87714704297141\r\n\"256\",2.00221355259418,1.81788911344585\r\n\"257\",6.39142813486978,5.23990949781775\r\n\"258\",2.76371977245435,1.38404411156753\r\n\"259\",6.83125754864886,3.83797776845087\r\n\"260\",3.82725452492014,2.32169768986459\r\n\"261\",3.77067089779302,3.31157822966062\r\n\"262\",3.15985489217564,3.05738783980748\r\n\"263\",5.09941715653986,4.73704595362766\r\n\"264\",5.61039082473144,4.59097046068678\r\n\"265\",1.85637175245211,0.65424214776212\r\n\"266\",5.36372958077118,3.9811277394736\r\n\"267\",2.33613356295973,2.33627603265793\r\n\"268\",4.97585136769339,4.54063614683775\r\n\"269\",2.62954677455127,2.19274486318954\r\n\"270\",2.64645809680223,1.78817634141603\r\n\"271\",3.85989238740876,2.71943217959463\r\n\"272\",4.12153062131256,4.2126973388239\r\n\"273\",6.3859406397678,4.47727139172689\r\n\"274\",3.84270967869088,2.46885999413966\r\n\"275\",4.99054853757843,3.45177253771527\r\n\"276\",3.40059694088995,0.68692084864127\r\n\"277\",3.82011537160724,2.24674570685324\r\n\"278\",1.90949915349483,1.62769508948927\r\n\"279\",2.85846412321553,2.45320906080177\r\n\"280\",7.45150065002963,4.10994465330612\r\n\"281\",3.35425210744143,2.57259820309925\r\n\"282\",6.70782520249486,4.81042405824015\r\n\"283\",6.3259060671553,4.82741630352611\r\n\"284\",1.93118067691103,2.46576112842346\r\n\"285\",5.1285452269949,3.75737898692408\r\n\"286\",4.27820987161249,3.69134818860085\r\n\"287\",1.57947028800845,1.98180991059574\r\n\"288\",2.90768217667937,1.41022956743426\r\n\"289\",5.64471408957615,3.75430141971041\r\n\"290\",3.57117543602362,4.24521559706926\r\n\"291\",2.34510756656528,1.4378767241526\r\n\"292\",5.84519702754915,4.15988714358891\r\n\"293\",5.29848040826619,3.51489178158106\r\n\"294\",3.43469992512837,2.71449803816852\r\n\"295\",1.86599523201585,1.3573731356938\r\n\"296\",5.09561549127102,5.63846038658719\r\n\"297\",1.53080772701651,2.42146466926894\r\n\"298\",1.61831145873293,1.51120510776249\r\n\"299\",6.68767721811309,5.52289166781273\r\n\"300\",7.34724552277476,4.39662231880207\r\n\"301\",5.98384569631889,3.98845283464788\r\n\"302\",5.19608237268403,3.31760677959215\r\n\"303\",4.19300743052736,2.77968262704372\r\n\"304\",2.34768755128607,0.846505525734867\r\n\"305\",4.70948937814683,2.45792170724652\r\n\"306\",2.30719771049917,3.50960468808818\r\n\"307\",2.73076882073656,2.64690453683064\r\n\"308\",3.88251350540668,2.97791672449312\r\n\"309\",3.57439364539459,1.42445054844615\r\n\"310\",4.15957638435066,1.79034231699163\r\n\"311\",1.54463433288038,1.64463660327815\r\n\"312\",3.38389667356387,0.731636289703917\r\n\"313\",3.61474518338218,2.90178859153712\r\n\"314\",6.5038823238574,4.31713315751556\r\n\"315\",1.84841255843639,2.09736906038893\r\n\"316\",4.42031190404668,4.39159297597301\r\n\"317\",6.47715323651209,5.31395690314421\r\n\"318\",6.56141998479143,6.28137019099328\r\n\"319\",7.18090719357133,5.60796663022193\r\n\"320\",2.80909078102559,2.9468704348362\r\n\"321\",5.68620540993288,3.87278884763581\r\n\"322\",4.80034374631941,2.84286835090218\r\n\"323\",2.41291215922683,1.46061633017032\r\n\"324\",2.92570351995528,3.75282476573453\r\n\"325\",3.17417576350272,3.12660323905035\r\n\"326\",2.68552962876856,2.82119264499813\r\n\"327\",2.12442924175411,2.73496005606334\r\n\"328\",2.6940221269615,2.19759211976013\r\n\"329\",4.23088879371062,4.15554086448376\r\n\"330\",5.35051566408947,4.07820883807991\r\n\"331\",5.09157950570807,4.45696361047167\r\n\"332\",6.25030221417546,4.93925895978283\r\n\"333\",5.32463256083429,3.73056995759183\r\n\"334\",2.04849771084264,2.57505485654663\r\n\"335\",2.38138524536043,2.17632214535203\r\n\"336\",6.9199429708533,6.23695531197885\r\n\"337\",2.35950779356062,0.898732802990241\r\n\"338\",4.54570746235549,3.62942438677575\r\n\"339\",7.31050263578072,5.92334607107772\r\n\"340\",1.69746369123459,0.546190176386527\r\n\"341\",4.48358581960201,3.03149706560407\r\n\"342\",2.07995533989742,2.84781684252972\r\n\"343\",4.83439738629386,4.5169605896786\r\n\"344\",4.76134447893128,4.59169701044968\r\n\"345\",4.27003655070439,3.15776509125229\r\n\"346\",5.37979542324319,4.28799968034075\r\n\"347\",4.25555722508579,3.61882978928995\r\n\"348\",4.2164684808813,3.20079786041367\r\n\"349\",5.69372506113723,3.99970300744161\r\n\"350\",2.99652070272714,1.66225319886238\r\n\"351\",2.84945821389556,2.39154999858735\r\n\"352\",3.74497363669798,3.06760945321428\r\n\"353\",3.44151328783482,2.01751358784846\r\n\"354\",4.28007538011298,3.64088397009386\r\n\"355\",1.82445571850985,0.266043663087602\r\n\"356\",2.88924200227484,1.29542729438059\r\n\"357\",2.16997749824077,1.85334761253999\r\n\"358\",6.97689869999886,5.36723610296883\r\n\"359\",2.68373122857884,2.46629234383312\r\n\"360\",1.65115609159693,1.72359356107647\r\n\"361\",4.41632057586685,4.30728841594257\r\n\"362\",3.88109869183972,3.15636743551461\r\n\"363\",3.78456055233255,1.69045809453004\r\n\"364\",6.2672605002299,5.08552301140633\r\n\"365\",1.89032675977796,2.48162387022579\r\n\"366\",1.76205817842856,1.35331734068273\r\n\"367\",6.08916251175106,5.27529415127748\r\n\"368\",6.37425386253744,3.67029152611131\r\n\"369\",6.39248312450945,3.69639040957121\r\n\"370\",3.20431634271517,2.6798873177165\r\n\"371\",2.90814691130072,2.36356466910435\r\n\"372\",2.34995206911117,1.43859610476326\r\n\"373\",7.00439883721992,5.36244826300704\r\n\"374\",3.48359911702573,2.32530922601278\r\n\"375\",2.51746380329132,1.52956094402572\r\n\"376\",5.69845134112984,4.03572701904756\r\n\"377\",4.38192328205332,3.55033718426243\r\n\"378\",4.77205908950418,3.55460031663999\r\n\"379\",5.39501645090058,4.19972852584098\r\n\"380\",5.45644130883738,3.13547515352636\r\n\"381\",1.78835992282256,1.14529581266695\r\n\"382\",3.11259278375655,3.53985412722536\r\n\"383\",5.08096844796091,3.82220396398113\r\n\"384\",4.85131391789764,3.83557764615373\r\n\"385\",6.60747211240232,5.68375056087792\r\n\"386\",3.1268561789766,2.51740087585715\r\n\"387\",6.56276283273473,4.32466038431927\r\n\"388\",2.34307547332719,2.95869680058244\r\n\"389\",1.60003301734105,1.68099713335246\r\n\"390\",5.24817553628236,4.25350130386043\r\n\"391\",3.03546131635085,2.26039569342357\r\n\"392\",4.94737214734778,3.61547105345786\r\n\"393\",1.57495361566544,1.50221162618331\r\n\"394\",5.74172488460317,3.77648463051399\r\n\"395\",5.91047526989132,3.20104243469125\r\n\"396\",6.1950039868243,4.05831161607775\r\n\"397\",5.4925897247158,4.10522451662447\r\n\"398\",4.51724457414821,3.72209763864769\r\n\"399\",4.31178599735722,3.82227031270074\r\n\"400\",4.63374137040228,3.70666098080036\r\n\"401\",7.42383249755949,4.90968846868095\r\n\"402\",2.50776854809374,2.15573362622405\r\n\"403\",5.71258665947244,5.74373608123351\r\n\"404\",6.82237690966576,4.3375714519323\r\n\"405\",3.36759181832895,2.58241663950174\r\n\"406\",6.32019131071866,4.28430555785881\r\n\"407\",2.2986433650367,2.96216820958049\r\n\"408\",4.69989587925375,3.99041589363757\r\n\"409\",7.2467548311688,5.89014869657948\r\n\"410\",5.6669511529617,3.68853951803276\r\n\"411\",4.08746198192239,3.21784736329012\r\n\"412\",3.31455607060343,2.22047032745549\r\n\"413\",6.76833622762933,5.91028487798389\r\n\"414\",5.40574906021357,4.8530325187823\r\n\"415\",3.63228338956833,3.80297363286464\r\n\"416\",4.92494968743995,3.09216464780573\r\n\"417\",3.69920552149415,3.23331374691575\r\n\"418\",6.89692928921431,5.26547895759795\r\n\"419\",7.02551769465208,4.81153337990624\r\n\"420\",6.47562521696091,5.36804117292558\r\n\"421\",1.63938056956977,1.79871830329663\r\n\"422\",6.63595411553979,4.76001433281849\r\n\"423\",4.60440921457484,3.41580647919151\r\n\"424\",7.03205709904432,5.04017458169285\r\n\"425\",3.10234604263678,1.96563593278332\r\n\"426\",3.04411420365795,1.87195785402224\r\n\"427\",6.18289923202246,5.32350738914155\r\n\"428\",5.41505292803049,4.50115833597183\r\n\"429\",3.86069598142058,2.43509485602314\r\n\"430\",5.69533881219104,4.5532165478241\r\n\"431\",6.36491893976927,4.36170927009388\r\n\"432\",2.09909346699715,1.47878719948943\r\n\"433\",3.49896751809865,2.76852312891189\r\n\"434\",5.9946022820659,3.96063637146719\r\n\"435\",6.84093094803393,5.9157813950033\r\n\"436\",5.75494099454954,3.98019535045613\r\n\"437\",6.92672096891329,4.70820298832305\r\n\"438\",1.62381373671815,1.41666105541776\r\n\"439\",5.90635848417878,4.328417097403\r\n\"440\",7.48152138059959,6.19629554587246\r\n\"441\",2.60785315884277,1.94596105258794\r\n\"442\",1.66065049683675,0.698459964938276\r\n\"443\",6.23558535519987,5.13545531647409\r\n\"444\",2.96599172148854,1.5603553757432\r\n\"445\",6.27926506614313,4.23499121504727\r\n\"446\",2.2204038631171,0.688909230817426\r\n\"447\",2.07095024641603,0.68584887435653\r\n\"448\",2.86841862648726,1.79175356811399\r\n\"449\",3.92814090382308,3.22747217285539\r\n\"450\",6.83574772579595,5.15933124018258\r\n\"451\",3.43716472154483,3.802970672606\r\n\"452\",5.13956521824002,4.21605759162201\r\n\"453\",6.0045212963596,4.93240997785224\r\n\"454\",3.28316689515486,2.23344058374244\r\n\"455\",5.9191292594187,4.37825945044752\r\n\"456\",3.91019854741171,3.4839256845149\r\n\"457\",2.6420985921286,2.0302970430243\r\n\"458\",5.72087741224095,3.59639831830423\r\n\"459\",3.39991641975939,3.26118210444504\r\n\"460\",1.61160631338134,3.48113776710441\r\n\"461\",4.78368549048901,3.39343915265214\r\n\"462\",4.59148910595104,4.22966334514803\r\n\"463\",5.67183973873034,4.32136429726237\r\n\"464\",6.05273140082136,4.40045514533445\r\n\"465\",4.31579417735338,3.3575973824697\r\n\"466\",6.49585476284847,5.33571800394344\r\n\"467\",6.54613258363679,5.58838933571186\r\n\"468\",4.68821962550282,2.99420428356126\r\n\"469\",3.1265240306966,2.41338175267275\r\n\"470\",6.78156412392855,4.98536659019169\r\n\"471\",7.11950537608936,4.98198112456772\r\n\"472\",7.29772228281945,6.38427449695588\r\n\"473\",5.87349354336038,3.26747543972\r\n\"474\",1.58731534564868,1.31275897723396\r\n\"475\",6.00180686311796,4.03954798601876\r\n\"476\",5.72322609554976,4.42599508434608\r\n\"477\",5.2378930519335,3.34274939573109\r\n\"478\",6.44994533993304,3.49231018567766\r\n\"479\",4.17233666824177,3.75122546273269\r\n\"480\",6.52277168026194,5.38655932895044\r\n\"481\",7.44811655720696,5.96342208635885\r\n\"482\",7.22519186418504,4.98525492664478\r\n\"483\",3.27533480711281,2.79816072820568\r\n\"484\",2.03181776404381,2.96732326832455\r\n\"485\",2.84014074783772,1.2287731839405\r\n\"486\",6.10156470257789,3.9617296397724\r\n\"487\",6.07432779902592,4.64914365195858\r\n\"488\",4.24387255171314,4.75416839515926\r\n\"489\",5.31857559224591,2.96195666265414\r\n\"490\",6.05238132085651,4.63344105474874\r\n\"491\",3.15404259134084,1.29413657590318\r\n\"492\",2.48238160461187,0.548365189283401\r\n\"493\",3.43510175077245,2.11513648497449\r\n\"494\",5.24920878885314,4.56870455997946\r\n\"495\",3.47179858479649,2.53500211304698\r\n\"496\",6.08760962309316,4.39745127503739\r\n\"497\",3.44084687624127,2.07066376753438\r\n\"498\",4.53054540697485,3.71019297842795\r\n"
  }
]