[
  {
    "path": ".github/workflows/ci.yml",
    "content": "# CI build that assembles artifacts and runs tests.\n# If validation is successful this workflow releases from the main dev branch.\n#\n# - skipping CI: add [skip ci] to the commit message\n# - skipping release: add [skip release] to the commit message\n\nname: CI\n\non:\n  push:\n    branches: ['main']\n    tags-ignore: [v*] # release tags are autogenerated after a successful CI, no need to run CI against them\n    paths-ignore:\n      - 'docs/**'\n      - '**.md'\n  pull_request:\n    branches: ['**']\n\njobs:\n  build:\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        include:\n          - scala-version: 2.11.8\n            spark-version: 2.3.0\n          - scala-version: 2.11.8\n            spark-version: 2.4.3\n          - scala-version: 2.12.11\n            spark-version: 2.4.3\n    if: \"! contains(toJSON(github.event.commits.*.message), '[skip ci]')\"\n    steps:\n      - name: Check out code\n        # https://github.com/actions/checkout\n        uses: actions/checkout@v6\n        with:\n          # Needed to get all tags. Refer https://github.com/shipkit/shipkit-changelog#fetch-depth-on-ci\n          fetch-depth: '0'\n      - name: Setup Java\n        uses: actions/setup-java@v5\n          with:\n            distribution: temurin  # Added for actions/setup-java v2+ compatibility\n        with:\n          java-version: 1.8\n      - name: Build and test code, and test artifact publishing\n        run: ./gradlew build publishToMavenLocal artifactoryPublishAll -s -Partifactory.dryRun -PscalaVersion=$SCALA_VERSION -PsparkVersion=$SPARK_VERSION\n        env:\n          SCALA_VERSION: ${{ matrix.scala-version }}\n          SPARK_VERSION: ${{ matrix.spark-version }}\n      - name: Release to Maven Central and LinkedIn Artifactory\n        # Release job, only for pushes to the main development branch\n        if: github.event_name == 'push'\n          && github.ref == 'refs/heads/main'\n          && github.repository == 'linkedin/LiFT'\n          && !contains(toJSON(github.event.commits.*.message), '[skip release]')\n        run: ./gradlew publishToSonatype closeAndReleaseStagingRepository artifactoryPublishAll -s -PscalaVersion=$SCALA_VERSION -PsparkVersion=$SPARK_VERSION\n        env:\n          SCALA_VERSION: ${{ matrix.scala-version }}\n          SPARK_VERSION: ${{ matrix.spark-version }}\n          SONATYPE_USER: ${{ secrets.SONATYPE_USER }}\n          SONATYPE_PWD: ${{ secrets.SONATYPE_PWD }}\n          PGP_KEY: ${{ secrets.PGP_KEY }}\n          PGP_PWD: ${{ secrets.PGP_PWD }}\n          ARTIFACTORY_USER: ${{ secrets.ARTIFACTORY_USER }}\n          ARTIFACTORY_KEY: ${{ secrets.ARTIFACTORY_KEY }}\n  github-release:\n    runs-on: ubuntu-latest\n    needs: build\n    # Release job, only for pushes to the main development branch\n    if: github.event_name == 'push'\n      && github.ref == 'refs/heads/main'\n      && github.repository == 'linkedin/LiFT'\n      && !contains(toJSON(github.event.commits.*.message), '[skip release]')\n    steps:\n      - name: Check out code\n        # https://github.com/actions/checkout\n        uses: actions/checkout@v6\n        with:\n          # Needed to get all tags. Refer https://github.com/shipkit/shipkit-changelog#fetch-depth-on-ci\n          fetch-depth: '0'\n      - name: Setup Java\n        uses: actions/setup-java@v5\n          with:\n            distribution: temurin  # Added for actions/setup-java v2+ compatibility\n        with:\n          java-version: 1.8    \n      - name: Release\n        run: ./gradlew githubRelease -s\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n"
  },
  {
    "path": ".gitignore",
    "content": ".gradle\nout\nbuild\n.DS_Store\n.idea\nspark-warehouse\n*.ipr\n*.iml\n*.iws\n*~\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "# Contribution Agreement\n\nAs a contributor, you represent that the code you submit is your\noriginal work or that of your employer (in which case you represent\nyou have the right to bind your employer).  By submitting code, you\n(and, if applicable, your employer) are licensing the submitted code\nto LinkedIn and the open source community subject to the BSD 2-Clause\nlicense.\n\n# Responsible Disclosure of Security Vulnerabilities\n\n**Do not file an issue on Github for security issues.**  Please review\nthe [guidelines for disclosure][disclosure_guidelines].  Reports should\nbe encrypted using PGP ([public key][pubkey]) and sent to\n[security@linkedin.com][disclosure_email] preferably with the title\n\"Vulnerability in Github LinkedIn/lift - &lt;short summary&gt;\".\n\n# Tips for Getting Your Pull Request Accepted\n\n1. Make sure all new features are tested and the tests pass.\n2. Bug fixes must include a test case demonstrating the error that it\n   fixes.\n3. Open an issue first and seek advice for your change before\n   submitting a pull request. Large features which have never been\n   discussed are unlikely to be accepted. **You have been warned.**\n\n[disclosure_guidelines]: https://www.linkedin.com/help/linkedin/answer/62924\n[pubkey]: https://www.linkedin.com/help/linkedin/answer/79676\n[disclosure_email]: mailto:security@linkedin.com?subject=Vulnerability%20in%20Github%20LinkedIn/fairsscale%20-%20%3Csummary%3E\n"
  },
  {
    "path": "LICENSE",
    "content": "BSD 2-CLAUSE LICENSE\n\nCopyright 2020 LinkedIn Corporation \nAll Rights Reserved.\n\nRedistribution and use in source and binary forms, with or\nwithout modification, are permitted provided that the following\nconditions are met:\n\n1. Redistributions of source code must retain the above copyright\nnotice, this list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above\ncopyright notice, this list of conditions and the following\ndisclaimer in the documentation and/or other materials provided\nwith the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\nLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\nA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\nHOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\nSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\nLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\nDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\nTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "NOTICE",
    "content": "Copyright 2020 LinkedIn Corporation\nAll Rights Reserved.\n\nLicensed under the BSD 2-Clause License (the \"License\").\nSee License in the project root for license information.\n\n================================================================================\n\nThis product uses Gradle (https://github.com/gradle/gradle) as its build system\nand includes the Gradle wrapper script.\nCopyright 2020 Gradle\nLicense: Apache-2.0\n\n========================================================================\nExternal dependencies\n========================================================================\nIn addition, this product automatically loads third party code from an external repository.\nSuch third party code is subject to other license terms than as set forth above.\nIn addition, such third party code may also depend on and load multiple tiers of dependencies.\n"
  },
  {
    "path": "README.md",
    "content": "# The LinkedIn Fairness Toolkit (LiFT)\n[![Build Status](https://github.com/linkedin/LiFT/actions/workflows/ci.yml/badge.svg?branch=main&event=push)](https://github.com/linkedin/LiFT/actions/workflows/ci.yml?query=branch%3Amain+event%3Apush)\n[![Release](https://img.shields.io/github/v/release/linkedin/LiFT)](https://github.com/linkedin/LiFT/releases/)\n[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](LICENSE)\n---\n> 📣 We've moved from Bintray to [Artifactory](https://linkedin.jfrog.io/artifactory/LiFT/)!\n>\n> As of version [0.2.2](https://github.com/linkedin/LiFT/releases/tag/v0.2.2), we are only publishing versions\n> to LinkedIn's Artifactory instance rather than Bintray, which is approaching end of life.\n\n## Introduction\nThe LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness and the mitigation of bias in large-scale machine learning workflows.\nThe measurement module includes measuring biases in training data, evaluating fairness\nmetrics for ML models, and detecting statistically significant differences in their performance across different\nsubgroups. It can also be used for ad-hoc fairness analysis. The mitigation part includes a post-processing method for transforming\nmodel scores to ensure the so-called equality of opportunity for rankings (in the presence/absence of position bias). This method can be directly applied to the model-generated scores without changing the existing model training pipeline.\n\nThis library was created by [Sriram Vasudevan](https://www.linkedin.com/in/vasudevansriram/) and\n[Krishnaram Kenthapadi](https://www.linkedin.com/in/krishnaramkenthapadi/) (work done while at LinkedIn).\n\nAdditional Contributors:\n1. [Preetam Nandy](https://www.linkedin.com/in/preetamnandy/)\n\n## Copyright\n\nCopyright 2020 LinkedIn Corporation\nAll Rights Reserved.\n\nLicensed under the BSD 2-Clause License (the \"License\").\nSee [License](LICENSE) in the project root for license information.\n\n\n## Features\nLiFT provides a configuration-driven Spark job for scheduled deployments, with support for custom metrics through\nUser Defined Functions (UDFs). APIs at various levels are also exposed to enable users to build upon the library's\ncapabilities as they see fit. One can thus opt for a plug-and-play approach or deploy a customized job\nthat uses LiFT. As a result, the library can be easily integrated into ML pipelines. It can also be utilized in\nJupyter notebooks for more exploratory fairness analyses.\n\nLiFT leverages Apache Spark to load input data into in-memory, fault-tolerant and scalable data structures.\nIt strategically caches datasets and any pre-computation performed. Distributed computation is balanced with\nsingle system execution to obtain a good mix of scalability and speed. For example, distance,\ndistribution and divergence related metrics are computed on the entire dataset in a distributed\nmanner, while benefit vectors and permutation tests (for model performance) are computed\non scored dataset samples that can be collected to the driver.\n\nThe LinkedIn Fairness Toolkit (LiFT) provides the following capabilities:\n1. [Measuring Fairness Metrics on Training Data](dataset-fairness.md)\n2. [Measuring Fairness Metrics for Model Performance](model-fairness.md)\n3. [Achieving Equality of Opportunity](equality-of-opportunity.md)\n\nAs part of the model performance metrics, it also contains the implementation of a new permutation testing framework\nthat detects statistically significant differences in model performance (as measured by an arbitrary performance metric) across different subgroups.\n\nHigh-level details about the parameters, metrics supported and usage are described below.\nMore details about the metrics themselves are provided in the links above.\n\nA list of automatically downloaded direct dependencies are provided [here](dependencies.md).\n\n\n## Usage\n\n### Building the Library\n\nIt is recommended to use Scala 2.11.8 and Spark 2.3.0. To build, run the following:\n\n```bash\n./gradlew build\n```\nThis will produce a JAR file in the ``./lift/build/libs/`` directory.\n\nIf you want to use the library with Spark 2.4 (and the Scala 2.11.8 default), you can specify this when running the build command.\n\n```bash\n./gradlew build -PsparkVersion=2.4.3\n```\n\nYou can also build an artifact with Spark 2.4 and Scala 2.12.\n\n```bash\n./gradlew build -PsparkVersion=2.4.3 -PscalaVersion=2.12.11\n```\n\nTests typically run with the `test` task. If you want to force-run all tests, you can use:\n\n```bash\n./gradlew cleanTest test --no-build-cache\n```\n\nTo force rebuild the library, you can use:\n```bash\n./gradlew clean build --no-build-cache\n```\n\n\n### Add a LiFT Dependency to Your Project\n\nPlease check [Artifactory](https://linkedin.jfrog.io/artifactory/LiFT/) for the latest artifact versions.\n\n#### Gradle Example\n\nThe artifacts are available in LinkedIn's Artifactory instance and in Maven Central, so you can specify either repository in the top-level build.gradle file.\n\n```\nrepositories {\n    mavenCentral()\n    maven {\n        url \"https://linkedin.jfrog.io/artifactory/open-source/\"\n    }\n}\n```\n\nAdd the LiFT dependency to the module-level `build.gradle` file. Here are some examples for multiple recent Spark/Scala version combinations:\n\n```\ndependencies {\n    compile 'com.linkedin.lift:lift_2.3.0_2.11:0.1.4'\n}\n```\n```\ndependencies {\n    compile 'com.linkedin.lift:lift_2.4.3_2.11:0.1.4'\n}\n```\n```\ndependencies {\n    compile 'com.linkedin.lift:lift_2.4.3_2.12:0.1.4'\n}\n```\n\n#### Using the JAR File\n\nDepending on the mode of usage, the built JAR can be deployed as part of an offline data pipeline, depended \nupon to build jobs using its APIs, or added to the classpath of a Spark Jupyter notebook or a Spark Shell instance. For\nexample:\n```bash\n$SPARK_HOME/bin/spark-shell --jars target/lift_2.3.0_2.11_0.1.4.jar\n```\n\n\n### Usage Examples\n\n#### Measuring Dataset Fairness Metrics using the provided Spark job\nLiFT provides a Spark job for measuring fairness metrics for training data,\nas well as for the validation or test dataset:\n\n`com.linkedin.fairness.eval.jobs.MeasureDatasetFairnessMetrics`\n\nThis job can be configured using various parameters to compute fairness metrics on\nthe dataset of interest:\n```\n1. datasetPath: Input data path\n2. protectedDatasetPath: Input path to the protected dataset (optional).\n                         If not provided, the library attempts to use\n                         the right dataset based on the protected attribute.\n3. dataFormat: Format of the input datasets. This is the parameter passed\n              to the Spark reader's format method. Defaults to avro.\n4. dataOptions: A map of options to be used with Spark's reader (optional).\n5. uidField: The unique ID field, like a memberId field. It acts as the join key for the primary dataset.\n6. labelField: The label field\n7. protectedAttributeField: The protected attribute field\n8. uidProtectedAttributeField: The uid field (join key) for the protected attribute dataset\n9. outputPath: Output data path\n10. referenceDistribution: A reference distribution to compare against (optional).\n                          Only accepted value currently is UNIFORM.\n11. distanceMetrics: Distance and divergence metrics like SKEWS, INF_NORM_DIST,\n                    TOTAL_VAR_DIST, JS_DIVERGENCE, KL_DIVERGENCE and\n                    DEMOGRAPHIC_PARITY (optional).\n12. overallMetrics: Aggregate metrics like GENERALIZED_ENTROPY_INDEX,\n                    ATKINSONS_INDEX, THEIL_L_INDEX, THEIL_T_INDEX and\n                    COEFFICIENT_OF_VARIATION, along with their corresponding\n                    parameters.\n13. benefitMetrics: The distance/divergence metrics to use as the benefit\n                    vector when computing the overall metrics. Acceptable\n                    values are SKEWS and DEMOGRAPHIC_PARITY.\n```\nThe most up-to-date information on these parameters can always be found [here](lift/src/main/scala/com/linkedin/lift/eval/MeasureDatasetFairnessMetricsCmdLineArgs.scala).\n\nThe Spark job performs no preprocessing of the input data, and makes assumptions\nlike assuming that the unique ID field (the join key) is stored in the same\nformat in the input data and the `protectedAttribute` data. This might not\nbe the case for your dataset, in which case you can always create your own\nSpark job similar to the provided example (described below).\n\n#### Measuring Model Fairness Metrics using the provided Spark job\nLiFT provides a Spark job for measuring fairness metrics for model\nperformance, based on the labels and scores of the test or validation data:\n\n`com.linkedin.fairness.eval.jobs.MeasureModelFairnessMetrics`\n\nThis job can be configured using various parameters to compute fairness metrics on\nthe dataset of interest:\n```\n1. datasetPath Input data path\n2. protectedDatasetPath Input path to the protected dataset (optional).\n                        If not provided, the library attempts to use\n                        the right dataset based on the protected attribute.\n3. dataFormat: Format of the input datasets. This is the parameter passed\n              to the Spark reader's format method. Defaults to avro.\n4. dataOptions: A map of options to be used with Spark's reader (optional).\n5. uidField The unique ID field, like a memberId field. It acts as the join key for the primary dataset.\n6. labelField The label field\n7. scoreField The score field\n8. scoreType Whether the scores are raw scores or probabilities.\n             Accepted values are RAW or PROB.\n9. protectedAttributeField The protected attribute field\n10. uidProtectedAttributeField The uid field (join key) for the protected attribute dataset.\n11. groupIdField An optional field to be used for grouping, in case of ranking metrics\n12. outputPath Output data path\n13. referenceDistribution A reference distribution to compare against (optional).\n                          Only accepted value currently is UNIFORM.\n14. approxRows The approximate number of rows to sample from the input data\n               when computing model metrics. The final sampled value is\n               min(numRowsInDataset, approxRows)\n15. labelZeroPercentage The percentage of the sampled data that must\n                        be negatively labeled. This is useful in case\n                        the input data is highly skewed and you believe\n                        that stratified sampling will not obtain sufficient\n                        number of examples of a certain label.\n16. thresholdOpt An optional value that contains a threshold. It is used\n                 in case you want to generate hard binary classifications.\n                 If not provided and you request metrics that depend on\n                 explicit label predictions (eg. precision), the scoreType\n                 information is used to convert the scores into the\n                 probabilities of predicting positives. This is used for\n                 computing expected positive prediction counts.\n17. numTrials Number of trials to run the permutation test for. More trials\n              yield results with lower variance in the computed p-value,\n              but takes more time\n18. seed The random value seed\n19. distanceMetrics Distance and divergence metrics that are to be computed.\n                    These are metrics such as Demographic Parity\n                    and Equalized Odds.\n20. permutationMetrics The metrics to use for permutation testing\n21. distanceBenefitMetrics The model metrics that are to be used for\n                           computing benefit vectors, one for each\n                           distance metric specified.\n22. performanceBenefitMetrics The model metrics that are to be used for\n                              computing benefit vectors, one for each\n                              model performance metric specified.\n23. overallMetrics The aggregate metrics that are to be computed on each\n                   of the benefit vectors generated.\n```\nThe most up-to-date information on these parameters can always be found [here](lift/src/main/scala/com/linkedin/lift/eval/MeasureModelFairnessMetricsCmdLineArgs.scala).\n\nThe Spark job performs no preprocessing of the input data, and makes assumptions\nlike assuming that the unique ID field (the join key) is stored in the same\nformat in the input data and the `protectedAttribute` data. This might not\nbe the case for your dataset, in which case you can always create your own\nSpark job similar to the provided example (described below)\n\n#### Learning and Applying Equality of Opportunity (EOpp) on Local Datasets\nAn example is provided in [EOppUtilsTest](lift/src/Test/scala/com/linkedin/lift/mitigation/EOppUtilsTest.scala) for applying the EOpp transformation to local datasets. We provide two simulated datasets [TrainingData.csv](lift/src/Test/Data/TrainingData.csv) and [ValidationData.csv]((lift/src/Test/Data/ValidationData.csv)) each containing 1M samples. The workflow is provided as a test function <code>eOppTransformationTest()</code> consisting of the following steps:\n1. Learning position bias corrected EOpp transformation using the training data\n2. Applying the EOpp transformation on the validation data\n3. Checking EOpp in the transformed validation data with position bias\n4. Checking the (optional) score distribution preserving property of the EOpp transformation\n\n#### Custom Spark jobs built on LiFT\nIf you are implementing your own driver program to measure dataset metrics,\nhere's how you can make use of LiFT:\n\n```scala\nobject MeasureDatasetFairnessMetrics { \n  def main(progArgs: Array[String]): Unit = { \n    // Get spark session\n    val spark = SparkSession \n      .builder() \n      .appName(getClass.getSimpleName) \n      .getOrCreate() \n \n    // Parse args\n    val args = MeasureDatasetFairnessMetricsCmdLineArgs.parseArgs(progArgs) \n \n    // Load and preprocess data\n    val df = spark.read.format(args.dataFormat)\n      .load(args.datasetPath)\n      .select(args.uidField, args.labelField)\n \n    // Load protected data and join\n    val joinedDF = ...\n    joinedDF.persist \n\n    // Obtain reference distribution (optional). This can be used to provide a\n    // custom distribution to compare the dataset against.\n    val referenceDistrOpt = ...\n \n    // Passing in the appropriate parameters to this API computes and writes \n    // out the fairness metrics \n    FairnessMetricsUtils.computeAndWriteDatasetMetrics(distribution,\n      referenceDistrOpt, args) \n  } \n}\n```\nA complete example for the above can be found [here](lift/src/main/scala/com/linkedin/lift/eval/jobs/MeasureDatasetFairnessMetrics.scala).\n\nIn the case of measuring model metrics, a similar Spark job can be implemented:\n```scala\nobject MeasureModelFairnessMetrics { \n  def main(progArgs: Array[String]): Unit = { \n    // Get spark session\n    val spark = SparkSession \n      .builder() \n      .appName(getClass.getSimpleName) \n      .getOrCreate() \n \n    // Parse args\n    val args = MeasureModelFairnessMetricsCmdLineArgs.parseArgs(progArgs) \n \n    // Load and preprocess data\n    val df = spark.read.format(args.dataFormat)\n      .load(args.datasetPath)\n      .select(args.uidField, args.labelField)\n \n    // Load protected data and join\n    val joinedDF = ...\n    joinedDF.persist \n\n    // Obtain reference distribution (optional). This can be used to provide a\n    // custom distribution to compare the dataset against.\n    val referenceDistrOpt = ...\n \n    // Passing in the appropriate parameters to this API computes and writes \n    // out the fairness metrics \n    FairnessMetricsUtils.computeAndWriteModelMetrics(\n      joinedDF, referenceDistrOpt, args) \n  } \n}\n```\nA complete example for the above can be found [here](lift/src/main/scala/com/linkedin/lift/eval/jobs/MeasureModelFairnessMetrics.scala).\n\n\n## Contributions\n\nIf you would like to contribute to this project, please review the instructions [here](CONTRIBUTING.md).\n\n\n## Acknowledgments\n\nImplementations of some methods in LiFT were inspired by other open-source libraries. LiFT also contains the\nimplementation of a new permutation testing framework. Discussions with several LinkedIn employees influenced\naspects of this library. A full list of acknowledgements can be found [here](acknowledgements.md).\n\n\n## Citations\nIf you publish material that references the LinkedIn Fairness Toolkit (LiFT), you can use the following citations:\n```\n@inproceedings{vasudevan20lift,\n    author       = {Vasudevan, Sriram and Kenthapadi, Krishnaram},\n    title        = {{LiFT}: A Scalable Framework for Measuring Fairness in ML Applications},\n    booktitle    = {Proceedings of the 29th ACM International Conference on Information and Knowledge Management},\n    series       = {CIKM '20},\n    year         = {2020},\n    pages        = {},\n    numpages     = {8}\n}\n\n@misc{lift,\n    author       = {Vasudevan, Sriram and Kenthapadi, Krishnaram},\n    title        = {The LinkedIn Fairness Toolkit ({LiFT})},\n    howpublished = {\\url{https://github.com/linkedin/lift}},\n    month        = aug,\n    year         = 2020\n}\n```\n\nIf you publish material that references the permutation testing methodology that is available as part of LiFT,\nyou can use the following citation:\n```\n@inproceedings{diciccio20evaluating,\n    author       = {DiCiccio, Cyrus and Vasudevan, Sriram and Basu, Kinjal and Kenthapadi, Krishnaram and Agarwal, Deepak},\n    title        = {Evaluating Fairness Using Permutation Tests},\n    booktitle    = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},\n    series       = {KDD '20},\n    year         = {2020},\n    pages        = {},\n    numpages     = {11}\n}\n```\n\nIf you publish material that references the equality of opportunity methodology that is available as part of LiFT,\nyou can use the following citation:\n```\n@misc{nandy21mitigation,\n   author        = {Preetam Nandy and Cyrus Diciccio and Divya Venugopalan and Heloise Logan and Kinjal Basu and Noureddine El Karoui},\n   title         = {Achieving Fairness via Post-Processing in Web-Scale Recommender Systems}, \n   year          = {2021},\n   eprint        = {2006.11350},\n   archivePrefix = {arXiv}\n}\n```\n\n"
  },
  {
    "path": "acknowledgements.md",
    "content": "# Acknowledgements\n\nThe computation of AUC and ROC in this library were inspired by the following implementations:\n- numpy (3-clause BSD): https://numpy.org/doc/stable/reference/generated/numpy.trapz.html\n- spark-mllib (Apache 2.0): https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/AreaUnderCurve.scala\n\nThe computation of a generalized confusion matrix was inspired by the following implementation:\n- AIF360 (Apache 2.0): https://github.com/IBM/AIF360/blob/master/aif360/metrics/classification_metric.py\n\nThe permutation test implmented in this library was developed in conjuction with Cyrus DiCiccio, Kinjal Basu and Deepak Agarwal.\n\nThe LinkedIn Fairness Toolkit (LiFT) was validated through its deployment as part of different product vertical ML workflows and existing ML\nplatforms at LinkedIn. Deepak Agarwal, Parvez Ahammad, Stuart Ambler, Romil Bansal, Kinjal Basu, Bee-Chung Chen,\nCyrus DiCiccio, Carlos Faham, Divya Gadde, Priyanka Gariba, Sahin Cem Geyik, Daniel Hewlett, Roshan Lal, Nicole Li,\nHeloise Logan, Sofus Macskassy, Varun Mithal, Arashpreet Singh Mor, Tanvi Motwani, Preetam Nandy, Cagri Ozcaglar,\nNitin Panjwani, Igor Perisic, Romer Rosales, Guillaume Saint-Jacques, Badrul Sarwar, Amir Sepehri, Arun Swami,\nRam Swaminathan, Grace Tang, Xin Wang, Ya Xu, and Yang Yang provided insightful feedback and discussions that influenced various aspects of LiFT.\n"
  },
  {
    "path": "build.gradle",
    "content": "buildscript {\n  repositories {\n    mavenLocal() // for local testing\n    maven {\n      url \"https://plugins.gradle.org/m2/\"\n    }\n  }\n  dependencies {\n    classpath \"org.jfrog.buildinfo:build-info-extractor-gradle:4.+\"\n    classpath \"io.github.gradle-nexus:publish-plugin:1.+\"\n    classpath \"org.shipkit:shipkit-auto-version:1.+\"\n    classpath \"org.shipkit:shipkit-changelog:1.+\"\n  }\n}\n\napply from: \"gradle/release.gradle\"\n\nallprojects {\n  apply plugin: \"eclipse\"\n  apply plugin: \"idea\"\n\n  group = \"com.linkedin.lift\"\n\n  repositories {\n    mavenCentral()\n  }\n}\n\ntask clean(type: Delete) {\n    delete \"build\"\n}\n"
  },
  {
    "path": "dataset-fairness.md",
    "content": "# Dataset-level Fairness Metrics\n\nWe provide here a list of the various metrics available for measuring fairness of\ndatasets, as well as a short description of each of them.\n\n1. **Metrics that compare against a given reference distribution:** These metrics involve computing\nsome measure of distance or divergence from a given reference distribution provided by the user.\nThe library supports only the `UNIFORM` distribution out of the box (all `label-protectedAttribute`\ncombinations must have equal number of records), but users may supply their own distribution\n(such as an apriori known gender distribution etc.).\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/lib/DivergenceUtils.scala), and look for the\n    `computeDatasetDistanceMetrics` method as the starting point. The following metrics fall under this\n    category:\n\n    1. **Skews:** Computes the logarithm of the ratio of the observed value to the expected value. For example, if we are dealing with label-gender distributions, this metric computes\n\n        ![\\log\\left(\\frac{(0.0, MALE)_{obs}}{(0.0, MALE)_{exp}}\\right), \\log\\left(\\frac{(1.0, MALE)_{obs}}{(1.0, MALE)_{exp}}\\right), \\log\\left(\\frac{(0.0, FEMALE)_{obs}}{(0.0, FEMALE)_{exp}}\\right), \\log\\left(\\frac{(1.0, FEMALE)_{obs}}{(1.0, FEMALE)_{exp}}\\right)](https://render.githubusercontent.com/render/math?math=%5Clog%5Cleft(%5Cfrac%7B(0.0%2C%20MALE)_%7Bobs%7D%7D%7B(0.0%2C%20MALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(1.0%2C%20MALE)_%7Bobs%7D%7D%7B(1.0%2C%20MALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(0.0%2C%20FEMALE)_%7Bobs%7D%7D%7B(0.0%2C%20FEMALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(1.0%2C%20FEMALE)_%7Bobs%7D%7D%7B(1.0%2C%20FEMALE)_%7Bexp%7D%7D%5Cright))\n\n    2. **Infinity Norm Distance:** Computes the Chebyshev Distance between the observed and reference distribution. It equals the maximum difference between the two distributions.\n    3. **Total Variation Distance:** Computes the Total Variation Distance between the observed and reference distribution. It is equal to half the L1 distance between the two distributions.\n    4. **JS Divergence:** The Jensen-Shannon Divergence between the observed and reference distribution. Suppose that the average of these two distributions is given by M. Then, the JS Divergence is the average of the KL Divergences between the observed distribution and M, and the reference distribution and M.\n    5. **KL Divergence:** The Kullback-Leibler Divergence between the observed and reference distribution. It is the expectation (over the observed distribution) of the logarithmic differences between the observed and reference distributions. The latter is the Skew we measure above.\n\n2. **Metrics computed on the observed distribution only:** These metrics compute some notion of\ndistance or divergence between various segments of the observed distribution.\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/lib/DivergenceUtils.scala), and look for the\n    `computeDatasetDistanceMetrics` method as the starting point. The following metrics are supported for training data:\n\n    1. **Demographic Parity:** It measures the difference between the conditional expected value of the prediction (given one protected attribute value) and the conditional expected value of the prediction (given the other protected attribute value). This is measured for all pairs of protected attribute values.\n\n        ![DP_{(g_1, g_2)} = E\\\\[Y(X)|G=g_1\\\\] - E\\\\[Y(X)|G=g_2\\\\] = P(Y(X)=1|G=g_1) - P(Y(X)=1|G=g_2)](https://render.githubusercontent.com/render/math?math=DP_%7B(g_1%2C%20g_2)%7D%20%3D%20E%5C%5BY(X)%7CG%3Dg_1%5C%5D%20-%20E%5C%5BY(X)%7CG%3Dg_2%5C%5D%20%3D%20P(Y(X)%3D1%7CG%3Dg_1)%20-%20P(Y(X)%3D1%7CG%3Dg_2))\n\n3. **Aggregate Metrics:** These metrics are useful to obtain higher level (or second order) notions of inequality, when comparing multiple per-protected-attribute-value inequality metrics. For example, these could be used to say if one set of Skews measured is more equally distributed that another set of Skews. These lower-level metrics are called benefit vectors, and the aggregate metrics provide a notion of how uniformly these inequalities are distributed.\n\n    Note that these metrics capture inequalities within the vector. Thus, going by this metric alone is not sufficient. For example, take a benefit vector that captures Demographic Parity differences between (MALE, FEMALE), (FEMALE, UNKNOWN), and (MALE, UNKNOWN). Suppose that the vector for one distribution is (a, 2a, 3a) and the other is (0.5a, 1.5a, 2a). Even though the individual differences are smaller in the second distribution (for each pair of protected attribute values), an aggregate metric will deem it to be more unfair than the former because the differences in the elements of the vector are more drastic than the other (for the first one, the ratio is 1:2:3 while for the second it is 1:3:4). However, the latter has better Demographic Parity. Hence, there may be conflicting notions of fairness being measured, and it is up to the end user to identify which one they would like to focus on.\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/types/BenefitMap.scala), and look for the `computeMetric` method as the starting point. The following aggregate metrics are available:\n    1. **Generalized Entropy Index:** Computes an average of the relative benefits based on some input parameters.\n    2. **Atkinsons Index:** A derivative of the Generalized Entropy Index. Used more commonly in the field of economics.\n    3. **Theil's L Index:** The Generalized Entropy Index when its parameter is set to 0. It is more sensitive to differences at the lower end of the distribution (the benefit vector values).\n    4. **Theil's T Index:** The Generalized Entropy Index when its parameter is set to 1. It is more sensitive to differences at the higher end of the distribution (the benefit vector values).\n    5. **Coefficient of Variation:** A derivative of the Generalized Entropy Index. It computes the value of the standard deviation divided by the mean of the benefit vector.\n"
  },
  {
    "path": "dependencies.md",
    "content": "# Dependencies\nThis product automatically downloads the following:\n1. Apache Spark and its subcomponents: While generally licensed under the Apache License 2.0, the Apache Spark project\ncontains subcomponents with separate copyright notices and license terms. See https://github.com/apache/spark/blob/branch-2.3/NOTICE.\n2. ScalaTest: While generally licensed under the Apache License 2.0, the ScalaTest library\ncontains subcomponents with separate copyright notices and license terms. See https://github.com/scalatest/scalatest/blob/3.1.x/NOTICE.\n3. scopt: While generally licensed under the MIT License, the scopt library\ncontains subcomponents with separate copyright notices and license terms. See https://github.com/scopt/scopt/blob/scopt4/LICENSE.md.\n4. TestNG: While generally licensed under the Apache License 2.0, the TestNG library\ncontains subcomponents with separate copyright notices and license terms. See https://github.com/cbeust/testng/blob/master/LICENSE.txt\n"
  },
  {
    "path": "docs/release-notes.md",
    "content": "<sup><sup>*Release notes were automatically generated by [Shipkit](http://shipkit.org/)*</sup></sup>\n\n#### 0.2.1\n - 2021-03-29 - [1 commit](https://github.com/linkedin/LiFT/compare/v0.2.0...v0.2.1) by [Preetam Nandy](https://github.com/preetamnandy) - published to [![Bintray](https://img.shields.io/badge/Bintray-0.2.1-green.svg)](https://bintray.com/linkedin/maven/LiFT/0.2.1)\n - mitigation: equality of opportunity [(#7)](https://github.com/linkedin/LiFT/pull/7)\n\n#### 0.2.0\n - 2020-09-15 - [1 commit](https://github.com/linkedin/LiFT/compare/v0.1.4...v0.2.0) by [Sriram Vasudevan](https://github.com/sriramvasudevan) - published to [![Bintray](https://img.shields.io/badge/Bintray-0.2.0-green.svg)](https://bintray.com/linkedin/maven/LiFT/0.2.0)\n - Update computeJoinedDF to take a DataFrame [(#3)](https://github.com/linkedin/LiFT/pull/3)\n\n#### 0.1.4\n - 2020-09-10 - [2 commits](https://github.com/linkedin/LiFT/compare/v0.1.3...v0.1.4) by [Sriram Vasudevan](https://github.com/sriramvasudevan) - published to [![Bintray](https://img.shields.io/badge/Bintray-0.1.4-green.svg)](https://bintray.com/linkedin/maven/LiFT/0.1.4)\n - Update Readme to document JCenter details [(#2)](https://github.com/linkedin/LiFT/pull/2)\n\n#### 0.1.3\n - 2020-09-08 - no code changes (no commits) - published to [![Bintray](https://img.shields.io/badge/Bintray-0.1.3-green.svg)](https://bintray.com/linkedin/maven/LiFT/0.1.3)\n\n#### 0.1.2\n - 2020-09-07 - 5 commits by [Sriram Vasudevan](https://github.com/sriramvasudevan) - published to [![Bintray](https://img.shields.io/badge/Bintray-0.1.2-green.svg)](https://bintray.com/linkedin/maven/LiFT/0.1.2)\n - No pull requests referenced in commit messages.\n\n"
  },
  {
    "path": "equality-of-opportunity.md",
    "content": "# Equality of Opportunity (EOpp)\n\n## EOpp Definition\nEquality of opportunity is one of the most widely used definitions of fairness. For a recommender system, EOpp suggests that randomly chosen ``qualified'' candidates should be represented equally regardless of which group they belong to; in other words, the exposure of qualified candidates from any group should be equal. Most recommender systems generate scores <em>s(X)</em> (predicting the relevance of an item to a binary response variable <em>Y</em>) to rank candidate items based on a feature set <em>X</em>. In these cases, EOpp corresponds to the independence of <<em>s(X)</em> and characteristic/attribute <em>C</em> given the response/label <em>Y=1</em>, i.e.\n\n![P(s(X) \\leq t \\mid C= c_1, Y=1) = P(s(X) \\leq t \\mid C= c_2, Y=1),\\forall c_1, c_2.](https://render.githubusercontent.com/render/math?math=P(s(X)%20%5Cleq%20t%20%5Cmid%20C%3D%20c_1%2C%20Y%3D1)%20%3D%20P(s(X)%20%5Cleq%20t%20%5Cmid%20C%3D%20c_2%2C%20Y%3D1)%2C%5Cforall%20c_1%2C%20c_2.)\n\n## EOpp Algorithm\nWe provide the post-processing technique presented in *[Nandy et al. (2021)](https://arxiv.org/abs/2006.11350)*. The function <code>eOppTransformation()</code> (see [EOppUtils](lift/src/main/scala/com/linkedin/lift/mitigation/EOppUtils.scala)) can be used to learn a transformation that can be applied to model scores for achieving EOpp. The distribution of the transformed scores can be forced to match as the distribution before transformation by setting the argument <code>originalScale = true</code>. This is useful for blending the transformed scores <em>s<sup>\\*</sup>(X)</em> with the original scores <em>s(X)</em> as <em>t \\* s<sup>\\*</sup>(X) + (1-t) \\* s(X)</em> to achieve a fairness-performance trade-off by adjusting the tuning parameter <em>t</em> in [0, 1].\n\n## Position bias adjustment\nTo define EOpp in the presence of the position bias, we need to take into account the dependency of the response variable <em>Y</em> on the position where the item is shown. To this end, we denote the counterfactual response when an item appears at position <em>j</em> by <em>Y(j)</em>. Furthermore, we use <img src=\"https://render.githubusercontent.com/render/math?math=%5Cgamma\"> to denote the position of an item in the ranking generated by <em>s(X)</em>. Therefore, the observed response is given by <em>Y(<img src=\"https://render.githubusercontent.com/render/math?math=%5Cgamma\">)</em>.\n\nA scoring function <em>s(X)</em> of a recommendation system satisfies EOpp with respect to a characteristic <em>C</em> if\n![P(s(X) \\leq t \\mid C=c_1,Y(\\gamma)=1) = P(s(X) \\leq t \\mid C=c_2,Y(\\gamma)=1), \\forall t, c_1, c_2.](https://render.githubusercontent.com/render/math?math=P(s(X)%20%5Cleq%20t%20%5Cmid%20C%3Dc_1%2CY(%5Cgamma)%3D1)%20%3D%20P(s(X)%20%5Cleq%20t%20%5Cmid%20C%3Dc_2%2CY(%5Cgamma)%3D1)%2C%20%5Cforall%20t%2C%20c_1%2C%20c_2.)\n\nWe provide a debiasing technique that should be applied before applying the EOpp algorithm in the presence of position bias. The function <code>debiasPositiveLabelScores()</code> (see [positionBiasUtils](lift/src/main/scala/com/linkedin/lift/lib/positionBiasUtils.scala)) removes the effect of the position bias from the training data and the output can be directly used by <code>eOppTransformation()</code> (see [EOppUtils](lift/src/main/scala/com/linkedin/lift/mitigation/EOppUtils.scala)) for learning the EOpp transformation.\n\n## Example\nWe illustrate the position bias adjusted EOpp algorithm in [EOppUtilsTest](lift/src/Test/scala/com/linkedin/lift/mitigation/EOppUtilsTest.scala). \n\n<strong>Data Generation</strong> (as in *[Nandy et al. (2021)](https://arxiv.org/abs/2006.11350)*): We generate a population of <em>p</em> = 50,000 items, where each item consists of id <em>i</em>, characteristic <em>C<sub>i</sub></em> in {0, 1}, label at position 1 <em>Y<sub>i</sub>(1)</em> and relevance <em>R<sub>i</sub></em>. We independently generate <em>C<sub>i</sub></em>'s from a <em>Bernoulli(0.6)</em> distribution. The conditional distribution <em>Y<sub>i</sub>(1)</em> given <em>C<sub>i</sub> = 0</em> is <em>Bernoulli(0.4)</em>, and the conditional distribution <em>Y<sub>i</sub>(1)</em> given <em>C<sub>i</sub> = 0</em> is <em>Bernoulli(0.5)</em>. Finally, <em>R<sub>i</sub></em> | <em>(C<sub>i</sub>, Y<sub>i</sub>(1))</em> is generated from \n<em>Gaussian(0.6Y<sub>i</sub>(1) + 2C<sub>i</sub>, 0.5) + (1 - C<sub>i</sub>) \\* Uniform[0,~ (1 + Y<sub>i</sub>(1))]</em>.\n\nWe consider a recommendation system with <em>K</em> = 50 slots. For each session, we randomly select 50 items from the population and assign a score <em>s<sub>i</sub> = R<sub>i </sub> + Gaussian(0, 0.1)</em> to each selected item <em>i</em> = 1,..., 50000. The selected items are then ranked according to <em>s<sub>i</sub></em> (in a descending order) and assigned position according to <em>rank(i)</em>. Finally, the item <em>i</em> at position <em>j</em> gets observed response <em>Y(j) = Y(1) \\* Bernoulli(w<sub>j</sub>)</em> with position bias <em>w<sub>j</sub> = 1 /</em> log<sub>2</sub><em>(1+j)</em>. \n\n<strong>Validation</strong>: We learn the EOpp transformation using [training data](lift/src/Test/Data/TrainingData.csv) containing 20K i.i.d.\\ sessions (i.e. 20K * 50 = 1M samples). For testing, we apply the transformation on [validation data](lift/src/Test/Data/TrainingData.csv) containing 20000 i.i.d.\\ sessions. To apply the effect of position bias in the transformed validation data, we multiply the labels <em>Y(1)</em> with Bernoulli(1/(1 + position)) random numbers, where the position corresponds to the rank of an item according to the transformed score. We validate the EOpp transformation by computing the 2nd Wasserstein distance between the transformed positive label score distributions corresponding to <em>C=0</em> and <em>C=1</em>. Additionally, we validate the equality of the transformed score distribution and the scores before the transformation.\n\n"
  },
  {
    "path": "gradle/java-publication.gradle",
    "content": "assert plugins.hasPlugin(JavaPlugin)\n\ntasks.withType(Jar) {\n  from \"$rootDir/LICENSE\"\n  from \"$rootDir/NOTICE\"\n}\n\n// Auxiliary jar files required by Maven module publications\ntask sourcesJar(type: Jar, dependsOn: classes) {\n  classifier 'sources'\n  from sourceSets.main.allSource\n}\n\ntask javadocJar(type: Jar, dependsOn: javadoc) {\n  classifier 'javadoc'\n  from javadoc.destinationDir\n}\n\nartifacts {\n  archives sourcesJar\n  archives javadocJar\n}\n\napply plugin: \"maven-publish\" // https://docs.gradle.org/current/userguide/publishing_maven.html\npublishing {\n  publications {\n    liftJar(MavenPublication) {\n      from components.java\n      artifact sourcesJar\n      artifact javadocJar\n\n      artifactId = project.archivesBaseName\n\n      pom {\n        name = artifactId\n        description = \"A Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.\"\n        url = \"https://github.com/linkedin/LiFT\"\n        licenses {\n          license {\n            name = 'BSD 2-CLAUSE'\n            url = 'https://github.com/linkedin/LiFT/blob/master/LICENSE'\n            distribution = 'repo'\n          }\n        }\n        developers {\n          [\n            'sriramvasudevan:Sriram Vasudevan'\n          ].each { devData ->\n            developer {\n              def devInfo = devData.split(':')\n              id = devInfo[0]\n              name = devInfo[1]\n              url = 'https://github.com/' + devInfo[0]\n              roles = [\"Core developer\"]\n            }\n          }\n        }\n        scm {\n          url = 'https://github.com/linkedin/LiFT.git'\n        }\n        issueManagement {\n          url = 'https://github.com/linkedin/LiFT/issues'\n          system = 'GitHub issues'\n        }\n        ciManagement {\n          url = 'https://travis-ci.com/linkedin/LiFT'\n          system = 'Travis CI'\n        }\n      }\n    }\n  }\n\n  //useful for testing - running \"publish\" will create artifacts/pom in a local dir\n  repositories { maven { url = \"$rootProject.buildDir/repo\" } }\n}\n\n//fleshes out problems with Maven pom generation when building\ntasks.build.dependsOn(\"publishLiftJarPublicationToMavenLocal\")\n\napply plugin: 'signing' //https://docs.gradle.org/current/userguide/signing_plugin.html\nsigning {\n  if (System.getenv(\"PGP_KEY\")) {\n    useInMemoryPgpKeys(System.getenv(\"PGP_KEY\"), System.getenv(\"PGP_PWD\"))\n    sign publishing.publications.liftJar\n  }\n}\n\n//////////////////////////////////\n// LinkedIn Artifactory Config\n//////////////////////////////////\n\napply plugin: \"com.jfrog.artifactory\" //https://www.jfrog.com/confluence/display/rtf/gradle+artifactory+plugin\nartifactory {\n  contextUrl = 'https://linkedin.jfrog.io/artifactory'\n  publish {\n    repository {\n      repoKey = 'LiFT'\n      username = System.getenv('ARTIFACTORY_USER')\n      password = System.getenv('ARTIFACTORY_KEY')\n      maven = true\n    }\n\n    defaults {\n      publications('liftJar')\n      publishBuildInfo = true\n      publishArtifacts = true\n      publishPom = true\n      publishIvy = true\n    }\n  }\n  clientConfig.setIncludeEnvVars(false)\n}\n\nartifactoryPublish {\n  skip = project.hasProperty('artifactory.dryRun')\n}\n"
  },
  {
    "path": "gradle/release.gradle",
    "content": "//Plugin jars are added to the buildscript classpath in the root build.gradle file\n\n//////////////////////////////////\n// Token Verification Tasks\n//////////////////////////////////\n\ntask checkGitHubToken {\n  doFirst {\n    if (System.getenv(\"GITHUB_TOKEN\") == null) {\n      throw new Exception(\"Environment variable GITHUB_TOKEN not set.\");\n    }\n    println \"Using repository \" + System.getenv(\"GITHUB_REPOSITORY\")\n  }\n}\n\ntask verifyArtifactoryProperties {\n  doFirst {\n    if (!project.hasProperty('artifactory.dryRun')) {\n      if (System.getenv('ARTIFACTORY_USER') == null) {\n        throw new Exception(\"Environment variable ARTIFACTORY_USER not set.\");\n      }\n      if (System.getenv('ARTIFACTORY_KEY') == null) {\n        throw new Exception(\"Environment variable ARTIFACTORY_KEY not set.\");\n      }\n    }\n  }\n}\n\n//////////////////////////////////\n// Shipkit Tasks\n//////////////////////////////////\n\napply plugin: \"org.shipkit.shipkit-auto-version\" //https://github.com/shipkit/shipkit-auto-version\n\napply plugin: \"org.shipkit.shipkit-changelog\" //https://github.com/shipkit/shipkit-changelog\ntasks.named(\"generateChangelog\") {\n  dependsOn checkGitHubToken\n  previousRevision = project.ext.'shipkit-auto-version.previous-tag'\n  githubToken = System.getenv(\"GITHUB_TOKEN\")\n  repository = \"linkedin/LiFT\"\n}\n\napply plugin: \"org.shipkit.shipkit-github-release\" //https://github.com/shipkit/shipkit-changelog\ntasks.named(\"githubRelease\") {\n  def genTask = tasks.named(\"generateChangelog\").get()\n  dependsOn genTask\n  dependsOn checkGitHubToken\n  repository = genTask.repository\n  changelog = genTask.outputFile\n  githubToken = System.getenv(\"GITHUB_TOKEN\")\n  newTagRevision = System.getenv(\"GITHUB_SHA\")\n}\n\n//////////////////////////////////\n// Maven Central Config\n//////////////////////////////////\n\napply plugin: \"io.github.gradle-nexus.publish-plugin\" //https://github.com/gradle-nexus/publish-plugin/\nnexusPublishing {\n  repositories {\n    if (System.getenv(\"SONATYPE_PWD\")) {\n      sonatype {\n        username = System.getenv(\"SONATYPE_USER\")\n        password = System.getenv(\"SONATYPE_PWD\")\n      }\n    }\n  }\n}\n\n//////////////////////////////////\n// Additional Release Tasks\n//////////////////////////////////\n\ntask artifactoryPublishAll {\n  description = \"Runs 'artifactoryPublish' tasks from all projects\"\n}\n\nallprojects {\n  tasks.matching { it.name == \"artifactoryPublish\" }.all {\n    it.dependsOn verifyArtifactoryProperties\n    artifactoryPublishAll.dependsOn it\n  }\n}\n"
  },
  {
    "path": "gradle/wrapper/gradle-wrapper.properties",
    "content": "distributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-6.8.3-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "gradle.properties",
    "content": "org.gradle.caching=true\n"
  },
  {
    "path": "gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=$((i+1))\n    done\n    case $i in\n        (0) set -- ;;\n        (1) set -- \"$args0\" ;;\n        (2) set -- \"$args0\" \"$args1\" ;;\n        (3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        (4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        (5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        (6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        (7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        (8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        (9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=$(save \"$@\")\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\n# by default we should be in the correct project dir, but when run from Finder on Mac, the cwd is wrong\nif [ \"$(uname)\" = \"Darwin\" ] && [ \"$HOME\" = \"$PWD\" ]; then\n  cd \"$(dirname \"$0\")\"\nfi\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto init\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto init\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:init\r\n@rem Get command-line arguments, handling Windows variants\r\n\r\nif not \"%OS%\" == \"Windows_NT\" goto win9xME_args\r\n\r\n:win9xME_args\r\n@rem Slurp the command line arguments.\r\nset CMD_LINE_ARGS=\r\nset _SKIP=2\r\n\r\n:win9xME_args_slurp\r\nif \"x%~1\" == \"x\" goto execute\r\n\r\nset CMD_LINE_ARGS=%*\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS%\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "lift/build.gradle",
    "content": "import org.gradle.util.VersionNumber\n\nplugins {\n  id 'scala'\n}\n\ndef scalaVersion = findProperty(\"scalaVersion\") ?: \"2.11.8\"  // Scala 2.11.8 is the default Scala build version.\nprintln \"Scala version: $scalaVersion\"\n// If scalaVersion == \"2.11.8\", then scalaVersionShort == \"2.11\".\ndef scalaVersionShort = \"${VersionNumber.parse(scalaVersion).getMajor()}.${VersionNumber.parse(scalaVersion).getMinor()}\"\n\ndef sparkVersion = findProperty(\"sparkVersion\") ?: \"2.3.0\"  // Spark 2.3.0 is the default Spark build version.\nprintln \"Spark version: $sparkVersion\"\n\ndependencies {\n  if(VersionNumber.parse(sparkVersion) >= VersionNumber.parse(\"2.4.0\")) {\n    compile(\"org.apache.spark:spark-avro_$scalaVersionShort:$sparkVersion\")\n  } else {\n    compile(\"com.databricks:spark-avro_$scalaVersionShort:4.0.0\")\n  }\n  compile(\"com.github.scopt:scopt_$scalaVersionShort:3.5.0\")\n  compile(\"org.apache.spark:spark-core_$scalaVersionShort:$sparkVersion\")\n  compile(\"org.apache.spark:spark-sql_$scalaVersionShort:$sparkVersion\")\n  compile(\"org.apache.spark:spark-mllib_$scalaVersionShort:$sparkVersion\")\n  compile(\"org.scalatest:scalatest_$scalaVersionShort:3.1.0\")\n  compile(\"org.testng:testng:6.8.8\")\n}\n\ntest {\n  useTestNG()\n}\n\narchivesBaseName = \"${project.name}_${sparkVersion}_${scalaVersionShort}\"\n\napply from: \"$rootDir/gradle/java-publication.gradle\"\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/eval/FairnessMetricsUtils.scala",
    "content": "package com.linkedin.lift.eval\n\nimport com.linkedin.lift.lib.{DivergenceUtils, PermutationTestUtils, StatsUtils}\nimport com.linkedin.lift.types.{BenefitMap, Distribution, FairnessResult, ModelPrediction}\nimport org.apache.spark.sql.functions._\nimport org.apache.spark.sql.types.IntegerType\nimport org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}\n\n/**\n  * Utilities that stitch together various fairness metrics methods,\n  * to provide more higher level APIs.\n  */\nobject FairnessMetricsUtils {\n\n  /**\n    * Extract a flattened DataFrame containing (id, label, score) columns.\n    * This input DataFrame would have an ID field, a label field and a score field.\n    *\n    * @param df The input DataFrame\n    * @param uidField The unique ID field, like a Member ID\n    * @param labelField The label field\n    * @param scoreField The score field\n    * @param groupIdField The grouping ID field\n    * @return A flattened DataFrame containing the above 3 fields\n    */\n  def projectIdLabelsAndScores(df: DataFrame, uidField: String,\n    labelField: String, scoreField: String, groupIdField: String): DataFrame = {\n    if (groupIdField.isEmpty) {\n      df.select(col(uidField), col(labelField), col(scoreField))\n    } else {\n      val allFields = df.schema.fieldNames\n      if (allFields.contains(groupIdField)) {\n        df.select(col(uidField), col(labelField), col(scoreField), col(groupIdField))\n      } else {\n        df.select(col(uidField), col(labelField), col(scoreField))\n      }\n    }\n  }\n\n  /**\n    * Evaluate if the difference in metric values is significant, for every\n    * pair of protected attribute values (in the given set of predictions),\n    * using permutation testing.\n    *\n    * @param predictions The input predictions to evaluate\n    * @param dimType The dimension type, such as gender or age.\n    * @param metrics The metrics of interest\n    * @param numTrials Number of trials to run the permutation test for\n    * @param seed A random seed.\n    * @return A map containing the results of the permutation test\n    */\n  def computePermutationTestMetrics(predictions: Seq[ModelPrediction],\n    dimType: String, metrics: Seq[String], numTrials: Int,\n    seed: Long): Seq[FairnessResult] = {\n    // Compute all subsets of size 2 - nC2 such pairs in total\n    val allDimValPairs = predictions.map(_.dimensionValue)\n      .toSet\n      .subsets(2)\n      .toList\n\n    metrics.flatMap { metric =>\n      allDimValPairs.map { dimValPair =>\n        PermutationTestUtils.permutationTest(predictions, dimType,\n          dimValPair.head, dimValPair.last, metric, numTrials, seed)\n      }\n    }\n  }\n\n  /**\n    * Evaluate a requested set of overall fairness metrics on a benefit\n    * vector generated by computing a benefit metric for each\n    * protected attribute value (in the given set of predictions).\n    *\n    * @param sampledDF The sampled input DataFrame\n    * @param args The Model Fairness Measurement command line args\n    * @return A sequence of various model-performance-related fairness metrics.\n    */\n  def computeModelPerformanceMetrics(sampledDF: DataFrame,\n    args: MeasureModelFairnessMetricsCmdLineArgs): Seq[FairnessResult] = {\n    val samplePredictions = ModelPrediction.compute(sampledDF,\n      args.labelField, args.scoreField, args.groupIdField,\n      args.protectedAttributeField)\n\n    val permutationTestMetrics = computePermutationTestMetrics(\n      samplePredictions, args.protectedAttributeField,\n      args.permutationMetrics, args.numTrials, args.seed)\n\n    val benefitMaps = args.performanceBenefitMetrics.map { benefitMetric =>\n      BenefitMap.compute(samplePredictions,\n        args.protectedAttributeField, benefitMetric)\n    }\n    val overallFairnessMetrics =\n      benefitMaps.flatMap(_.computeOverallMetrics(args.overallMetrics))\n\n    permutationTestMetrics ++ overallFairnessMetrics\n  }\n\n  /**\n    * Join the input DataFrame with the protectedAttribute DataFrame, and return\n    * the input DataFrame appended with the protectedAttribute.\n    *\n    * @param protectedDF DataFrame containing the protected attribute data\n    * @param df The input DataFrame\n    * @param uidField The unique ID field, such as memberId\n    * @param protectedDatasetPath Path to the protected dataset. If empty, we\n    *                             attempt to load the right dataset based on the\n    *                             protectedAttribute specified.\n    * @param uidProtectedAttributeField The uid field of the protected\n    *                                   attribute dataset\n    * @param protectedAttributeField The protected attribute field in the\n    *                                protectedAttribute DataFrame\n    * @return The joined DataFrame\n    */\n  def computeJoinedDF(protectedDF: DataFrame, df: DataFrame, uidField: String,\n    protectedDatasetPath: String, uidProtectedAttributeField: String,\n    protectedAttributeField: String): DataFrame = {\n    protectedDF.select(col(uidProtectedAttributeField).as(uidField),\n      col(protectedAttributeField))\n      .join(df, uidField)\n  }\n\n  /**\n    * Computes a reference distribution based on the input dataset's\n    * distribution.\n    *\n    * @param inputDistr The input dataset's distribution of protectedAttributes only\n    * @param referenceDistribution The kind of reference distribution desired.\n    *                              Currently, only a uniform distribution is supported.\n    * @return The computed reference distribution, or None if the specified\n    *         referenceDistribution parameter is invalid.\n    */\n  def computeReferenceDistributionOpt(inputDistr: Distribution,\n    referenceDistribution: String):\n    Option[Distribution] = {\n    if (referenceDistribution != \"UNIFORM\") {\n      None\n    } else {\n      val numDims = inputDistr.entries.size\n      val uniformWeight: Double = 1.0 / numDims\n      val uniformEntries = inputDistr.entries.map { case (dimVal, _) =>\n        (dimVal, uniformWeight)\n      }\n      Some(Distribution(uniformEntries))\n    }\n  }\n\n  /**\n    * Computes all requested distance and divergence metrics for a given\n    * distribution. It compares it to a reference distribution if necessary.\n    *\n    * We assume that the original and reference distributions are for the training\n    * data, and are over (label, protectedAttribute), and that the reference\n    * distribution also contains similar dimensions.\n    *\n    * @param distribution The distribution of (label, protectedAttribute) for the\n    *                     training dataset\n    * @param referenceDistrOpt An optional field that contains a reference\n    *                          distribution of (label, protectedAttribute) to\n    *                          compare against. If not provided, distance and\n    *                          divergence metrics that perform comparisons will\n    *                          return empty results.\n    * @param args The set of parsed command line arguments for this job.\n    * @return A sequence of Metric values that contain the name of the metric,\n    *         any parameters used, and the result of the computation.\n    */\n  def computeDatasetMetrics(distribution: Distribution,\n    referenceDistrOpt: Option[Distribution],\n    args: MeasureDatasetFairnessMetricsCmdLineArgs,\n    scoreField: String = \"\"): Seq[FairnessResult] = {\n    val computedDistanceMetrics = DivergenceUtils.computeDistanceMetrics(\n      args.distanceMetrics, distribution, referenceDistrOpt,\n      args.labelField, scoreField, args.protectedAttributeField)\n\n    val benefitMaps = DivergenceUtils.computeDistanceMetrics(\n      args.benefitMetrics, distribution, referenceDistrOpt,\n      args.labelField, scoreField, args.protectedAttributeField)\n      .map(_.toBenefitMap)\n    val computedOverallMetrics =\n      benefitMaps.flatMap(_.computeOverallMetrics(args.overallMetrics))\n\n    computedDistanceMetrics ++ computedOverallMetrics\n  }\n\n  /**\n    * Computes a DataFrame with probabilities for the score field. If the\n    * threshold is specified, we use it to obtain 0/1 values for the scores,\n    * thus giving us probabilities. If we have raw scores, we convert it into\n    * probabilities using the sigmoid function. If we have probabilities, we\n    * return them as is.\n    *\n    * @param df The input DataFrame\n    * @param thresholdOpt An optional threshold that can be provided to convert\n    *                     the scores into 0/1 predictions directly.\n    * @param labelField The label field\n    * @param scoreField The score field\n    * @param protectedAttributeField The protected attribute field\n    * @param scoreType Whether the scores are raw scores or probabilities\n    * @return The input DataFrame with its scores transformed into probabilities.\n    */\n  def computeProbabilityDF(df: DataFrame, thresholdOpt: Option[Double],\n    labelField: String, scoreField: String, protectedAttributeField: String,\n    scoreType: String): DataFrame = {\n    val probDF =\n      if (scoreType.equals(\"RAW\")) {\n        // Compute sigmoid(x) = 1.0 / (1.0 + exp(-x))\n        df.select(col(labelField),\n          (lit(1.0) /\n            (lit(1.0) + exp(-col(scoreField))))\n            .as(scoreField),\n          col(protectedAttributeField))\n      } else {\n        df.select(labelField, scoreField, protectedAttributeField)\n      }\n\n    thresholdOpt.map { threshold =>\n      probDF.select(col(labelField),\n        (col(scoreField) > threshold).cast(IntegerType).as(scoreField),\n        col(protectedAttributeField))\n    }.getOrElse(probDF)\n  }\n\n  /**\n    * Computes all requested model-related fairness metrics for a given\n    * dataset. We assume that the dataset has the (label, score/prediction,\n    * protectedAttribute) fields at the very least.\n    *\n    * At a high level, there are three kinds of metrics. The first involves\n    * checking for statistically significant differences in a particular metric\n    * across various protected groups. The second kind computes conventional\n    * notions of fairness such as Demographic Parity and Equalized Odds. The\n    * third kind is to compute aggregate metrics. This can further be divided\n    * into two kinds: the first is used to summarize metrics such as Demographic\n    * Parity and Equalized Odds, while the second is used to summarize model\n    * performance metrics across various groups (such as precision, TPR, FPR, AUC etc.).\n    *\n    * This method works by sampling the input data and computes metrics. Typically,\n    * 50k-100k rows of data are more than sufficient to compute good estimates\n    * without having to analyze the entire output. If the number of data points\n    * in the input DataFrame is less than this, the entire dataset is analyzed.\n    *\n    * @param df The input DataFrame\n    * @param referenceDistrOpt An optional reference distribution to compare against\n    * @param args The set of model-related fairness measurement command line args\n    * @return A sequence of Metric values that contain the name of the metric,\n    *         any parameters used, and the result of the computation.\n    */\n  def computeModelMetrics(df: DataFrame, referenceDistrOpt: Option[Distribution],\n    args: MeasureModelFairnessMetricsCmdLineArgs): Seq[FairnessResult] = {\n    val probabilityDF = computeProbabilityDF(df, args.thresholdOpt,\n      args.labelField, args.scoreField,\n      args.protectedAttributeField, args.scoreType)\n\n    val distribution = DivergenceUtils\n      .computeGeneralizedPredictionCountDistribution(probabilityDF,\n        args.labelField, args.scoreField, args.protectedAttributeField)\n\n    val distMetrics = computeDatasetMetrics(distribution, referenceDistrOpt,\n      MeasureDatasetFairnessMetricsCmdLineArgs(\n        datasetPath = args.datasetPath,\n        protectedDatasetPath = args.protectedDatasetPath,\n        uidField = args.uidField,\n        labelField = args.labelField,\n        protectedAttributeField = args.protectedAttributeField,\n        outputPath = args.outputPath,\n        referenceDistribution = args.referenceDistribution,\n        distanceMetrics = args.distanceMetrics,\n        overallMetrics = args.overallMetrics,\n        benefitMetrics = args.distanceBenefitMetrics), args.scoreField)\n\n    val sampledDF =\n      if (args.groupIdField.isEmpty) {\n        StatsUtils.sampleDataFrame(probabilityDF, args.labelField,\n          args.approxRows, args.labelZeroPercentage, args.seed)\n      } else {\n        StatsUtils.sampleDataFrameByGroupId(df, args.labelField, args.scoreField,\n          args.groupIdField, args.protectedAttributeField, args.approxRows, args.seed)\n      }\n\n    val modelPerfMetrics = computeModelPerformanceMetrics(sampledDF, args)\n    distMetrics ++ modelPerfMetrics\n  }\n\n  /**\n    * Writes a sequence of FairnessResults out to disk.\n    *\n    * @param spark The Spark Session to use\n    * @param dataFormat Output data format\n    * @param dataOptions options for the DataFrameWriter\n    * @param outputPath Output path for the results\n    * @param fairnessResults Sequence of results to be written out\n    */\n  def writeFairnessResults(spark: SparkSession, dataFormat: String,\n    dataOptions: Map[String, String], outputPath: String,\n    fairnessResults: Seq[FairnessResult]): Unit = {\n    FairnessResult.toDF(spark, fairnessResults)\n      .repartition(1)\n      .write\n      .mode(SaveMode.Overwrite)\n      .format(dataFormat)\n      .options(dataOptions)\n      .save(outputPath)\n  }\n\n  /**\n    * Computes dataset metrics and logs the output for tracking purposes.\n    *\n    * @param df The input data\n    * @param referenceDistrOpt An optional reference distribution\n    * @param args Command line args for dataset metrics measurement\n    */\n  def computeAndWriteDatasetMetrics(df: DataFrame,\n    referenceDistrOpt: Option[Distribution],\n    args: MeasureDatasetFairnessMetricsCmdLineArgs): Unit = {\n    // Compute the label-protected attribute distribution of the input data\n    val distribution = Distribution.compute(df,\n      Set(args.labelField, args.protectedAttributeField))\n\n    // Passing in the appropriate parameters to this API returns the fairness metrics\n    val fairnessMetrics = computeDatasetMetrics(distribution, referenceDistrOpt, args)\n\n    // The above fairness metrics can be written out to HDFS\n    writeFairnessResults(df.sparkSession, args.dataFormat, args.dataOptions,\n      args.outputPath, fairnessMetrics)\n  }\n\n  /**\n    * Computes model metrics and logs the output for tracking purposes.\n    *\n    * @param df The input data\n    * @param referenceDistrOpt An optional reference distribution\n    * @param args Command line args for model metrics measurement\n    */\n  def computeAndWriteModelMetrics(df: DataFrame,\n    referenceDistrOpt: Option[Distribution],\n    args: MeasureModelFairnessMetricsCmdLineArgs): Unit = {\n    // Passing in the appropriate parameters to this API returns the fairness metrics\n    val fairnessMetrics = computeModelMetrics(df, referenceDistrOpt, args)\n\n    // The above fairness metrics can be written out to HDFS\n    writeFairnessResults(df.sparkSession, args.dataFormat, args.dataOptions,\n      args.outputPath, fairnessMetrics)\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/eval/MeasureDatasetFairnessMetricsCmdLineArgs.scala",
    "content": "package com.linkedin.lift.eval\n\n/**\n  * Contains the dataset metrics command line arguments\n  *\n  * @param datasetPath Input data path\n  * @param protectedDatasetPath Input path to the protected dataset (optional).\n  *                             If not provided, the library attempts to use\n  *                             the right dataset based on the protected attribute.\n  * @param dataFormat Format of the input datasets. This is the parameter passed\n  *                   to the Spark reader's format method. Defaults to avro.\n  * @param dataOptions A map of options to be used with Spark's reader (optional).\n  * @param uidField The unique ID field, like a memberId field.\n  * @param labelField The label field\n  * @param protectedAttributeField The protected attribute field\n  * @param uidProtectedAttributeField The uid field for the protected attribute dataset\n  * @param outputPath Output data path\n  * @param referenceDistribution A reference distribution to compare against (optional).\n  *                              Only accepted value currently is UNIFORM.\n  * @param distanceMetrics Distance and divergence metrics like SKEWS, INF_NORM_DIST,\n  *                        TOTAL_VAR_DIST, JS_DIVERGENCE, KL_DIVERGENCE and\n  *                        DEMOGRAPHIC_PARITY (optional).\n  * @param overallMetrics Aggregate metrics like GENERALIZED_ENTROPY_INDEX,\n  *                       ATKINSONS_INDEX, THEIL_L_INDEX, THEIL_T_INDEX and\n  *                       COEFFICIENT_OF_VARIATION, along with their corresponding\n  *                       parameters.\n  * @param benefitMetrics The distance/divergence metrics to use as the benefit\n  *                       vector when computing the overall metrics. Acceptable\n  *                       values are SKEWS and DEMOGRAPHIC_PARITY.\n  */\ncase class MeasureDatasetFairnessMetricsCmdLineArgs(\n  datasetPath: String = \"\",\n  protectedDatasetPath: String = \"\",\n  dataFormat: String = \"com.databricks.spark.avro\",\n  dataOptions: Map[String, String] = Map(),\n  uidField: String = \"\",\n  labelField: String = \"\",\n  protectedAttributeField: String = \"\",\n  uidProtectedAttributeField: String = \"memberId\",\n  outputPath: String = \"\",\n  referenceDistribution: String = \"\",\n  distanceMetrics: Seq[String] = Seq(),\n  overallMetrics: Map[String, String] = Map(),\n  benefitMetrics: Seq[String] = Seq()\n)\n\nobject MeasureDatasetFairnessMetricsCmdLineArgs {\n  /**\n    * Parse command line arguments to generate a structured case class.\n    *\n    * @param args The command line args\n    * @return A case class with the populated parameters\n    */\n  def parseArgs(args: Seq[String]): MeasureDatasetFairnessMetricsCmdLineArgs = {\n    val parser = new scopt.OptionParser[MeasureDatasetFairnessMetricsCmdLineArgs](\n      \"MeasureDatasetFairnessMetrics\") {\n      opt[String](\"datasetPath\") required() action { (x, c) =>\n        c.copy(datasetPath = x)\n      }\n      opt[String](\"protectedDatasetPath\") optional() action { (x, c) =>\n        c.copy(protectedDatasetPath = x)\n      }\n      opt[String](\"dataFormat\") optional() action { (x, c) =>\n        c.copy(dataFormat = x)\n      }\n      opt[Map[String, String]](\"dataOptions\") optional() action { (x, c) =>\n        c.copy(dataOptions = x)\n      }\n      opt[String](\"uidField\") required() action { (x, c) =>\n        c.copy(uidField = x)\n      }\n      opt[String](\"labelField\") required() action { (x, c) =>\n        c.copy(labelField = x)\n      }\n      opt[String](\"protectedAttributeField\") required() action { (x, c) =>\n        c.copy(protectedAttributeField = x)\n      }\n      opt[String](\"uidProtectedAttributeField\") optional() action { (x, c) =>\n        c.copy(uidProtectedAttributeField = x)\n      }\n      opt[String](\"outputPath\") required() action { (x, c) =>\n        c.copy(outputPath = x)\n      }\n      opt[String](\"referenceDistribution\") optional() action { (x, c) =>\n        c.copy(referenceDistribution = x)\n      }\n      opt[Seq[String]](\"distanceMetrics\") optional() action { (x, c) =>\n        c.copy(distanceMetrics = x)\n      }\n      opt[Map[String, String]](\"overallMetrics\") optional() action { (x, c) =>\n        c.copy(overallMetrics = x)\n      }\n      opt[Seq[String]](\"benefitMetrics\") optional() action { (x, c) =>\n        c.copy(benefitMetrics = x)\n      }\n    }\n\n    // If the parser was unable to read the arguments correctly,\n    // this will generate an exception and end the job\n    val cmdLineArgsOpt: Option[MeasureDatasetFairnessMetricsCmdLineArgs] = parser.parse(\n      args, MeasureDatasetFairnessMetricsCmdLineArgs())\n    require(cmdLineArgsOpt.isDefined)\n\n    cmdLineArgsOpt.get\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/eval/MeasureModelFairnessMetricsCmdLineArgs.scala",
    "content": "package com.linkedin.lift.eval\n\n/**\n  * Contains the model metrics command line arguments\n  *\n  * @param datasetPath Input data path\n  * @param protectedDatasetPath Input path to the protected dataset (optional).\n  *                             If not provided, the library attempts to use\n  *                             the right dataset based on the protected attribute.\n  * @param dataFormat Format of the input datasets. This is the parameter passed\n  *                   to the Spark reader's format method. Defaults to avro.\n  * @param dataOptions A map of options to be used with Spark's reader (optional).\n  * @param uidField The unique ID field, like a memberId field.\n  * @param labelField The label field\n  * @param scoreField The score field\n  * @param scoreType Whether the scores are raw scores or probabilities.\n  *                  Accepted values are RAW or PROB.\n  * @param protectedAttributeField The protected attribute field\n  * @param uidProtectedAttributeField The uid field for the protected attribute dataset\n  * @param groupIdField An optional field to be used for grouping, in case of ranking metrics\n  * @param outputPath Output data path\n  * @param referenceDistribution A reference distribution to compare against (optional).\n  *                              Only accepted value currently is UNIFORM.\n  * @param approxRows The approximate number of rows to sample from the input data\n  *                   when computing model metrics. The final sampled value is\n  *                   min(numRowsInDataset, approxRows)\n  * @param labelZeroPercentage The percentage of the sampled data that must\n  *                            be negatively labeled. This is useful in case\n  *                            the input data is highly skewed and you believe\n  *                            that stratified sampling will not obtain sufficient\n  *                            number of examples of a certain label.\n  * @param thresholdOpt An optional value that contains a threshold. It is used\n  *                     in case you want to generate hard binary classifications.\n  *                     If not provided and you request metrics that depend on\n  *                     explicit label predictions (eg. precision), the scoreType\n  *                     information is used to convert the scores into the\n  *                     probabilities of predicting positives. This is used for\n  *                     computing expected positive prediction counts.\n  * @param numTrials Number of trials to run the permutation test for. More trials\n  *                  yield results with lower variance in the computed p-value,\n  *                  but takes more time\n  * @param seed The random value seed\n  * @param distanceMetrics Distance and divergence metrics that are to be computed.\n  *                        These are metrics such as Demographic Parity\n  *                        and Equalized Odds.\n  * @param permutationMetrics The metrics to use for permutation testing\n  * @param distanceBenefitMetrics The model metrics that are to be used for\n  *                               computing benefit vectors, one for each\n  *                               distance metric specified.\n  * @param performanceBenefitMetrics The model metrics that are to be used for\n  *                                  computing benefit vectors, one for each\n  *                                  model performance metric specified.\n  * @param overallMetrics The aggregate metrics that are to be computed on each\n  *                       of the benefit vectors generated.\n  */\ncase class MeasureModelFairnessMetricsCmdLineArgs(\n  datasetPath: String = \"\",\n  protectedDatasetPath: String = \"\",\n  dataFormat: String = \"com.databricks.spark.avro\",\n  dataOptions: Map[String, String] = Map(),\n  uidField: String = \"\",\n  labelField: String = \"\",\n  scoreField: String = \"\",\n  scoreType: String = \"PROB\",\n  protectedAttributeField: String = \"\",\n  uidProtectedAttributeField: String = \"memberId\",\n  groupIdField: String = \"\",\n  outputPath: String = \"\",\n  referenceDistribution: String = \"\",\n  approxRows: Long = 500000L,\n  labelZeroPercentage: Double = -1.0,\n  thresholdOpt: Option[Double] = None,\n  numTrials: Int = 1000,\n  seed: Long = 0L,\n  distanceMetrics: Seq[String] = Seq(),\n  permutationMetrics: Seq[String] = Seq(),\n  distanceBenefitMetrics: Seq[String] = Seq(),\n  performanceBenefitMetrics: Seq[String] = Seq(),\n  overallMetrics: Map[String, String] = Map()\n)\n\nobject MeasureModelFairnessMetricsCmdLineArgs {\n  /**\n    * Parse command line arguments to generate a structured case class.\n    *\n    * @param args The command line args\n    * @return A case class with the populated parameters\n    */\n  def parseArgs(args: Seq[String]): MeasureModelFairnessMetricsCmdLineArgs = {\n    val parser = new scopt.OptionParser[MeasureModelFairnessMetricsCmdLineArgs](\n      \"MeasureModelFairnessMetrics\") {\n      opt[String](\"datasetPath\") required() action { (x, c) =>\n        c.copy(datasetPath = x)\n      }\n      opt[String](\"protectedDatasetPath\") optional() action { (x, c) =>\n        c.copy(protectedDatasetPath = x)\n      }\n      opt[String](\"dataFormat\") optional() action { (x, c) =>\n        c.copy(dataFormat = x)\n      }\n      opt[Map[String, String]](\"dataOptions\") optional() action { (x, c) =>\n        c.copy(dataOptions = x)\n      }\n      opt[String](\"uidField\") required() action { (x, c) =>\n        c.copy(uidField = x)\n      }\n      opt[String](\"labelField\") required() action { (x, c) =>\n        c.copy(labelField = x)\n      }\n      opt[String](\"scoreField\") required() action { (x, c) =>\n        c.copy(scoreField = x)\n      }\n      opt[String](\"scoreType\") required() action { (x, c) =>\n        c.copy(scoreType = x)\n      }\n      opt[String](\"protectedAttributeField\") required() action { (x, c) =>\n        c.copy(protectedAttributeField = x)\n      }\n      opt[String](\"uidProtectedAttributeField\") optional() action { (x, c) =>\n        c.copy(uidProtectedAttributeField = x)\n      }\n      opt[String](\"groupIdField\") optional() action { (x, c) =>\n        c.copy(groupIdField = x)\n      }\n      opt[String](\"outputPath\") required() action { (x, c) =>\n        c.copy(outputPath = x)\n      }\n      opt[String](\"referenceDistribution\") optional() action { (x, c) =>\n        c.copy(referenceDistribution = x)\n      }\n      opt[Long](\"approxRows\") optional() action { (x, c) =>\n        c.copy(approxRows = x)\n      }\n      opt[Double](\"labelZeroPercentage\") optional() action { (x, c) =>\n        c.copy(labelZeroPercentage = x)\n      }\n      opt[Double](\"threshold\") optional() action { (x, c) =>\n        c.copy(thresholdOpt = Some(x))\n      }\n      opt[Int](\"numTrials\") optional() action { (x, c) =>\n        c.copy(numTrials = x)\n      }\n      opt[Long](\"seed\") optional() action { (x, c) =>\n        c.copy(seed = x)\n      }\n      opt[Seq[String]](\"distanceMetrics\") optional() action { (x, c) =>\n        c.copy(distanceMetrics = x)\n      }\n      opt[Seq[String]](\"permutationMetrics\") optional() action { (x, c) =>\n        c.copy(permutationMetrics = x)\n      }\n      opt[Map[String, String]](\"overallMetrics\") optional() action { (x, c) =>\n        c.copy(overallMetrics = x)\n      }\n      opt[Seq[String]](\"distanceBenefitMetrics\") optional() action { (x, c) =>\n        c.copy(distanceBenefitMetrics = x)\n      }\n      opt[Seq[String]](\"performanceBenefitMetrics\") optional() action { (x, c) =>\n        c.copy(performanceBenefitMetrics = x)\n      }\n    }\n\n    // If the parser was unable to read the arguments correctly,\n    // this will generate an exception and end the job\n    val cmdLineArgsOpt: Option[MeasureModelFairnessMetricsCmdLineArgs] = parser.parse(\n      args, MeasureModelFairnessMetricsCmdLineArgs())\n    require(cmdLineArgsOpt.isDefined)\n\n    cmdLineArgsOpt.get\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/eval/jobs/MeasureDatasetFairnessMetrics.scala",
    "content": "package com.linkedin.lift.eval.jobs\n\nimport com.linkedin.lift.eval.{FairnessMetricsUtils, MeasureDatasetFairnessMetricsCmdLineArgs}\nimport com.linkedin.lift.types.Distribution\nimport org.apache.spark.sql.SparkSession\n\n/**\n  * A basic dataset-level fairness metrics measurement program. If your use case\n  * is more involved, you can create a similar wrapper driver program that\n  * prepares the data and calls the computeDatasetMetrics API.\n  */\nobject MeasureDatasetFairnessMetrics {\n  /**\n    * Driver program to measure various fairness metrics\n    *\n    * @param progArgs Command line arguments\n    */\n  def main(progArgs: Array[String]): Unit = {\n    val spark = SparkSession\n      .builder()\n      .appName(getClass.getSimpleName)\n      .getOrCreate()\n\n    val args = MeasureDatasetFairnessMetricsCmdLineArgs.parseArgs(progArgs)\n\n    // One could choose to do their own preprocessing here\n    // For example, filtering out only certain records based on some threshold\n    val dfReader = spark.read.format(args.dataFormat).options(args.dataOptions)\n    val df = dfReader.load(args.datasetPath)\n      .select(args.uidField, args.labelField)\n    val protectedDF = dfReader.load(args.protectedDatasetPath)\n\n    // Similar preprocessing can be done with the protected attribute data\n    val joinedDF = FairnessMetricsUtils.computeJoinedDF(protectedDF, df, args.uidField,\n      args.protectedDatasetPath, args.uidProtectedAttributeField,\n      args.protectedAttributeField)\n\n    // Input distributions are computed using the joined data\n    val referenceDistrOpt =\n      if (args.referenceDistribution.isEmpty) {\n        None\n      } else {\n        val distribution = Distribution.compute(joinedDF,\n          Set(args.labelField, args.protectedAttributeField))\n        FairnessMetricsUtils.computeReferenceDistributionOpt(\n          distribution, args.referenceDistribution)\n      }\n\n    // Passing in the appropriate parameters to this API computes and writes\n    // out the fairness metrics\n    FairnessMetricsUtils.computeAndWriteDatasetMetrics(joinedDF,\n      referenceDistrOpt, args)\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/eval/jobs/MeasureModelFairnessMetrics.scala",
    "content": "package com.linkedin.lift.eval.jobs\n\nimport com.linkedin.lift.eval.{FairnessMetricsUtils, MeasureModelFairnessMetricsCmdLineArgs}\nimport com.linkedin.lift.lib.DivergenceUtils\nimport org.apache.spark.sql.SparkSession\n\n/**\n  * A basic model-level fairness metrics measurement program. If your use case\n  * is more involved, you can create a similar wrapper driver program that\n  * prepares the data and calls the computeModelMetrics API.\n  */\nobject MeasureModelFairnessMetrics {\n  /**\n    * Driver program to measure various fairness metrics\n    *\n    * @param progArgs Command line arguments\n    */\n  def main(progArgs: Array[String]): Unit = {\n    val spark = SparkSession\n      .builder()\n      .appName(getClass.getSimpleName)\n      .getOrCreate()\n\n    val args = MeasureModelFairnessMetricsCmdLineArgs.parseArgs(progArgs)\n\n    // One could choose to do their own preprocessing here\n    // For example, filtering out only certain records based on some threshold\n    val dfReader = spark.read.format(args.dataFormat).options(args.dataOptions)\n    val df = FairnessMetricsUtils.projectIdLabelsAndScores(dfReader.load(args.datasetPath),\n      args.uidField, args.labelField, args.scoreField, args.groupIdField)\n    val protectedDF = dfReader.load(args.protectedDatasetPath)\n\n    // Similar preprocessing can be done with the protected attribute data\n    val joinedDF = FairnessMetricsUtils.computeJoinedDF(protectedDF, df, args.uidField,\n      args.protectedDatasetPath, args.uidProtectedAttributeField,\n      args.protectedAttributeField)\n    joinedDF.persist\n\n    // Input distributions are computed using the joined data\n    val referenceDistrOpt =\n      if (args.referenceDistribution.isEmpty) {\n        None\n      } else {\n        val probabilityDF = FairnessMetricsUtils.computeProbabilityDF(joinedDF,\n          args.thresholdOpt, args.labelField, args.scoreField,\n          args.protectedAttributeField, args.scoreType)\n        val distribution = DivergenceUtils\n          .computeGeneralizedPredictionCountDistribution(probabilityDF,\n            args.labelField, args.scoreField, args.protectedAttributeField)\n          .computeMarginal(Set(args.scoreField, args.protectedAttributeField))\n        FairnessMetricsUtils.computeReferenceDistributionOpt(\n          distribution, args.referenceDistribution)\n      }\n\n    // Passing in the appropriate parameters to this API computes and writes\n    // out the fairness metrics\n    FairnessMetricsUtils.computeAndWriteModelMetrics(joinedDF,\n      referenceDistrOpt, args)\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/DivergenceUtils.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.types.Distribution.DimensionValues\nimport com.linkedin.lift.types.{Distribution, FairnessResult}\nimport org.apache.spark.sql.DataFrame\nimport org.apache.spark.sql.functions._\n\n/**\n  * Utilities to compute divergence, distance, and skew measures.\n  */\nobject DivergenceUtils {\n\n  /**\n    * KL divergence from q to p. Note that this is asymmetric.\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * This method normalizes the values into probabilities.\n    *\n    * There is support for Laplace smoothing for the source distribution\n    * to avoid divide-by-zero errors. To ensure numerical stability, we compute\n    * KL divergence on the counts and then adjust this to convert it into the\n    * actual KL divergence on probabilities. We use log base 2 to measure info\n    * in terms of bits.\n    *\n    * @param p Target distribution\n    * @param q Source distribution\n    * @param alpha Parameter to set amount of Laplace smoothing.\n    *              Defaults to 1.0 (add one smoothing)\n    * @return Kullback-Leibler Divergence\n    */\n  def computeKullbackLeiblerDivergence(p: Distribution, q: Distribution,\n    alpha: Double = 1.0): Double = {\n    val logVals = p.zip(q).map { case (_, pVal, qVal) =>\n      if (pVal == 0.0) {\n        0.0\n      } else {\n        pVal * math.log(pVal / (qVal + alpha))\n      }\n    }\n\n    val pSum = p.sum\n    val qSum = q.sum + (alpha * logVals.size)\n\n    1.0 / math.log(2.0) * ((logVals.sum / pSum) + math.log(qSum / pSum))\n  }\n\n  /**\n    * JS divergence of p and q. Note that this is symmetric.\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * The JS divergence is the average of the KS divergences of M (from p and q),\n    * where M is the average of the probability distributions of p and q.\n    *\n    * @param p First distribution\n    * @param q Second distribution\n    * @return Jensen-Shannon Divergence\n    */\n  def computeJensenShannonDivergence(p: Distribution, q: Distribution): Double = {\n    val pSum = p.sum\n    val qSum = q.sum\n    val avgDistributionEntries = p.zip(q)\n      .map { case (dimensions, pVal, qVal) =>\n        (dimensions, 0.5 * ((pVal / pSum) + (qVal / qSum)))\n      }\n      .toMap\n    val avgDistribution = Distribution(avgDistributionEntries)\n\n    // We don't need any smoothing since an avgDistribution value will be zero\n    // iff the corresponding p and q values are both 0.0. But if this is the\n    // case, p * log(p/avg) will be zero, so no divide by zero errors.\n    0.5 * (computeKullbackLeiblerDivergence(p, avgDistribution, 0.0) +\n      computeKullbackLeiblerDivergence(q, avgDistribution, 0.0))\n  }\n\n  /**\n    * Total variation distance between p and q. Note that this is symmetric.\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * Total variation distance between p and q equals half the L1-distance\n    * between the underlying probability distribution vectors. It also equals\n    * the largest possible difference between the probabilities that the two\n    * distributions can assign to the same event.\n    * https://en.wikipedia.org/wiki/Total_variation_distance_of_probability_measures\n    *\n    * @param p First distribution\n    * @param q Second distribution\n    * @return Total variation distance\n    */\n  def computeTotalVariationDistance(p: Distribution, q: Distribution): Double = {\n    val pSum = p.sum\n    val qSum = q.sum\n    val l1Distance = p.zip(q)\n      .map { case (_, pVal, qVal) =>\n        math.abs((pVal / pSum) - (qVal / qSum))\n      }\n      .sum\n\n    0.5 * l1Distance\n  }\n\n  /**\n    * Infinity norm distance (Chebyshev distance) between probability\n    * distributions corresponding to p and q. Note that this is symmetric.\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * Infinity norm distance (Chebyshev distance) equals the maximum\n    * difference between the probabilities assigned by p and q along any\n    * dimension.\n    * https://en.wikipedia.org/wiki/Chebyshev_distance\n    *\n    * @param p First distribution\n    * @param q Second distribution\n    * @return Infinity norm distance\n    */\n  def computeInfinityNormDistance(p: Distribution, q: Distribution): Double = {\n    val pSum = p.sum\n    val qSum = q.sum\n    val infinityNormDistance = p.zip(q)\n      .map { case (_, pVal, qVal) =>\n        math.abs((pVal / pSum) - (qVal / qSum))\n      }\n      .max\n\n    infinityNormDistance\n  }\n\n  /**\n    * Skew for a category (dimensions) in the observed distribution (p)\n    * with respect to the desired distribution (q), defined as the logarithmic\n    * ratio of the proportion for the category, dimensions observed in p to the\n    * corresponding desired proportion in q. Note that this is asymmetric.\n    *\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * There is support for Laplace smoothing for the source distribution\n    * to avoid divide-by-zero errors.\n    *\n    * @param p Observed distribution\n    * @param q Desired distribution\n    * @param dimensions Category for which skew is to be computed\n    * @param alpha Parameter to set amount of Laplace smoothing.\n    *              Defaults to 1.0 (add one smoothing)\n    * @return Skew\n    */\n  def computeSkew(p: Distribution, q: Distribution, dimensions: DimensionValues,\n    alpha: Double = 1.0): Double = {\n    val totalCategoryCount = p.zip(q).size\n    val pSum = p.sum + (alpha * totalCategoryCount)\n    val qSum = q.sum + (alpha * totalCategoryCount)\n\n    math.log(p.getValue(dimensions) + alpha) - math.log(pSum) + math.log(qSum) -\n      math.log(q.getValue(dimensions) + alpha)\n  }\n\n  /**\n    * Minimum skew over all categories in the observed distribution (p) with\n    * respect to the desired distribution (q), defined as the minimum over all\n    * categories of the logarithmic ratio of the proportion for a category\n    * observed in p to the corresponding desired proportion in q. Note that\n    * this is asymmetric.\n    *\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * There is support for Laplace smoothing for the source distribution\n    * to avoid divide-by-zero errors.\n    *\n    * @param p Observed distribution\n    * @param q Desired distribution\n    * @param alpha Parameter to set amount of Laplace smoothing.\n    *              Defaults to 1.0 (add one smoothing)\n    * @return (Category, skew) corresponding to the minimum skew\n    */\n  def computeMinSkew(p: Distribution, q: Distribution,\n    alpha: Double = 1.0): (DimensionValues, Double) = {\n    val probRatios = p.zip(q).map { case (dimensions, pVal, qVal) =>\n      (dimensions, (pVal + alpha) / (qVal + alpha))\n    }\n\n    val pSum = p.sum + (alpha * probRatios.size)\n    val qSum = q.sum + (alpha * probRatios.size)\n\n    val (minDimensions, minProbRatios) = probRatios.minBy(_._2)\n    (minDimensions, math.log(minProbRatios) + math.log(qSum / pSum))\n  }\n\n  /**\n    * Maximum skew over all categories in the observed distribution (p) with\n    * respect to the desired distribution (q), defined as the maximum over all\n    * categories of the logarithmic ratio of the proportion for a category\n    * observed in p to the corresponding desired proportion in q. Note that\n    * this is asymmetric.\n    *\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * There is support for Laplace smoothing for the source distribution\n    * to avoid divide-by-zero errors.\n    *\n    * @param p Observed distribution\n    * @param q Desired distribution\n    * @param alpha Parameter to set amount of Laplace smoothing.\n    *              Defaults to 1.0 (add one smoothing)\n    * @return (Category, skew) corresponding to the maximum skew\n    */\n  def computeMaxSkew(p: Distribution, q: Distribution,\n    alpha: Double = 1.0): (DimensionValues, Double) = {\n    val probRatios = p.zip(q).map { case (dimensions, pVal, qVal) =>\n      (dimensions, (pVal + alpha) / (qVal + alpha))\n    }\n\n    val pSum = p.sum + (alpha * probRatios.size)\n    val qSum = q.sum + (alpha * probRatios.size)\n\n    val (maxDimensions, maxProbRatios) = probRatios.maxBy(_._2)\n    (maxDimensions, math.log(maxProbRatios) + math.log(qSum / pSum))\n  }\n\n  /**\n    * Compute skew for all categories, where the skew for a category\n    * (dimensions) in the observed distribution (p) with respect to the\n    * desired distribution (q) is defined as the logarithmic ratio of the\n    * proportion for the category, dimensions observed in p to the\n    * corresponding desired proportion in q. Note that this is asymmetric.\n    *\n    * We assume that the distributions are valid\n    * (i.e., they don't have any non-negative values).\n    *\n    * There is support for Laplace smoothing for the source distribution\n    * to avoid divide-by-zero errors.\n    *\n    * @param p Observed distribution\n    * @param q Desired distribution\n    * @param alpha Parameter to set amount of Laplace smoothing.\n    *              Defaults to 1.0 (add one smoothing)\n    * @return A map of (category, skew) tuples\n    */\n  def computeAllSkews(p: Distribution, q: Distribution,\n    alpha: Double = 1.0): Map[DimensionValues, Double] = {\n    val pzipq = p.zip(q)\n    val totalCategoryCount = pzipq.size\n    val pSum = p.sum + (alpha * totalCategoryCount)\n    val qSum = q.sum + (alpha * totalCategoryCount)\n    val logSumDiff = math.log(pSum) - math.log(qSum)\n\n    pzipq.map { case (dimensions, pVal, qVal) =>\n      val skew = math.log(pVal + alpha) - math.log(qVal + alpha) - logSumDiff\n\n      (dimensions, skew)\n    }.toMap\n  }\n\n  /**\n    * Compute a distribution of (protectedAttributeValue, label, prediction)\n    * counts that works both for cases when prediction is {0.0, 1.0}, and when\n    * it is a probability P(y=1) in [0.0, 1.0]. The labels are assumed to be\n    * binary. The working is straightforward in the former case.\n    * In the latter, we compute the expected number of FPs, TPs, FNs and TNs.\n    * E[FPs] = E[C(label = 0, prediction = 1)]. Hence doing this for all\n    * protected attribute values gets us the expected counts as desired. The logic\n    * is similar to that used for computing the generalized confusion matrix.\n    *\n    * This method is typically to be used when there is no notion of a threshold\n    * for the classifier, ie., the model's scores are directly being used for\n    * things like ranking, but the model is actually a binary classifier.\n    *\n    * @param df The input DataFrame\n    * @param labelField Label field name\n    * @param scoreField Score field name\n    * @param protectedAttributeField Protected attribute field name\n    * @return A generalized count distribution\n    */\n  def computeGeneralizedPredictionCountDistribution(df: DataFrame,\n    labelField: String, scoreField: String,\n    protectedAttributeField: String): Distribution = {\n    // E[number of positive predictions] = sum(prob that ith example is positive * 1.0)\n    // Score of ith example is the probability it is positive (We assume that\n    // the scores are probabilities. Raw scores can be passed through a sigmoid\n    // to convert them into probabilities)\n    val entries = df.select(protectedAttributeField, labelField, scoreField)\n      .groupBy(protectedAttributeField, labelField)\n      .agg(sum(col(scoreField)), sum(lit(1.0) - col(scoreField)))\n      .collect\n      .flatMap { row =>\n        val rowSeq = row.toSeq.map { Option(_).fold(\"\") { _.toString } }\n        val protectedAttr = rowSeq.head\n        val label = rowSeq(1)\n\n        // score1 is the E[C(positive predictions | label, protectedAttr)]\n        // score0 is the E[C(negative predictions | label, protectedAttr)]\n        // They always sum to C(label, protectedAttr)\n        val score1 = rowSeq(2).toDouble\n        val score0 = rowSeq(3).toDouble\n\n        val dimVals0: DimensionValues = Map(\n          protectedAttributeField -> protectedAttr,\n          labelField -> label,\n          scoreField -> \"0.0\")\n        val dimVals1: DimensionValues = Map(\n          protectedAttributeField -> protectedAttr,\n          labelField -> label,\n          scoreField -> \"1.0\")\n        Seq((dimVals0, score0), (dimVals1, score1))\n      }.toMap\n\n    Distribution(entries)\n  }\n\n  /**\n    * Computes Demographic Parity deviations for all combinations of the protected\n    * attribute values. Demographic Parity is defined as\n    * P(Y=1|G=g1) = P(Y=1|G=g2) for all g1, g2 in G (the protected attribute).\n    * The variable Y is the label in the case of training data, and is the\n    * prediction in the case of scored outputs.\n    *\n    * Note that aiming to achieve Demographic Parity is not necessarily an ideal\n    * solution, since it only requires that the positive label rates are equal,\n    * and does not look into more meaningful values like true and false positive\n    * rates. Nevertheless, given two models with similar performance, the one\n    * with lower DP is generally more desirable.\n    *\n    * This metric is ideally suited for binary classifier problems.\n    *\n    * @param p The input distribution of (label/prediction, protectedAttribute) counts\n    * @param labelField The label/prediction field name\n    * @param protectedAttributeField The protected attribute field name\n    * @return A list of Demographic Parity deviations for all combinations of the\n    *         protected attribute values.\n    */\n  def computeDemographicParity(p: Distribution, labelField: String,\n    protectedAttributeField: String): FairnessResult = {\n    // Find out if the labels are 1/0 or 1.0/0.0\n    val labelVals = p.entries.map { case (dimVals, _) => dimVals(labelField) }.toSet\n    val labelValueOne =\n      if (labelVals.contains(\"1\")) {\n        \"1\"\n      } else {\n        \"1.0\"\n      }\n\n    // Compute P(Y=1 | G=g) for all g in G\n    val protectedAttributeDistr = p.computeMarginal(Set(protectedAttributeField))\n    val positiveLabelRates = protectedAttributeDistr.entries.map {\n      case (dimVals, protectedAttrCount) =>\n        val labelProtectedAttrCount =\n          p.getValue(dimVals ++ Map(labelField -> labelValueOne))\n      (dimVals.values.mkString(\",\"),\n        StatsUtils.roundDouble(labelProtectedAttrCount / protectedAttrCount))\n    }\n\n    // Compute all pairs {g1, g2} from G\n    val allDimValPairs = positiveLabelRates.keys\n      .toSet\n      .subsets(2)\n\n    // Compute differences for all the pairs\n    val constituentVals = allDimValPairs.map { dimValPair =>\n      val dimVal1 = dimValPair.head\n      val dimVal2 = dimValPair.last\n      (Map(protectedAttributeField + \"1\" -> dimVal1,\n        protectedAttributeField + \"2\" -> dimVal2),\n        StatsUtils.roundDouble(\n          math.abs(positiveLabelRates(dimVal1) - positiveLabelRates(dimVal2))))\n    }.toMap\n\n    FairnessResult(\n      resultType = \"DEMOGRAPHIC_PARITY\",\n      resultValOpt = None,\n      constituentVals = constituentVals,\n      additionalStats = positiveLabelRates)\n  }\n\n  /**\n    * Computes Equalized Odds deviations for all combinations of the protected\n    * attribute values. Equalized Odds is defined as\n    * P(Y_hat=1|Y=y,G=g1) = P(Y_hat=1|Y=y,G=g2) for y in {0, 1} (label) and\n    * for all g1, g2 in G (the protected attribute).\n    * The variable Y_hat is the predicted value.\n    *\n    * Note that aiming to achieve perfect Equalized Odds is not always possible,\n    * especially if the application at hand has requirements such as ensuring\n    * high precision across all groups. In such scenarios, it is possible only in\n    * trivial cases of perfect classifiers or equal prevalence rates amongst\n    * various protected groups. That is, all |gC2|*|y| equations might not\n    * all be simultaneously satisfiable. This is due to the impossibility results\n    * that link FPR, TPR (recall), precision and prevalence rates. Equalized Odds\n    * attempts to ensure that FPRs are equal (y=0) and TPRs are equal (y=1).\n    * Thus, this will come at the cost of precision of the model when prevalence\n    * rates are unequal.\n    *\n    * Nevertheless, obtaining these deviations is helpful to understand model\n    * biases upfront.\n    *\n    * This metric is ideally suited for binary classifier problems.\n    *\n    * @param p The input distribution of (label, prediction, protectedAttribute) counts\n    * @param labelField The label field name\n    * @param predictionField The prediction field name\n    * @param protectedAttributeField The protected attribute field name\n    * @return A list of EO deviations for all combinations of the\n    *         protected attribute values.\n    */\n  def computeEqualizedOdds(p: Distribution, labelField: String,\n    predictionField: String, protectedAttributeField: String): FairnessResult = {\n    // Find out if the predictions are 1/0 or 1.0/0.0\n    val predictionVals = p.entries.map { case (dimVals, _) => dimVals(predictionField) }.toSet\n    val predictionValueOne =\n      if (predictionVals.contains(\"1\")) {\n        \"1\"\n      } else {\n        \"1.0\"\n      }\n\n    // Compute P(Y=1 | Y=y, G=g) for all y in Y and g in G\n    val labelProtectedAttributeDistr =\n      p.computeMarginal(Set(labelField, protectedAttributeField))\n    val trueFalsePositiveRates = labelProtectedAttributeDistr.entries.map {\n      case (dimVals, labelProtectedAttrCount) =>\n        val predictionLabelProtectedAttrCount =\n          p.getValue(dimVals ++ Map(predictionField -> predictionValueOne))\n        (dimVals,\n          StatsUtils.roundDouble(predictionLabelProtectedAttrCount / labelProtectedAttrCount))\n    }\n\n    // Group by Y, so that we don't compare TPRs and FPRs with each other\n    val constituentVals = trueFalsePositiveRates.groupBy { case (dimVals, _) =>\n      dimVals(labelField)\n    }.flatMap { case (label, positiveRatesForLabel) =>\n      // Compute all pairs {g1, g2} from G\n      val allDimValPairs = positiveRatesForLabel.keys\n        .toSet\n        .subsets(2)\n\n      // Compute differences for all the pairs\n      allDimValPairs.map { dimValPair =>\n        val dimVal1 = dimValPair.head\n        val dimVal2 = dimValPair.last\n        (Map(protectedAttributeField + \"1\" -> dimVal1(protectedAttributeField),\n          protectedAttributeField + \"2\" -> dimVal2(protectedAttributeField),\n          labelField -> label), StatsUtils.roundDouble(\n          math.abs(positiveRatesForLabel(dimVal1) - positiveRatesForLabel(dimVal2))))\n      }\n    }\n\n    val additionalStats = trueFalsePositiveRates.map { case (dimVals, positiveRate) =>\n      (dimVals.values.mkString(\",\"), positiveRate)\n    }\n\n    FairnessResult(\n      resultType = \"EQUALIZED_ODDS\",\n      resultValOpt = None,\n      constituentVals = constituentVals,\n      additionalStats = additionalStats)\n  }\n\n  /**\n    * Computes a list of distance/divergence related fairness metrics over\n    * (protectedAttributeField, labelField/scoreField).\n    *\n    * @param distanceMetrics The set of metrics to compute\n    * @param distribution The input distribution to compute the metrics for.\n    *                     This is a distribution over (protectedAttribute, labelField)\n    * @param referenceDistrOpt An optional reference distribution, for metrics that\n    *                          compare the input distribution against another distribution\n    * @param labelField The label field. This could be the score field, in\n    *                   case one wants to compute the statistics on the\n    *                   (protectedAttribute, scoreField) distribution instead.\n    * @param protectedAttributeField The protected attribute field\n    * @return A sequence of FairnessResults containing distance/divergence metrics\n    */\n  def computeDatasetDistanceMetrics(distanceMetrics: Seq[String],\n    distribution: Distribution,\n    referenceDistrOpt: Option[Distribution], labelField: String,\n    protectedAttributeField: String): Seq[FairnessResult] = {\n    distanceMetrics.flatMap {\n      case \"SKEWS\" =>\n        referenceDistrOpt.map { referenceDistr =>\n          val allSkews = computeAllSkews(distribution, referenceDistr)\n          FairnessResult(\n            resultType = \"SKEWS\",\n            resultValOpt = None,\n            parameters = referenceDistr.toString,\n            constituentVals = allSkews)\n        }\n      case \"INF_NORM_DIST\" =>\n        referenceDistrOpt.map { referenceDistr =>\n          val infNormDist =\n            computeInfinityNormDistance(distribution, referenceDistr)\n          FairnessResult(\n            resultType = \"INF_NORM_DIST\",\n            resultValOpt = Some(infNormDist),\n            parameters = referenceDistr.toString,\n            constituentVals = Map())\n        }\n      case \"TOTAL_VAR_DIST\" =>\n        referenceDistrOpt.map { referenceDistr =>\n          val totalVarDist =\n            computeTotalVariationDistance(distribution, referenceDistr)\n          FairnessResult(\n            resultType = \"TOTAL_VAR_DIST\",\n            resultValOpt = Some(totalVarDist),\n            parameters = referenceDistr.toString,\n            constituentVals = Map())\n        }\n      case \"JS_DIVERGENCE\" =>\n        referenceDistrOpt.map { referenceDistr =>\n          val JSDivergence =\n            computeJensenShannonDivergence(distribution, referenceDistr)\n          FairnessResult(\n            resultType = \"JS_DIVERGENCE\",\n            resultValOpt = Some(JSDivergence),\n            parameters = referenceDistr.toString,\n            constituentVals = Map())\n        }\n      case \"KL_DIVERGENCE\" =>\n        referenceDistrOpt.map { referenceDistr =>\n          val KLDivergence =\n            computeKullbackLeiblerDivergence(distribution, referenceDistr)\n          FairnessResult(\n            resultType = \"KL_DIVERGENCE\",\n            resultValOpt = Some(KLDivergence),\n            parameters = referenceDistr.toString,\n            constituentVals = Map())\n        }\n      case \"DEMOGRAPHIC_PARITY\" =>\n        Some(computeDemographicParity(distribution,\n          labelField, protectedAttributeField))\n      case _ => None\n    }\n  }\n\n  /**\n    * Computes a list of distance/divergence related fairness metrics over\n    * (protectedAttributeField, labelField, scoreField). If the scoreField is\n    * missing, it assumes that dataset metrics are being computed, and calls\n    * computeDatasetDistanceMetrics with the appropriate parameters.\n    *\n    * @param distanceMetrics The set of metrics to compute\n    * @param distribution The input distribution to compute the metrics for. This is\n    *                     a distribution over (protectedAttribute, label, score)\n    * @param referenceDistrOpt An optional reference distribution over\n    *                          (protectedAttributeField, scoreField)\n    * @param labelField The label field\n    * @param scoreField The score field. If empty, computes dataset-only metrics\n    * @param protectedAttributeField The protected attribute field\n    * @return A sequence of FairnessResults containing distance/divergence metrics\n    */\n  def computeDistanceMetrics(distanceMetrics: Seq[String], distribution: Distribution,\n    referenceDistrOpt: Option[Distribution], labelField: String,\n    scoreField: String, protectedAttributeField: String): Seq[FairnessResult] = {\n    if (scoreField.isEmpty) {\n      computeDatasetDistanceMetrics(distanceMetrics, distribution,\n        referenceDistrOpt, labelField, protectedAttributeField)\n    } else {\n      // Metrics that need only the score and protectedAttribute\n      val scoreProtectedAttrDistr = distribution.computeMarginal(Set(scoreField,\n        protectedAttributeField))\n      val computedMetrics = computeDatasetDistanceMetrics(distanceMetrics,\n        scoreProtectedAttrDistr, referenceDistrOpt, scoreField, protectedAttributeField)\n\n      // Metrics that need both score and label, and the protectedAttribute\n      val additionalOnes = distanceMetrics.flatMap {\n        case \"EQUALIZED_ODDS\" =>\n          Some(computeEqualizedOdds(distribution,\n            labelField, scoreField, protectedAttributeField))\n        case _ => None\n      }\n      computedMetrics ++ additionalOnes\n    }\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/PermutationTestUtils.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.types.{FairnessResult, ModelPrediction}\n\nimport scala.util.Random\n\n/**\n  * Utilities to perform statistical tests\n  */\nobject PermutationTestUtils {\n\n  /**\n    * Generates a bootstrap sample: Given a sequence of size N, generates a new\n    * sequence of the same size that has been obtained by sampling the input\n    * sequence (with replacement)\n    *\n    * @param predictions The input sequence\n    * @return The bootstrap sample\n    */\n  private[lift] def generateBootstrapSample(\n    predictions: Seq[ModelPrediction]): Seq[ModelPrediction] = {\n    val n = predictions.length\n    (0 until n).map { _ =>\n      predictions(Random.nextInt(n))\n    }\n  }\n\n  /**\n    * Estimates the standard deviation of a statistic (computed on a given sample).\n    * It achieves this by computing the statistic on multiple bootstrap samples of\n    * the input, and computes the standard deviation of the resulting\n    * distribution (of the statistic). In effect, this simulates the act of\n    * picking multiple samples (of the same size) from the original population\n    * and computing the same statistic for each of these.\n    *\n    * @param predictions The input sample to operate on\n    * @param bootstrapFn The statistic to compute on the bootstrap sample\n    * @param numTrials The number of trials to run the bootstrap sampling for.\n    *                  More trials produce a better estimate of the distribution.\n    * @return An estimate of the standard deviation of the statistic\n    */\n  private[lift] def computeBootstrapStdDev(predictions: Seq[ModelPrediction],\n    bootstrapFn: Seq[ModelPrediction] => Double, numTrials: Int): Double = {\n    val bootstrapDifferences = (0 until numTrials).map { _ =>\n      val sampledPredictions = generateBootstrapSample(predictions)\n      bootstrapFn(sampledPredictions)\n    }\n    StatsUtils.computeStdDev(bootstrapDifferences)\n  }\n\n  /**\n    * Computes the difference (in the same metric) between two different groups.\n    * This is the test statistic for permutation testing.\n    *\n    * @param dim1 The first group\n    * @param dim2 The second group\n    * @param fn The metric/statistic to compute on each group\n    * @param predictions The sample for which the difference is to be computed\n    * @return The value of the input fn evaluated for dim1 and dim2, and their difference\n    */\n  private[lift] def permutationFn(dim1: String, dim2: String, fn: Seq[ModelPrediction] => Double)\n    (predictions:Seq[ModelPrediction]): (Double, Double, Double) = {\n    val predictionsDim1 =\n      predictions.filter(_.dimensionValue == dim1)\n    val predictionsDim2 =\n      predictions.filter(_.dimensionValue == dim2)\n    val value1 = fn(predictionsDim1)\n    val value2 = fn(predictionsDim2)\n    (value1, value2, value1 - value2)\n  }\n\n  /**\n    * Implementation of the permutation testing methodology described in:\n    * \"Cyrus DiCiccio, Sriram Vasudevan, Kinjal Basu, Krishnaram Kenthapadi, Deepak Agarwal. 2020.\n    * Evaluating Fairness Using Permutation Tests. To Appear in Proceedings of the 26th ACM SIGKDD\n    * International Conference on Knowledge Discovery & Data Mining (KDD '20).\n    * Association for Computing Machinery, New York, NY, USA.\"\n    *\n    * Perform a two-sided permutation test (for a given function) to assess if\n    * the difference between two groups is statistically significant.\n    *\n    * The null hypothesis is that there is no difference between these groups.\n    * If this is the case, then randomly shuffling the samples around should\n    * have no impact on the difference between the two groups. To generate the\n    * distribution of data under the null hypotheses, we need to compute the\n    * difference between all possible permutations of the data split into\n    * two groups. To approximate this, we randomly shuffle the data N times,\n    * splitting it in the ratio of the two groups.\n    *\n    * We then compute the p-value, the probability (under the null hypothesis)\n    * of observing a result as extreme as (or more extreme than) the result\n    * we observed.\n    *\n    * Our sequence of extremeDiffs can be viewed as a biased coin with a bias\n    * equal to the p-value. This can then be looked at as a binomially distributed\n    * observation. We can then estimate its standard error as sqrt(p * (1-p) / n)\n    * (Refer: https://en.wikipedia.org/wiki/Margin_of_error#Calculations_assuming_random_sampling,\n    * https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Normal_approximation_interval)\n    *\n    * To decide if the difference is meaningful, one can think of the following:\n    * 1. Is the observed difference large enough to matter?\n    * 2. Is it statistically significant (wrt some significance level)\n    * 3. What is the confidence interval for the estimated p-value?\n    * (It is [p - z * std_err, p + z * std_err] where z is the critical value of\n    * a standard normal distribution corresponding to the (1 - alpha/2) quantile,\n    * and alpha is the target error = 1 - confidence_interval_percentage. Some\n    * useful values: 68.3% is about 1 std error, 95.4% is about 2, and 99.7% is about 3)\n    *\n    * @param predictions The sequence of model predictions to operate on\n    * @param dimType The dimension type, such as gender or age.\n    * @param dim1 The first dimension value / group\n    * @param dim2 The second dimension value / group\n    * @param metric The metric to evaluate using the permutation test\n    * @param numTrials Number of trials to perform (for both permutation\n    *                  testing and bootstrap estimate of std dev)\n    * @param seed Random seed. If not provided (or set to 0), uses a random seed.\n    * @return A PermutationTestResult case class containing the results\n    */\n  def permutationTest(predictions: Seq[ModelPrediction], dimType: String,\n    dim1: String, dim2: String, metric: String, numTrials: Int,\n    seed: Long = 0): FairnessResult = {\n    // Consider samples for only the two dimensions since simulations show\n    // that testing with only these samples results in more statistical power.\n    val predictionsDim12 = predictions.filter { prediction =>\n      (prediction.dimensionValue == dim1) ||\n        (prediction.dimensionValue == dim2)\n    }\n    val bucket1Length = predictionsDim12.count(_.dimensionValue == dim1)\n\n    // Set seed if provided and non-zero\n    if (seed != 0) {\n      Random.setSeed(seed)\n    }\n\n    // Obtain permutation functions\n    val fn = StatsUtils.getMetricFn(metric)\n    val permutationTestFn: Seq[ModelPrediction] => (Double, Double, Double) =\n      permutationFn(dim1, dim2, fn)\n\n    // Compute the observed difference and studentize it\n    val (value1, value2, observedDifference) = permutationTestFn(predictionsDim12)\n    val bootstrapStdDev = computeBootstrapStdDev(predictionsDim12,\n      permutationTestFn(_)._3, numTrials)\n    val studentizedObservedDifference = observedDifference / bootstrapStdDev\n\n    // Compute differences for n random trials and studentize it\n    val differenceHist: Seq[Double] =\n      (0 until numTrials).map { _ =>\n        val (shuffledBucket1, shuffledBucket2) =\n          Random.shuffle(predictionsDim12).splitAt(bucket1Length)\n        fn(shuffledBucket1) - fn(shuffledBucket2)\n      }\n    val differenceHistStdDev = StatsUtils.computeStdDev(differenceHist)\n    val studentizedDiffHist = differenceHist.map(_ / differenceHistStdDev)\n      .filterNot(_.isNaN)\n\n    // Compute p-value for the two-sided test\n    val extremeDiffs = studentizedDiffHist.map(math.abs)\n      .map(_ > math.abs(studentizedObservedDifference))\n      .map(_.compare(false))\n    val pVal = extremeDiffs.sum / numTrials.asInstanceOf[Double]\n    val stdError = math.sqrt(pVal * (1 - pVal) / numTrials)\n\n    // Build FairnessResults\n    val metricMap = Map(\n      \"metric\" -> metric,\n      \"numTrials\" -> numTrials.toString,\n      \"seed\" -> seed.toString)\n    FairnessResult(\n      resultType = \"PERMUTATION_TEST\",\n      resultValOpt = Some(StatsUtils.roundDouble(observedDifference)),\n      parameters = metricMap.toString,\n      constituentVals = Map(\n        Map(dimType -> dim1) -> StatsUtils.roundDouble(value1),\n        Map(dimType -> dim2) -> StatsUtils.roundDouble(value2)),\n      additionalStats = Map(\n        \"pValue\" -> StatsUtils.roundDouble(pVal),\n        \"stdError\" -> StatsUtils.roundDouble(stdError),\n        \"bootstrapStdDev\" -> bootstrapStdDev,\n        \"testStatisticStdDev\" -> differenceHistStdDev))\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/PositionBiasUtils.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.types.ScoreWithLabelAndPosition\nimport org.apache.spark.mllib.stat.KernelDensity\nimport org.apache.spark.sql.expressions.Window\nimport org.apache.spark.sql.functions.{broadcast, count, lit, min, pow, rand, stddev_pop}\nimport org.apache.spark.sql.{DataFrame, Dataset}\n\n/**\n  * Utilities for estimating position bias and removing the effect of position bias for learning\n  * an equality of opportunity transformation\n  */\nobject PositionBiasUtils {\n  case class PositionBias(position: Int, positionBias: Double)\n  /**\n    * Bandwidth computation based on Silverman's rule for kernel density estimation\n    *\n    * @param df a dataframe containing i.i.d. samples as \"value\"\n    * @return bandwidth\n    */\n  def getBandwidth(df: DataFrame): Double = {\n    import df.sparkSession.implicits._\n    df.agg(count($\"value\").as(\"numSamples\"), stddev_pop($\"value\").as(\"stdDev\"))\n      .select(lit(1.06) * $\"stdDev\" * pow($\"numSamples\", lit(-0.2))).head.getDouble(0)\n  }\n\n  /**\n    * Kernel density estimation with a Gaussian kernel for scores corresponding to a given position\n    *\n    * @param data     dataset containing score, binary label/response, position\n    * @param position position value for filtering\n    * @return a probability density function\n    */\n  def getDensity(data: Dataset[ScoreWithLabelAndPosition], position: Int): KernelDensity = {\n    import data.sparkSession.implicits._\n    val scores = data.filter(_.position == position).map(_.score)\n    val bw = getBandwidth(scores.toDF(\"value\"))\n    val density = new KernelDensity()\n      .setSample(scores.rdd)\n      .setBandwidth(bw)\n    density\n  }\n\n  /**\n    * Estimating the position bias at \"targetPosition\" with respect to \"basePosition\",\n    * where the position bias is defined as the decay in the number of positive examples\n    * from basePosition to targetPosition when similar items are served at each position.\n    * Typically, (the quality of) items differ from one position to another in the observational data.\n    * We correct for this discrepancy by matching the score distribution at the target position with\n    * the score distribution at the base position via importance sampling.\n    *\n    * @param data                dataset containing score, binary label {0, 1}, position\n    * @param maxImportanceWeight to control the variance of an importance sampling estimator\n    * @param targetPosition      target position for position bias estimation\n    * @param basePosition        base position for position bias estimation\n    * @return estimated position bias\n    */\n  def estimateAdjacentPositionBias(data: Dataset[ScoreWithLabelAndPosition],\n    maxImportanceWeight: Double, targetPosition: Int, basePosition: Int): Double = {\n    import data.sparkSession.implicits._\n\n    // estimating the density of the scores at basePosition\n    val kdTargetPosition = getDensity(data, targetPosition)\n\n    // estimating the density of the scores at basePosition\n    val kdPreviousPosition = getDensity(data, basePosition)\n\n    // extracting scores corresponding to positive labels at targetPosition\n    val targetPositionPositiveScoresArray = data\n      .filter(r => r.label == 1 && r.position == targetPosition)\n      .map(r => r.score).collect\n\n    // estimating the positive label probability at targetPosition with importance sampling adjustment\n    val totalWeight = kdPreviousPosition.estimate(targetPositionPositiveScoresArray)\n      .zip(kdTargetPosition.estimate(targetPositionPositiveScoresArray))\n      .map(x => Math.min(x._1 / x._2, maxImportanceWeight)).sum\n\n    // estimating the positive label probability at basePosition\n    val basePositionPositiveScoresCount = data\n      .filter(r => r.label == 1 && r.position == basePosition)\n      .count\n\n    totalWeight / basePositionPositiveScoresCount\n  }\n\n  /**\n    * Estimating the position bias with respect to the top most position\n    * based on cumulative multiplications of estimated adjacent position biases.\n    * The adjacent position bias at a target position is estimated as the decay in the number of positive examples\n    * from the previous position to the target position after adjusting for\n    * the discrepancy in the score distributions with importance sampling (see above).\n    *\n    * @param data                         dataset containing score, binary label {0, 1}, position\n    * @param maxImportanceWeight          to control the variance of an importance sampling estimator\n    *                                     for the adjacent position bias estimations\n    * @param positionBiasEstimationCutOff all adjacent position biases will be set to one beyond this cutoff\n    * @return position bias estimates for all positions\n    */\n  def estimatePositionBias(data: Dataset[ScoreWithLabelAndPosition],\n    maxImportanceWeight: Double, positionBiasEstimationCutOff: Int): Seq[PositionBias] = {\n    import data.sparkSession.implicits._\n    val positions = data.map(_.position).distinct.collect.toSeq.sorted\n    val adjacentPositionBias = (1 until math.min(positions.size, positionBiasEstimationCutOff)).map(i =>\n      estimateAdjacentPositionBias(\n        data.filter($\"position\" === positions(i) or $\"position\" === positions(i - 1)),\n        maxImportanceWeight, positions(i), positions(i - 1)))\n    var estimate = Seq(PositionBias(positions.head, 1.0))\n    for (i <- 1 until positions.size) {\n      if (i < positionBiasEstimationCutOff) {\n        // note that adjacentPositionBias(i-1) = adjacent position bias at position(i)\n        estimate :+= PositionBias(positions(i), estimate.last.positionBias * adjacentPositionBias(i - 1))\n      } else {\n        estimate :+= PositionBias(positions(i), estimate.last.positionBias)\n      }\n    }\n    estimate\n  }\n\n  /**\n    * Resampling data with weights corresponds to the inverse position bias for removing effect of position bias\n    * from the data with positive labels.\n    * We first compute normalized weights from the position bias estimates.\n    * Note that the normalized weights are in [0, 1].\n    * This allows us to get a weighted sample by applying down-sampling with normalized weights as down-sample rate.\n    * We repeat the down-sampling with multiple copies of the original sample to improve\n    * the accuracy of the final weighted sample.\n    *\n    * @param data                         dataset containing score, binary label {0, 1}, position\n    * @param maxImportanceWeight          see estimatePositionBias()\n    * @param positionBiasEstimationCutOff see estimatePositionBias()\n    * @param repeatTimes                  number of times the dataset to be repeated for resampling,\n    *                                     a larger number would lead to a more computationally expensive but\n    *                                     more accurate debiasing\n    * @param inflationRate                the maximum allowed ratio of the number of data points with positive labels\n    *                                     after and before debiasing\n    * @param numPartitions                the number of partition for repartitioning the data with positive labels\n    * @param seed                         for setting random seed for reproducibility\n    * @return debiased dataset with positive labels\n    */\n  def debiasPositiveLabelScores(data: Dataset[ScoreWithLabelAndPosition],\n    maxImportanceWeight: Double = 1000, positionBiasEstimationCutOff: Int, repeatTimes: Int = 1000,\n    inflationRate: Double = 10, numPartitions: Int = 1000, seed: Long = scala.util.Random.nextLong()):\n  Dataset[ScoreWithLabelAndPosition] = {\n    import data.sparkSession.implicits._\n\n    if (positionBiasEstimationCutOff < 1) {\n      return data.filter(_.label == 1)\n    }\n\n    val positionBiasEstimates = estimatePositionBias(data, maxImportanceWeight, positionBiasEstimationCutOff)\n      .toDF\n\n    val dataWithWeight = data\n      .filter(_.label == 1)\n      .repartition(numPartitions, $\"position\")\n      .join(broadcast(positionBiasEstimates), Seq(\"position\"), \"left_outer\")\n      .withColumn(\"minPositionBias\", min($\"positionBias\").over(Window.partitionBy(lit(1))))\n      .withColumn(\"weight\", $\"minPositionBias\" / $\"positionBias\")\n\n    val repeatedSamples = Seq.range(0, repeatTimes)\n      .map(i => dataWithWeight.filter(rand(seed * i) <= $\"weight\").drop(\"weight\"))\n      .reduceOption(_ union _).getOrElse(throw new RuntimeException(\"Cannot create union data\"))\n      .as[ScoreWithLabelAndPosition]\n\n    repeatedSamples.persist()\n\n    val downSampleRate = inflationRate * data.count().toFloat / repeatedSamples.count()\n    if (downSampleRate < 1) {\n      repeatedSamples.sample(downSampleRate)\n    } else {\n      repeatedSamples\n    }\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/StatsUtils.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.types.{CustomMetric, ModelPrediction}\nimport org.apache.spark.sql.DataFrame\nimport org.apache.spark.sql.functions.col\n\n/**\n  * Utilities to perform statistical tests\n  */\nobject StatsUtils {\n\n  /**\n    * Represents a confusion matrix\n    *\n    * @param truePositive True Positive count\n    * @param falsePositive False Positive count\n    * @param trueNegative True Negative count\n    * @param falseNegative False Negative count\n    */\n  case class ConfusionMatrix(\n    truePositive: Double,\n    falsePositive: Double,\n    trueNegative: Double,\n    falseNegative: Double)\n\n  /**\n    * Round off a double to a certain number of digits of precision\n    * @param d The double to round off\n    * @param numDigits Number of digits of precision required\n    * @return The rounded off double\n    */\n  def roundDouble(d: Double, numDigits: Int = 5): Double = {\n    val multiplier = math.pow(10, numDigits)\n    math.round(d * multiplier) / multiplier\n  }\n\n  /**\n    * Compute the percentage of positive and negative labels to sample, given\n    * the source DataFrames for the positive and negative labels.\n    *\n    * @param posDF The original DataFrame containing only positive labels\n    * @param negDF The original DataFrame containing only negative labels\n    * @param approxRows The approximate number of rows to sample (in total)\n    * @param labelZeroPercentage Percentage of negative labels to be present in\n    *                            the sampled DataFrame. If not provided (or if\n    *                            an invalid percentage is given), the sampling\n    *                            ratio is according to that of the source\n    *                            DataFrame.\n    * @return The sampling percentages for the positive and negative labels\n    *         respectively, to be used for stratified sampling.\n    */\n  def computePosNegSamplePercentages(posDF: DataFrame, negDF: DataFrame,\n    approxRows: Long, labelZeroPercentage: Double = -1.0): (Double, Double) = {\n    val posCount = posDF.count.toDouble\n    val negCount = negDF.count.toDouble\n    val totalCount = negCount + posCount\n\n    val (samplePosPercentage, sampleNegPercentage) =\n      if (labelZeroPercentage >= 0.0 && labelZeroPercentage <= 1.0) {\n        ((approxRows * (1.0 - labelZeroPercentage)) / posCount,\n          (approxRows * labelZeroPercentage) / negCount)\n      } else {\n        (approxRows / totalCount,\n          approxRows / totalCount)\n      }\n    val updatedSamplePosPercentage =\n      if (samplePosPercentage > 1.0) 1.0 else samplePosPercentage\n    val updatedSampleNegPercentage =\n      if (sampleNegPercentage > 1.0) 1.0 else sampleNegPercentage\n    (updatedSamplePosPercentage, updatedSampleNegPercentage)\n  }\n\n  /**\n    * Sample an approximate number of entries from a DataFrame (using stratified\n    * sampling), ensuring that it contains a positive:negative label ratio\n    * as specified. If no such input is provided, we attempt to sample according\n    * to the ratio in which positives and negatives are present in the input\n    * DataFrame.\n    *\n    * @param df The DataFrame to operate on\n    * @param labelField The column containing the labels\n    * @param approxRows An approximate number of rows to sample\n    * @param labelZeroPercentage Percentage of negative labels to be present in\n    *                            the sampled DataFrame. If not provided (or if\n    *                            an invalid percentage is given), the sampling\n    *                            ratio is according to that of the source\n    *                            DataFrame.\n    * @param seed Random seed. If not provided (or set to 0), uses a random seed.\n    * @return The sampled DataFrame.\n    */\n  def sampleDataFrame(df: DataFrame, labelField: String, approxRows: Long,\n    labelZeroPercentage: Double = -1.0, seed: Long = 0): DataFrame = {\n    val posDF = df.filter(col(labelField) === 1.0)\n    val negDF = df.filter(col(labelField) === 0.0)\n\n    val (samplePosPercentage, sampleNegPercentage) =\n      computePosNegSamplePercentages(posDF, negDF, approxRows, labelZeroPercentage)\n\n    val (samplePosDF, sampleNegDF) =\n      if (seed == 0) {\n        (posDF.sample(samplePosPercentage),\n          negDF.sample(sampleNegPercentage))\n      } else {\n        (posDF.sample(samplePosPercentage, seed),\n          negDF.sample(sampleNegPercentage, seed))\n      }\n\n    samplePosDF.union(sampleNegDF)\n  }\n\n  /**\n    * Sample an approximate number of entries from a DataFrame by selecting all\n    * rows belonging to a group ID, for a bunch of randomly sampled group IDs.\n    * The idea behind this sampling technique is to ensure that per-groupID\n    * metrics have as little noise as possible (eg: a group ID may have only 25\n    * results, if the group ID is the search ID), while cutting down on the\n    * number of groups being analyzed. Ranking metrics average their results\n    * across group IDs, so sampling by this should only affect the averaging\n    * process.\n    *\n    * @param df The DataFrame to operate on\n    * @param labelField The column containing the labels\n    * @param scoreField The column containing the scores\n    * @param groupIdField The column containing the group IDs\n    * @param protectedAttributeField The column containing the protected attributes\n    * @param approxRows An approximate number of groupIDs to sample\n    * @param seed Random seed. If not provided (or set to 0), uses a random seed.\n    * @return The sampled DataFrame.\n    */\n  def sampleDataFrameByGroupId(df: DataFrame, labelField: String,\n    scoreField: String, groupIdField: String, protectedAttributeField: String,\n    approxRows: Long, seed: Long = 0): DataFrame = {\n    val modelPredictionDF = ModelPrediction.getModelPredictionDS(df,\n      labelField, scoreField, groupIdField, protectedAttributeField)\n      .toDF\n    val groupIdDF = modelPredictionDF.select(\"groupId\").distinct\n    val samplePercentage = math.min(1.0, approxRows.toDouble / groupIdDF.count)\n    val sampledGroupIdDF =\n      if (seed == 0) {\n        groupIdDF.sample(samplePercentage)\n      } else {\n        groupIdDF.sample(samplePercentage, seed)\n      }\n    modelPredictionDF.join(sampledGroupIdDF, \"groupId\")\n      .select(col(\"label\").as(labelField),\n        col(\"prediction\").as(scoreField),\n        col(\"groupId\").as(groupIdField),\n        col(\"dimensionValue\").as(protectedAttributeField))\n  }\n\n  /**\n    * Computes Precision@K at a given threshold. Data points with predictions\n    * higher than this threshold are true positives and others are false positives.\n    * For example, if job views are labeled 1 and job applies are labeled 2,\n    * using a threshold of 2 computes P@K for job applies\n    * while 1 computes it for job views (includes job applies). Note that threshold\n    * indicates whether a result is 'relevant' or not (TP or FP), while K indicates\n    * the position up to which results are to be looked at.\n    *\n    * @param threshold Threshold above which to mark as true positive\n    * @param k The value of k for P@K\n    * @param data The data to compute this over\n    * @return The Precision@K value for the input data\n    */\n  def computePrecisionAtK(threshold: Double, k: Int)\n    (data: Seq[ModelPrediction]): Double = {\n    def singleQueryPrecisionAtK(predicted: Seq[ModelPrediction]):\n      Double = {\n      // Consider only predictions with ranks [1, k]\n      val predictedWithinK = predicted\n        .filter(_.rank <= k)\n\n      if (predictedWithinK.isEmpty) {\n        0.0\n      } else {\n        // Ranking metrics are computed by looking at the predicted ordering\n        // of labels rather than the predictions/scores themselves\n        val numRelevant = predictedWithinK\n          .count(_.label >= threshold)\n          .toDouble\n\n        numRelevant / predictedWithinK.length\n      }\n    }\n\n    val precisions = data.groupBy(_.groupId)\n      .map { case (_, perGroupPredictions) =>\n        singleQueryPrecisionAtK(perGroupPredictions)\n      }\n    precisions.sum / precisions.size\n  }\n\n  /**\n    * Retrieve the metric function corresponding to the requested metric\n    *\n    * @param metric The metric of interest\n    * @return The function that computes the requested metric\n    */\n  def getMetricFn(metric: String): Seq[ModelPrediction] => Double = {\n    if (metric.equals(\"AUC\")) {\n      computeAUC\n    } else if (metric.equals(\"FNR\")) {\n      computeFalseNegativeRate\n    } else if (metric.equals(\"FPR\")) {\n      computeFalsePositiveRate\n    } else if (metric.equals(\"TNR\")) {\n      computeTrueNegativeRate\n    } else if (metric.equals(\"PRECISION\")) {\n      computePrecision\n    } else if (metric.equals(\"RECALL\")) {\n      computeRecall\n    } else if (metric.matches(\"PRECISION/\\\\d*\\\\.*\\\\d+@\\\\d+\")) {\n      // The format is PRECISION/threshold@K\n      val parameters = metric.split(\"/\").last\n      val threshold = parameters.split(\"@\").head.toDouble\n      val k = parameters.split(\"@\").last.toInt\n      computePrecisionAtK(threshold, k)\n    } else {\n      Class.forName(metric)\n        .asInstanceOf[Class[CustomMetric]]\n        .newInstance\n        .compute\n    }\n  }\n\n  /**\n    * Compute the standard deviation of a given sample. This is obtained by\n    * taking the square root of the unbiased estimator of the variance, but the\n    * estimate of the standard deviation obtained as a result is biased.\n    * The unbiased estimator of the standard deviation is fairly involved and\n    * isn't worth it, especially when we're dealing with large samples.\n    *\n    * @param vals The values to obtain the standard deviation for\n    * @return The standard deviation\n    */\n  def computeStdDev(vals: Seq[Double]): Double = {\n    val valsWithoutNan = vals.filterNot(_.isNaN)\n    val variance =\n      if (valsWithoutNan.length < 2) {\n          0.0\n      } else {\n        // Compute an unbiased estimate of the variance\n        val n = valsWithoutNan.length\n        val mean = valsWithoutNan.sum / n\n        val sumOfSquareDeviations = valsWithoutNan\n          .map { v => (v - mean) * (v - mean) }\n          .sum\n        sumOfSquareDeviations / (n - 1)\n      }\n    math.sqrt(variance)\n  }\n\n  //////////////////////////////////////////////////////////////////////////////\n  // Metrics to evaluate using the permutation test are defined below.\n  //////////////////////////////////////////////////////////////////////////////\n\n  /**\n    * Computes a generalized confusion matrix for the given model prediction\n    * data. The values of this matrix are the sum of the prediction scores.\n    *\n    * If the predicted scores are thresholded (ie., either 0.0 or 1.0\n    * only), the generalized confusion matrix reduces\n    * to the traditional confusion matrix.\n    *\n    * @param data The model prediction details\n    * @return A confusion matrix containing the true positive, false positive,\n    *         true negative and false negative scores/counts.\n    */\n  def computeGeneralizedConfusionMatrix(data: Seq[ModelPrediction]):\n    ConfusionMatrix = {\n    if (data.isEmpty) {\n      ConfusionMatrix(0, 0, 0, 0)\n    } else {\n      data.map { modelPrediction =>\n\n        // The label indicates if it is a positive label (1) or not (0)\n        val tp = modelPrediction.prediction * modelPrediction.label\n        val fp = modelPrediction.prediction * (1.0 - modelPrediction.label)\n        val tn = (1.0 - modelPrediction.prediction) * (1.0 - modelPrediction.label)\n        val fn = (1.0 - modelPrediction.prediction) * modelPrediction.label\n\n        ConfusionMatrix(\n          truePositive = tp,\n          falsePositive = fp,\n          trueNegative = tn,\n          falseNegative = fn)\n      }.reduce { (cm1, cm2) =>\n        // Add up entries\n        ConfusionMatrix(\n          truePositive = cm1.truePositive + cm2.truePositive,\n          falsePositive = cm1.falsePositive + cm2.falsePositive,\n          trueNegative = cm1.trueNegative + cm2.trueNegative,\n          falseNegative = cm1.falseNegative + cm2.falseNegative)\n      }\n    }\n  }\n\n  /**\n    * Computes the precision/Positive Predictive Value for a given set of\n    * model predictions.\n    * Refer: https://en.wikipedia.org/wiki/Confusion_matrix\n    *\n    * @param data Sequence of model predictions.\n    *             We assume that the predictions are thresholded to 0/1.\n    * @return The precision for the given predictions\n    */\n  def computePrecision(data: Seq[ModelPrediction]): Double = {\n    val confusionMatrix = computeGeneralizedConfusionMatrix(data)\n    if (confusionMatrix.truePositive == 0) {\n      0.0\n    } else {\n      confusionMatrix.truePositive /\n        (confusionMatrix.truePositive + confusionMatrix.falsePositive)\n    }\n  }\n\n  /**\n    * Computes the False Positive Rate for a given set of model predictions.\n    * Refer: https://en.wikipedia.org/wiki/Confusion_matrix\n    *\n    * @param data Sequence of model predictions\n    *             We assume that the predictions are thresholded to 0/1.\n    * @return The FPR for the given predictions\n    */\n  def computeFalsePositiveRate(data: Seq[ModelPrediction]): Double = {\n    val confusionMatrix = computeGeneralizedConfusionMatrix(data)\n    if (confusionMatrix.falsePositive == 0) {\n      0.0\n    } else {\n      confusionMatrix.falsePositive /\n        (confusionMatrix.falsePositive + confusionMatrix.trueNegative)\n    }\n  }\n\n  /**\n    * Computes the False Negative Rate for a given set of model predictions.\n    * Refer: https://en.wikipedia.org/wiki/Confusion_matrix\n    *\n    * @param data Sequence of model predictions\n    *             We assume that the predictions are thresholded to 0/1.\n    * @return The FNR for the given predictions\n    */\n  def computeFalseNegativeRate(data: Seq[ModelPrediction]): Double = {\n    val confusionMatrix = computeGeneralizedConfusionMatrix(data)\n    if (confusionMatrix.falseNegative == 0) {\n      0.0\n    } else {\n      confusionMatrix.falseNegative /\n        (confusionMatrix.truePositive + confusionMatrix.falseNegative)\n    }\n  }\n\n  /**\n    * Computes the recall/sensitivity/True Positive Rate for a given set of\n    * model predictions.\n    * Refer: https://en.wikipedia.org/wiki/Confusion_matrix\n    *\n    * @param data Sequence of model predictions\n    *             We assume that the predictions are thresholded to 0/1.\n    * @return The recall for the given predictions\n    */\n  def computeRecall(data: Seq[ModelPrediction]): Double = {\n    val confusionMatrix = computeGeneralizedConfusionMatrix(data)\n    if (confusionMatrix.truePositive == 0) {\n      0.0\n    } else {\n      confusionMatrix.truePositive /\n        (confusionMatrix.truePositive + confusionMatrix.falseNegative)\n    }\n  }\n\n  /**\n    * Computes the True Negative Rate for a given set of model predictions.\n    * Refer: https://en.wikipedia.org/wiki/Confusion_matrix\n    *\n    * @param data Sequence of model predictions\n    *             We assume that the predictions are thresholded to 0/1.\n    * @return The TNR for the given predictions\n    */\n  def computeTrueNegativeRate(data: Seq[ModelPrediction]): Double = {\n    val confusionMatrix = computeGeneralizedConfusionMatrix(data)\n    if (confusionMatrix.trueNegative == 0) {\n      0.0\n    } else {\n      confusionMatrix.trueNegative /\n        (confusionMatrix.trueNegative + confusionMatrix.falsePositive)\n    }\n  }\n\n  /**\n    * Compute the ROC curve for a given sequence of labels and predictions.\n    * The implementation here is similar to NumPy's ROC curve computation.\n    *\n    * @param data The sequence of labels and predictions\n    * @return The False Positive Rate (FPR) and True Positive Rate (TPR) values\n    *         for the model, making up the X and Y axes of the ROC curve\n    */\n  def computeROCCurve(data: Seq[ModelPrediction]): (Seq[Double], Seq[Double]) = {\n    val descSortedData = data.sortBy(-_.prediction)\n\n    // Select the largest index for each unique prediction value. For example,\n    // for [0.8, 0.7, 0.7, 0.6, 0.5, 0.5, 0.5], we get [0, 2, 3, 6].\n    // We do this by finding the indices with non-zero differences between\n    // adjacent elements. We need to force-select the last index.\n    val thresholdIdxs = descSortedData.sliding(2)\n      .zipWithIndex\n      .collect { case (slidingWindow, idx)\n        if (slidingWindow(1).prediction - slidingWindow.head.prediction) != 0 =>\n        idx\n      }.toList :+ (descSortedData.length - 1)\n\n    val cumSums = descSortedData.scanLeft(0.0)(_ + _.label)\n      .tail // Drop the initial 0.0 that gets added to the list\n\n    // Select the cumulative sums at the identified thresholds. This would be the\n    // number of true positives, since the labels are 0 or 1.\n    val truePositives = thresholdIdxs.collect(cumSums)\n    val falsePositives = thresholdIdxs.zip(truePositives)\n      .map { case (thresholdIdx, truePositive) =>\n        // 1 + thresholdIdx is the number of entries marked +ve at that threshold\n        // so subtracting the TP count would give us the FP count.\n        1 + thresholdIdx - truePositive\n      }\n\n    // The last entries in the FP and TP counts would be the values when\n    // all datapoints are predicted as 1. This means that TN and FN would be zero\n    // due to no negative predictions. Thus, N = FP + TN = FP, and P = TP + FN = TP.\n    // That is, the total positive and negative counts are given by the last entries\n    // in the TP and FP lists respectively.\n    val numPositives = truePositives.lastOption.getOrElse(0.0)\n    val numNegatives = falsePositives.lastOption.getOrElse(0.0)\n\n    val fpr = falsePositives.map(_ / numNegatives)\n    val tpr = truePositives.map(_ / numPositives)\n    (fpr, tpr)\n  }\n\n  /**\n    * Compute AUC for a given sequence of labels and predictions.\n    *\n    * This works by computing the ROC curve, and estimates the integral of\n    * y = f(x) (where x is the fpr and y is the tpr) by using the trapezoidal\n    * rule. This is similar to how NumPy and Spark MLLib estimate AUC.\n    *\n    * @param data The sequence of labels and predictions\n    * @return The AUC for the model\n    */\n  def computeAUC(data: Seq[ModelPrediction]): Double = {\n    val (fpr, tpr) = computeROCCurve(data)\n\n    if (fpr.length == 1 && tpr.length == 1) {\n      0.0\n    } else {\n      // Integral from a to b of f(x) is estimated by computing the area of the\n      // trapezium (formed by a, b, f(a) and f(b)) as (b-a) * (f(a) + f(b)) / 2\n      fpr.zip(tpr)\n        .sliding(2)\n        .foldLeft(0.0) { case (auc, slidingWindow) =>\n          val xDiff = slidingWindow(1)._1 - slidingWindow.head._1\n          val yAvg = (slidingWindow(1)._2 + slidingWindow.head._2) / 2.0\n          auc + (xDiff * yAvg)\n        }\n    }\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/testing/TestCustomMetric.scala",
    "content": "package com.linkedin.lift.lib.testing\n\nimport com.linkedin.lift.types.{CustomMetric, ModelPrediction}\n\n/**\n  * Custom metric test class. Returns 1.0\n  */\nclass TestCustomMetric extends CustomMetric {\n  override def compute(data: Seq[ModelPrediction]): Double = {\n    1.0\n  }\n}\n\n\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/testing/TestUtils.scala",
    "content": "package com.linkedin.lift.lib.testing\n\nimport com.linkedin.lift.types.ScoreWithLabelAndAttribute\nimport org.apache.spark.SparkConf\nimport org.apache.spark.sql.expressions.Window\nimport org.apache.spark.sql.functions.rank\nimport org.apache.spark.sql.types.StructType\nimport org.apache.spark.sql.{DataFrame, Dataset, SparkSession}\n\nimport scala.reflect.ClassTag\nimport scala.reflect.runtime.universe._\n\n/**\n  * Common utilities for testing purposes\n  */\nobject TestUtils {\n  /**\n    * Creates DataFrame from a subclass of Product\n    */\n  def createDFFromProduct[T <: Product: ClassTag](spark: SparkSession, data: Seq[T])\n    (implicit t: TypeTag[T]):\n    DataFrame = {\n    val rdd = spark.sparkContext.parallelize(data)\n    spark.createDataFrame(rdd)\n  }\n\n  /**\n    * Create the local SparkSession used for general-purpose Spark unit tests.\n    *\n    * @param appName: Name of the local spark app.\n    * @param numThreads: Parallelism of the local spark app, the input of\n    *                  numThreads can either be an integer or the character '*'\n    *                  which means spark will use as many worker\n    *                  threads as the logical cores.\n    */\n  def createSparkSession(appName: String = \"localtest\", numThreads: Any = 4):\n    SparkSession = {\n    val sparkConf: SparkConf = {\n      // Turn on Kryo serialization by default\n      val conf = new SparkConf()\n      conf.set(\"spark.serializer\", \"org.apache.spark.serializer.KryoSerializer\")\n      conf.set(\"spark.driver.host\", \"localhost\")\n      conf\n    }\n\n    // numThreads can either be an integer or '*' which means Spark will\n    // use as many worker threads as the logical cores\n    if (!numThreads.isInstanceOf[Int] && !numThreads.equals(\"*\")) {\n      throw new IllegalArgumentException(s\"Invalid arguments: The number of \" +\n        s\"threads ($numThreads) can only be integers or '*'.\")\n    }\n\n    SparkSession.builder\n      .appName(appName)\n      .master(s\"local[$numThreads]\")\n      .config(sparkConf)\n      .getOrCreate()\n  }\n\n  /**\n    * Loading csv data\n    *\n    * @param spark spark session\n    * @param dataPath data path\n    * @param dataSchema data schema\n    * @param delimiter data separating delimiter\n    * @param numPartitions number of partitions\n    * @return loaded data as a dataframe\n    */\n  def loadCsvData(spark: SparkSession, dataPath: String, dataSchema: StructType, delimiter: String,\n    numPartitions: Int = 100): DataFrame ={\n    spark.read.format(\"csv\")\n      .option(\"header\", value = true)\n      .option(\"delimiter\", delimiter)\n      .option(\"numPartitions\", numPartitions)\n      .schema(dataSchema)\n      .load(dataPath)\n  }\n\n  /**\n    * To apply the effect of position bias (i.e. positive response decay), we multiply the labels with\n    * Bernoulli(1/(1 + position)) random numbers,\n    * where the position corresponds to the rank of an item in a session (according to item scores).\n    *\n    * @param dataWithoutPositionBias a dataset containing sessionId, score, position and label\n    * @return dataset with modified labels\n    */\n  def applyPositionBias(dataWithoutPositionBias: Dataset[ScoreWithLabelAndAttribute]):\n  Dataset[ScoreWithLabelAndAttribute] = {\n    import dataWithoutPositionBias.sparkSession.implicits._\n    dataWithoutPositionBias\n      .withColumn(\"position\", rank().over(Window.partitionBy($\"sessionId\").orderBy($\"score\".desc)))\n      .as[ScoreWithLabelAndAttribute]\n      .map(row => row.copy(label = if (math.random < 1 / math.log(1 + row.position.getOrElse(1)) &&\n        row.label == 1) 1 else 0))\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/lib/testing/TestValues.scala",
    "content": "package com.linkedin.lift.lib.testing\n\nimport com.linkedin.lift.types.ScoreWithLabelAndPosition\nimport org.apache.spark.mllib.random.RandomRDDs.normalRDD\nimport org.apache.spark.rdd.RDD\nimport org.apache.spark.sql.{DataFrame, Dataset, SparkSession}\n\n/**\n  * Common values for testing purposes\n  */\nobject TestValues {\n  val spark: SparkSession = TestUtils.createSparkSession(numThreads = \"*\")\n  import spark.implicits._\n\n  case class JoinedData(memberId: Int, label: String,\n    predicted: String, gender: String, qid: String = \"\")\n  val testData: Seq[JoinedData] = Seq(\n    JoinedData(12340, \"0\", \"0\", \"MALE\"),\n    JoinedData(12341, \"1\", \"0\", \"MALE\"),\n    JoinedData(12342, \"0\", \"1\", \"MALE\"),\n    JoinedData(12343, \"0\", \"0\", \"MALE\"),\n    JoinedData(12344, \"1\", \"1\", \"MALE\"),\n    JoinedData(12345, \"0\", \"1\", \"UNKNOWN\"),\n    JoinedData(12346, \"1\", \"1\", \"FEMALE\"),\n    JoinedData(12347, \"1\", \"0\", \"FEMALE\"),\n    JoinedData(12348, \"0\", \"0\", \"FEMALE\"),\n    JoinedData(12349, \"0\", \"1\", \"FEMALE\"))\n  val df: DataFrame = TestUtils.createDFFromProduct(TestValues.spark, testData)\n\n  val testData2: Seq[JoinedData] = Seq(\n    JoinedData(12340, \"0.0\", \"0.3\", \"MALE\", \"1\"),\n    JoinedData(12341, \"1.0\", \"0.4\", \"MALE\", \"2\"),\n    JoinedData(12342, \"0.0\", \"0.8\", \"MALE\", \"3\"),\n    JoinedData(12343, \"0.0\", \"0.1\", \"MALE\", \"3\"),\n    JoinedData(12344, \"1.0\", \"0.7\", \"MALE\", \"1\"),\n    JoinedData(12345, \"0.0\", \"0.6\", \"UNKNOWN\", \"2\"),\n    JoinedData(12346, \"1.0\", \"0.9\", \"FEMALE\", \"2\"),\n    JoinedData(12347, \"1.0\", \"0.3\", \"FEMALE\", \"3\"),\n    JoinedData(12348, \"0.0\", \"0.2\", \"FEMALE\", \"2\"),\n    JoinedData(12349, \"0.0\", \"0.8\", \"FEMALE\", \"1\"))\n  val df2: DataFrame = TestUtils.createDFFromProduct(TestValues.spark, testData2)\n\n  // test data for PositionBiasUtils\n  val score00: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 1000L, 1, 12)\n    .map(x => ScoreWithLabelAndPosition(x, 0, 1))\n  val score10: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 1000L, 1, 123)\n    .map(x => ScoreWithLabelAndPosition(x, 1, 1))\n  val score01: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 200L, 1, 1234)\n    .map(x => ScoreWithLabelAndPosition(x, 0, 2))\n  val score11: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 800L, 1, 12345)\n    .map(x => ScoreWithLabelAndPosition(x, 1, 2))\n  val score02: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 100L, 1, 23)\n    .map(x => ScoreWithLabelAndPosition(x - 0.5, 0, 3))\n  val score12: RDD[ScoreWithLabelAndPosition] =\n    normalRDD(spark.sparkContext, 600L, 1, 234)\n    .map(x => ScoreWithLabelAndPosition(x - 0.5, 1, 3))\n\n  val positionBiasData: Dataset[ScoreWithLabelAndPosition] =\n    (score00 ++ score01 ++ score10 ++ score11 ++ score02 ++ score12).toDS\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/mitigation/EOppUtils.scala",
    "content": "package com.linkedin.lift.mitigation\n\nimport com.linkedin.lift.types.{ScoreWithAttribute, ScoreWithLabelAndAttribute}\nimport org.apache.spark.sql.{DataFrame, Dataset}\n\n/**\n  * Utilities for learning and applying an equality of opportunity transformation\n  * (based on https://arxiv.org/abs/2006.11350)\n  */\nobject EOppUtils {\n  /**\n    * This is a helper function for applyTransformation() below.\n    * Transforming a single score using a transformation function given as a scala map.\n    * We first perform a binary search to determine the closest lowerBound and upperBound of the score in the keys\n    * of the transformation map. Then we transform the score assuming that the transformation function is linear in\n    * the interval (lowerBound, upperBound).\n    *\n    * @param score          score value\n    * @param sortedKeys     sorted keys of the transformation map\n    * @param transformation transformation function given as a scala map\n    * @return transformed score\n    */\n  def transformScore(score: Double, sortedKeys: Seq[Double], transformation: Map[Double, Double]): Double = {\n    if (score <= sortedKeys.head) {\n      return transformation(sortedKeys.head)\n    } else if (score >= sortedKeys.last) {\n      return transformation(sortedKeys.last)\n    }\n\n    var left = 0\n    var right = sortedKeys.length - 1\n\n    while (left <= right) {\n      val mid = left + (right - left) / 2\n      if (sortedKeys(mid) <= score)\n        left = mid + 1\n      else\n        right = mid - 1\n      if (score <= sortedKeys(left)) {\n        right = left - 1\n      } else if (score >= sortedKeys(right)) {\n        left = right + 1\n      }\n    }\n\n    val lowerBound = sortedKeys(right)\n    val upperBound = sortedKeys(left)\n    val deltaProportion = (score - lowerBound) / (upperBound - lowerBound)\n\n    transformation(lowerBound) + deltaProportion * (transformation(upperBound) - transformation(lowerBound))\n\n  }\n\n  /**\n    * Transform scores of a dataset based on the corresponding attribute using transformScore().\n    *\n    * @param data          dataset containing score and attribute\n    * @param attributeList list of attributes\n    * @param transformations transformations represented as a scala Map for each attribute\n    * @return transformed scores\n    */\n  def applyTransformation(data: Dataset[ScoreWithAttribute], attributeList: Seq[String],\n    transformations: Map[String, Map[Double, Double]], numPartitions: Int = 1000): Dataset[ScoreWithAttribute] = {\n    import data.sparkSession.implicits._\n    val sortedKeys: Map[String, Seq[Double]] = attributeList.zip(attributeList.map(\n      transformations(_).keys.toSeq.sorted)).toMap\n\n    data\n      .filter($\"attribute\".isin(attributeList: _*))\n      .repartition(numPartitions)\n      .map(row => row.copy(score = transformScore(row.score, sortedKeys(row.attribute), transformations(row.attribute)))\n      )\n  }\n\n  /**\n    * Computing the empirical CDF function.\n    *\n    * @param data              dataframe containing \"score\"\n    * @param probabilities     array of probabilities for computing quantiles\n    * @param relativeTolerance relative tolerance for computing approximate quantiles\n    * @return the eCDF as a scala map\n    */\n  def cdfTransformation(data: DataFrame, probabilities: Array[Double],\n    relativeTolerance: Double): Map[Double, Double] = {\n    val quantiles = data.stat.approxQuantile(\"score\", probabilities, relativeTolerance)\n    quantiles.zip(probabilities).toMap\n  }\n\n  /**\n    * Adjust transformation such that the transformed score distribution is the same as the baseline score distribution\n    *\n    * @param baselineData dataset containing score, label, attribute\n    * @param attributeList list of attributes\n    * @param transformations transformations represented as a scala Map for each attribute\n    * @param numQuantiles the number of quantiles for computing a quantile-quantile map between the original score\n    *                     and the transformed score\n    * @param relativeTolerance relative tolerance for computing approximate quantiles\n    * @return modified transformations\n    */\n  def adjustScale(baselineData: Dataset[ScoreWithAttribute], attributeList: Seq[String],\n    transformations: Map[String, Map[Double, Double]],\n    numQuantiles: Int, relativeTolerance: Double): Map[String, Map[Double, Double]] = {\n\n    import baselineData.sparkSession.implicits._\n\n    val filteredData = baselineData\n      .filter($\"attribute\".isin(attributeList: _*))\n      .as[ScoreWithAttribute]\n\n    val probabilities = Array.range(0, numQuantiles + 1).map(x => x.toDouble / numQuantiles)\n\n    val quantilesBeforeTransformation = filteredData.stat.approxQuantile(\"score\", probabilities,\n      relativeTolerance)\n    val transformedData = applyTransformation(filteredData, attributeList, transformations)\n\n    val quantilesAfterTransformation = transformedData.stat.approxQuantile(\"score\", probabilities,\n      relativeTolerance)\n\n    val qqMap = quantilesAfterTransformation.zip(quantilesBeforeTransformation).toMap\n\n    transformations.transform((attribute, innerMap) => innerMap.transform((key, value) =>\n      transformScore(value, quantilesAfterTransformation.toSeq, qqMap)))\n  }\n\n  /**\n    * Learning the equality of opportunity (EOpp) transformation for datasets.\n    * By setting originalScale = true, a score distribution preserving transformation can be learned.\n    * However, this may affect the quality of the output (i.e. the EOpp transformation), especially when numQuntiles is\n    * not large enough.\n    *\n    * @param data              dataset containing score, label, attribute\n    * @param numQuantiles      number of points for representing transformation functions\n    *                          (quantile-quantile mappings).\n    * @param relativeTolerance relative tolerance for computing approximate quantiles\n    * @param originalScale     whether the distribution of the transformed score should be the same as the distribution\n    *                          before transformation.\n    * @return EOpp transformation represented as a scala Map[Double, Double] for each attribute\n    */\n  def eOppTransformation(data: Dataset[ScoreWithLabelAndAttribute], attributeList: Seq[String],\n    numQuantiles: Int = 10000, relativeTolerance: Double = 1e-6, originalScale: Boolean = false):\n  Map[String, Map[Double, Double]] = {\n    import data.sparkSession.implicits._\n\n    val probabilities = Array.range(0, numQuantiles + 1).map(x => x.toDouble / numQuantiles)\n    val eOppMaps = attributeList.zip(attributeList.map(attribute =>\n      cdfTransformation(data.filter($\"label\" === 1 and $\"attribute\" === attribute).toDF,\n        probabilities, relativeTolerance)\n    )).toMap\n\n    if (!originalScale) {\n      eOppMaps\n    } else {\n      adjustScale(data.drop(\"label\").as[ScoreWithAttribute], attributeList, eOppMaps,\n        numQuantiles, relativeTolerance)\n    }\n  }\n\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/BenefitMap.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.lib.StatsUtils\nimport com.linkedin.lift.types.Distribution.DimensionValues\n\n/**\n  * Class representing the benefits for different categories. It is a map\n  * from a category (specified as DimensionValues) to the corresponding benefit\n  * value. Examples of a category include [gender = Female], [gender = Male,\n  * age >= 40], and [age < 40, disability = yes]. Examples of a benefit value\n  * include AUC, precision, recall, error rate, FPR, FNR, FDR, and FOR. We\n  * assume that this map is non-empty, the benefit values are non-negative,\n  * at least one benefit value is positive, and the benefit value for missing\n  * dimensions equals zero.\n  *\n  * @param entries The map that represents benefits for different categories.\n  * @param benefitType The benefit metric whose values are stored in the entries.\n  */\ncase class BenefitMap(\n  entries: Map[DimensionValues, Double],\n  benefitType: String\n) {\n\n  val errorTolerance = 1e-12\n\n  /**\n    * Computes the mean of the benefit values\n    *\n    * @return The mean of all the entries\n    */\n  def mean: Double = entries.values.sum / entries.size\n\n  /**\n    * Computes the population variance of the benefit values\n    *\n    * While computing the inequality measures, we would be considering the\n    * entire \"population\" consisting of all dimensions (as opposed to\n    * \"sampling\" from the set of potential dimensions). Hence, we treat as\n    * though we calculate the population variance and do not apply Bessel's\n    * correction (https://en.wikipedia.org/wiki/Bessel%27s_correction).\n    *\n    * @return The population variance of all the entries\n    */\n  def variance: Double = {\n    entries.values.map(math.pow(_, 2)).sum / entries.size -\n      math.pow(this.mean, 2)\n  }\n\n  /**\n    * Get the value corresponding to a given DimensionValue\n    *\n    * @param key DimensionValue of interest\n    * @return value if present, else 0.0\n    */\n  def getValue(key: DimensionValues): Double = entries.getOrElse(key, 0.0)\n\n  /**\n    * Generalized entropy index as a measure of inequality of the distribution\n    * of the benefits over categories. References:\n    * https://arxiv.org/abs/1902.04783\n    * https://arxiv.org/abs/1807.00787\n    * https://en.wikipedia.org/wiki/Generalized_entropy_index\n    *\n    * We assume that the benefits are positive whenever alpha is set to 0 or 1,\n    * so it is recommended to use the useAbsVal flag (to convert benefit vectors\n    * into their positive counterparts) if the vector might contain negative values.\n    *\n    * @param alpha Parameter which regulates the weight given to distances\n    *              between benefits at different parts of the distribution.\n    * @return Generalized entropy index\n    */\n  def computeGeneralizedEntropyIndex(alpha: Double,\n    useAbsVal: Boolean = false): Double = {\n    val count = entries.size\n    val updatedBenefitMap =\n      if (useAbsVal) {\n        val posEntries = entries.map { case (dimVal, entry) =>\n          (dimVal, math.abs(entry))\n        }\n        BenefitMap(entries = posEntries, benefitType = this.benefitType)\n      } else {\n        this\n      }\n    val mean = updatedBenefitMap.mean\n\n    val normalizedBenefits = updatedBenefitMap.entries.map { case (_, benefit) =>\n      benefit / mean\n    }\n\n    if (math.abs(alpha - 1.0) < errorTolerance) {\n      normalizedBenefits.map(x => x * math.log(x)).sum / count\n    } else if (math.abs(alpha) < errorTolerance) {\n      normalizedBenefits.map(x => - math.log(x)).sum / count\n    } else {\n      normalizedBenefits.map(x => math.pow(x, alpha) - 1.0).sum /\n        (count * alpha * (alpha - 1.0))\n    }\n  }\n\n  /**\n    * Theil T index as a measure of inequality of the distribution of the\n    * benefits over categories. Reference:\n    * https://en.wikipedia.org/wiki/Theil_index\n    *\n    * We assume that the distribution has only positive values.\n    *\n    * @return Theil T index\n    */\n  def computeTheilTIndex: Double = computeGeneralizedEntropyIndex(1.0, useAbsVal = true)\n\n  /**\n    * Theil L index as a measure of inequality of the distribution of the\n    * benefits over categories (also known as the mean log deviation).\n    * References:\n    * https://en.wikipedia.org/wiki/Theil_index\n    * https://en.wikipedia.org/wiki/Mean_log_deviation\n    *\n    * We assume that the distribution has only positive values.\n    *\n    * @return Theil L index\n    */\n  def computeTheilLIndex: Double = computeGeneralizedEntropyIndex(0.0, useAbsVal = true)\n\n  /**\n    * Atkinson index as a measure of inequality of the distribution\n    * of the benefits over categories. References:\n    * https://en.wikipedia.org/wiki/Atkinson_index\n    * Atkinson, On the measurement of inequality.\n    * Journal of Economic Theory, 2 (3), 1970\n    * http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.521.849&rep=rep1&type=pdf\n    * https://statisticalhorizons.com/wp-content/uploads/Inequality.pdf (Note\n    * that there is a typo in equation 17)\n    *\n    * We assume that the benefits are positive whenever epsilon > 1. Although\n    * Atkinson index can be expressed in terms of generalized entropy index,\n    * we compute directly for simplicity and to avoid positivity assumption\n    * when epsilon = 0.\n    *\n    * @param epsilon Inequality aversion parameter (greater than or equal to\n    *                zero, with zero corresponding to no aversion to inequality)\n    * @return Atkinson index\n    */\n  def computeAtkinsonIndex(epsilon: Double): Double = {\n    val count = entries.size\n    val mean = this.mean\n\n    val normalizedBenefits = entries.map {case (_, benefit) =>\n      benefit / mean\n    }\n\n    val alpha = 1 - epsilon\n    if (math.abs(alpha) < errorTolerance) {\n      1.0 - math.pow(normalizedBenefits.product, 1.0/count)\n    } else {\n      val normalizedBenefitPowerMean = normalizedBenefits\n        .map(math.pow(_, alpha))\n        .sum / count\n\n      1.0 - math.pow(normalizedBenefitPowerMean, 1.0/alpha)\n    }\n  }\n\n  /**\n    * Coefficient of variation as a measure of inequality of the distribution\n    * of the benefits over categories. References:\n    * https://en.wikipedia.org/wiki/Coefficient_of_variation\n    * https://statisticalhorizons.com/wp-content/uploads/Inequality.pdf\n    *\n    * Although coefficient of variation can be expressed in terms of\n    * generalized entropy index (GEI with alpha = 2 equals half the squared\n    * coefficient of variation), we compute directly for simplicity.\n    *\n    * @return Coefficient of variation\n    */\n  def computeCoefficientOfVariation: Double = {\n    math.sqrt(this.variance) / this.mean\n  }\n\n  /**\n    * Compute the requested metric, passing along any metric-specific parameters.\n    *\n    * @param metric The metric of interest\n    * @param metricParam A metric-specific parameter\n    * @return The computed metric of interest\n    */\n  def computeMetric(metric: String, metricParam: String): Option[FairnessResult] = {\n    metric match {\n      case \"GENERALIZED_ENTROPY_INDEX\" =>\n        Some(FairnessResult(\n          resultType = s\"$benefitType: $metric\",\n          resultValOpt = Some(computeGeneralizedEntropyIndex(metricParam.toDouble)),\n          parameters = metricParam,\n          constituentVals = Map()))\n      case \"ATKINSON_INDEX\" =>\n        Some(FairnessResult(\n          resultType = s\"$benefitType: $metric\",\n          resultValOpt = Some(computeAtkinsonIndex(metricParam.toDouble)),\n          parameters = metricParam,\n          constituentVals = Map()))\n      case \"THEIL_L_INDEX\" =>\n        Some(FairnessResult(\n          resultType = s\"$benefitType: $metric\",\n          resultValOpt = Some(computeTheilLIndex),\n          parameters = metricParam,\n          constituentVals = Map()))\n      case \"THEIL_T_INDEX\" =>\n        Some(FairnessResult(\n          resultType = s\"$benefitType: $metric\",\n          resultValOpt = Some(computeTheilTIndex),\n          parameters = metricParam,\n          constituentVals = Map()))\n      case \"COEFFICIENT_OF_VARIATION\" =>\n        Some(FairnessResult(\n          resultType = s\"$benefitType: $metric\",\n          resultValOpt = Some(computeCoefficientOfVariation),\n          parameters = metricParam,\n          constituentVals = Map()))\n      case _ => None\n    }\n  }\n\n  /**\n    * Compute the aggregate metrics requested, and append the benefit metric\n    * used for these computations to the returned list of results.\n    *\n    * @param overallMetrics The aggregate metrics to compute\n    * @return The sequence of FairnessResults\n    */\n  def computeOverallMetrics(overallMetrics: Map[String, String]): Seq[FairnessResult] = {\n    val overallMetricsSeq =\n      overallMetrics.flatMap { case (overallMetric, metricParam) =>\n        computeMetric(overallMetric, metricParam)\n      }.toList\n\n    FairnessResult(\n      resultType = s\"Benefit Map for $benefitType\",\n      resultValOpt = None,\n      constituentVals = entries\n    ) +: overallMetricsSeq\n  }\n}\n\nobject BenefitMap {\n  /**\n    * Compute a Benefit Map that captures a benefit value for each dimension\n    * value, from a given set of model predictions and benefit function.\n    *\n    * @param predictions The model predictions to analyze\n    * @param dimensionType The dimension type of interest\n    * @param benefitMetric The benefit metric to compute for each dimension value\n    * @return The computed benefit map\n    */\n  def compute(predictions: Seq[ModelPrediction], dimensionType: String,\n    benefitMetric: String): BenefitMap = {\n    val benefitFn = StatsUtils.getMetricFn(benefitMetric)\n    val benefitEntries: Map[DimensionValues, Double] =\n      predictions.groupBy(_.dimensionValue)\n        .map { case (dimVal, entries) =>\n          (Map(dimensionType -> dimVal), benefitFn(entries))\n        }\n    BenefitMap(entries = benefitEntries, benefitType = benefitMetric)\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/CustomMetric.scala",
    "content": "package com.linkedin.lift.types\n\n/**\n  * Abstract class that needs to be extended in case a custom metric needs to\n  * be computed. The compute method needs to be overridden.\n  */\nabstract class CustomMetric {\n  /**\n    * Compute a user-defined metric given a sequence of model predictions.\n    *\n    * @param data A sample of model predictions\n    * @return The custom computed metric\n    */\n  def compute(data: Seq[ModelPrediction]): Double\n}\n\n\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/Distribution.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.types.Distribution.DimensionValues\nimport org.apache.spark.sql.functions.{col, count}\nimport org.apache.spark.sql.types.{DoubleType, StringType, StructField, StructType}\nimport org.apache.spark.sql.{DataFrame, Row, SparkSession}\n\n/**\n  * Class representing a data distribution. It is a map from a set of dimension\n  * entries to their corresponding value. We assume that this is a sparse\n  * representation, ie., missing dimensions correspond to a value of zero. Note\n  * that the class is not aware of the set of all possible dimension values -\n  * it will return a value of zero for any key it doesn't contain.\n  *\n  * The Distribution can be a frequency distribution or a discrete probability\n  * distribution.\n  *\n  * @param entries The map that represents the distribution.\n  */\ncase class Distribution(\n  entries: Map[DimensionValues, Double]\n) {\n\n  /**\n    * Computes the sum of the distribution\n    *\n    * @return The sum of all the entries\n    */\n  def sum: Double = entries.values.sum\n\n  /**\n    * Computes the max of the distribution\n    *\n    * @return The max of all the entries\n    */\n  def max: Double = entries.values.max\n\n  /**\n    * Get the value corresponding to a given DimensionValue\n    *\n    * @param key DimensionValue of interest\n    * @return value if present, else 0.0\n    */\n  def getValue(key: DimensionValues): Double = entries.getOrElse(key, 0.0)\n\n  /**\n    * Zips this distribution with another distribution\n    *\n    * @param other The other distribution to zip with\n    * @return An iterable over the two distributions, with imputed values for\n    *         dimensions missing in either distribution. Since we are not aware\n    *         of the set of all dimension values, we cannot impute values for\n    *         dimensions missing in both distributions.\n    */\n  def zip(other: Distribution): Seq[(DimensionValues, Double, Double)] = {\n    (this.entries.keys ++ other.entries.keys).map { key =>\n      (key, this.getValue(key), other.getValue(key))\n    }.toSeq\n  }\n\n  /**\n    * Computes marginal distribution with respect to the specified set of\n    * dimensions\n    *\n    * @param groupByCols Dimensions to group by\n    * @return  The resultant marginal distribution\n    */\n  def computeMarginal(groupByCols: Set[String]): Distribution = {\n    val marginalDistributionEntries = entries.toSeq\n      .map {  case (dimVals, count) =>\n        val marginalDimensions = groupByCols.map { groupByCol =>\n          (groupByCol, dimVals.getOrElse(groupByCol, \"\"))\n        }.toMap\n\n        (marginalDimensions, count)\n      }\n      .groupBy(_._1)\n      .map { case (marginalDimensions, countsGroup) =>\n        (marginalDimensions, countsGroup.map(_._2).sum)}\n\n    Distribution(entries = marginalDistributionEntries)\n  }\n\n  /**\n    * Convert the Distribution into a DataFrame\n    *\n    * @param spark The current Spark Session\n    * @return A DataFrame with column names (dim1, ...., dimN, count).\n    */\n  def toDF(spark: SparkSession): DataFrame = {\n    val allKeys = entries.keySet.flatMap(_.keySet).toSeq\n\n    // Build a schema corresponding to the distribution entries\n    val schema = StructType(\n      allKeys.map(StructField(_, StringType)) :+\n        StructField(\"count\", DoubleType))\n\n    // Build an RDD with the distribution entries\n    val entriesSeq = entries.map { case (dimVals, count) =>\n      val entries: Seq[Any] = allKeys.map(dimVals.getOrElse(_, \"\")) :+ count\n      entries\n    }.toSeq\n    val rowData = entriesSeq.map(entry => Row(entry: _*))\n    val rdd = spark.sparkContext.parallelize(rowData)\n\n    spark.createDataFrame(rdd, schema)\n  }\n}\n\nobject Distribution {\n\n  type DimensionValues = Map[String, String]\n\n  /**\n    * Create a Distribution instance given a DataFrame and the fields to group on.\n    *\n    * @param df DataFrame to be grouped\n    * @param groupByCols Dimensions to group by\n    * @return The resultant distribution\n    */\n  def compute(df: DataFrame, groupByCols: Set[String]): Distribution = {\n    val groupBySqlCols = groupByCols.map(col).toSeq\n    val distributionEntries = df.select(groupBySqlCols: _*)\n      .groupBy(groupBySqlCols: _*)\n      .agg(count(\"*\"))\n      .collect\n      .toSeq\n      .map { row =>\n        val rowSeq = row.toSeq.map { Option(_).fold(\"\") { _.toString } }\n        val groupingVals = rowSeq.take(groupByCols.size)\n        val countVal = rowSeq.drop(groupByCols.size).head\n        val dimensions = groupByCols.zip(groupingVals).toMap\n\n        (dimensions, countVal.toDouble)\n      }\n      .toMap\n\n    Distribution(entries = distributionEntries)\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/EOppCaseClasses.scala",
    "content": "package com.linkedin.lift.types\n\ncase class ScoreWithLabelAndPosition(score: Double, label: Int, position: Int, attribute: Option[String] = None)\n\ncase class ScoreWithAttribute(itemId: Int, score: Double, attribute: String,\n  sessionId: Option[Int] = None, position: Option[Int] = None)\n\ncase class ScoreWithLabelAndAttribute(itemId: Int, score: Double, label: Int, attribute: String,\n  sessionId: Option[Int] = None, position: Option[Int] = None)\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/FairnessResult.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.types.Distribution.DimensionValues\nimport org.apache.spark.sql.{DataFrame, SparkSession}\n\n/**\n  * Captures the results of a generic fairness metric computation.\n  *\n  * @param resultType Description/title of the computed metric\n  * @param parameters Any parameters that were used in the computation\n  * @param resultValOpt The result of the computation. Some results involve a\n  *                     single metric, in which case this field is used. It is\n  *                     None otherwise.\n  * @param constituentVals Values/results that the result is comprised of. Some\n  *                        metrics produce a list of values, while others\n  *                        produce a single overall value. In the latter case,\n  *                        we attempt to capture the contributions of the\n  *                        individual dimensions in this field.\n  * @param additionalStats Any additional statistics related to the computation.\n  */\ncase class FairnessResult(\n  resultType: String,\n  parameters: String = \"\",\n  resultValOpt: Option[Double],\n  constituentVals: Map[DimensionValues, Double],\n  additionalStats: Map[String, Double] = Map()\n) {\n  /**\n    * Convert a FairnessResult into a BenefitMap. This works by using the\n    * constituent values of the FairnessResult as the benefit vector.\n    *\n    * @return The resultant BenefitMap\n    */\n  def toBenefitMap: BenefitMap = {\n    BenefitMap(entries = constituentVals, benefitType = resultType)\n  }\n}\n\nobject FairnessResult {\n  // Avro schemas allow only String keys for Maps\n  private case class AvroCompatibleResult(\n    resultType: String,\n    parameters: String,\n    resultValOpt: Option[Double],\n    constituentVals: Map[String, Double],\n    additionalStats: Map[String, Double])\n\n  /**\n    * Create an Avro-compatible DataFrame from a sequence of results.\n    *\n    * @param spark The Spark Session\n    * @param results The results to be converted\n    * @return A DataFrame containing the results\n    */\n  def toDF(spark: SparkSession, results: Seq[FairnessResult]): DataFrame = {\n    import spark.implicits._\n\n    results.toDS.map { result =>\n      AvroCompatibleResult(\n        resultType = result.resultType,\n        parameters = result.parameters,\n        resultValOpt = result.resultValOpt,\n        constituentVals = result.constituentVals.map { case (dimVals, value) =>\n          (dimVals.toString, value)\n        },\n        additionalStats = result.additionalStats)\n    }.toDF\n  }\n}\n"
  },
  {
    "path": "lift/src/main/scala/com/linkedin/lift/types/ModelPrediction.scala",
    "content": "package com.linkedin.lift.types\n\nimport org.apache.spark.sql.{DataFrame, Dataset, Encoders, Row}\n\n/**\n  * Represents a single data point's ground-truth label,\n  * the model's prediction (either a score, or a predicted class),\n  * and the dimension value it corresponds to. For a method to work with\n  * the permutation test, it needs to take a sequence of these case classes\n  * as its input.\n  *\n  * @param label The ground-truth label of the data point. Values are in {0, 1}\n  * @param prediction The model's prediction. Lies between [0, 1]\n  * @param groupId The optional groupId for ranking\n  * @param rank A value that indicates the rank of the prediction. If groupId is\n  *             empty, this would be the absolute rank. Otherwise, it is\n  *             the per-group rank. Starts from 1.\n  * @param dimensionValue The dimension value the data point belongs to\n  */\ncase class ModelPrediction(\n  label: Double,\n  prediction: Double,\n  groupId: String = \"\",\n  rank: Int = 0,\n  dimensionValue: String)\n\nobject ModelPrediction {\n  /**\n    * Retrieves the group ID from the specified field.\n    *\n    * @param row Input DataFrame's Row\n    * @param groupIdField The group ID field\n    * @return THe group ID value if present, else an empty string\n    */\n  def getGroupId(row: Row, groupIdField: String): String = {\n    val allFields = row.schema.fieldNames\n    val groupIdValOpt =\n      if (allFields.contains(groupIdField)) {\n        Some(row.getAs[CharSequence](groupIdField).toString)\n      } else {\n        None\n      }\n    groupIdValOpt.getOrElse(\"\")\n  }\n\n  /**\n    * Builds a Dataset[ModelPrediction] by extracting labels, predictions,\n    * dimension values and group IDs.\n    *\n    * @param df The DataFrame to process\n    * @param labelField The label field\n    * @param scoreField The score field\n    * @param groupIdField The group ID field\n    * @param dimValField The dimension value field\n    * @return The Dataset containing ModelPredictions\n    */\n  def getModelPredictionDS(df: DataFrame, labelField: String,\n    scoreField: String, groupIdField: String,\n    dimValField: String): Dataset[ModelPrediction] = {\n    df.map { row =>\n      val label = row.getAs[Any](labelField).toString.toDouble\n      val prediction = row.getAs[Any](scoreField).toString.toDouble\n      val groupIdVal = getGroupId(row, groupIdField)\n      val dimVal = row.getAs[CharSequence](dimValField).toString\n\n      ModelPrediction(\n        label = label,\n        prediction = prediction,\n        groupId = groupIdVal,\n        dimensionValue = dimVal)\n    } (Encoders.product[ModelPrediction])\n  }\n\n  /**\n    * Generate Model Predictions from a given DataFrame.\n    *\n    * @param df The DataFrame to process\n    * @param labelField Column containing the labels\n    * @param scoreField Column containing the model scores\n    * @param groupIdField Grouping column name (usually meant for ranking metrics)\n    * @param dimValField Column containing the dimension value of interest\n    * @return A sequence of model predictions extracted from the DataFrame\n    */\n  def compute(df: DataFrame, labelField: String, scoreField: String,\n    groupIdField: String, dimValField: String): Seq[ModelPrediction] = {\n    val modelPredictions = getModelPredictionDS(df,\n      labelField, scoreField, groupIdField, dimValField)\n      .collect\n      .toSeq\n\n    // Add ranking info\n    modelPredictions.groupBy(_.groupId)\n      .flatMap { case (_, predictions) =>\n        predictions.sortBy(-_.prediction)\n          .zipWithIndex\n          .map { case (prediction, rank) =>\n            prediction.copy(rank = rank + 1)\n          }\n      }.toList\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/eval/FairnessMetricsUtilsTest.scala",
    "content": "package com.linkedin.lift.eval\n\nimport com.linkedin.lift.lib.testing.{TestUtils, TestValues}\nimport com.linkedin.lift.lib.testing.TestValues.JoinedData\nimport com.linkedin.lift.types.{Distribution, FairnessResult, ModelPrediction}\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for FairnessMetricsUtils\n  */\nclass FairnessMetricsUtilsTest {\n\n  val predictions: Seq[ModelPrediction] = Seq(\n    ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\"),\n    ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\"),\n    ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n    ModelPrediction(label = 0, prediction = 0, dimensionValue = \"FEMALE\"),\n    ModelPrediction(label = 1, prediction = 1, dimensionValue = \"FEMALE\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n    ModelPrediction(label = 1, prediction = 1, dimensionValue = \"UNKNOWN\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"),\n    ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"))\n\n  @Test(description = \"Project IDs, labels and scores\")\n  def testProjectIdLabelsAndScores(): Unit = {\n    val projectedDF = FairnessMetricsUtils.projectIdLabelsAndScores(TestValues.df,\n      \"memberId\", \"label\", \"predicted\", \"\")\n    val projectedDFStrSeq = projectedDF.collect.toSeq.map(_.toString)\n    val expectedStringSeq1 = Seq(\n      \"[12340,0,0]\", \"[12341,1,0]\", \"[12342,0,1]\", \"[12343,0,0]\", \"[12344,1,1]\",\n      \"[12345,0,1]\", \"[12346,1,1]\", \"[12347,1,0]\", \"[12348,0,0]\", \"[12349,0,1]\")\n    Assert.assertEquals(projectedDFStrSeq, expectedStringSeq1)\n  }\n\n  @Test(description = \"Compute permutation test metrics\")\n  def testComputePermutationTestMetrics(): Unit = {\n    val actualResults = FairnessMetricsUtils.computePermutationTestMetrics(\n      predictions, \"gender\", Seq(\"PRECISION\", \"RECALL\"), 1000, 2)\n    Assert.assertEquals(actualResults, Seq(\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(0.16667),\n        constituentVals =\n          Map(Map(\"gender\" -> \"MALE\") -> 0.5, Map(\"gender\" -> \"FEMALE\") -> 0.33333),\n        parameters = \"Map(metric -> PRECISION, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 0.498, \"stdError\" -> 0.01581,\n          \"bootstrapStdDev\" -> 0.49003251236869033,\n          \"testStatisticStdDev\" -> 0.5265411200529941)),\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(0.25),\n        constituentVals =\n          Map(Map(\"gender\" -> \"MALE\") -> 0.5, Map(\"gender\" -> \"UNKNOWN\") -> 0.25),\n        parameters = \"Map(metric -> PRECISION, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 0.631, \"stdError\" -> 0.01526,\n          \"bootstrapStdDev\" -> 0.45875449105441796,\n          \"testStatisticStdDev\" -> 0.4308307912931494)),\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(0.08333),\n        constituentVals =\n          Map(Map(\"gender\" -> \"FEMALE\") -> 0.33333, Map(\"gender\" -> \"UNKNOWN\") -> 0.25),\n        parameters = \"Map(metric -> PRECISION, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 1.0, \"stdError\" -> 0.0,\n          \"bootstrapStdDev\" -> 0.38278033036295717,\n          \"testStatisticStdDev\" -> 0.373106162115032)),\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(-0.5),\n        constituentVals =\n          Map(Map(\"gender\" -> \"MALE\") -> 0.5, Map(\"gender\" -> \"FEMALE\") -> 1.0),\n        parameters = \"Map(metric -> RECALL, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 0.444, \"stdError\" -> 0.01571,\n          \"bootstrapStdDev\" -> 0.6180776162116582,\n          \"testStatisticStdDev\" -> 0.7040802954178795)),\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(-0.5),\n        constituentVals =\n          Map(Map(\"gender\" -> \"MALE\") -> 0.5, Map(\"gender\" -> \"UNKNOWN\") -> 1.0),\n        parameters = \"Map(metric -> RECALL, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 0.417, \"stdError\" -> 0.01559,\n          \"bootstrapStdDev\" -> 0.6473608956252646,\n          \"testStatisticStdDev\" -> 0.6938742203424768)),\n      FairnessResult(resultType = \"PERMUTATION_TEST\",\n        resultValOpt = Some(0.0),\n        constituentVals =\n          Map(Map(\"gender\" -> \"FEMALE\") -> 1.0, Map(\"gender\" -> \"UNKNOWN\") -> 1.0),\n        parameters = \"Map(metric -> RECALL, numTrials -> 1000, seed -> 2)\",\n        additionalStats = Map(\"pValue\" -> 0.426, \"stdError\" -> 0.01564,\n          \"bootstrapStdDev\" -> 0.6958650934112338,\n          \"testStatisticStdDev\" -> 0.653010277424806))))\n  }\n\n  @Test(description = \"Compute reference distributions\")\n  def testComputeReferenceDistributionOpt(): Unit = {\n    val distribution = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\") -> 2.0,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\") -> 3.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\") -> 2.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\") -> 2.8,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1\") -> 0.3,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 1.0))\n\n    Assert.assertEquals(FairnessMetricsUtils.computeReferenceDistributionOpt(\n      distribution, \"incorrect\"), None)\n    Assert.assertEquals(FairnessMetricsUtils.computeReferenceDistributionOpt(\n      distribution, \"UNIFORM\"), Some(Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\") -> 1.0/6.0,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\") -> 1.0/6.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\") -> 1.0/6.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\") -> 1.0/6.0,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1\") -> 1.0/6.0,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 1.0/6.0))))\n  }\n\n  @Test(description = \"Compute dataset metrics - no reference distribution\")\n  def testComputeDatasetMetricsNoRefDistr(): Unit = {\n    val distribution = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.3,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.2,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 0.1,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.2,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.1,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 0.1))\n    val args = MeasureDatasetFairnessMetricsCmdLineArgs(\n      labelField = \"label\",\n      protectedAttributeField = \"gender\",\n      distanceMetrics = Seq(\"KL_DIVERGENCE\", \"DEMOGRAPHIC_PARITY\", \"EQUALIZED_ODDS\"),\n      overallMetrics = Map(\"THEIL_L_INDEX\" -> \"\", \"THEIL_T_INDEX\" -> \"\"),\n      benefitMetrics = Seq(\"SKEWS\"))\n\n    // Only distance metrics with no reference distribution are computed\n    val actualMetrics =\n      FairnessMetricsUtils.computeDatasetMetrics(distribution, None, args)\n    Assert.assertEquals(actualMetrics, Seq(FairnessResult(\n      resultType = \"DEMOGRAPHIC_PARITY\",\n      resultValOpt = None,\n      constituentVals = Map(\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.16667,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\") -> 0.26667,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\") -> 0.1),\n      additionalStats =\n        Map(\"FEMALE\" -> 0.33333, \"UNKNOWN\" -> 0.5, \"MALE\" -> 0.6))))\n  }\n\n  @Test(description = \"Compute dataset metrics - with reference distribution\")\n  def testComputeDatasetMetricsWithRefDistr(): Unit = {\n    val distribution = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.3,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.2,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 0.1,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.2,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.1,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 0.1))\n    val args = MeasureDatasetFairnessMetricsCmdLineArgs(\n      labelField = \"label\",\n      protectedAttributeField = \"gender\",\n      distanceMetrics = Seq(\"KL_DIVERGENCE\", \"DEMOGRAPHIC_PARITY\", \"EQUALIZED_ODDS\"),\n      overallMetrics = Map(\"THEIL_L_INDEX\" -> \"\", \"THEIL_T_INDEX\" -> \"\"),\n      benefitMetrics = Seq(\"SKEWS\"))\n\n    // Dataset distance metrics\n    val referenceDistr = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.16666,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.16666,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 0.16666))\n    val actualMetrics =\n      FairnessMetricsUtils.computeDatasetMetrics(distribution, Some(referenceDistr), args)\n    Assert.assertEquals(actualMetrics, Seq(\n      FairnessResult(resultType = \"KL_DIVERGENCE\",\n        parameters = Distribution(Map(\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.16666)).toString,\n        resultValOpt = Some(0.13852315605014068),\n        constituentVals = Map()),\n      FairnessResult(resultType = \"DEMOGRAPHIC_PARITY\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.16667,\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\") -> 0.26667,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\") -> 0.1),\n        additionalStats =\n          Map(\"FEMALE\" -> 0.33333, \"UNKNOWN\" -> 0.5, \"MALE\" -> 0.6)),\n      FairnessResult(\n        resultType = \"Benefit Map for SKEWS\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> -0.058840500022933395,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> -0.058840500022933395,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> -0.058840500022933395,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.028170876966696262,\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.028170876966696262,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.10821358464023273)),\n      FairnessResult(\n        resultType = \"SKEWS: THEIL_L_INDEX\",\n        resultValOpt = Some(0.10948572991373717),\n        constituentVals = Map()),\n      FairnessResult(\n        resultType = \"SKEWS: THEIL_T_INDEX\",\n        resultValOpt = Some(0.10611973507347484),\n        constituentVals = Map())))\n  }\n\n  @Test(description = \"Compute model metrics\")\n  def testComputeModelMetrics(): Unit = {\n    val testData = Seq(\n      JoinedData(memberId = 1, label = \"0.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 2, label = \"0.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 3, label = \"0.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 4, label = \"0.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 5, label = \"1.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 6, label = \"1.0\", predicted = \"1.0\", gender = \"MALE\"),\n      JoinedData(memberId = 7, label = \"0.0\", predicted = \"0.0\", gender = \"MALE\"),\n      JoinedData(memberId = 9, label = \"0.0\", predicted = \"0.0\", gender = \"MALE\"),\n      JoinedData(memberId = 9, label = \"1.0\", predicted = \"0.0\", gender = \"MALE\"),\n      JoinedData(memberId = 10, label = \"1.0\", predicted = \"0.0\", gender = \"MALE\"),\n      JoinedData(memberId = 11, label = \"0.0\", predicted = \"1.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 12, label = \"0.0\", predicted = \"1.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 13, label = \"0.0\", predicted = \"0.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 14, label = \"0.0\", predicted = \"0.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 15, label = \"1.0\", predicted = \"0.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 16, label = \"1.0\", predicted = \"0.0\", gender = \"FEMALE\"),\n      JoinedData(memberId = 17, label = \"1.0\", predicted = \"1.0\", gender = \"UNKNOWN\"),\n      JoinedData(memberId = 18, label = \"1.0\", predicted = \"1.0\", gender = \"UNKNOWN\"),\n      JoinedData(memberId = 19, label = \"0.0\", predicted = \"0.0\", gender = \"UNKNOWN\"),\n      JoinedData(memberId = 20, label = \"1.0\", predicted = \"0.0\", gender = \"UNKNOWN\"))\n    val df = TestUtils.createDFFromProduct(TestValues.spark, testData)\n    val args = MeasureModelFairnessMetricsCmdLineArgs(\n      labelField = \"label\",\n      scoreField = \"predicted\",\n      protectedAttributeField = \"gender\",\n      distanceMetrics = Seq(\"KL_DIVERGENCE\", \"DEMOGRAPHIC_PARITY\", \"EQUALIZED_ODDS\"),\n      overallMetrics = Map(\"THEIL_L_INDEX\" -> \"\", \"THEIL_T_INDEX\" -> \"\"),\n      distanceBenefitMetrics = Seq(\"SKEWS\"))\n\n    // Model distance metrics\n    val referenceDistr = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"predicted\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"MALE\", \"predicted\" -> \"0.0\") -> 0.16666,\n      Map(\"gender\" -> \"FEMALE\", \"predicted\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"FEMALE\", \"predicted\" -> \"0.0\") -> 0.16666,\n      Map(\"gender\" -> \"UNKNOWN\", \"predicted\" -> \"1.0\") -> 0.16666,\n      Map(\"gender\" -> \"UNKNOWN\", \"predicted\" -> \"0.0\") -> 0.16666))\n    val actualMetrics = FairnessMetricsUtils.computeModelMetrics(df,\n      Some(referenceDistr), args)\n    Assert.assertEquals(actualMetrics, Seq(\n      FairnessResult(resultType = \"KL_DIVERGENCE\",\n        parameters = Distribution(Map(\n          Map(\"gender\" -> \"UNKNOWN\", \"predicted\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"predicted\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"predicted\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"FEMALE\", \"predicted\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"FEMALE\", \"predicted\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"UNKNOWN\", \"predicted\" -> \"0.0\") -> 0.16666)).toString,\n        resultValOpt = Some(0.13852315605014037),\n        constituentVals = Map()),\n      FairnessResult(resultType = \"DEMOGRAPHIC_PARITY\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.16667,\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\") -> 0.26667,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\") -> 0.1),\n        additionalStats =\n          Map(\"FEMALE\" -> 0.33333, \"UNKNOWN\" -> 0.5, \"MALE\" -> 0.6)),\n      FairnessResult(resultType = \"EQUALIZED_ODDS\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.66667,\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.5,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.16667,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.5,\n          Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.16667,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.66667),\n        additionalStats = Map(\"1.0,UNKNOWN\" -> 0.66667, \"0.0,UNKNOWN\" -> 0.0,\n          \"1.0,MALE\" -> 0.5, \"0.0,FEMALE\" -> 0.5,\n          \"0.0,MALE\" -> 0.66667, \"1.0,FEMALE\" -> 0.0)),\n      FairnessResult(\n        resultType = \"Benefit Map for SKEWS\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"predicted\" -> \"1.0\", \"gender\" -> \"UNKNOWN\") -> -0.3677247801253174,\n          Map(\"predicted\" -> \"1.0\", \"gender\" -> \"MALE\") -> 0.47957308026188605,\n          Map(\"predicted\" -> \"0.0\", \"gender\" -> \"MALE\") -> 0.1431008436406731,\n          Map(\"predicted\" -> \"0.0\", \"gender\" -> \"FEMALE\") -> 0.1431008436406731,\n          Map(\"predicted\" -> \"1.0\", \"gender\" -> \"FEMALE\") -> -0.3677247801253174,\n          Map(\"predicted\" -> \"0.0\", \"gender\" -> \"UNKNOWN\") -> -0.3677247801253174)),\n      FairnessResult(\n        resultType = \"SKEWS: THEIL_L_INDEX\",\n        resultValOpt = Some(0.10437214313750444),\n        constituentVals = Map()),\n      FairnessResult(\n        resultType = \"SKEWS: THEIL_T_INDEX\",\n        resultValOpt = Some(0.08957922977452398),\n        constituentVals = Map())))\n  }\n\n  @Test(description = \"Compute probability DataFrame\")\n  def testComputeProbabilityDF(): Unit = {\n    // DF with probabilities\n    val actualDF1 = FairnessMetricsUtils.computeProbabilityDF(TestValues.df2, None,\n      \"label\", \"predicted\", \"gender\", \"PROB\")\n    Assert.assertEquals(actualDF1.collect.toSeq.map(_.mkString(\",\")),\n      Seq(\"0.0,0.3,MALE\", \"1.0,0.4,MALE\", \"0.0,0.8,MALE\", \"0.0,0.1,MALE\",\n        \"1.0,0.7,MALE\", \"0.0,0.6,UNKNOWN\", \"1.0,0.9,FEMALE\", \"1.0,0.3,FEMALE\",\n        \"0.0,0.2,FEMALE\", \"0.0,0.8,FEMALE\"))\n\n    // DF with threshold\n    val actualDF2 = FairnessMetricsUtils.computeProbabilityDF(TestValues.df2, Some(0.6),\n      \"label\", \"predicted\", \"gender\", \"RAW\")\n    Assert.assertEquals(actualDF2.collect.toSeq.map(_.mkString(\",\")),\n      Seq(\"0.0,0,MALE\", \"1.0,0,MALE\", \"0.0,1,MALE\", \"0.0,0,MALE\", \"1.0,1,MALE\",\n        \"0.0,1,UNKNOWN\", \"1.0,1,FEMALE\", \"1.0,0,FEMALE\", \"0.0,0,FEMALE\", \"0.0,1,FEMALE\"))\n\n    // DF with raw scores\n    val actualDF3 = FairnessMetricsUtils.computeProbabilityDF(TestValues.df2, None,\n      \"label\", \"predicted\", \"gender\", \"RAW\")\n    Assert.assertEquals(actualDF3.collect.toSeq.map(_.mkString(\",\")),\n      Seq(\n        \"0.0,0.574442516811659,MALE\", \"1.0,0.598687660112452,MALE\",\n        \"0.0,0.6899744811276125,MALE\", \"0.0,0.52497918747894,MALE\",\n        \"1.0,0.6681877721681662,MALE\", \"0.0,0.6456563062257954,UNKNOWN\",\n        \"1.0,0.7109495026250039,FEMALE\", \"1.0,0.574442516811659,FEMALE\",\n        \"0.0,0.549833997312478,FEMALE\", \"0.0,0.6899744811276125,FEMALE\"))\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/lib/DivergenceUtilsTest.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.lib.testing.TestValues\nimport com.linkedin.lift.types.{Distribution, FairnessResult}\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for DivergenceUtils\n  */\nclass DivergenceUtilsTest {\n\n  val EPS = 1e-12\n\n  @Test(description = \"KL divergence - no overlap\")\n  def testKLDivergenceNoOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val actualKLDivergence12 =\n      DivergenceUtils.computeKullbackLeiblerDivergence(testDist1, testDist2)\n    val expectedKLDivergence12 = (1.0 / math.log(2.0)) *\n      ((24.0 / 28.0 * math.log((24.0 / 28.0) / (1.0 / 34.0))) +\n        (4.0 / 28.0 * math.log((4.0 / 28.0) / (1.0 / 34.0))))\n    Assert.assertTrue(math.abs(actualKLDivergence12 - expectedKLDivergence12) < EPS)\n\n    val actualKLDivergence21 =\n      DivergenceUtils.computeKullbackLeiblerDivergence(testDist2, testDist1)\n    val expectedKLDivergence21 = (1.0 / math.log(2.0)) *\n      ((20.0 / 30.0 * math.log((20.0 / 30.0) / (1.0 / 32.0))) +\n        (10.0 / 30.0 * math.log((10.0 / 30.0) / (1.0 / 32.0))))\n    Assert.assertTrue(math.abs(actualKLDivergence21 - expectedKLDivergence21) < EPS)\n\n    // Ensure that the difference is asymmetric (in this case)\n    Assert.assertFalse(math.abs(actualKLDivergence12 - actualKLDivergence21) < EPS)\n  }\n\n  @Test(description = \"KL divergence - with overlap\")\n  def testKLDivergenceWithOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 5.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val actualKLDivergence12 =\n      DivergenceUtils.computeKullbackLeiblerDivergence(testDist1, testDist2)\n    val expectedKLDivergence12 = (1.0 / math.log(2.0)) *\n      ((24.0 / 40.0 * math.log((24.0 / 40.0) / (1.0 / 39.0))) +\n        (12.0 / 40.0 * math.log((12.0 / 40.0) / (11.0 / 39.0))) +\n        (4.0 / 40.0 * math.log((4.0 / 40.0) / (6.0 / 39.0))))\n    Assert.assertTrue(math.abs(actualKLDivergence12 - expectedKLDivergence12) < EPS)\n\n    val actualKLDivergence21 =\n      DivergenceUtils.computeKullbackLeiblerDivergence(testDist2, testDist1)\n    val expectedKLDivergence21 = (1.0 / math.log(2.0)) *\n      ((20.0 / 35.0 * math.log((20.0 / 35.0) / (1.0 / 44.0))) +\n        (5.0 / 35.0 * math.log((5.0 / 35.0) / (5.0 / 44.0))) +\n        (10.0 / 35.0 * math.log((10.0 / 35.0) / (13.0 / 44.0))))\n    Assert.assertTrue(math.abs(actualKLDivergence21 - expectedKLDivergence21) < EPS)\n\n    // Ensure that the difference is asymmetric (in this case)\n    Assert.assertFalse(math.abs(actualKLDivergence12 - actualKLDivergence21) < EPS)\n  }\n\n  @Test(description = \"JS divergence - no overlap\")\n  def testJSDivergenceNoOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedJSDivergence = (0.5 / math.log(2.0)) *\n      ((24.0 / 28.0 * math.log((24.0 / 28.0) / (12.0 / 28.0))) +\n        (4.0 / 28.0 * math.log((4.0 / 28.0) / (2.0 / 28.0))) +\n        (20.0 / 30.0 * math.log((20.0 / 30.0) / (10.0 / 30.0))) +\n        (10.0 / 30.0 * math.log((10.0 / 30.0) / (5.0 / 30.0))))\n\n    val actualJSDivergence12 =\n      DivergenceUtils.computeJensenShannonDivergence(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualJSDivergence12 - expectedJSDivergence) < EPS)\n\n    val actualJSDivergence21 =\n      DivergenceUtils.computeJensenShannonDivergence(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualJSDivergence21 - expectedJSDivergence) < EPS)\n  }\n\n  @Test(description = \"JS divergence - with overlap\")\n  def testJSDivergenceWithOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 5.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedJSDivergence = (0.5 / math.log(2.0)) *\n      ((24.0 / 40.0 * math.log((24.0 / 40.0) / (12.0 / 40.0))) +\n        (12.0 / 40.0 * math.log((12.0 / 40.0) / (820.0 / 2800.0))) +\n        (4.0 / 40.0 * math.log((4.0 / 40.0) / (340.0 / 2800.0))) +\n        (20.0 / 35.0 * math.log((20.0 / 35.0) / (10.0 / 35.0))) +\n        (5.0 / 35.0 * math.log((5.0 / 35.0) / (340.0 / 2800.0))) +\n        (10.0 / 35.0 * math.log((10.0 / 35.0) / (820.0 / 2800.0))))\n\n    val actualJSDivergence12 =\n      DivergenceUtils.computeJensenShannonDivergence(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualJSDivergence12 - expectedJSDivergence) < EPS)\n\n    val actualJSDivergence21 =\n      DivergenceUtils.computeJensenShannonDivergence(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualJSDivergence21 - expectedJSDivergence) < EPS)\n  }\n\n  @Test(description = \"Total variation and infinity norm distances - no overlap\")\n  def testTotalVariationAndInfinityNormDistancesNoOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedTotalVariationDistance = 1.0\n    val expectedInfinityNormDistance = 24.0 / 28.0\n\n    val actualTotalVariationDistance12 =\n      DivergenceUtils.computeTotalVariationDistance(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualTotalVariationDistance12 -\n      expectedTotalVariationDistance) < EPS)\n\n    val actualTotalVariationDistance21 =\n      DivergenceUtils.computeTotalVariationDistance(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualTotalVariationDistance21 -\n      expectedTotalVariationDistance) < EPS)\n\n    val actualInfinityNormDistance12 =\n      DivergenceUtils.computeInfinityNormDistance(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualInfinityNormDistance12 -\n      expectedInfinityNormDistance) < EPS)\n\n    val actualInfinityNormDistance21 =\n      DivergenceUtils.computeInfinityNormDistance(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualInfinityNormDistance21 -\n      expectedInfinityNormDistance) < EPS)\n  }\n\n  @Test(description = \"Total variation and infinity norm distances - with overlap\")\n  def testTotalVariationAndInfinityNormDistancesWithOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 25.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 5.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedTotalVariationDistance = 0.5 * (24.0 + 2.0 + 25.0 + 1.0) / 40.0\n    val expectedInfinityNormDistance = 25.0 / 40.0\n\n    val actualTotalVariationDistance12 =\n      DivergenceUtils.computeTotalVariationDistance(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualTotalVariationDistance12 -\n      expectedTotalVariationDistance) < EPS)\n\n    val actualTotalVariationDistance21 =\n      DivergenceUtils.computeTotalVariationDistance(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualTotalVariationDistance21 -\n      expectedTotalVariationDistance) < EPS)\n\n    val actualInfinityNormDistance12 =\n      DivergenceUtils.computeInfinityNormDistance(testDist1, testDist2)\n    Assert.assertTrue(math.abs(actualInfinityNormDistance12 -\n      expectedInfinityNormDistance) < EPS)\n\n    val actualInfinityNormDistance21 =\n      DivergenceUtils.computeInfinityNormDistance(testDist2, testDist1)\n    Assert.assertTrue(math.abs(actualInfinityNormDistance21 -\n      expectedInfinityNormDistance) < EPS)\n  }\n\n  @Test(description = \"Skew measures - no overlap\")\n  def testSkewMeasuresNoOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedSkew12GenderFemaleAge20 = math.log(1/21.0 * 34.0/32.0)\n    val expectedSkew12GenderMaleAge20  = math.log(25.0 * 34.0/32.0)\n    val expectedMinSkew12 = (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"),\n      expectedSkew12GenderFemaleAge20)\n    val expectedMaxSkew12 = (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"),\n      expectedSkew12GenderMaleAge20)\n    val expectedAllSkews12 = Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> expectedSkew12GenderMaleAge20,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> math.log(5.0 * 34.0/32.0),\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> expectedSkew12GenderFemaleAge20,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> math.log(1/11.0 * 34.0/32.0))\n\n    val actualSkew12GenderFemaleAge20 =\n      DivergenceUtils.computeSkew(testDist1, testDist2,\n        Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"))\n    Assert.assertTrue(math.abs(actualSkew12GenderFemaleAge20 -\n      expectedSkew12GenderFemaleAge20) < EPS)\n\n    val actualSkew12GenderMaleAge20 =\n      DivergenceUtils.computeSkew(testDist1, testDist2,\n        Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"))\n    Assert.assertTrue(math.abs(actualSkew12GenderMaleAge20 -\n      expectedSkew12GenderMaleAge20) < EPS)\n\n    val actualMinSkew12 = DivergenceUtils.computeMinSkew(testDist1, testDist2)\n    Assert.assertEquals(actualMinSkew12._1, expectedMinSkew12._1)\n    Assert.assertTrue(math.abs(actualMinSkew12._2 -\n      expectedMinSkew12._2) < EPS)\n\n    val actualMaxSkew12 = DivergenceUtils.computeMaxSkew(testDist1, testDist2)\n    Assert.assertEquals(actualMaxSkew12._1, expectedMaxSkew12._1)\n    Assert.assertTrue(math.abs(actualMaxSkew12._2 -\n      expectedMaxSkew12._2) < EPS)\n\n    val actualAllSkews12 = DivergenceUtils.computeAllSkews(testDist1, testDist2)\n    actualAllSkews12.foreach { case (dimensions, skew) =>\n        Assert.assertTrue(math.abs(skew -\n          expectedAllSkews12.getOrElse(dimensions, 0.0)) < EPS)\n    }\n  }\n\n  @Test(description = \"Skew measures - with overlap\")\n  def testSkewMeasuresWithOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 25.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 5.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedSkew12GenderFemaleAge20 = math.log(1/26.0)\n    val expectedSkew12GenderMaleAge20  = math.log(25.0)\n    val expectedMinSkew12 = (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"),\n      expectedSkew12GenderFemaleAge20)\n    val expectedMaxSkew12 = (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"),\n      expectedSkew12GenderMaleAge20)\n    val expectedAllSkews12 = Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> expectedSkew12GenderMaleAge20,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> math.log(5.0/6.0),\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> expectedSkew12GenderFemaleAge20,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> math.log(13.0/11.0))\n\n    val actualSkew12GenderFemaleAge20 =\n      DivergenceUtils.computeSkew(testDist1, testDist2,\n        Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"))\n    Assert.assertTrue(math.abs(actualSkew12GenderFemaleAge20 -\n      expectedSkew12GenderFemaleAge20) < EPS)\n\n    val actualSkew12GenderMaleAge20 =\n      DivergenceUtils.computeSkew(testDist1, testDist2,\n        Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"))\n    Assert.assertTrue(math.abs(actualSkew12GenderMaleAge20 -\n      expectedSkew12GenderMaleAge20) < EPS)\n\n    val actualMinSkew12 =\n      DivergenceUtils.computeMinSkew(testDist1, testDist2)\n    Assert.assertEquals(actualMinSkew12._1, expectedMinSkew12._1)\n    Assert.assertTrue(math.abs(actualMinSkew12._2 -\n      expectedMinSkew12._2) < EPS)\n\n    val actualMaxSkew12 =\n      DivergenceUtils.computeMaxSkew(testDist1, testDist2)\n    Assert.assertEquals(actualMaxSkew12._1, expectedMaxSkew12._1)\n    Assert.assertTrue(math.abs(actualMaxSkew12._2 -\n      expectedMaxSkew12._2) < EPS)\n\n    val actualAllSkews12 = DivergenceUtils.computeAllSkews(testDist1, testDist2)\n    actualAllSkews12.foreach { case (dimensions, skew) =>\n      Assert.assertTrue(math.abs(skew -\n        expectedAllSkews12.getOrElse(dimensions, 0.0)) < EPS)\n    }\n  }\n\n  @Test(description = \"Generalized counts distribution\")\n  def testComputeGeneralizedPredictionCountDistribution(): Unit = {\n    val expectedDistr = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\", \"predicted\" -> \"1.0\") -> 1.1,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\", \"predicted\" -> \"0.0\") -> 0.9,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 1.2,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 1.8,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\", \"predicted\" -> \"1.0\") -> 1.2,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\", \"predicted\" -> \"0.0\") -> 0.8,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 1.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 1.0,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 0.6,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 0.4))\n\n    val actualDistr =\n      DivergenceUtils.computeGeneralizedPredictionCountDistribution(\n        TestValues.df2, \"label\", \"predicted\", \"gender\")\n    Assert.assertEquals(actualDistr.entries.size, expectedDistr.entries.size)\n    expectedDistr.entries.foreach { case (dimVals, expectedCounts) =>\n      Assert.assertTrue(math.abs(actualDistr.getValue(dimVals) - expectedCounts) < EPS)\n    }\n  }\n\n  @Test(description = \"Demographic Parity\")\n  def testComputeDemographicParity(): Unit = {\n    val distribution = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 345,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 123,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 567,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 89,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 25,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 70))\n    val actualResults =\n      DivergenceUtils.computeDemographicParity(distribution, \"label\", \"gender\")\n    val expectedResults = FairnessResult(\n      resultType = \"DEMOGRAPHIC_PARITY\",\n      resultValOpt = None,\n      constituentVals = Map(\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.60117,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\") -> 0.12715,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\") -> 0.47402),\n      additionalStats = Map(\"MALE\" -> 0.73718, \"FEMALE\" -> 0.86433, \"UNKNOWN\" -> 0.26316))\n    Assert.assertEquals(actualResults, expectedResults)\n\n    // Test with 0/1 labels\n    val distributionInt = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\") -> 345,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\") -> 123,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\") -> 567,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\") -> 89,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1\") -> 25,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 70))\n    val actualResultsInt =\n      DivergenceUtils.computeDemographicParity(distributionInt, \"label\", \"gender\")\n    Assert.assertEquals(actualResultsInt, expectedResults)\n  }\n\n  @Test(description = \"Equalized Odds\")\n  def testComputeEqualizedOdds(): Unit = {\n    val distribution = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\", \"predicted\" -> \"0.0\") -> 345,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\", \"predicted\" -> \"1.0\") -> 145,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 123,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 23,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\", \"predicted\" -> \"0.0\") -> 567,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\", \"predicted\" -> \"1.0\") -> 367,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 89,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 49,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\", \"predicted\" -> \"0.0\") -> 25,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\", \"predicted\" -> \"1.0\") -> 35,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\", \"predicted\" -> \"0.0\") -> 70,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\", \"predicted\" -> \"1.0\") -> 20))\n\n    val actualResults =\n      DivergenceUtils.computeEqualizedOdds(distribution, \"label\", \"predicted\", \"gender\")\n    val expectedResults = FairnessResult(\n      resultType = \"EQUALIZED_ODDS\",\n      resultValOpt = None,\n      constituentVals = Map(\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.1904,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.09701,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.28741,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.06469,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.13285,\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.19754),\n      additionalStats = Map(\n        \"1.0,MALE\" -> 0.29592, \"1.0,FEMALE\" -> 0.39293, \"1.0,UNKNOWN\" -> 0.58333,\n        \"0.0,MALE\" -> 0.15753, \"0.0,FEMALE\" -> 0.35507, \"0.0,UNKNOWN\" -> 0.22222))\n    Assert.assertEquals(actualResults, expectedResults)\n\n    val distributionInt = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\", \"predicted\" -> \"0\") -> 345,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\", \"predicted\" -> \"1\") -> 145,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\", \"predicted\" -> \"0\") -> 123,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\", \"predicted\" -> \"1\") -> 23,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\", \"predicted\" -> \"0\") -> 567,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\", \"predicted\" -> \"1\") -> 367,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\", \"predicted\" -> \"0\") -> 89,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\", \"predicted\" -> \"1\") -> 49,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1\", \"predicted\" -> \"0\") -> 25,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1\", \"predicted\" -> \"1\") -> 35,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\", \"predicted\" -> \"0\") -> 70,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\", \"predicted\" -> \"1\") -> 20))\n    val actualResultsInt =\n      DivergenceUtils.computeEqualizedOdds(distributionInt, \"label\", \"predicted\", \"gender\")\n    val expectedResultsInt = FairnessResult(\n      resultType = \"EQUALIZED_ODDS\",\n      resultValOpt = None,\n      constituentVals = Map(\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"FEMALE\", \"label\" -> \"1\") -> 0.1904,\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"FEMALE\", \"label\" -> \"1\") -> 0.09701,\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"UNKNOWN\", \"label\" -> \"1\") -> 0.28741,\n        Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\", \"label\" -> \"0\") -> 0.06469,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 0.13285,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\", \"label\" -> \"0\") -> 0.19754),\n      additionalStats = Map(\n        \"1,MALE\" -> 0.29592, \"1,FEMALE\" -> 0.39293, \"1,UNKNOWN\" -> 0.58333,\n        \"0,MALE\" -> 0.15753, \"0,FEMALE\" -> 0.35507, \"0,UNKNOWN\" -> 0.22222))\n    Assert.assertEquals(actualResultsInt, expectedResultsInt)\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/lib/PermutationTestUtilsTest.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.types.{FairnessResult, ModelPrediction}\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for PermutationTestUtils\n  */\nclass PermutationTestUtilsTest {\n  @Test(description = \"Permutation test with precision. Expected results obtained using R code.\")\n  def testPermutationTestPrecision(): Unit = {\n    val predictions1 = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"))\n\n    val actualResult1 = PermutationTestUtils.permutationTest(predictions1, \"gender\",\n      \"MALE\", \"FEMALE\", \"PRECISION\", 2000, 1)\n    val expectedResult1 = FairnessResult(\n      resultType = \"PERMUTATION_TEST\",\n      parameters = \"Map(metric -> PRECISION, numTrials -> 2000, seed -> 1)\",\n      resultValOpt = Some(0.125),\n      constituentVals =\n        Map(Map(\"gender\" -> \"MALE\") -> 0.125, Map(\"gender\" -> \"FEMALE\") -> 0.0),\n      additionalStats = Map(\"pValue\" -> 0.438, \"stdError\" -> 0.01109,\n        \"bootstrapStdDev\" -> 0.12188672941783991,\n        \"testStatisticStdDev\" -> 0.1574454263804954))\n\n    Assert.assertEquals(actualResult1, expectedResult1)\n\n    val predictions2 = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"))\n\n    val actualResult2 = PermutationTestUtils.permutationTest(predictions2, \"gender\",\n      \"FEMALE\", \"MALE\", \"PRECISION\", 2000, 1)\n    val expectedResult2 = FairnessResult(\n      resultType = \"PERMUTATION_TEST\",\n      parameters = \"Map(metric -> PRECISION, numTrials -> 2000, seed -> 1)\",\n      resultValOpt = Some(0.25),\n      constituentVals =\n        Map(Map(\"gender\" -> \"MALE\") -> 0.25, Map(\"gender\" -> \"FEMALE\") -> 0.5),\n      additionalStats = Map(\"pValue\" -> 0.753, \"stdError\" -> 0.00964,\n        \"bootstrapStdDev\" -> 0.4590352058557182,\n        \"testStatisticStdDev\" -> 0.44861306534335205))\n\n    Assert.assertEquals(actualResult2, expectedResult2)\n\n    val predictions3 = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"))\n\n    val actualResult3 = PermutationTestUtils.permutationTest(predictions3, \"gender\",\n      \"MALE\", \"FEMALE\", \"PRECISION\", 2000, 1)\n    val expectedResult3 = FairnessResult(\n      resultType = \"PERMUTATION_TEST\",\n      parameters = \"Map(metric -> PRECISION, numTrials -> 2000, seed -> 1)\",\n      resultValOpt = Some(0.0),\n      constituentVals =\n        Map(Map(\"gender\" -> \"MALE\") -> 0.25, Map(\"gender\" -> \"FEMALE\") -> 0.25),\n      additionalStats = Map(\"pValue\" -> 0.788, \"stdError\" -> 0.00914,\n        \"bootstrapStdDev\" -> 0.32798838458036056,\n        \"testStatisticStdDev\" -> 0.3334284273228113))\n    Assert.assertEquals(actualResult3, expectedResult3)\n  }\n\n  @Test(description = \"Permutation test for ranking\")\n  def testPermutationTestRanking(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\", groupId = \"1\", rank = 1),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\", groupId = \"1\", rank = 2),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\", groupId = \"1\", rank = 3),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\", groupId = \"2\", rank = 1),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"FEMALE\", groupId = \"2\", rank = 2),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\", groupId = \"2\", rank = 4),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\", groupId = \"2\", rank = 7),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\", groupId = \"3\", rank = 1),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\", groupId = \"3\", rank = 2),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\", groupId = \"3\", rank = 3))\n\n    val actualResult3 = PermutationTestUtils.permutationTest(predictions, \"gender\",\n      \"MALE\", \"FEMALE\", \"PRECISION/1@5\", 2000, 1)\n    val expectedResult3 = FairnessResult(\n      resultType = \"PERMUTATION_TEST\",\n      parameters = \"Map(metric -> PRECISION/1@5, numTrials -> 2000, seed -> 1)\",\n      resultValOpt = Some(0.33333),\n      constituentVals =\n        Map(Map(\"gender\" -> \"MALE\") -> 0.66667, Map(\"gender\" -> \"FEMALE\") -> 0.33333),\n      additionalStats = Map(\"pValue\" -> 0.317, \"stdError\" -> 0.0104,\n        \"bootstrapStdDev\" -> 0.3387960612857448,\n        \"testStatisticStdDev\" -> 0.40255115701974276))\n    Assert.assertEquals(actualResult3, expectedResult3)\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/lib/PositionBiasUtilsTest.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.lib.PositionBiasUtils._\nimport com.linkedin.lift.lib.testing.TestUtils\nimport com.linkedin.lift.lib.testing.TestValues.positionBiasData\nimport org.apache.spark.mllib.random.RandomRDDs.normalRDD\nimport org.apache.spark.sql.SparkSession\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for PositionBiasUtils\n  */\n\nclass PositionBiasUtilsTest {\n\n  final val spark: SparkSession = TestUtils.createSparkSession()\n\n  @Test(description = \"Bandwidth computation based on Silverman's rule\")\n  def getBandwidthTest(): Unit = {\n    import spark.implicits._\n    val df = normalRDD(spark.sparkContext, 10000L, 1, seed = 123).toDF(\"value\")\n    val bw = getBandwidth(df)\n    Assert.assertEquals(bw, 1.06 * Math.pow(10000, -0.2), 0.01)\n  }\n\n  @Test(description = \"Estimating the position bias at targetPosition with respect to basePosition\")\n  def estimateAdjacentPositionBiasTest(): Unit = {\n    val estimate = estimateAdjacentPositionBias(positionBiasData, 1e3,\n      2, 1)\n    Assert.assertEquals(estimate, 0.80, 0.01)\n  }\n\n  @Test(description = \"Position bias Estimation with respect to the top most position\")\n  def estimatePositionBiasTest(): Unit = {\n    val estimate = estimatePositionBias(positionBiasData, 1e3,\n      3)\n    Assert.assertEquals(estimate(1).positionBias, 0.80, 0.01)\n    Assert.assertEquals(estimate(2).positionBias, 0.60, 0.01)\n  }\n\n  @Test(description = \"Resampling data with weights corresponds to the inverse position bias\")\n  def debiasPositiveLabelScores(): Unit = {\n    import spark.implicits._\n    val debiasedPositiveLabelData = PositionBiasUtils.debiasPositiveLabelScores(positionBiasData,\n      1e3, 3, 5, 10, 1,\n      1234)\n\n    val positiveLabelRatioInDebiasedData21 = debiasedPositiveLabelData.filter(\n      $\"position\" === 2).count.toFloat /\n      debiasedPositiveLabelData.filter($\"position\" === 1).count\n\n    val positiveLabelRatioInData21 = positionBiasData.filter($\"position\" === 2 and\n      $\"label\" === 1).count.toFloat /\n      positionBiasData.filter($\"position\" === 1 and $\"label\" === 1).count\n\n    // the ratio of positiveLabelRatioInData21 and positiveLabelRatioInDebiasedData21 should match\n    // the position bias at position 2 with respect to position 1\n    Assert.assertEquals(positiveLabelRatioInData21 / positiveLabelRatioInDebiasedData21, 0.80, 0.05)\n\n    val positiveLabelRatioInDebiasedData31 = debiasedPositiveLabelData.filter(\n      $\"position\" === 3).count.toFloat /\n      debiasedPositiveLabelData.filter($\"position\" === 1).count\n\n    val positiveLabelRatioInData31 = positionBiasData.filter($\"position\" === 3 and\n      $\"label\" === 1).count.toFloat /\n      positionBiasData.filter($\"position\" === 1 and $\"label\" === 1).count\n\n    // the ratio of positiveLabelRatioInData31 and positiveLabelRatioInDebiasedData31 should match\n    // the position bias at position 3 with respect to position 1\n    Assert.assertEquals(positiveLabelRatioInData31 / positiveLabelRatioInDebiasedData31, 0.60, 0.05)\n  }\n\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/lib/StatsUtilsTest.scala",
    "content": "package com.linkedin.lift.lib\n\nimport com.linkedin.lift.lib.StatsUtils.ConfusionMatrix\nimport com.linkedin.lift.lib.testing.TestValues\nimport com.linkedin.lift.types.ModelPrediction\nimport org.apache.spark.sql.functions.col\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for StatsUtils\n  */\nclass StatsUtilsTest {\n\n  @Test(description = \"Round a double to specified digits of precision\")\n  def testRoundDouble(): Unit = {\n    Assert.assertEquals(StatsUtils.roundDouble(0.123456), 0.12346)\n    Assert.assertEquals(StatsUtils.roundDouble(0.123456, 4), 0.1235)\n    Assert.assertEquals(StatsUtils.roundDouble(0.123456, 2), 0.12)\n    Assert.assertEquals(StatsUtils.roundDouble(0.123456, 1), 0.1)\n  }\n\n  @Test(description = \"Compute positive and negative sample percentages\")\n  def testComputePosNegSamplePercentages(): Unit = {\n    val posDF = TestValues.df.filter(col(\"label\") === \"1\")\n    val negDF = TestValues.df.filter(col(\"label\") === \"0\")\n\n    // Sample 50% from each striation to ensure an overall 50% sample with the\n    // same pos:neg ratio as the source\n    val (posSamplePercentage1, negSamplePercentage1) =\n      StatsUtils.computePosNegSamplePercentages(posDF, negDF, 5)\n    Assert.assertEquals(posSamplePercentage1, 0.5)\n    Assert.assertEquals(negSamplePercentage1, 0.5)\n\n    // Sampling 1 out of 4 positives, and 4 out of 6 negatives will give us 0.8\n    // percentage of negative labels and a total of 5 rows.\n    val (posSamplePercentage2, negSamplePercentage2) =\n      StatsUtils.computePosNegSamplePercentages(posDF, negDF, 5, 0.8)\n    Assert.assertEquals(StatsUtils.roundDouble(posSamplePercentage2), 0.25)\n    Assert.assertEquals(StatsUtils.roundDouble(negSamplePercentage2), 0.66667)\n\n    // Requesting way too many samples should return 1.0\n    val (posSamplePercentage3, negSamplePercentage3) =\n      StatsUtils.computePosNegSamplePercentages(posDF, negDF, 100)\n    Assert.assertEquals(posSamplePercentage3, 1.0)\n    Assert.assertEquals(negSamplePercentage3, 1.0)\n  }\n\n  @Test(description = \"Precision@K\")\n  def testComputePrecisionAtK(): Unit = {\n    val pAt5Threshold1 = StatsUtils.computePrecisionAtK(1.0, 5)(_)\n    val pAt5Threshold2 = StatsUtils.computePrecisionAtK(2.0, 5)(_)\n    val pAt10Threshold1 = StatsUtils.computePrecisionAtK(1.0, 10)(_)\n    val pAt10Threshold2 = StatsUtils.computePrecisionAtK(2.0, 10)(_)\n\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1.0, dimensionValue = \"\", groupId = \"1\", rank = 1),\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\", groupId = \"1\", rank = 2),\n      ModelPrediction(label = 2, prediction = 0.8, dimensionValue = \"\", groupId = \"1\", rank = 3),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\", groupId = \"1\", rank = 4),\n\n      ModelPrediction(label = 2, prediction = 0.9, dimensionValue = \"\", groupId = \"2\", rank = 1),\n      ModelPrediction(label = 2, prediction = 0.2, dimensionValue = \"\", groupId = \"2\", rank = 2),\n      ModelPrediction(label = 1, prediction = 0.3, dimensionValue = \"\", groupId = \"2\", rank = 3),\n      ModelPrediction(label = 1, prediction = 1.0, dimensionValue = \"\", groupId = \"2\", rank = 4),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 5),\n      ModelPrediction(label = 1, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 6),\n      ModelPrediction(label = 2, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 7),\n      ModelPrediction(label = 2, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 8),\n      ModelPrediction(label = 1, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 9),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\", groupId = \"2\", rank = 10),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\", groupId = \"2\", rank = 11),\n\n      ModelPrediction(label = 2, prediction = 0.6, dimensionValue = \"\", groupId = \"3\", rank = 1),\n      ModelPrediction(label = 2, prediction = 0.6, dimensionValue = \"\", groupId = \"3\", rank = 2),\n      ModelPrediction(label = 2, prediction = 0.6, dimensionValue = \"\", groupId = \"3\", rank = 3),\n      ModelPrediction(label = 1, prediction = 0.6, dimensionValue = \"\", groupId = \"3\", rank = 4),\n      ModelPrediction(label = 1, prediction = 0.6, dimensionValue = \"\", groupId = \"3\", rank = 5),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\", groupId = \"3\", rank = 6))\n\n    Assert.assertEquals(pAt5Threshold1(predictions), 0.85)\n    Assert.assertEquals(pAt5Threshold2(predictions), 0.4166666666666667)\n    Assert.assertEquals(pAt10Threshold1(predictions), 0.7944444444444444)\n    Assert.assertEquals(pAt10Threshold2(predictions), 0.3833333333333333)\n  }\n\n  @Test(description = \"Standard Deviation\")\n  def testComputeStdDev(): Unit = {\n    Assert.assertEquals(StatsUtils.computeStdDev(Seq()), 0.0)\n    Assert.assertEquals(StatsUtils.computeStdDev(Seq(1.0)), 0.0)\n\n    val testSeq1: Seq[Double] = Seq(1.0, 1.0, 1.0, 1.0)\n    Assert.assertEquals(StatsUtils.computeStdDev(testSeq1), 0.0)\n\n    val testSeq2: Seq[Double] = Seq(-2.0, -1.0, 0.0, 1.0, 2.0)\n    Assert.assertEquals(StatsUtils.computeStdDev(testSeq2), 1.5811388300841898)\n\n    val testSeq3: Seq[Double] = Seq(1.0, 1.2, 2.0, 1.3, -1.4, -2.3, -1.8, 4.4,\n      2.2, 5.8, -3.0, 0.0, 0.3, 0.1, -0.01, -4, -3, -2.0, 1.0, 4.1, -2.8, 3.3)\n    Assert.assertEquals(StatsUtils.computeStdDev(testSeq3), 2.6658697808242033)\n  }\n\n  @Test(description = \"Traditional confusion matrix\")\n  def testComputeTraditionalConfusionMatrix(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualConfMatrix =\n      StatsUtils.computeGeneralizedConfusionMatrix(predictions)\n    val expectedConfMatrix = ConfusionMatrix(\n      truePositive = 1,\n      falsePositive = 4,\n      trueNegative = 2,\n      falseNegative = 3)\n    Assert.assertEquals(actualConfMatrix, expectedConfMatrix)\n  }\n\n  @Test(description = \"Generalized confusion matrix\")\n  def testComputeGeneralizedConfusionMatrix(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.4, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.9, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.3, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"))\n\n    val actualConfMatrix =\n      StatsUtils.computeGeneralizedConfusionMatrix(predictions)\n    val expectedConfMatrix = ConfusionMatrix(\n      truePositive = 1.3,\n      falsePositive = 3.7,\n      trueNegative = 2.3,\n      falseNegative = 2.7)\n    Assert.assertEquals(actualConfMatrix, expectedConfMatrix)\n  }\n\n  @Test(description = \"Compute precision\")\n  def testComputePrecision(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualPrecision = StatsUtils.computePrecision(predictions)\n    Assert.assertEquals(actualPrecision, 0.2)\n  }\n\n  @Test(description = \"Compute FPR\")\n  def testComputeFalsePositiveRate(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualFPR = StatsUtils.computeFalsePositiveRate(predictions)\n    Assert.assertEquals(StatsUtils.roundDouble(actualFPR), 0.66667)\n  }\n\n  @Test(description = \"Compute FNR\")\n  def testComputeFalseNegativeRate(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualFNR = StatsUtils.computeFalseNegativeRate(predictions)\n    Assert.assertEquals(actualFNR, 0.75)\n  }\n\n  @Test(description = \"Compute Recall\")\n  def testComputeRecall(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualRecall = StatsUtils.computeRecall(predictions)\n    Assert.assertEquals(actualRecall, 0.25)\n  }\n\n  @Test(description = \"Compute TNR\")\n  def testComputeTrueNegativeRate(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"\"))\n\n    val actualTNR = StatsUtils.computeTrueNegativeRate(predictions)\n    Assert.assertEquals(StatsUtils.roundDouble(actualTNR), 0.33333)\n  }\n\n  @Test(description = \"computeROCCurve\")\n  def testComputeROCCurve(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.4, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.9, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"))\n    val (fpr, tpr) = StatsUtils.computeROCCurve(predictions)\n    val roundedFpr = fpr.map(StatsUtils.roundDouble(_))\n    val roundedTpr = tpr.map(StatsUtils.roundDouble(_))\n    Assert.assertEquals(roundedFpr,\n      Seq(0.16667, 0.33333, 0.33333, 0.66667, 0.83333, 0.83333, 1.0, 1.0))\n    Assert.assertEquals(roundedTpr,\n      Seq(0.0, 0.0, 0.25, 0.25, 0.25, 0.75, 0.75, 1.0))\n  }\n\n  @Test(description = \"computeAUC\")\n  def testComputeAUC(): Unit = {\n    // Using the same predictions as above\n    val predictions1 = Seq(\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.4, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.9, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 1.0, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"))\n    val auc1 = StatsUtils.computeAUC(predictions1)\n    Assert.assertEquals(StatsUtils.roundDouble(auc1), 0.25)\n\n    // Predictions with a good classifier\n    val predictions2 = Seq(\n      ModelPrediction(label = 1, prediction = 1.0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.5, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.4, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.0, dimensionValue = \"\"))\n    val auc2 = StatsUtils.computeAUC(predictions2)\n    Assert.assertEquals(StatsUtils.roundDouble(auc2), 0.89583)\n\n    // Predictions with a perfect classifier\n    val predictions3 = Seq(\n      ModelPrediction(label = 1, prediction = 1.0, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.8, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 1, prediction = 0.7, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.6, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.4, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.2, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.1, dimensionValue = \"\"),\n      ModelPrediction(label = 0, prediction = 0.0, dimensionValue = \"\"))\n    val auc3 = StatsUtils.computeAUC(predictions3)\n    Assert.assertEquals(StatsUtils.roundDouble(auc3), 1.0)\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/mitigation/EOppUtilsTest.scala",
    "content": "package com.linkedin.lift.mitigation\n\nimport com.linkedin.lift.lib.PositionBiasUtils.debiasPositiveLabelScores\nimport com.linkedin.lift.lib.testing.TestUtils\nimport com.linkedin.lift.lib.testing.TestUtils.{applyPositionBias, loadCsvData}\nimport com.linkedin.lift.mitigation.EOppUtils._\nimport com.linkedin.lift.types.{ScoreWithAttribute, ScoreWithLabelAndAttribute, ScoreWithLabelAndPosition}\nimport org.apache.spark.mllib.random.RandomRDDs.uniformRDD\nimport org.apache.spark.sql.types._\nimport org.apache.spark.sql.{Row, SparkSession}\nimport org.scalatest.matchers.must.Matchers.contain\nimport org.scalatest.matchers.should.Matchers.convertToAnyShouldWrapper\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n\n/**\n  * Tests for EOppUtils\n  */\n\n\nclass EOppUtilsTest {\n\n\n  final val spark: SparkSession = TestUtils.createSparkSession()\n\n  @Test(description = \"Transforming a single score using a transformation function given as a scala map\")\n  def transformScoreTest: Unit = {\n    val transformation = Map(1.0 -> 2.0, 2.0 -> 4.0, 3.0 -> 5.0, 6.0 -> 11.0, 7.0 -> 11.0)\n    val sortedKeys = transformation.keys.toList.sorted\n\n    Assert.assertEquals(transformScore(0.0, sortedKeys, transformation), 2.0, 0)\n    Assert.assertEquals(transformScore(1.0, sortedKeys, transformation), 2.0, 0)\n    Assert.assertEquals(transformScore(1.5, sortedKeys, transformation), 3.0, 0)\n    Assert.assertEquals(transformScore(3.2, sortedKeys, transformation), 5.4, 0)\n    Assert.assertEquals(transformScore(6.2, sortedKeys, transformation), 11.0, 0)\n    Assert.assertEquals(transformScore(10, sortedKeys, transformation), 11.0, 0)\n  }\n\n  @Test(description = \"Transform scores of a dataset based on the corresponding attribute\")\n  def applyTransformationTest(): Unit = {\n    import spark.implicits._\n\n    val attributeList = List(\"0\", \"1\")\n\n    val transformations = Map(attributeList(0) -> Map(1.0 -> 2.0, 2.0 -> 4.0, 3.0 -> 5.0, 6.0 -> 11.0, 7.0 -> 11.0),\n      attributeList(1) -> Map(1.0 -> 10.0, 3.0 -> 4.0, 5.0 -> 2.0))\n\n    val data = List(\n      ScoreWithAttribute(0, 0.0, attributeList(0)),\n      ScoreWithAttribute(1, 0.0, attributeList(1)),\n      ScoreWithAttribute(2, 1.5, attributeList(0)),\n      ScoreWithAttribute(3, 1.5, attributeList(1)),\n      ScoreWithAttribute(4, 4.0, attributeList(0)),\n      ScoreWithAttribute(5, 4.0, attributeList(1)),\n      ScoreWithAttribute(6, 6.0, attributeList(0)),\n      ScoreWithAttribute(7, 6.0, attributeList(1))).toDS\n\n    val transformedData = applyTransformation(data, attributeList, transformations)\n\n    val expectedOutput = List(\n      ScoreWithAttribute(0, 2.0, attributeList(0)),\n      ScoreWithAttribute(1, 10.0, attributeList(1)),\n      ScoreWithAttribute(2, 3.0, attributeList(0)),\n      ScoreWithAttribute(3, 8.5, attributeList(1)),\n      ScoreWithAttribute(4, 7.0, attributeList(0)),\n      ScoreWithAttribute(5, 3.0, attributeList(1)),\n      ScoreWithAttribute(6, 11.0, attributeList(0)),\n      ScoreWithAttribute(7, 2.0, attributeList(1))).toDS\n\n    transformedData.collect() should contain theSameElementsAs expectedOutput.collect\n  }\n\n  @Test(description = \"Computing the empirical CDF function\")\n  def cdfTransformationTest(): Unit = {\n    val schema = StructType(Array(StructField(\"score\", DoubleType)))\n    val scoreRDD = uniformRDD(spark.sparkContext, 10000L, 5, 12).map(Row(_))\n    val data = spark.createDataFrame(scoreRDD, schema)\n    val numQuantiles = 4\n    val probabilities = Array.range(0, numQuantiles + 1).map(x => x.toDouble / numQuantiles)\n    val cdf = cdfTransformation(data, probabilities, 1e-6)\n\n    val sortedKeys = cdf.keys.toList.sorted\n\n    Assert.assertEquals(transformScore(0.0, sortedKeys, cdf), 0.0, 0.01)\n    Assert.assertEquals(transformScore(0.25, sortedKeys, cdf), 0.25, 0.01)\n    Assert.assertEquals(transformScore(0.5, sortedKeys, cdf), 0.5, 0.01)\n    Assert.assertEquals(transformScore(1.0, sortedKeys, cdf), 1.0, 0.01)\n  }\n\n  @Test()\n  def adjustScaleTest(): Unit = {\n    import spark.implicits._\n\n    val attributeList = List(\"0\", \"1\")\n\n    val transformations = Map(attributeList(0) -> Map(1.0 -> 2.0, 2.0 -> 4.0, 3.0 -> 6.0),\n      attributeList(1) -> Map(1.0 -> 2.0, 2.0 -> 4.0, 3.0 -> 6.0))\n\n    val data = List(\n      ScoreWithAttribute(0, 1.0, attributeList(0)),\n      ScoreWithAttribute(1, 1.0, attributeList(1)),\n      ScoreWithAttribute(2, 2.0, attributeList(0)),\n      ScoreWithAttribute(3, 2.0, attributeList(1)),\n      ScoreWithAttribute(4, 3.0, attributeList(0)),\n      ScoreWithAttribute(5, 3.0, attributeList(1))).toDS\n\n    val adjustedTransformation = adjustScale(data, attributeList, transformations, 3,\n      1e-6)\n\n    val transformedData = applyTransformation(data, attributeList, adjustedTransformation)\n\n    transformedData.collect() should contain theSameElementsAs data.collect\n  }\n\n  //@Test() // it takes around 2-5 minutes to run\n  def eOppTransformationTest(): Unit = {\n    // Training data and validation data are generated using the models described in the simulation section of\n    // https://arxiv.org/abs/2006.11350. Each dataset contains 1 million rows\n    // (20k sessions with 50 randomly selected items from a population of 50k items) and 5 columns\n    // (itemId, sessionId, score, label, attribute). Please see equality-of-opportunity.md for further details\n\n    import spark.implicits._\n\n    val attributeList = List(\"0\", \"1\")\n    val dataSchema = StructType(Array(\n      StructField(\"itemId\", IntegerType),\n      StructField(\"sessionId\", IntegerType),\n      StructField(\"score\", DoubleType),\n      StructField(\"label\", IntegerType),\n      StructField(\"attribute\", StringType),\n      StructField(\"position\", IntegerType, true))\n    )\n\n\n    val trainingDataWithoutPositionBias = loadCsvData(spark,\n      \"src/test/data/TrainingData.csv\", dataSchema, \",\")\n      .as[ScoreWithLabelAndAttribute]\n\n    val trainingData = applyPositionBias(trainingDataWithoutPositionBias)\n    trainingData.persist\n\n    // Step 1: Learning position bias corrected EOpp transformation using the training data\n    val debiasedTrainingData = debiasPositiveLabelScores(positionBiasEstimationCutOff = 20,\n      data = trainingData.as[ScoreWithLabelAndPosition], repeatTimes = 10, inflationRate = 10,\n      numPartitions = 10, seed = 123)\n\n    val transformations = eOppTransformation(debiasedTrainingData.as[ScoreWithLabelAndAttribute],\n      attributeList, numQuantiles = 1000, relativeTolerance = 1e-4, true)\n\n    // Step 2: Applying the EOpp transformation on the validation data\n    val validationDataWithoutPositionBias = loadCsvData(spark,\n      \"src/test/data/ValidationData.csv\", dataSchema, \",\")\n      .as[ScoreWithLabelAndAttribute]\n\n    val validationDataWithoutLabel = validationDataWithoutPositionBias\n      .drop(\"label\").as[ScoreWithAttribute]\n\n    val transformedValidationData = applyTransformation(validationDataWithoutLabel, attributeList,\n      transformations, 10)\n\n    val joinedData = transformedValidationData\n      .join(validationDataWithoutPositionBias.select($\"itemId\", $\"sessionId\", $\"label\"),\n        Seq(\"itemId\", \"sessionId\"), \"inner\")\n      .as[ScoreWithLabelAndAttribute]\n\n    val transformedValidationDataWithPositionBias = applyPositionBias(joinedData)\n      .filter($\"label\" === 1)\n\n    // Step 3: checking EOpp in the transformed validation data with position bias\n    val numQuantiles = 1000\n    val relativeTolerance = 1e-4\n    val probabilities = Array.range(0, numQuantiles + 1).map(x => x.toDouble / numQuantiles)\n    val attribute0Quantiles = transformedValidationDataWithPositionBias.filter($\"attribute\" === \"0\")\n      .stat.approxQuantile(\"score\", probabilities, relativeTolerance)\n    val attribute1Quantiles = transformedValidationDataWithPositionBias.filter($\"attribute\" === \"1\")\n      .stat.approxQuantile(\"score\", probabilities, relativeTolerance)\n\n    val wasserstein2DistanceEOpp = attribute0Quantiles.zip(attribute1Quantiles)\n      .map(x => math.pow(x._1 - x._2, 2)).sum / numQuantiles\n\n    Assert.assertEquals(wasserstein2DistanceEOpp, 0, 0.05)\n\n    // Step 4: checking if the transformed score distribution is the same as the score distribution before\n    //transformation\n    val quantilesAfterTransformation = transformedValidationData\n      .stat.approxQuantile(\"score\", probabilities, relativeTolerance)\n\n    val quantilesBeforeTransformation = validationDataWithoutLabel\n      .stat.approxQuantile(\"score\", probabilities, relativeTolerance)\n\n    val wasserstein2DistanceRescaling = quantilesAfterTransformation.zip(quantilesBeforeTransformation)\n      .map(x => math.pow(x._1 - x._2, 2)).sum / numQuantiles\n\n    Assert.assertEquals(wasserstein2DistanceRescaling, 0, 0.05)\n\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/types/BenefitMapTest.scala",
    "content": "package com.linkedin.lift.types\n\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for the BenefitMap class\n  */\n\nclass BenefitMapTest {\n\n  val EPS = 1e-12\n  val testBenefits: BenefitMap = BenefitMap(benefitType = \"x\", entries = Map(\n    Map(\"gender\" -> \"MALE\") -> 0.9,\n    Map(\"gender\" -> \"FEMALE\") -> 0.75,\n    Map(\"gender\" -> \"UNKNOWN\") -> 0.6))\n  val testBenefitsEqual: BenefitMap = BenefitMap(benefitType = \"y\", entries = Map(\n    Map(\"gender\" -> \"MALE\") -> 0.9,\n    Map(\"gender\" -> \"FEMALE\") -> 0.9,\n    Map(\"gender\" -> \"UNKNOWN\") -> 0.9))\n\n  @Test(description = \"Benefits mean and variance\")\n  def testMean(): Unit = {\n    Assert.assertEquals(testBenefits.mean, 0.75)\n    Assert.assertTrue(math.abs(testBenefits.variance - 0.015) < EPS)\n    Assert.assertEquals(testBenefitsEqual.mean, 0.9)\n    Assert.assertTrue(math.abs(testBenefitsEqual.variance) < EPS)\n  }\n\n  @Test(description = \"Inequality measures - unequal benefits\")\n  def testInequalityMeasuresUnequalBenefits(): Unit = {\n    val actualGEI20 = testBenefits.computeGeneralizedEntropyIndex(2.0)\n    val expectedGEI20 = 0.04 / 3\n    Assert.assertTrue(math.abs(actualGEI20 - expectedGEI20) < EPS)\n\n    val actualGEI10 = testBenefits.computeGeneralizedEntropyIndex(1.0)\n    val actualTheilT = testBenefits.computeTheilTIndex\n    val expectedGEI10 = (1.2 * math.log(1.2) + 0.8 * math.log(0.8)) / 3\n    Assert.assertTrue(math.abs(actualGEI10 - expectedGEI10) < EPS)\n    Assert.assertTrue(math.abs(actualTheilT - expectedGEI10) < EPS)\n\n    val actualGEI00 = testBenefits.computeGeneralizedEntropyIndex(0)\n    val actualTheilL = testBenefits.computeTheilLIndex\n    val expectedGEI00 = - (math.log(1.2) + math.log(0.8)) / 3\n    Assert.assertTrue(math.abs(actualGEI00 - expectedGEI00) < EPS)\n    Assert.assertTrue(math.abs(actualTheilL - expectedGEI00) < EPS)\n\n    val actualGEI05 = testBenefits.computeGeneralizedEntropyIndex(0.5)\n    val expectedGEI05 = (2 - math.sqrt(1.2) - math.sqrt(0.8)) * 4 / 3\n    Assert.assertTrue(math.abs(actualGEI05 - expectedGEI05) < EPS)\n\n    val actualAtkinson10 = testBenefits.computeAtkinsonIndex(1.0)\n    val expectedAtkinson10 = 1 - math.exp(-expectedGEI00)\n    Assert.assertTrue(math.abs(actualAtkinson10 - expectedAtkinson10) < EPS)\n\n    val actualAtkinson00 = testBenefits.computeAtkinsonIndex(0)\n    Assert.assertTrue(math.abs(actualAtkinson00) < EPS)\n\n    val actualAtkinson05 = testBenefits.computeAtkinsonIndex(0.5)\n    val expectedAtkinson05 =\n      1 - math.pow(math.sqrt(1.2) + math.sqrt(0.8) + 1, 2) / 9\n    Assert.assertTrue(math.abs(actualAtkinson05 - expectedAtkinson05) < EPS)\n\n    val actualCOV = testBenefits.computeCoefficientOfVariation\n    val expectedCOV = math.sqrt(0.015) / 0.75\n    Assert.assertTrue(math.abs(actualCOV - expectedCOV) < EPS)\n  }\n\n  @Test(description = \"Inequality measures - equal benefits\")\n  def testInequalityMeasuresEqualBenefits(): Unit = {\n    val actualGEI20 = testBenefitsEqual.computeGeneralizedEntropyIndex(2.0)\n    Assert.assertTrue(math.abs(actualGEI20) < EPS)\n\n    val actualGEI10 = testBenefitsEqual.computeGeneralizedEntropyIndex(1.0)\n    val actualTheilT = testBenefitsEqual.computeTheilTIndex\n    Assert.assertTrue(math.abs(actualGEI10) < EPS)\n    Assert.assertTrue(math.abs(actualTheilT) < EPS)\n\n    val actualGEI00 = testBenefitsEqual.computeGeneralizedEntropyIndex(0)\n    val actualTheilL = testBenefitsEqual.computeTheilLIndex\n    Assert.assertTrue(math.abs(actualGEI00) < EPS)\n    Assert.assertTrue(math.abs(actualTheilL) < EPS)\n\n    val actualGEI05 = testBenefitsEqual.computeGeneralizedEntropyIndex(0.5)\n    Assert.assertTrue(math.abs(actualGEI05) < EPS)\n\n    val actualAtkinson10 = testBenefitsEqual.computeAtkinsonIndex(1.0)\n    Assert.assertTrue(math.abs(actualAtkinson10) < EPS)\n\n    val actualAtkinson00 = testBenefitsEqual.computeAtkinsonIndex(0)\n    Assert.assertTrue(math.abs(actualAtkinson00) < EPS)\n    val actualAtkinson05 = testBenefitsEqual.computeAtkinsonIndex(0.5)\n    Assert.assertTrue(math.abs(actualAtkinson05) < EPS)\n\n    val actualCOV = testBenefitsEqual.computeCoefficientOfVariation\n    Assert.assertTrue(math.abs(actualCOV) < EPS)\n  }\n\n  @Test(description = \"Compute overall fairness metrics\")\n  def testComputeOverallMetrics(): Unit = {\n    val actualResults = testBenefits.computeOverallMetrics(Map(\n      \"GENERALIZED_ENTROPY_INDEX\" -> \"0.5\", \"THEIL_T_INDEX\" -> \"\",\n      \"THEIL_L_INDEX\" -> \"\"))\n\n    Assert.assertEquals(actualResults, Seq(\n      FairnessResult(resultType = \"Benefit Map for x\",\n        resultValOpt = None,\n        constituentVals = Map(Map(\"gender\" -> \"UNKNOWN\") -> 0.6,\n          Map(\"gender\" -> \"MALE\") -> 0.9,\n          Map(\"gender\" -> \"FEMALE\") -> 0.75)),\n      FairnessResult(resultType = \"x: GENERALIZED_ENTROPY_INDEX\",\n        parameters = \"0.5\",\n        constituentVals = Map(),\n        resultValOpt = Some(0.013503591986335994)),\n      FairnessResult(resultType = \"x: THEIL_T_INDEX\",\n        constituentVals = Map(),\n        resultValOpt = Some(0.013423675700459214)),\n      FairnessResult(resultType = \"x: THEIL_L_INDEX\",\n        constituentVals = Map(),\n        resultValOpt = Some(0.013607331506751752))))\n  }\n\n  @Test(description = \"BenefitMap computation\")\n  def testCompute(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 1, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"MALE\"),\n      ModelPrediction(label = 0, prediction = 0, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"FEMALE\"),\n      ModelPrediction(label = 1, prediction = 1, dimensionValue = \"UNKNOWN\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"),\n      ModelPrediction(label = 0, prediction = 1, dimensionValue = \"UNKNOWN\"))\n\n    val actualBenefitMap1 = BenefitMap.compute(predictions,\n      \"gender\", \"PRECISION\")\n    Assert.assertEquals(actualBenefitMap1, BenefitMap(\n      benefitType = \"PRECISION\",\n      entries = Map(Map(\"gender\" -> \"MALE\") -> 0.5,\n        Map(\"gender\" -> \"FEMALE\") -> 0.0, Map(\"gender\" -> \"UNKNOWN\") -> 0.25)))\n\n    val actualBenefitMap2 = BenefitMap.compute(predictions, \"gender\",\n      \"com.linkedin.lift.lib.testing.TestCustomMetric\")\n    Assert.assertEquals(actualBenefitMap2, BenefitMap(\n      benefitType = \"com.linkedin.lift.lib.testing.TestCustomMetric\",\n      entries = Map(Map(\"gender\" -> \"MALE\") -> 1.0,\n      Map(\"gender\" -> \"FEMALE\") -> 1.0, Map(\"gender\" -> \"UNKNOWN\") -> 1.0)))\n  }\n\n  @Test(description = \"BenefitMap computation for ranking metric\")\n  def testComputeRanking(): Unit = {\n    val predictions = Seq(\n      ModelPrediction(label = 1, prediction = 0.35, dimensionValue = \"MALE\", groupId = \"1\", rank = 1),\n      ModelPrediction(label = 0, prediction = 0.25, dimensionValue = \"MALE\", groupId = \"1\", rank = 2),\n      ModelPrediction(label = 1, prediction = 0.11, dimensionValue = \"FEMALE\", groupId = \"1\", rank = 3),\n      ModelPrediction(label = 1, prediction = 0.88, dimensionValue = \"MALE\", groupId = \"2\", rank = 1),\n      ModelPrediction(label = 0, prediction = 0.65, dimensionValue = \"FEMALE\", groupId = \"2\", rank = 2),\n      ModelPrediction(label = 0, prediction = 0.22, dimensionValue = \"MALE\", groupId = \"2\", rank = 3),\n      ModelPrediction(label = 1, prediction = 0.10, dimensionValue = \"FEMALE\", groupId = \"2\", rank = 4),\n      ModelPrediction(label = 1, prediction = 0.11, dimensionValue = \"MALE\", groupId = \"3\", rank = 1))\n\n    val actualBenefitMap = BenefitMap.compute(predictions,\n      \"gender\", \"PRECISION/1@25\")\n    Assert.assertEquals(actualBenefitMap, BenefitMap(\n      benefitType = \"PRECISION/1@25\",\n      entries = Map(Map(\"gender\" -> \"MALE\") -> 0.6666666666666666,\n        Map(\"gender\" -> \"FEMALE\") -> 0.75)))\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/types/DistributionTest.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.lib.testing.TestValues\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for the Distribution class\n  */\nclass DistributionTest {\n\n  @Test(description = \"Distribution sum\")\n  def testSum(): Unit = {\n    val testDist = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n    Assert.assertEquals(testDist.sum, 58.0)\n\n    val testDistEmpty = Distribution(Map())\n    Assert.assertEquals(testDistEmpty.sum, 0.0)\n  }\n\n  @Test(description = \"Zip two different distributions - no overlap\")\n  def testZipNoOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedZip12 = Seq(\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"), 24.0, 0.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\"), 4.0, 0.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"), 0.0, 20.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"40\"), 0.0, 10.0))\n    Assert.assertEquals(testDist1.zip(testDist2), expectedZip12)\n\n    val expectedZip21 = Seq(\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"), 20.0, 0.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"40\"), 10.0, 0.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"), 0.0, 24.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\"), 0.0, 4.0))\n    Assert.assertEquals(testDist2.zip(testDist1), expectedZip21)\n  }\n\n  @Test(description = \"Zip two different distributions - with overlap\")\n  def testZipWithOverlap(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n\n    val testDist2 = Distribution(Map(\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\") -> 20.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 5.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\") -> 10.0))\n\n    val expectedZip12 = Seq(\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"), 24.0, 0.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"40\"), 12.0, 10.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\"), 4.0, 5.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"), 0.0, 20.0))\n    Assert.assertEquals(testDist1.zip(testDist2), expectedZip12)\n\n    val expectedZip21 = Seq(\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"20\"), 20.0, 0.0),\n      (Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\"), 5.0, 4.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"40\"), 10.0, 12.0),\n      (Map(\"gender\" -> \"MALE\", \"age\" -> \"20\"), 0.0, 24.0))\n    Assert.assertEquals(testDist2.zip(testDist1), expectedZip21)\n  }\n\n  @Test(description = \"Marginal distribution computation\")\n  def testComputeMarginal(): Unit = {\n    val inputDistributionGenderLabel = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\") -> 10.0,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\") -> 3.0,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 4.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\") -> 5.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\") -> 2.0))\n\n    val expectedMarginalDistributionGender = Distribution(Map(\n      Map(\"gender\" -> \"MALE\") -> 13.0,\n      Map(\"gender\" -> \"UNKNOWN\") -> 4.0,\n      Map(\"gender\" -> \"FEMALE\") -> 7.0))\n\n    val expectedMarginalDistributionLabel = Distribution(Map(\n      Map(\"label\" -> \"0\") -> 19.0,\n      Map(\"label\" -> \"1\") -> 5.0))\n\n    val actualMarginalDistributionGender =\n      inputDistributionGenderLabel.computeMarginal(Set(\"gender\"))\n    Assert.assertEquals(actualMarginalDistributionGender,\n      expectedMarginalDistributionGender)\n\n    val actualMarginalDistributionLabel =\n      inputDistributionGenderLabel.computeMarginal(Set(\"label\"))\n    Assert.assertEquals(actualMarginalDistributionLabel,\n      expectedMarginalDistributionLabel)\n\n    // Ensure that the marginal distribution is identical to the original\n    // distribution when all dimensions are included\n    val actualMarginalDistributionGenderLabel =\n      inputDistributionGenderLabel.computeMarginal(Set(\"gender\", \"label\"))\n    Assert.assertEquals(actualMarginalDistributionGenderLabel,\n      inputDistributionGenderLabel)\n  }\n\n  @Test(description = \"Distribution to DF conversion\")\n  def testToDF(): Unit = {\n    val testDist1 = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"20\") -> 24.0,\n      Map(\"gender\" -> \"MALE\", \"age\" -> \"40\", \"label\" -> \"1\") -> 12.0,\n      Map(\"gender\" -> \"FEMALE\", \"age\" -> \"40\") -> 4.0))\n    val df = testDist1.toDF(TestValues.spark)\n\n    // Ensure column names are correct\n    Assert.assertEquals(df.schema.fieldNames.toSeq,\n      Seq(\"gender\", \"age\", \"label\", \"count\"))\n\n    val actualDFSeq: Seq[Seq[Any]] = df.collect.toSeq.map { row =>\n      row.toSeq.map { Option(_).fold(\"\") { _.toString } }\n    }\n\n    val expectedDFSeq: Seq[Seq[Any]] = Seq(\n      Seq(\"MALE\", \"20\", \"\", \"24.0\"),\n      Seq(\"MALE\", \"40\", \"1\", \"12.0\"),\n      Seq(\"FEMALE\", \"40\", \"\", \"4.0\"))\n\n    // Ensure that datasets match\n    Assert.assertEquals(actualDFSeq, expectedDFSeq)\n  }\n\n  @Test(description = \"Distribution computation\")\n  def testCompute(): Unit = {\n    val actualDistributionGender =\n      Distribution.compute(TestValues.df, Set(\"gender\"))\n    val expectedDistributionGender = Distribution(Map(\n      Map(\"gender\" -> \"MALE\") -> 5,\n      Map(\"gender\" -> \"FEMALE\") -> 4,\n      Map(\"gender\" -> \"UNKNOWN\") -> 1))\n    Assert.assertEquals(actualDistributionGender, expectedDistributionGender)\n\n    val actualDistributionLabel =\n      Distribution.compute(TestValues.df, Set(\"label\"))\n    val expectedDistributionLabel = Distribution(Map(\n      Map(\"label\" -> \"0\") -> 6,\n      Map(\"label\" -> \"1\") -> 4))\n    Assert.assertEquals(actualDistributionLabel, expectedDistributionLabel)\n\n    val actualDistributionGenderLabel =\n      Distribution.compute(TestValues.df, Set(\"gender\", \"label\"))\n    val expectedDistributionGenderLabel = Distribution(Map(\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"0\") -> 3.0,\n      Map(\"gender\" -> \"MALE\", \"label\" -> \"1\") -> 2.0,\n      Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0\") -> 1.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0\") -> 2.0,\n      Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1\") -> 2.0))\n    Assert.assertEquals(actualDistributionGenderLabel,\n      expectedDistributionGenderLabel)\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/types/FairnessResultTest.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.lib.testing.TestValues\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for the FairnessResult class\n  */\n\nclass FairnessResultTest {\n  val EPS = 1e-12\n\n  @Test(description = \"FairnessResult to BenefitMap translation\")\n  def testToBenefitMap(): Unit = {\n    val fairnessResult = FairnessResult(\n      resultType = \"DEMOGRAPHIC_PARITY\",\n      resultValOpt = None,\n      constituentVals = Map(\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"FEMALE\") -> 0.01,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.03,\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"UNKNOWN\") -> 0.02),\n      additionalStats = Map(\"MALE\" -> 0.03, \"FEMALE\" -> 0.02, \"UNKNOWN\" -> 0.05))\n\n    val actualBenefitMap = fairnessResult.toBenefitMap\n    Assert.assertEquals(actualBenefitMap, BenefitMap(\n      benefitType = \"DEMOGRAPHIC_PARITY\",\n      entries = Map(\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"FEMALE\") -> 0.01,\n        Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.03,\n        Map(\"gender1\" -> \"MALE\", \"gender2\" -> \"UNKNOWN\") -> 0.02)))\n\n    val actualGEI = (3.0 - math.pow(0.5, 0.5) - math.pow(1.5, 0.5)\n      - math.pow(1.0, 0.5)) / 0.75\n    Assert.assertTrue(math.abs(\n      actualBenefitMap.computeGeneralizedEntropyIndex(0.5) - actualGEI) < EPS)\n  }\n\n  @Test(description = \"FairnessResults to DataFrame translation\")\n  def testToDF(): Unit = {\n    val results = Seq(\n      FairnessResult(resultType = \"KL_DIVERGENCE\",\n        parameters = Distribution(Map(\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"UNKNOWN\", \"label\" -> \"1.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"FEMALE\", \"label\" -> \"0.0\") -> 0.16666,\n          Map(\"gender\" -> \"MALE\", \"label\" -> \"1.0\") -> 0.16666)).toString,\n        resultValOpt = Some(0.13852315605014068),\n        constituentVals = Map()),\n      FairnessResult(resultType = \"DEMOGRAPHIC_PARITY\",\n        resultValOpt = None,\n        constituentVals = Map(\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"UNKNOWN\") -> 0.16667,\n          Map(\"gender1\" -> \"FEMALE\", \"gender2\" -> \"MALE\") -> 0.26667,\n          Map(\"gender1\" -> \"UNKNOWN\", \"gender2\" -> \"MALE\") -> 0.1),\n        additionalStats =\n          Map(\"FEMALE\" -> 0.33333, \"UNKNOWN\" -> 0.5, \"MALE\" -> 0.6)))\n\n    val actualDFSeq = FairnessResult.toDF(TestValues.spark, results)\n      .collect\n      .toSeq.map(_.toString)\n    Assert.assertEquals(actualDFSeq, Seq(\"[KL_DIVERGENCE,Distribution(Map(Map(\" +\n      \"gender -> FEMALE, label -> 1.0) -> 0.16666, Map(gender -> UNKNOWN, \" +\n      \"label -> 0.0) -> 0.16666, Map(gender -> UNKNOWN, label -> 1.0) -> 0.16666, \" +\n      \"Map(gender -> MALE, label -> 0.0) -> 0.16666, Map(gender -> FEMALE, \" +\n      \"label -> 0.0) -> 0.16666, Map(gender -> MALE, label -> 1.0) -> 0.16666)),\" +\n      \"0.13852315605014068,Map(),Map()]\",\n      \"[DEMOGRAPHIC_PARITY,,null,Map(Map(gender1 -> FEMALE, gender2 -> UNKNOWN) -> 0.16667, \" +\n        \"Map(gender1 -> FEMALE, gender2 -> MALE) -> 0.26667, Map(gender1 -> UNKNOWN, \" +\n        \"gender2 -> MALE) -> 0.1),Map(FEMALE -> 0.33333, UNKNOWN -> 0.5, MALE -> 0.6)]\"))\n  }\n}\n"
  },
  {
    "path": "lift/src/test/scala/com/linkedin/lift/types/ModelPredictionTest.scala",
    "content": "package com.linkedin.lift.types\n\nimport com.linkedin.lift.lib.testing.TestValues\nimport org.testng.Assert\nimport org.testng.annotations.Test\n\n/**\n  * Tests for the ModelPrediction class\n  */\n\nclass ModelPredictionTest {\n  @Test(description = \"Compute ModelPrediction instances from a DF\")\n  def testCompute(): Unit = {\n    val actualPredictions = ModelPrediction.compute(TestValues.df,\n      \"label\", \"predicted\", \"\", \"gender\")\n    val expectedPredictions = TestValues.testData\n      .sortBy(- _.predicted.toDouble)\n      .zipWithIndex\n      .map { case (data, idx) =>\n        ModelPrediction(\n          label = data.label.toDouble,\n          prediction = data.predicted.toDouble,\n          rank = idx + 1,\n          dimensionValue = data.gender)\n      }\n    Assert.assertEquals(actualPredictions, expectedPredictions)\n  }\n\n  @Test(description = \"Compute ModelPrediction instances from a DF with groups\")\n  def testComputeWithGroups(): Unit = {\n    val actualPredictions = ModelPrediction.compute(TestValues.df2,\n      \"label\", \"predicted\", \"qid\", \"gender\")\n    val expectedPredictions = TestValues.testData2\n      .groupBy(_.qid)\n      .flatMap { case (_, dataPts) =>\n        dataPts.sortBy(- _.predicted.toDouble)\n          .zipWithIndex.map { case (data, idx) =>\n          ModelPrediction(\n            label = data.label.toDouble,\n            prediction = data.predicted.toDouble,\n            groupId = data.qid,\n            rank = idx + 1,\n            dimensionValue = data.gender)\n        }\n      }\n    Assert.assertEquals(actualPredictions, expectedPredictions)\n  }\n}\n"
  },
  {
    "path": "model-fairness.md",
    "content": "# Model-level Fairness Metrics\n\nAt a high level, these metrics require the score of the model and the corresponding\nprotected attribute value. There are some metrics that also make use of the\ncorresponding label as well. Now, a model can produce a raw score or a probability.\nOur fairness metrics specifically deal with models that output probabilities that\ncan be treated as ![P(\\hat{Y}(X) = 1)](https://render.githubusercontent.com/render/math?math=P(%5Chat%7BY%7D(X)%20%3D%201))\n(if the scores are raw scores, we pass it through a sigmoid link function\nto interpret it as a probability). If your models\ndo not output binary prediction probabilities, or these probabilities\nare not appropriate to be interpreted as shown, you will need to preprocess\nthe scores before using the library. This can be done inline, in the Spark\njob that computes the model-related fairness metrics.\n\nIf your model is being used for binary classification (and not just for its scores), an\noptional threshold value can be provided, which will be used to binarize the predictions.\nIf a threshold value is not specified, the probabilities ![P(\\hat{Y}(X) = 1)](https://render.githubusercontent.com/render/math?math=P(%5Chat%7BY%7D(X)%20%3D%201))\nare used to compute expected TP, FP, TN and FN counts as needed.\n\nWe provide here a list of the various metrics available for measuring fairness of\nML models, as well as a short description of each of them.\n\n1. **Metrics that compare against a given reference distribution:** These metrics involve computing\nsome measure of distance or divergence from a given reference distribution provided by the user.\nThe library supports only the `UNIFORM` distribution out of the box (all `score-protectedAttribute`\ncombinations must have equal number of records), but users may supply their own distribution\n(such as an apriori known gender distribution etc.). These metrics are\nsimilar to those computed on the training dataset. The only difference is that we make use\nof the predictions/scores instead of the labels, ie., ![\\hat{Y}(X)](https://render.githubusercontent.com/render/math?math=%5Chat%7BY%7D(X))\ninstead of ![Y(X)](https://render.githubusercontent.com/render/math?math=Y(X)).\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/lib/DivergenceUtils.scala), and look for the\n    `computeDistanceMetrics` method as the starting point. The following metrics fall under this\n    category:\n\n    1. **Skews:** Computes the logarithm of the ratio of the observed value to the expected value. For example, if we are dealing with score-gender distributions, this metric computes\n\n        ![\\log\\left(\\frac{(0.0, MALE)_{obs}}{(0.0, MALE)_{exp}}\\right), \\log\\left(\\frac{(1.0, MALE)_{obs}}{(1.0, MALE)_{exp}}\\right), \\log\\left(\\frac{(0.0, FEMALE)_{obs}}{(0.0, FEMALE)_{exp}}\\right), \\log\\left(\\frac{(1.0, FEMALE)_{obs}}{(1.0, FEMALE)_{exp}}\\right)](https://render.githubusercontent.com/render/math?math=%5Clog%5Cleft(%5Cfrac%7B(0.0%2C%20MALE)_%7Bobs%7D%7D%7B(0.0%2C%20MALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(1.0%2C%20MALE)_%7Bobs%7D%7D%7B(1.0%2C%20MALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(0.0%2C%20FEMALE)_%7Bobs%7D%7D%7B(0.0%2C%20FEMALE)_%7Bexp%7D%7D%5Cright)%2C%20%5Clog%5Cleft(%5Cfrac%7B(1.0%2C%20FEMALE)_%7Bobs%7D%7D%7B(1.0%2C%20FEMALE)_%7Bexp%7D%7D%5Cright))\n\n    2. **Infinity Norm Distance:** Computes the Chebyshev Distance between the observed and reference distribution. It equals the maximum difference between the two distributions.\n    3. **Total Variation Distance:** Computes the Total Variation Distance between the observed and reference distribution. It is equal to half the L1 distance between the two distributions.\n    4. **JS Divergence:** The Jensen-Shannon Divergence between the observed and reference distribution. Suppose that the average of these two distributions is given by M. Then, the JS Divergence is the average of the KL Divergences between the observed distribution and M, and the reference distribution and M.\n    5. **KL Divergence:** The Kullback-Leibler Divergence between the observed and reference distribution. It is the expectation (over the observed distribution) of the logarithmic differences between the observed and reference distributions. The latter is the Skew we measure above.\n    \n2. **Metrics computed on the observed distribution only:** These metrics compute some notion of\ndistance or divergence between various segments of the observed distribution.\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/lib/DivergenceUtils.scala), and look for the\n    `computeDistanceMetrics` method as the starting point. The following metrics fall under this\n    category:\n\n    1. **Demographic Parity:** It measures the difference between the conditional expected value of the prediction (given one protected attribute value) and the conditional expected value of the prediction (given the other protected attribute value). This is measured for all pairs of protected attribute values.\n\n        ![DP_{(g_1, g_2)} = E\\[\\hat{Y}(X)|G=g_1\\] - E\\[\\hat{Y}(X)|G=g_2\\] = P(\\hat{Y}(X)=1|G=g_1) - P(\\hat{Y}(X)=1|G=g_2)](https://render.githubusercontent.com/render/math?math=DP_%7B(g_1%2C%20g_2)%7D%20%3D%20E%5B%5Chat%7BY%7D(X)%7CG%3Dg_1%5D%20-%20E%5B%5Chat%7BY%7D(X)%7CG%3Dg_2%5D%20%3D%20P(%5Chat%7BY%7D(X)%3D1%7CG%3Dg_1)%20-%20P(%5Chat%7BY%7D(X)%3D1%7CG%3Dg_2))\n\n       This metric captures the idea that different protected groups should have similar acceptance rates. While this is desirable in an ideal scenario (and is related to the [80% Labor Law rule](https://en.wikipedia.org/wiki/Disparate_impact), this might not always be true. For example, various socio-economic factors might contribute towards having different acceptance rates for different groups. That is, the difference is not due to the protected group itself, but rather due to other meaningful, but correlated variables. Furthermore, even if we are dealing with a scenario where DP is desirable, it does not deal with model performance at all. We might as well have a second model predict '1' randomly for one group (with a probability equal to the acceptance rate of the other group) to achieve DP. Thus, attempting to optimize for DP directly might not be a good goal, but using it to inform decisions is nevertheless helpful.\n\n    2. **Equalized Odds:** It measures the difference between the conditional expected value of the prediction (given one protected attribute value and its label) and the conditional expected value of the prediction (given the other protected attribute value and its label). This is measured for all pairs of protected attribute values and label.\n\n        ![EO_{(g_1, g_2, y)} = E\\[\\hat{Y}(X)|Y=y,G=g_1\\] - E\\[\\hat{Y}(X)|Y=y,G=g_2\\] = P(\\hat{Y}(X)=1|Y=y,G=g_1) - P(\\hat{Y}(X)=1|Y=y,G=g_2)](https://render.githubusercontent.com/render/math?math=EO_%7B(g_1%2C%20g_2%2C%20y)%7D%20%3D%20E%5B%5Chat%7BY%7D(X)%7CY%3Dy%2CG%3Dg_1%5D%20-%20E%5B%5Chat%7BY%7D(X)%7CY%3Dy%2CG%3Dg_2%5D%20%3D%20P(%5Chat%7BY%7D(X)%3D1%7CY%3Dy%2CG%3Dg_1)%20-%20P(%5Chat%7BY%7D(X)%3D1%7CY%3Dy%2CG%3Dg_2))\n\n3. **Statistical Tests for Fairness:** This deals with comparing a given model performance metric between two different protected groups. For example, comparing the AUC for men vs AUC for women. We need to be able to say if this difference is statistically significant, and we also need it to be metric-agnostic. We achieve this using Permutation Testing. Since this is a non-parametric statistical test, it can be slow, so users can control the sample size and the number of trials to run. The test provides a p-value and a measure of standard error (for the p-value) as well.\n\n    We support AUC, Precision, Recall, TNR, FNR and FPR out-of-the-box (the full list can be found by visiting `StatsUtils.scala` and looking at the `getMetricFn` method), and also support any user-defined custom metrics (it needs to extend [CustomMetric.scala](lift/src/main/scala/com/linkedin/lift/types/CustomMetric.scala)). More details about the test itself can be found in the `permutationTest` method defined [here](lift/src/main/scala/com/linkedin/lift/lib/PermutationTestUtils.scala). To cite this work, please refer to the 'Citations' section of the [README](README.md).\n    \n4. **Aggregate Metrics:** These metrics are useful to obtain higher level (or second order) notions of inequality, when comparing multiple per-protected-attribute-value inequality metrics. For example, these could be used to say if one set of Skews measured is more equally distributed that another set of Skews. These lower-level metrics are called benefit vectors, and the aggregate metrics provide a notion of how uniformly these inequalities are distributed.\n\n    Note that these metrics capture inequalities within the vector. Thus, going by this metric alone is not sufficient. For example, take a benefit vector that captures Demographic Parity differences between (MALE, FEMALE), (FEMALE, UNKNOWN), and (MALE, UNKNOWN). Suppose that the vector for one distribution is (a, 2a, 3a) and the other is (0.5a, 1.5a, 2a). Even though the individual differences are smaller in the second distribution (for each pair of protected attribute values), an aggregate metric will deem it to be more unfair than the former because the differences in the elements of the vector are more drastic than the other (for the first one, the ratio is 1:2:3 while for the second it is 1:3:4). However, the latter has better Demographic Parity. Hence, there may be conflicting notions of fairness being measured, and it is up to the end user to identify which one they would like to focus on.\n\n    We divide these into two: `distanceBenefitMetrics` and `performanceBenefitMetrics`. The former computes distance and divergence metrics (mentioned in 1 and 2) and uses these as the benefit vectors for aggregate metrics computation. The latter uses model performance metrics (such as AUC, TPR, FPR for different protected groups) as the benefit vector for aggregate metrics computation. There is no difference in the aggregate computation itself; this distinction is used by LiFT to just be more specific about what needs to be computed. The aggregate metrics can be computed for performance metrics supported out-of-the-box, as well as user-defined custom ones, as mentioned in 3.\n\n    For the most up-to-date documentation on the supported metrics, please look at the link [here](lift/src/main/scala/com/linkedin/lift/types/BenefitMap.scala), and look for the `computeMetric` method as the starting point. The following aggregate metrics are available:\n    1. **Generalized Entropy Index:** Computes an average of the relative benefits based on some input parameters.\n    2. **Atkinsons Index:** A derivative of the Generalized Entropy Index. Used more commonly in the field of economics.\n    3. **Theil's L Index:** The Generalized Entropy Index when its parameter is set to 0. It is more sensitive to differences at the lower end of the distribution (the benefit vector values).\n    4. **Theil's T Index:** The Generalized Entropy Index when its parameter is set to 1. It is more sensitive to differences at the higher end of the distribution (the benefit vector values).\n    5. **Coefficient of Variation:** A derivative of the Generalized Entropy Index. It computes the value of the standard deviation divided by the mean of the benefit vector.\n\n"
  },
  {
    "path": "settings.gradle",
    "content": "/*\n * This file was generated by the Gradle 'init' task.\n *\n * The settings file is used to specify which projects to include in your build.\n *\n * Detailed information about configuring a multi-project build in Gradle can be found\n * in the user manual at https://docs.gradle.org/5.6.2/userguide/multi_project_builds.html\n */\n\nrootProject.name = 'lift'\n\ninclude 'lift'\n"
  },
  {
    "path": "version.properties",
    "content": "# Version of the produced binaries.\n# The version is inferred by shipkit-auto-version Gradle plugin (https://github.com/shipkit/shipkit-auto-version)\nversion=0.3.*\n"
  }
]