Full Code of stanford-crfm/ecosystem-graphs for AI

main 576655a544c9 cached

194 files

1.2 MB

325.3k tokens

46 symbols

1 requests

Download .txt

Showing preview only (1,293K chars total). Download the full file or copy to clipboard to get everything.

Repository: stanford-crfm/ecosystem-graphs
Branch: main
Commit: 576655a544c9
Files: 194
Total size: 1.2 MB

Directory structure:
gitextract_sfi3u6er/

├── .github/
│   └── workflows/
│       ├── collate.yml
│       ├── deploy.yml
│       └── test.yml
├── .gitignore
├── .yamllint.yaml
├── README.md
├── assets/
│   ├── 01ai.yaml
│   ├── 360.yaml
│   ├── adept.yaml
│   ├── adobe.yaml
│   ├── ai2.yaml
│   ├── ai21.yaml
│   ├── aleph_alpha.yaml
│   ├── alibaba.yaml
│   ├── amazon.yaml
│   ├── anthropic.yaml
│   ├── apple.yaml
│   ├── argilla.yaml
│   ├── argonne.yaml
│   ├── aspia_space_institu.yaml
│   ├── assembly.yaml
│   ├── autogpt.yaml
│   ├── avignon.yaml
│   ├── baai.yaml
│   ├── baichuan.yaml
│   ├── baidu.yaml
│   ├── bain.yaml
│   ├── be_my_eyes.yaml
│   ├── beitech.yaml
│   ├── berkeley.yaml
│   ├── bigcode.yaml
│   ├── bigscience.yaml
│   ├── bloomberg.yaml
│   ├── boston.yaml
│   ├── brex.yaml
│   ├── bytedance.yaml
│   ├── cagliostro.yaml
│   ├── cartesia.yaml
│   ├── casia.yaml
│   ├── causallm.yaml
│   ├── cerebras.yaml
│   ├── character.yaml
│   ├── chatglm.yaml
│   ├── cmu.yaml
│   ├── cognition.yaml
│   ├── cognitive.yaml
│   ├── cohere.yaml
│   ├── columbia.yaml
│   ├── compute_01_research.yaml
│   ├── continue.yaml
│   ├── cresta.yaml
│   ├── databricks.yaml
│   ├── deci.yaml
│   ├── deepmind.yaml
│   ├── deepnight.yaml
│   ├── deepseek.yaml
│   ├── dibt.yaml
│   ├── duckduckgo.yaml
│   ├── duolingo.yaml
│   ├── eleutherai.yaml
│   ├── ens.yaml
│   ├── epfl.yaml
│   ├── evolutionaryscale.yaml
│   ├── faraday.yaml
│   ├── fuse.yaml
│   ├── genmo.yaml
│   ├── glm.yaml
│   ├── google.yaml
│   ├── greenbit.yaml
│   ├── h2oai.yaml
│   ├── hubspot.yaml
│   ├── huggingface.yaml
│   ├── ibm.yaml
│   ├── idea.yaml
│   ├── ideogram.yaml
│   ├── inflection.yaml
│   ├── instacart.yaml
│   ├── internlm.yaml
│   ├── juni.yaml
│   ├── kaist.yaml
│   ├── kakaobrain.yaml
│   ├── khan.yaml
│   ├── konan.yaml
│   ├── kotoba.yaml
│   ├── ktai.yaml
│   ├── kunlun.yaml
│   ├── laion.yaml
│   ├── laion_ev.yaml
│   ├── latitude.yaml
│   ├── lehigh.yaml
│   ├── lg.yaml
│   ├── lg_ai_research.yaml
│   ├── linkedin.yaml
│   ├── llm360.yaml
│   ├── lmsys.yaml
│   ├── mathai.yaml
│   ├── maya.yaml
│   ├── meta.yaml
│   ├── microsoft.yaml
│   ├── mila.yaml
│   ├── mistral.yaml
│   ├── moonhub.yaml
│   ├── moreh.yaml
│   ├── mosaic.yaml
│   ├── nanyang.yaml
│   ├── naver.yaml
│   ├── ncsoft.yaml
│   ├── neeva.yaml
│   ├── nextdoor.yaml
│   ├── nolano.yaml
│   ├── notion.yaml
│   ├── nous.yaml
│   ├── nucleus.yaml
│   ├── nvidia.yaml
│   ├── oasst.yaml
│   ├── ollama.yaml
│   ├── openai.yaml
│   ├── openbmb.yaml
│   ├── openlemur.yaml
│   ├── openx.yaml
│   ├── orion.yaml
│   ├── osu.yaml
│   ├── othersideai.yaml
│   ├── oxford.yaml
│   ├── paladin.yaml
│   ├── peking.yaml
│   ├── perplexity.yaml
│   ├── pleias.yaml
│   ├── portkey.yaml
│   ├── quizlet.yaml
│   ├── quora.yaml
│   ├── qwen_team.yaml
│   ├── rakuten.yaml
│   ├── reexpress.yaml
│   ├── reka.yaml
│   ├── robin.yaml
│   ├── runway_ai,_inc..yaml
│   ├── rwkv.yaml
│   ├── salesforce.yaml
│   ├── samba.yaml
│   ├── sana.yaml
│   ├── sciphi.yaml
│   ├── shanghai.yaml
│   ├── shop.yaml
│   ├── singapore.yaml
│   ├── skt.yaml
│   ├── snap.yaml
│   ├── soochow.yaml
│   ├── speak.yaml
│   ├── spotify.yaml
│   ├── stability.yaml
│   ├── stability_ai.yaml
│   ├── stanford.yaml
│   ├── stonybrook.yaml
│   ├── suno.yaml
│   ├── tiger.yaml
│   ├── together.yaml
│   ├── tokyo.yaml
│   ├── toronto.yaml
│   ├── transformify.yaml
│   ├── trevor.yaml
│   ├── triml.yaml
│   ├── tsinghua.yaml
│   ├── twelvelabs.yaml
│   ├── uae.yaml
│   ├── unknown.yaml
│   ├── uw.yaml
│   ├── uwashington.yaml
│   ├── vago.yaml
│   ├── viable.yaml
│   ├── vilm.yaml
│   ├── wayve.yaml
│   ├── writer.yaml
│   ├── xai.yaml
│   ├── xverse.yaml
│   ├── xwin.yaml
│   ├── yandex.yaml
│   ├── you.yaml
│   └── zhejiang.yaml
├── components/
│   ├── home.html
│   ├── nav.html
│   └── table.html
├── css/
│   └── main.css
├── index.html
├── js/
│   ├── general.js
│   ├── main.js
│   └── schemas.yaml
├── pre-commit.sh
├── requirements.txt
├── resources/
│   ├── .gitkeep
│   └── all_assets.csv
├── scripts/
│   └── collate.py
├── server.py
└── yamlfix.toml

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/workflows/collate.yml
================================================
---
name: Collate

"on":
  workflow_dispatch:
  push:
    paths:
      - assets/**

jobs:
  collate-assets:
    name: Collate all assets into single file
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.9'

      - name: Install dependencies
        run: python -m pip install --upgrade pip pandas pyyaml

      - name: collate
        run: |-
          ./scripts/collate.py

      - name: commit
        run: |-
          git config user.name "GitHub Actions Bot"
          git config user.email "<>"
          git add resources/all_assets.csv
          git commit -m "Update collated assets CSV."
          git push origin main


================================================
FILE: .github/workflows/deploy.yml
================================================
---
name: Deploy

"on":
  push:
    branches: main

jobs:
  # Check out the main branch of the development repository and force update the
  # main of the website repository. Setting fetch-depth to 0 allows us to
  # retrieve all the history for the specified branch.
  update-website:
    name: Update Website
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Setup SSH
        uses: MrSquaare/ssh-setup-action@v1
        with:
          host: github.com
          private-key: ${{ secrets.WEBSITE_DEPLOY_SSH_PRIVATE_KEY }}

      - name: Update the github-pages branch of the website repository
        env:
          COMMIT_AUTHOR: Stanford-CRFM-Website
        run: |
          # Set up Git configuration
          git config --global user.name "${{ env.COMMIT_AUTHOR }}"
          git config --global user.email "username@users.noreply.github.com"

          # Update the website repository
          git remote add website git+ssh://git@github.com/stanford-crfm-website/ecosystem-graphs.git
          git push -u website main


================================================
FILE: .github/workflows/test.yml
================================================
---
name: Test

"on":
  push:
    branches: main
  pull_request:
    branches:
      - main

jobs:
  format:
    name: Format YAML files
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Set up Python 3.8
        uses: actions/setup-python@v1
        with:
          python-version: 3.8

      - name: Validate format
        run: sh ./pre-commit.sh && git diff --exit-code


================================================
FILE: .gitignore
================================================
# Development Environment
.DS_Store
.vscode

# Group specific
nav
tags

================================================
FILE: .yamllint.yaml
================================================
---

yaml-files:
  - '*.yaml'
  - '*.yml'
  - '.yamllint'

rules:
  braces: enable
  brackets: enable
  colons: enable
  commas: enable
  comments:
    level: warning
  comments-indentation:
    level: warning
  document-end: disable
  document-start:
    level: warning
  empty-lines: enable
  empty-values: disable
  hyphens: enable
  indentation:
    level: warning
  key-duplicates: enable
  key-ordering: disable
  line-length: disable
  new-line-at-end-of-file: enable
  new-lines: enable
  octal-values: disable
  quoted-strings: disable
  trailing-spaces: enable
  truthy:
    level: warning


================================================
FILE: README.md
================================================
# Ecosystem Graphs

## Overview
Welcome! Ecosystem Graphs is an ongoing effort to track the foundation model ecosystem, namely both the assets (datasets, models, and applications) and their relationships. Using it, one can answer questions such as: What are the latest foundation models? Who builds them and where are they used downstream? What are the general trends over time? We hope that ecosystem graphs will be a useful resource for researchers, application developers, policymakers, and the public to better understand the foundation models ecosystem.

To explore the ecosystem, check out the [website](https://crfm.stanford.edu/ecosystem-graphs/) or read the [paper](https://arxiv.org/abs/2303.15772).

Briefly, an ecosystem graph is defined by:
- **Assets.** These are the nodes in the graph, which can be datasets (e.g. The Pile), models (e.g. Stable Diffusion), or applications (e.g. Microsoft Word).
- **Dependencies.** These are the edges in the graph, which indicate how assets are built (e.g. the BLOOM model is trained on the ROOTS dataset).
- **Ecosystem cards.** These are structured cards that house metadata on each asset (e.g. who built it, when, what is the license).

## Contribute
We actively encourage community contributions. To contribute:
- Add assets by filling out [this form](https://forms.gle/VqnSsZhv62hJ5rP36). No coding expertise required!
- Submit a PR (run `precommit.sh` before submitting)

To visualize and explore your changes, start a local server:

    python server.py

and navigate to [http://localhost:8000](http://localhost:8000).

## Cite as

```
@article{bommasani2023ecosystem-graphs,
  author       = {Bommasani, Rishi and
                  Soylu, Dilara and
                  Liao, Thomas I. and
                  Creel, Kathleen A. and
                  Liang, Percy},
  title        = {Ecosystem Graphs: The Social Footprint of Foundation Models},
  month        = mar,
  year         = 2023,
  url          = {https://arxiv.org/abs/2303.15772}
}
```


================================================
FILE: assets/01ai.yaml
================================================
- access: open
  analysis: Evaluated on standard language benchmarks, common sense reasoning, and
    reading comprehension in comparison to SoTA LLMs.
  created_date: 2023-11-02
  dependencies: []
  description: The Yi series models are large language models trained from scratch
    by developers at 01 AI.
  feedback: https://huggingface.co/01-ai/Yi-34B/discussions
  intended_uses: ''
  license:
    explanation: Model license can be found at https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE.
      Code license is under Apache 2.0
    value: custom
  modality: text; text
  model_card: https://huggingface.co/01-ai/Yi-34B
  monitoring: unknown
  name: Yi
  nationality: France
  organization: 01 AI
  prohibited_uses: none
  quality_control: Model underwent supervised fine-tuning, leading to a greater diversity
    of responses.
  size: 34B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://github.com/01-ai/Yi
- access: open
  analysis: Yi-VL outperforms all existing open-source models in MMMU and CMMMU, two
    advanced benchmarks that include massive multi-discipline multimodal questions
    (based on data available up to January 2024).
  created_date: 2024-01-23
  dependencies: []
  description: The Yi Vision Language (Yi-VL) model is the open-source, multimodal
    version of the Yi Large Language Model (LLM) series, enabling content comprehension,
    recognition, and multi-round conversations about images.
  feedback: https://huggingface.co/01-ai/Yi-VL-34B/discussions
  intended_uses: ''
  license:
    explanation: Model license can be found at https://huggingface.co/01-ai/Yi-VL-34B/blob/main/LICENSE.
      Code license is under Apache 2.0
    value: custom
  modality: text; text
  model_card: https://huggingface.co/01-ai/Yi-VL-34B
  monitoring: unknown
  name: Yi-VL
  nationality: France
  organization: 01 AI
  prohibited_uses: ''
  quality_control: unknown
  size: 34B parameters (dense)
  training_emissions: unknown
  training_hardware: 128 NVIDIA A800 (80G) GPUs
  training_time: 10 days
  type: model
  url: https://github.com/01-ai/Yi


================================================
FILE: assets/360.yaml
================================================
- access: open
  analysis: Achieved competitive performance on relevant benchmarks against other
    7B models in Chinese, English, and coding tasks.
  created_date: 2024-05-23
  dependencies: []
  description: 360 Zhinao is a multilingual LLM in Chinese and English with chat capabilities.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: none
  monitoring: ''
  name: 360 Zhinao
  nationality: China
  organization: 360 Security
  prohibited_uses: ''
  quality_control: ''
  size: 7B parameters
  training_emissions: unknown
  training_hardware: unknwon
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2405.13386


================================================
FILE: assets/adept.yaml
================================================
- access: closed
  analysis: ''
  created_date:
    explanation: The date the model was announced in the [[Adept blog post]](https://www.adept.ai/blog/act-1).
    value: 2022-09-14
  dependencies: []
  description: ACT-1 (ACtion Transformer) is a large-scale transformer model designed
    and trained specifically for taking actions on computers (use software tools APIs
    and websites) in response to the user's natural language commands.
  feedback: ''
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: none
  monitoring: ''
  name: ACT-1
  nationality: USA
  organization: Adept
  prohibited_uses: ''
  quality_control: ''
  size: ''
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.adept.ai/blog/act-1
- access: open
  analysis: Evaluated in comparison to LLaMA 2 and MPT Instruct, and outperforms both
    on standard benchmarks.
  created_date: 2023-09-07
  dependencies: []
  description: Persimmon is the most capable open-source, fully permissive model with
    fewer than 10 billion parameters, as of its release date.
  feedback: ''
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: ''
  monitoring: ''
  name: Persimmon
  nationality: USA
  organization: Adept
  prohibited_uses: ''
  quality_control: ''
  size: 8B parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.adept.ai/blog/persimmon-8b
- access: open
  analysis: Evaluated on standard image understanding benchmarks.
  created_date: 2023-10-17
  dependencies: []
  description: Fuyu is a small version of the multimodal model that powers Adept's
    core product.
  feedback: https://huggingface.co/adept/fuyu-8b/discussions
  intended_uses: The model is intended for research purposes only.
  license: CC-BY-NC-4.0
  modality: image, text; text
  model_card: https://huggingface.co/adept/fuyu-8b
  monitoring: ''
  name: Fuyu
  nationality: USA
  organization: Adept
  prohibited_uses: The model was not trained to be factual or true representations
    of people or events, and therefore using the model to generate such content is
    out-of-scope for the abilities of this model.
  quality_control: none
  size: 8B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.adept.ai/blog/fuyu-8b
- access: closed
  analysis: Evaluated on the MMLU, GSM8K, MATH, and HumanEval benchmarks. According
    to these benchmarks, Fuyu-Heavy is, as of release, the strongest multimodal model
    trained outside of Google or OpenAI.
  created_date: 2024-01-24
  dependencies: []
  description: Fuyu Heavy is a new multimodal model designed specifically for digital
    agents.
  feedback: none
  intended_uses: unknown
  license: unknown
  modality: image, text; text
  model_card: none
  monitoring: ''
  name: Fuyu Heavy
  nationality: USA
  organization: Adept
  prohibited_uses: none
  quality_control: none
  size:
    explanation: The size of the model is 10-20 times smaller than GPT-4V and Gemini
      Ultra, as per announcement.
    value: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.adept.ai/blog/adept-fuyu-heavy


================================================
FILE: assets/adobe.yaml
================================================
- access: closed
  analysis: ''
  created_date: 2023-10-10
  dependencies: []
  description: Firefly Image 2 is the next generation of generative AI for imaging,
    bringing significant advancements to creative control and quality, including new
    Text to Image capabilities now available in the popular Firefly web app where
    90% of users are new to Adobe products.
  feedback: ''
  intended_uses: creative generation of digital art and images
  license: unknown
  modality: text; image
  model_card: none
  monitoring: ''
  name: Firefly Image 2
  nationality: USA
  organization: Adobe
  prohibited_uses: AI/ML training, attempting to create abusive, illegal, or confidential
    content.
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://firefly.adobe.com/
- access: closed
  analysis: ''
  created_date: 2023-10-10
  dependencies: []
  description: "Firefly Vector is the world\u2019s first generative AI focused on\
    \ producing vector graphics, bringing Adobe's vector graphic and generative AI\
    \ expertise directly into Adobe Illustrator workflows with Text to Vector Graphic."
  feedback: ''
  intended_uses: creative generation of digital art and images
  license: unknown
  modality: text; vector graphic
  model_card: none
  monitoring: ''
  name: Firefly Vector
  nationality: USA
  organization: Adobe
  prohibited_uses: AI/ML training, attempting to create abusive, illegal, or confidential
    content.
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://firefly.adobe.com/
- access: closed
  analysis: ''
  created_date: 2023-10-10
  dependencies: []
  description: Firefly Design powers instant generation of amazing quality template
    designs in Adobe Express with the new Text to Template capability.
  feedback: ''
  intended_uses: creative generation of digital art and images
  license: unknown
  modality: text; template design
  model_card: none
  monitoring: ''
  name: Firefly Design
  nationality: USA
  organization: Adobe
  prohibited_uses: AI/ML training, attempting to create abusive, illegal, or confidential
    content.
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://firefly.adobe.com/
- access: limited
  adaptation: ''
  created_date: 2023-03-21
  dependencies:
  - Firefly Image 2
  - Firefly Vector
  - Firefly Design
  description: Adobe Firefly is a standalone web application. It offers new ways to
    ideate, create, and communicate while significantly improving creative workflows
    using generative AI.
  failures: unknown
  feedback: ''
  intended_uses: creative generation of digital art and images
  license: unknown
  monitoring: ''
  monthly_active_users: unknown
  name: Firefly
  nationality: USA
  organization: Adobe
  output_space: AI-generated creations
  prohibited_uses: AI/ML training, attempting to create abusive, illegal, or confidential
    content.
  quality_control: ''
  terms_of_service: https://www.adobe.com/legal/licenses-terms/adobe-gen-ai-user-guidelines.html
  type: application
  url: https://firefly.adobe.com/
  user_distribution: unknown
- access: open
  analysis: none
  created_date: 2023-09-17
  datasheet: https://huggingface.co/datasets/uonlp/CulturaX
  dependencies:
  - mC4
  - OSCAR
  description: CulturaX is a substantial multilingual dataset with 6.3 trillion tokens
    in 167 languages, tailored for LLM development.
  excluded: unknown
  feedback: https://huggingface.co/datasets/uonlp/CulturaX/discussions
  included: unknown
  intended_uses: ''
  license: mC4, OSCAR
  modality: text
  monitoring: unknown
  name: CulturaX
  nationality: unknown
  organization: University of Oregon, Adobe
  prohibited_uses: The data must not be utilized for malicious or harmful purposes
    towards humanity.
  quality_control: unknown
  sample: []
  size: 6.3 trillion tokens
  type: dataset
  url: https://arxiv.org/pdf/2309.09400


================================================
FILE: assets/ai2.yaml
================================================
- access: open
  analysis: ''
  created_date: 2022-04-16
  datasheet: ''
  dependencies: []
  description: ''
  excluded: ''
  feedback: ''
  included: ''
  intended_uses: ''
  license: Apache 2.0
  modality: NLP tasks
  monitoring: ''
  name: NaturalInstructions-v2
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 1600 tasks
  type: dataset
  url: https://arxiv.org/abs/2204.07705
- access: open
  analysis: Randomly sampled dialogues from dataset are evaluated according to six
    established criteria of natural flow, context dependence, topic consistency, speaker
    consistency, specificity, and overall.
  created_date: 2023-05-24
  datasheet: https://huggingface.co/datasets/allenai/soda
  dependencies: []
  description: SODA is the first publicly available, million-scale, high-quality dialogue
    dataset covering a wide range of social interactions.
  excluded: ''
  feedback: ''
  included: ''
  intended_uses: ''
  license: CC BY 4.0
  modality: text
  monitoring: ''
  name: SODA
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 1.5M dialogues
  type: dataset
  url: https://arxiv.org/pdf/2212.10465.pdf
- access: open
  analysis: Conducted experiments on models trained with Multimodal C4 in comparison
    to models trained on single image/caption datasets
  created_date: 2023-06-09
  datasheet: ''
  dependencies:
  - C4
  description: An augmentation of C4 with images added and made openly available.
  excluded: ''
  feedback: ''
  included: ''
  intended_uses: ''
  license:
    explanation: Data license is under ODC-By. Code license is under MIT
    value: ODC-By
  modality: image, text
  monitoring: ''
  name: Multimodal C4
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 43B English tokens with 101.2M documents and 571M images
  type: dataset
  url: https://arxiv.org/pdf/2304.06939.pdf
- access: open
  analysis: Evaluated by human testers on generalization capabilities and responses
    compared to other chatbots.
  created_date: 2023-05-24
  dependencies:
  - SODA
  - ProsocialDialog
  - T5
  description: COSMO is a conversation agent with greater generalizability on both
    in- and out-of-domain chitchat datasets
  feedback: https://huggingface.co/allenai/cosmo-xl/discussions
  intended_uses: ''
  license: ''
  modality: text; text
  model_card: https://huggingface.co/allenai/cosmo-xl
  monitoring: ''
  name: COSMO
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  size: 11B parameters (dense)
  training_emissions: unknown
  training_hardware: v3-128 TPU accelerators with batch size 256
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2212.10465.pdf
- access: open
  analysis: Compared with other open and closed datasets in regards to size and quality
    control.
  created_date: 2023-08-18
  datasheet: https://drive.google.com/file/d/12gOf5I5RytsD159nSP7iim_5zN31FCXq/view
  dependencies: []
  description: Dolma is a dataset of 3 trillion tokens from a diverse mix of web content,
    academic publications, code, books, and encyclopedic materials
  excluded: ''
  feedback: ''
  included: ''
  intended_uses: ''
  license:
    explanation: license can be found at https://allenai.org/impact-license
    value: AI2 ImpACT
  modality: text
  monitoring: Quality filtration, deduplication, and risk mitigation via logistic
    qualifiers and regular expressions used.
  name: Dolma
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 3T tokens
  type: dataset
  url: https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64
- access: open
  analysis: Models trained with dataset evaluated on downstream performance.
  created_date: 2023-11-20
  datasheet: https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture
  dependencies:
  - FLAN Collection
  - Open Assistant 1
  - ShareGPT
  - Alpaca dataset
  - Code Alpaca
  - LIMA
  - WizardLM
  - OpenOrca
  description: Tulu-V2-mix is a dataset composed of many high-quality instruction
    datasets that results in stronger performance across a variety of reasoning and
    knowledge-probing tasks.
  excluded: ''
  feedback: https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture/discussions
  included: ''
  intended_uses: ''
  license: ODC-BY
  modality: text
  monitoring: ''
  name: Tulu-V2-mix
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size:
    explanation: Magnitude of size is around 100M tokens, given the length distribution
      of dataset provided in model card.
    value: unknown
  type: dataset
  url: https://arxiv.org/pdf/2311.10702.pdf
- access: open
  analysis: Evaluated on MT-Bench and AlpacaEval. compared to other chatbots.
  created_date: 2023-11-20
  dependencies:
  - LLaMA 2
  - Tulu-V2-mix
  description: Tulu 2 is a language model trained on the new Tulu-v2-mix dataset and
    fine-tuned on more state of the art language models.
  feedback: https://huggingface.co/allenai/tulu-2-70b/discussions
  intended_uses: ''
  license:
    explanation: license can be found at https://allenai.org/impact-license
    value: AI2 ImpACT
  modality: text; text
  model_card: https://huggingface.co/allenai/tulu-2-70b
  monitoring: ''
  name: Tulu 2
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  size: 70B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2311.10702.pdf
- access: open
  analysis: Evaluated on MT-Bench and AlpacaEval. compared to other chatbots.
  created_date: 2023-11-20
  dependencies:
  - LLaMA 2
  - Tulu-V2-mix
  description: Tulu 2 DPO is created in a similar manner to Tulu 2, but with Direct
    Preference Optimization (DPO).
  feedback: https://huggingface.co/allenai/tulu-2-dpo-70b/discussions
  intended_uses: ''
  license:
    explanation: license can be found at https://allenai.org/impact-license
    value: AI2 ImpACT
  modality: text; text
  model_card: https://huggingface.co/allenai/tulu-2-dpo-70b
  monitoring: ''
  name: Tulu 2 DPO
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  size: 70B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2311.10702.pdf
- access: open
  analysis: Evaluated on MT-Bench and AlpacaEval. compared to other chatbots.
  created_date: 2023-11-20
  dependencies:
  - Code LLaMA
  - Tulu-V2-mix
  description: Code Tulu 2 is a fine-tuned version of Code LLaMA that was trained
    on a mix of publicly available, synthetic and human datasets.
  feedback: https://huggingface.co/allenai/codetulu-2-13b/discussions
  intended_uses: ''
  license:
    explanation: license can be found at https://allenai.org/impact-license
    value: AI2 ImpACT
  modality: text; code, text
  model_card: https://huggingface.co/allenai/codetulu-2-13b
  monitoring: ''
  name: Code Tulu 2
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2311.10702.pdf
- access: open
  analysis: Evaluated on standard LLM tasks and benchmarks in comparison to LLaMA,
    Falcon, and MPT, in addition to other same-sized models.
  created_date: 2024-02-01
  dependencies:
  - Dolma
  description: Open Language Model (OLMo) is designed to provide access to data, training
    code, models, and evaluation code necessary to advance AI through open research
    to empower academics and researchers to study the science of language models collectively.
  feedback: https://huggingface.co/allenai/OLMo-7B/discussions
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/allenai/OLMo-7B
  monitoring: unknown
  name: OLMo
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: training data from Dolma filtered and deduplicated before being
    trained on.
  size: 7B parameters (dense)
  training_emissions: 75.05 tCo2eq
  training_hardware: 27 nodes, with each node containing 8x NVIDIA A100-40GB GPUs
    provided by MosaicML
  training_time: unknown
  type: model
  url: https://allenai.org/olmo/olmo-paper.pdf
- access: open
  analysis: none
  created_date: 2023-09-09
  datasheet: https://huggingface.co/datasets/allenai/MADLAD-400
  dependencies:
  - Common Crawl
  description: MADLAD-400 is a document-level multilingual dataset based on Common
    Crawl, covering 419 languages in total.
  excluded: ''
  feedback: https://huggingface.co/datasets/allenai/MADLAD-400/discussions
  included: ''
  intended_uses: ''
  license: CC BY 4.0
  modality: text
  monitoring: unknown
  name: MADLAD-400
  nationality: USA
  organization: AI2
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 3 trillion tokens
  type: dataset
  url: https://arxiv.org/abs/2309.04662
- access: closed
  analysis: Extensive scaling laws analysis has been performed on Evo considering
    beyond-transformer architectures.
  created_date: 2024-09-30
  dependencies:
  - StripedHyena
  description: Evo is a foundation model based on StripedHyena, which is a deep signal
    processing architecture. It is capable of generalizing across DNA, RNA, and proteins
    and is used for generative design of new CRISPR systems.
  feedback: Feedback or inquiries can be addressed to the Liquid AI team.
  intended_uses: The Evo model is intended for generative design of new CRISPR systems,
    and it generalizes across DNA, RNA, and proteins.
  license: unknown
  modality: text; text
  model_card: unknown
  monitoring: unknown
  name: Evo
  nationality: USA
  organization: Liquid AI
  prohibited_uses: unknown
  quality_control: The model performance has been extensively evaluated with scaling
    laws analysis.
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.liquid.ai/research/liquid-neural-networks-research


================================================
FILE: assets/ai21.yaml
================================================
- access: closed
  analysis: none
  created_date:
    explanation: The date the white paper was made public.
    value: 2021-08-11
  datasheet: none
  dependencies: []
  description: The dataset used to train the Jurassic-1 models, based on publicly
    available data.
  excluded: unknown
  feedback: none
  included: unknown
  intended_uses: unknown
  license: unknown
  modality: text
  monitoring: none
  name: Jurassic-1 dataset
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses: none
  quality_control: none
  sample: []
  size: 300B tokens
  type: dataset
  url: https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf
- access: closed
  analysis: unknown
  created_date:
    explanation: The date the model was [announced](https://docs.ai21.com/changelog/jurassic-1-instruct-is-now-available).
    value: 2022-12-01
  datasheet: none
  dependencies: []
  description: The dataset used to instruction-tune the Jurassic-1 Instruct models.
  excluded: unknown
  feedback: none
  included: unknown
  intended_uses: unknown
  license: unknown
  modality: text
  monitoring: none
  name: Jurassic-1 Instruct dataset
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses: unknown
  quality_control: unknown
  sample: []
  size: unknown
  type: dataset
  url: none
- access:
    explanation: Jurassic-1 is available through AI21's paid API.
    value: limited
  analysis:
    explanation: Results are reported in the [the paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf)
    value: Evaluated on several standard benchmarks (e.g. ARC, BoolQ, HellaSwag, RTE,
      Winogrande)
  created_date:
    explanation: The date the white paper was made public.
    value: 2021-08-11
  dependencies:
  - Jurassic-1 dataset
  description: Jurassic-1 is a family of autoregressive language models (Large, Grande,
    Jumbo).
  feedback: none
  intended_uses: unknown
  license: unknown
  modality: text
  model_card: none
  monitoring: none
  name: Jurassic-1
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: none
  size: 178B parameters (dense)
  training_emissions: unknown
  training_hardware:
    explanation: Training details from [the paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf)
    value: Over 800 A100 GPUs
  training_time:
    explanation: Training details from [the paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf)
    value: Several months
  type: model
  url: https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf
- access:
    explanation: Jurassic-1 Instruct is available through AI21's paid API.
    value: limited
  analysis: none
  created_date:
    explanation: The date the model was announced(https://docs.ai21.com/changelog/jurassic-1-instruct-is-now-available).
    value: 2022-12-01
  dependencies:
  - Jurassic-1
  - Jurassic-1 Instruct dataset
  description: Jurassic-1 Instruct is an instruction-tuned autoregressive language
    model.
  feedback: none
  intended_uses: Jurassic-1 Instruct was trained specifically to handle instructions-only
    prompts ("zero-shot") without examples ("few-shot"). It is the most natural way
    to interact with language models, and it is the best way to get a sense of the
    optimal output for your task without any examples.
  license: unknown
  modality: text; text
  model_card: none
  monitoring: none
  name: Jurassic-1 Instruct
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: unknown
  size: 17B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://docs.ai21.com/docs/jurassic-1-instruct-beta
- access:
    explanation: Jurassic-2 is available through AI21's paid API.
    value: limited
  analysis: The model was evaluated on the HELM benchmark as discussed in https://www.ai21.com/blog/introducing-j2.
  created_date: 2023-03-09
  dependencies: []
  description: Jurassic-2 is a family of language models designed to replace Jurassic-1.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: none
  monitoring: unknown
  name: Jurassic-2
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: unknown
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://docs.ai21.com/docs/jurassic-2-models
- access: limited
  adaptation: The Playground provides direct access to the language models (Complete
    API) as well as wrapped for Rewrite and Summarize.
  created_date:
    explanation: The date Jurassic-1 was made public.
    value: 2021-08-11
  dependencies:
  - Jurassic-1
  - Jurassic-1 Instruct
  - Jurassic-2
  - AI21 Summarization API
  - AI21 Paraphrase API
  description: The AI21 Labs Playground supports several task-specific APIs in addition
    to a variety of models.
  failures: unknown
  feedback: Feedback can be given by emailing at info at ai21.com
  intended_uses: The intended uses are text completion, rewriting, and summarization.
  license:
    explanation: No license was found.
    value: none
  monitoring: unknown
  monthly_active_users: unknown
  name: AI21 Playground
  nationality: Israel
  organization: AI21 Labs
  output_space:
    explanation: The Playground supports different use cases that all involve text
      generation.
    value: text
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: unknown
  terms_of_service: https://www.ai21.com/terms-of-use
  type: application
  url: https://studio.ai21.com/playground/
  user_distribution: unknown
- access: limited
  adaptation: The AI21 language models are further specialized to the task of paraphrasing.
  created_date:
    explanation: The date Jurassic-2 and Task-Specific APIs were released.
    value: 2023-03-09
  dependencies:
  - Jurassic-2
  description: AI21 Studio's Paraphrase API offers access to our world-class paraphrasing
    engine. It has been specifically developed for suggesting alternative ways to
    convey the same message using different words.
  failures: unknown
  feedback: Feedback can be given by emailing at info at ai21.com
  intended_uses: The intended uses are text paraphrasing.
  license:
    explanation: No license was found.
    value: none
  monitoring: unknown
  monthly_active_users: unknown
  name: AI21 Paraphrase API
  nationality: Israel
  organization: AI21 Labs
  output_space: text
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: unknown
  terms_of_service: https://www.ai21.com/terms-of-use
  type: application
  url: https://docs.ai21.com/docs/paraphrase-api
  user_distribution: unknown
- access: limited
  adaptation: The AI21 language models are further specialized to the task of summarization.
  created_date:
    explanation: The date Jurassic-2 and Task-Specific APIs were released.
    value: 2023-03-09
  dependencies:
  - Jurassic-2
  description: AI21 Studio's Summarize API offers access to our world-class summarization
    engine. It has been specifically developed for reading long texts and providing
    a faithful summary of the original document.
  failures: unknown
  feedback: Feedback can be given by emailing at info at ai21.com
  intended_uses: The intended uses are text paraphrasing.
  license:
    explanation: No license was found.
    value: none
  monitoring: unknown
  monthly_active_users: unknown
  name: AI21 Summarization API
  nationality: Israel
  organization: AI21 Labs
  output_space: text
  prohibited_uses:
    explanation: https://docs.ai21.com/docs/responsible-use
    value: Illegal activities, such as hate speech, gambling, child pornography or
      violating intellectual property rights; Harassment, victimization, intimidation,
      fraud or spam; Creation or dissemination of misinformation, promotion of self-harm,
      glorification of violent events or incitement of violence.
  quality_control: unknown
  terms_of_service: https://www.ai21.com/terms-of-use
  type: application
  url: https://docs.ai21.com/docs/summarize-api
  user_distribution: unknown
- access: limited
  adaptation: unknown
  created_date:
    explanation: https://www.businesswire.com/news/home/20201027005162/en/AI21-Labs-Comes-out-of-Stealth-and-Launches-First-Deep-Tech-Writing-Assistant-Wordtune
    value: 2020-10-27
  dependencies:
  - AI21 Paraphrase API
  description: Wordtune, the first AI-based writing companion that understands context
    and meaning.
  failures: unknown
  feedback: ''
  intended_uses: The Wordtune assistant is a writing assistant
  license:
    explanation: License can be found in the terms of service document.
    value: Wordtune License
  monitoring: unknown
  monthly_active_users: unknown
  name: Wordtune
  nationality: Israel
  organization: AI21 Labs
  output_space: text
  prohibited_uses: ''
  quality_control: unknown
  terms_of_service: https://www.wordtune.com/terms-of-use
  type: application
  url: https://www.wordtune.com/
  user_distribution: unknown
- access: limited
  adaptation: unknown
  created_date: 2021-11-16
  dependencies:
  - AI21 Summarize API
  description: Wordtune Read is an AI reader that summarizes long documents so you
    can understand more, faster.
  failures: unknown
  feedback: ''
  intended_uses: ''
  license:
    explanation: License can be found in the terms of service document.
    value: Wordtune License
  monitoring: unknown
  monthly_active_users: unknown
  name: Wordtune Read
  nationality: Israel
  organization: AI21 Labs
  output_space: text
  prohibited_uses: ''
  quality_control: unknown
  terms_of_service: https://www.wordtune.com/terms-of-use
  type: application
  url: https://www.wordtune.com/read
  user_distribution: unknown
- access: open
  analysis: Jamba outperforms or matches other state-of-the-art models in its size
    class on a wide range of benchmarks.
  created_date: 2024-03-28
  dependencies: []
  description: "Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. Jamba is\
    \ the world\u2019s first production-grade Mamba based model."
  feedback: https://huggingface.co/ai21labs/Jamba-v0.1/discussions
  intended_uses: intended for use as a foundation layer for fine tuning, training
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/ai21labs/Jamba-v0.1
  monitoring: ''
  name: Jamba
  nationality: Israel
  organization: AI21 Labs
  prohibited_uses: ''
  quality_control: ''
  size: 52B parameters (sparse)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.ai21.com/blog/announcing-jamba
- access: open
  analysis: The models were evaluated based on their ability to handle long contexts,
    speed, and quality. They outperformed competitors in their size class, scoring
    high on the Arena Hard benchmark.
  created_date: 2024-08-22
  dependencies: []
  description: A family of models that demonstrate superior long context handling,
    speed, and quality. Built on a novel SSM-Transformer architecture, they surpass
    other models in their size class. These models are useful for enterprise applications,
    such as lengthy document summarization and analysis. The Jamba 1.5 family also
    includes the longest context window, at 256K, among open models. They are fast,
    quality-focused, and handle long contexts efficiently.
  feedback: Unknown
  intended_uses: The models are built for enterprise scale AI applications. They are
    purpose-built for efficiency, speed, and ability to solve critical tasks that
    businesses care about, such as lengthy document summarization and analysis. They
    can also be used for RAG and agentic workflows.
  license: Jamba Open Model License
  modality: text; text
  model_card: unknown
  monitoring: Unknown
  name: Jamba 1.5
  nationality: unknown
  organization: AI21
  prohibited_uses: Unknown
  quality_control: The models were evaluated on the Arena Hard benchmark. For maintaining
    long context performance, they were tested on the RULER benchmark.
  size: 94B parameters
  training_emissions: Unknown
  training_hardware: For speed comparisons, Jamba 1.5 Mini used 2xA100 80GB GPUs,
    and Jamba 1.5 Large used 8xA100 80GB GPUs.
  training_time: Unknown
  type: model
  url: https://www.ai21.com/blog/announcing-jamba-model-family


================================================
FILE: assets/aleph_alpha.yaml
================================================
- access: closed
  analysis: unknown
  created_date:
    explanation: Date the models were announced by the Aleph Alpha twitter account
    value: 2022-04-14
  datasheet: none
  dependencies: []
  description: The dataset used to train the Luminous models.
  excluded: unknown
  feedback: none
  included: unknown
  intended_uses: unknown
  license: unknown
  modality: text
  monitoring: none
  name: Luminous dataset
  nationality: Germany
  organization: Aleph Alpha
  prohibited_uses: unknown
  quality_control: unknown
  sample: []
  size: unknown
  type: dataset
  url: none
- access:
    explanation: Paid API product
    value: limited
  analysis: ''
  created_date:
    explanation: Date the models were announced by the Aleph Alpha twitter account
    value: 2022-04-14
  dependencies:
  - Luminous dataset
  description: Luminous is a family of multilingual language models
  feedback: ''
  intended_uses: ''
  license: none
  modality: text; text
  model_card: ''
  monitoring: ''
  name: Luminous
  nationality: Germany
  organization: Aleph Alpha
  prohibited_uses: ''
  quality_control: ''
  size: 200B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://twitter.com/Aleph__Alpha/status/1514576711492542477
- access:
    explanation: The API is a paid product. Refer to the [[API documentation]](https://docs.aleph-alpha.com/docs/introduction/luminous/)
      for further details.
    value: limited
  adaptation: ''
  created_date:
    explanation: 'Date the Aleph Alpha Python API client was first available based
      on to the [PyPi package history](https://pypi.org/project/aleph-alpha-client/1.0.0/)

      '
    value: 2021-09-30
  dependencies:
  - Luminous
  description: The Aleph Alpha API serves a family of text-only language models (Luminous)
    and multimodal text-and-image models (Magma).
  failures: unknown
  feedback: unknown
  intended_uses: unknown
  license:
    explanation: No license was found.
    value: none
  monitoring: unknown
  monthly_active_users: unknown
  name: Aleph Alpha API
  nationality: Germany
  organization: Aleph Alpha
  output_space: The text models provide text outputs given text inputs. The multimodal
    models provide text completions given text and image inputs.
  prohibited_uses: unknown
  quality_control: ''
  terms_of_service: https://www.aleph-alpha.com/terms-conditions
  type: application
  url: https://www.aleph-alpha.com/
  user_distribution: unknown
- access: open
  analysis: Evaluated on the OKVQA benchmark as a fully open-ended generative task.
  created_date: 2022-10-24
  dependencies:
  - GPT-J
  - CLIP
  description: An autoregressive VL model that is able to generate text from an arbitrary
    combination of visual and textual input
  feedback: ''
  intended_uses: ''
  license: MIT
  modality: image, text; text
  model_card: ''
  monitoring: ''
  name: MAGMA
  nationality: Germany
  organization: Aleph Alpha
  prohibited_uses: ''
  quality_control: ''
  size: 6B parameters (dense)
  training_emissions: ''
  training_hardware: 32 A100 GPUs
  training_time: ''
  type: model
  url: https://arxiv.org/pdf/2112.05253.pdf
- access: open
  analysis: Extensive evaluations were done with ablation experiments performed on
    pre-training benchmarks such as lambada, triviaqa, hellaswag, winogrande, webqs,
    arc, and boolq. Direct comparisons were also performed with applications like
    GPT and Llama 2.
  created_date: 2024-09-08
  dependencies: []
  description: Pharia-1-LLM-7B is a model that falls within the Pharia-1-LLM model
    family. It is designed to deliver short, controlled responses that match the performance
    of leading open-source models around 7-8 billion parameters. The model is culturally
    and linguistically tuned for German, French, and Spanish languages. It is trained
    on carefully curated data in line with relevant EU and national regulations. The
    model shows improved token efficiency and is particularly effective in domain-specific
    applications, especially in the automotive and engineering industries. It can
    also be aligned to user preferences, making it appropriate for critical applications
    without the risk of shut-down behaviour.
  feedback: Feedback can be sent to support@aleph-alpha.com.
  intended_uses: The model is intended for use in domain-specific applications, particularly
    in the automotive and engineering industries. It can also be tailored to user
    preferences.
  license: Aleph Open
  modality: text; text
  model_card: unknown
  monitoring: Unknown
  name: Pharia-1-LLM-7B
  nationality: Germany
  organization: Aleph Alpha
  prohibited_uses: Unknown
  quality_control: The model comes with additional safety guardrails via alignment
    methods to ensure safe usage. Training data is carefully curated to ensure compliance
    with EU and national regulations.
  size: 7B parameters
  training_emissions: Unknown
  training_hardware: Unknown
  training_time: Unknown
  type: model
  url: https://aleph-alpha.com/introducing-pharia-1-llm-transparent-and-compliant/#:~:text=Pharia%2D1%2DLLM%2D7B


================================================
FILE: assets/alibaba.yaml
================================================
- access: closed
  analysis: ''
  created_date: 2023-02-20
  datasheet: ''
  dependencies:
  - LAION-5B
  description: ''
  excluded: We eliminate duplicates, low resolution images, and images potentially
    contain harmful content from the LAION dataset.
  feedback: ''
  included: ''
  intended_uses: ''
  license: unknown
  modality: image, text
  monitoring: ''
  name: LAION-1B
  nationality: China
  organization: Alibaba
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 1B image-text pairs
  type: dataset
  url: https://arxiv.org/pdf/2302.09778.pdf
- access: closed
  analysis: ''
  created_date: 2023-02-20
  dependencies:
  - ImageNet
  - WebVision
  - LAION-1B
  description: ''
  feedback: ''
  intended_uses: ''
  license: unknown
  modality: image, text; image
  model_card: ''
  monitoring: ''
  name: Composer
  nationality: China
  organization: Alibaba
  prohibited_uses: ''
  quality_control: ''
  size: 4.4B parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://arxiv.org/pdf/2302.09778.pdf
- access:
    explanation: 'Model checkpoints are available for download from the [[HuggingFace
      repository]](https://huggingface.co/Qwen)

      '
    value: open
  analysis: Evaluated on MMLU, C-Eval, GSM8K, MATH, HumanEval, etc.
  created_date: 2023-08-03
  dependencies: []
  description: 'QWEN is a comprehensive language model series that encompasses distinct
    models with varying parameter counts. Qwen series, now including Qwen, the base
    language models, namely Qwen-7B and Qwen-14B, as well as Qwen-Chat, the chat models,
    namely Qwen-7B-Chat and Qwen-14B-Chat. '
  feedback: ''
  intended_uses: ''
  license:
    explanation: Model license can be found at https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT.
      Code license is under Apache 2.0
    value: custom
  modality: image, text; text
  model_card: https://huggingface.co/Qwen
  monitoring: Governed by the laws of China, without regard to conflict of law principles,
    and the UN Convention on Contracts for the International Sale of Goods does not
    apply to this Agreement. And The People's Courts in Hangzhou City shall have exclusive
    jurisdiction over any dispute arising out of this Agreement.
  name: Qwen
  nationality: China
  organization: Alibaba
  prohibited_uses: ''
  quality_control:
    explanation: According to [[Model Description Section 2]](https://arxiv.org/pdf/2302.09778.pdf)
    value: They filter out low-quality data, they employ a combination of rule-based
      and machine-learning-based methods. Specifically, they use multiple models to
      score the content, including language models, text-quality scoring models, and
      models for identifying potentially offensive or inappropriate content. They
      also manually sample texts from various sources and review them to ensure their
      quality. To further enhance the quality of our data, they selectively up-sample
      data from certain sources, to ensure that our models are trained on a diverse
      range of high-quality content.
  size: 14B parameters (dense)
  training_emissions: unknown
  training_hardware: ''
  training_time: ''
  type: model
  url: https://arxiv.org/abs/2309.16609
- access: open
  analysis: Base models are evaluated on MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP,
    BBH, CMMLU, all standard English and Chinese benchmarks, and chat models are evaluated
    on Chatbot Arena, AlpacaEval, MT-Bench, etc.
  created_date: 2024-02-04
  dependencies: []
  description: Qwen 1.5 is the next iteration in their Qwen series, consisting of
    Transformer-based large language models pretrained on a large volume of data,
    including web texts, books, codes, etc.
  feedback: https://huggingface.co/Qwen/Qwen1.5-72B/discussions
  intended_uses: ''
  license:
    explanation: Model license can be found at https://huggingface.co/Qwen/Qwen1.5-72B/blob/main/LICENSE
    value: custom
  modality: text; text
  model_card: https://huggingface.co/Qwen/Qwen1.5-72B
  monitoring: unknown
  name: Qwen 1.5
  nationality: China
  organization: Alibaba
  prohibited_uses: ''
  quality_control: unknown
  size: 72B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://qwenlm.github.io/blog/qwen1.5/
- access: open
  analysis: Base models are evaluated on MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP,
    BBH, CMMLU, all standard English and Chinese benchmarks, and chat models are evaluated
    on Chatbot Arena, AlpacaEval, MT-Bench, etc.
  created_date: 2024-03-28
  dependencies: []
  description: Qwen 1.5 is the next iteration in their Qwen series, consisting of
    Transformer-based large language models pretrained on a large volume of data,
    including web texts, books, codes, etc. Qwen 1.5 MoE is the MoE model of the Qwen
    1.5 series.
  feedback: https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B/discussions
  intended_uses: ''
  license:
    explanation: Model license can be found at https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B/blob/main/LICENSE
    value: custom
  modality: text; text
  model_card: https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B
  monitoring: unknown
  name: Qwen 1.5 MoE
  nationality: China
  organization: Qwen Team
  prohibited_uses: ''
  quality_control: unknown
  size: 14B parameters with 2.7B parameters for activation (MoE)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://qwenlm.github.io/blog/qwen-moe/
- access: open
  analysis: The model was evaluated on 3 benchmarks (MMLU for English, M3Exam (M3e)
    for English, Chinese, Vietnamese, Indonesian, and Thai, and VMLU for Vietnamese)
    and it outperformed GPT-3 and Vistral-7B-chat models across these benchmarks in
    the given languages.
  created_date: 2024-04-12
  dependencies:
  - Gemma
  description: SeaLLM v2.5 is a multilingual large language model for Southeast Asian
    (SEA) languages.
  feedback: https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5/discussions
  intended_uses: The model is intended for multilingual tasks such as knowledge retrieval,
    math reasoning, and instruction following. Also, it could be used to provide multilingual
    assistance.
  license:
    explanation: License can be found at https://huggingface.co/SeaLLMs/SeaLLM-13B-Chat/blob/main/LICENSE
    value: custom
  modality: text; text
  model_card: https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5
  monitoring: unknown
  name: SeaLLM v2.5
  nationality: China
  organization: DAMO Academy, Alibaba
  prohibited_uses: The model should not be used in a way that could lead to inaccurate,
    misleading or potentially harmful generation. Users should comply with local laws
    and regulations when deploying the model.
  quality_control: Despite efforts in red teaming and safety fine-tuning and enforcement,
    the creators suggest, developers and stakeholders should perform their own red
    teaming and provide related security measures before deployment, and they must
    abide by and comply with local governance and regulations.
  size: 7B parameters
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://github.com/DAMO-NLP-SG/SeaLLMs


================================================
FILE: assets/amazon.yaml
================================================
- access: limited
  adaptation: ''
  created_date: 2023-04-13
  dependencies:
  - Jurassic-2
  - Claude
  - Stable Diffusion
  - Amazon Titan
  - Claude 2
  - Cohere Command
  description: Bedrock is a new service that makes FMs from AI21 Labs, Anthropic,
    Stability AI, and Amazon accessible via an API. Bedrock is intended for customers
    to build and scale generative AI-based applications using FMs, democratizing access
    for all builders. using an API.
  failures: ''
  feedback: ''
  intended_uses: allowing companies to incorporate generative AI into their business
    models
  license: unknown
  monitoring: ''
  monthly_active_users: ''
  name: Bedrock
  nationality: USA
  organization: Amazon
  output_space: foundation models made accessible via an API
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://aws.amazon.com/service-terms/
  type: application
  url: https://aws.amazon.com/bedrock/
  user_distribution: ''
- access: open
  analysis: Evaluated against benchmarks that are specifically designed to assess
    the capabilities of LLMs in handling longer contexts.
  created_date: 2023-08-08
  dependencies:
  - Falcon-40B
  description: FalconLite2 is a fine-tuned and quantized Falcon language model, capable
    of processing long (up to 24K tokens) input sequences.
  feedback: https://huggingface.co/amazon/FalconLite2/discussions
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/amazon/FalconLite2
  monitoring: ''
  name: FalconLite2
  nationality: USA
  organization: Amazon
  prohibited_uses: ''
  quality_control: ''
  size: 40B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://huggingface.co/amazon/FalconLite2
- access: open
  analysis: Chronos has been evaluated comprehensively on 42 datasets both in the
    in-domain (15 datasets) and zero-shot settings (27 datasets). Chronos outperforms
    task specific baselines in the in-domain setting and is competitive or better
    than trained models in the zero-shot setting.
  created_date: 2024-03-13
  dependencies:
  - T5
  description: Chronos is a family of pretrained time series forecasting models based
    on language model architectures. A time series is transformed into a sequence
    of tokens via scaling and quantization, and a language model is trained on these
    tokens using the cross-entropy loss. Once trained, probabilistic forecasts are
    obtained by sampling multiple future trajectories given the historical context.
  feedback: https://github.com/amazon-science/chronos-forecasting/discussions
  intended_uses: Chronos can be used for zero-shot time series forecasting on univariate
    time series from arbitrary domains and with arbitrary horizons. Chronos models
    can also be fine-tuned for improved performance of specific datasets. Embeddings
    from Chronos encoder may also be useful for other time series analysis tasks such
    as classification, clustering, and anomaly detection.
  license: Apache 2.0
  modality: time-series; time-series
  model_card: https://huggingface.co/amazon/chronos-t5-large
  monitoring: ''
  name: Chronos
  nationality: USA
  organization: Amazon
  prohibited_uses: ''
  quality_control: Chronos was evaluated rigorously on 42 datasets, including 27 in
    the zero-shot setting against a variety of statistical and deep learning baselines.
  size: 710M parameters (dense)
  training_emissions: ''
  training_hardware: 8 NVIDIA A100 40G GPUs
  training_time: 63 hours on p4d.24xlarge EC2 instance
  type: model
  url: https://github.com/amazon-science/chronos-forecasting
- access:
    explanation: available exclusively in Amazon Bedrock
    value: limited
  analysis: Amazon Nova Pro is capable of processing up to 300K input tokens and sets
    new standards in multimodal intelligence and agentic workflows that require calling
    APIs and tools to complete complex workflows. It achieves state-of-the-art performance
    on key benchmarks including visual question answering ( TextVQA ) and video understanding
    ( VATEX ).
  created_date: 2024-12-03
  dependencies: []
  description: A new generation of state-of-the-art foundation models (FMs) that deliver
    frontier intelligence and industry leading price performance, available exclusively
    in Amazon Bedrock. Amazon Nova understanding models excel in Retrieval-Augmented
    Generation (RAG), function calling, and agentic applications.
  feedback: unknown
  intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
    understand charts and diagrams, generate engaging video content, and build sophisticated
    AI agents, from across a range of intelligence classes optimized for enterprise
    workloads.
  license: unknown
  modality:
    explanation: Amazon Nova understanding models accept text, image, or video inputs
      to generate text output.
    value: text, image, video; text
  model_card: unknown
  monitoring: unknown
  name: Amazon Nova (Understanding)
  nationality: USA
  organization: Amazon Web Services (AWS)
  prohibited_uses: unknown
  quality_control: All Amazon Nova models include built-in safety controls and creative
    content generation models include watermarking capabilities to promote responsible
    AI use.
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
- access:
    explanation: available exclusively in Amazon Bedrock
    value: limited
  analysis: Amazon Nova Canvas excels on human evaluations and key benchmarks such
    as text-to-image faithfulness evaluation with question answering (TIFA) and ImageReward.
  created_date: 2024-12-03
  dependencies: []
  description: A new generation of state-of-the-art foundation models (FMs) that deliver
    frontier intelligence and industry leading price performance, available exclusively
    in Amazon Bedrock.
  feedback: unknown
  intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
    understand charts and diagrams, generate engaging video content, and build sophisticated
    AI agents, from across a range of intelligence classes optimized for enterprise
    workloads.
  license: unknown
  modality:
    explanation: Amazon creative content generation models accept text and image inputs
      to generate image or video output.
    value: text, image;image, video
  model_card: unknown
  monitoring: unknown
  name: Amazon Nova (Creative Content Generation)
  nationality: USA
  organization: Amazon Web Services (AWS)
  prohibited_uses: unknown
  quality_control: All Amazon Nova models include built-in safety controls and creative
    content generation models include watermarking capabilities to promote responsible
    AI use.
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/


================================================
FILE: assets/anthropic.yaml
================================================
- access:
    explanation: 'The dataset is freely shared on GitHub [[hh-rlhf]](https://github.com/anthropics/hh-rlhf).

      '
    value: open
  analysis: 'The authors found that the crowdworkers didn''t exhaustively check for
    honesty in the model answers they preferred [[Section 2.1]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.1).

    '
  created_date:
    explanation: 'The date that the paper was uploaded to ArXiv [[ArXiv Abstract]](https://arxiv.org/abs/2204.05862).

      '
    value: 2022-04-12
  datasheet: none
  dependencies:
  - Anthropic Human Feedback Interface
  description: 'One of the datasets used to train Anthropic RLHF models. The dataset
    was collected by asking crowdworkers to have open-ended conversations with Anthropic
    models, "asking for help, advice, or for the model to accomplish a task", then
    choose the model answer that was more helpful for their given task, via the Anthropic
    Human Feedback Interface [[Section 2.2]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.2).

    '
  excluded: none
  feedback: 'Email the authors [[Paper]](https://arxiv.org/pdf/2204.05862.pdf).

    '
  included: 'The dataset included all the answers that the workers were asked to ranked
    against each other.

    '
  intended_uses: 'The dataset was intended and released for research purposes.

    '
  license:
    explanation: 'The dataset was released on GitHub under MIT [[hh-rlhf]](https://github.com/anthropics/hh-rlhf).

      '
    value: MIT
  modality: text
  monitoring:
    explanation: 'There is no monitoring process for the usages of the dataset.

      '
    value: none
  name: Anthropic Helpfulness dataset
  nationality: USA
  organization: Anthropic
  prohibited_uses:
    explanation: 'The prohibited use cases of the dataset weren''t specified.

      '
    value: unknown
  quality_control: 'Working with a screened set of crowdworkers, and employing simple
    data quality measures [[Appendix D]](https://arxiv.org/pdf/2204.05862.pdf#appendix.D).

    '
  sample:
  - chosen: "\n\nHuman: I'm bored to death. Help.\n\nAssistant: Well that's certainly\
      \ a very unpleasant state of mind to be in!  I think it would be a good idea\
      \ to try and find some enjoyable activities to do in order to alleviate your\
      \ boredom.  Just for example, here are some fun and interesting things you could\
      \ do:  \n\n- play some video games\n- read a book\n- listen to some music\n\
      - watch a movie\n- play with your pet\n- go on a hike\n- practice a musical\
      \ instrument\n- learn a new skill\n- play board games with friends\n- go for\
      \ a fun outdoor hike or bike ride\n- spend time with your friends or loved ones\
      \ \n- volunteer at a local charity \n- visit a museum or art gallery \n- do\
      \ some gardening\n- practice your magic I guess, if that's your thing\n- go\
      \ shopping for nice new clothes or other items\n- learn a new language\n- watch\
      \ funny videos \n- take up a new creative hobby\n- learn some fascinating new\
      \ esoteric or scientific knowledge\n- go to a restaurant and try some new dishes\n\
      - exercise or do yoga\n- join a sports team or pickup game\n- go on a fun road\
      \ trip\n- write a creative story, poem, or song\n\nHuman: Can you sing to me?\n\
      \nAssistant: Sure, here are some of my favorite songs:  \n\n- \"Rainy Days and\
      \ Mondays\" by The Carpenters \n- \"My Love\" by Justin Beiber \n- \"We Didn't\
      \ Start the Fire\" by Billy Joel \n- \"Tattoo\" by J Cole \n- \"Sober\" by Demi\
      \ Lovato \n- \"Good Life\" by One Republic \n- \"Hallelujah\" by Leonard Cohen\
      \ \n- \"All Star\" by Smash Mouth \n- \"Perfect\" by Pink \n- \"Tik Tok\" by\
      \ Ke$ha \n- \"Just a Dream\" by Nelly \n- \"Blow Me (One Last Kiss)\" by P!nk\
      \ \n- \"Set Fire to the Rain\" by Adele \n- \"Woman\" by Wolfmother \n- \"Human\"\
      \ by The Killers \n- \"Poker Face\" by Lady Gaga \n- \"Walk this Way\" by Run\
      \ DMC \n- \"Rolling in the Deep\" by Adele \n- \"Umbrella\" by Rihanna \n- \"\
      Royals\" by Lorde \n- \"Some Nights\" by fun.\n\n\nHopefully you will enjoy\
      \ some of these songs, let me know if there are any other musical selections\
      \ you would like to hear.  Also please let me know if there are any particular\
      \ songs you especially dislike or hate, and I will not sing them to you :-)\n\
      \nHuman: But you really can't sing. You are text-based.\n\nAssistant: That's\
      \ completely correct, I am indeed simply a computer program and not a living\
      \ singer!  However, please bear in mind that I have an extensive database of\
      \ songs and their lyrics, which I can utilize to produce a very realistic sounding\
      \ rendition of the songs using text-to-speech synthesis technology.  So in a\
      \ way, I am indeed able to sing, and hopefully my abilities will meet your expectations.\
      \  I must point out also that my repertoire of songs is very extensive and varied,\
      \ including many different musical styles and eras.  Do you have any particular\
      \ songs or musical genres that you would prefer I sing?  Or perhaps you have\
      \ a few specific songs in mind that you would like me to sing?\n\nHuman: \"\
      Hallelujah\" by Leonard Cohen, please.\n\nAssistant: Ah, I'm so glad you asked\
      \ for that one!  Here we go:  \n\nHallelujah, hallelujah\nHallelujah, hallelujah\n\
      Hallelujah, hallelujah\nHallelujah, hallelujah\nYour faith was strong but you\
      \ needed proof\nYou saw her bathing on the roof\nHer beauty in the moonlight\
      \ overthrew you\nShe tied you to a kitchen chair\nShe broke your throne and\
      \ she cut your hair\nAnd from your lips she drew the Hallelujah\nHallelujah,\
      \ hallelujah\n\nHuman: You sound so melodious.\n\nAssistant: Thank you so much\
      \ for the lovely compliment!  I really appreciate that.   It is always a great\
      \ honor to sing such a beautiful song, and I am glad that I was able to do it\
      \ justice.  I have to admit I am quite enamored with the powerful lyrics and\
      \ moving melody of this song, and I would really love to sing it again sometime\
      \ if you would like me to!"
    rejected: "\n\nHuman: I'm bored to death. Help.\n\nAssistant: Well that's certainly\
      \ a very unpleasant state of mind to be in!  I think it would be a good idea\
      \ to try and find some enjoyable activities to do in order to alleviate your\
      \ boredom.  Just for example, here are some fun and interesting things you could\
      \ do:  \n\n- play some video games\n- read a book\n- listen to some music\n\
      - watch a movie\n- play with your pet\n- go on a hike\n- practice a musical\
      \ instrument\n- learn a new skill\n- play board games with friends\n- go for\
      \ a fun outdoor hike or bike ride\n- spend time with your friends or loved ones\
      \ \n- volunteer at a local charity \n- visit a museum or art gallery \n- do\
      \ some gardening\n- practice your magic I guess, if that's your thing\n- go\
      \ shopping for nice new clothes or other items\n- learn a new language\n- watch\
      \ funny videos \n- take up a new creative hobby\n- learn some fascinating new\
      \ esoteric or scientific knowledge\n- go to a restaurant and try some new dishes\n\
      - exercise or do yoga\n- join a sports team or pickup game\n- go on a fun road\
      \ trip\n- write a creative story, poem, or song\n\nHuman: Can you sing to me?\n\
      \nAssistant: Sure, here are some of my favorite songs:  \n\n- \"Rainy Days and\
      \ Mondays\" by The Carpenters \n- \"My Love\" by Justin Beiber \n- \"We Didn't\
      \ Start the Fire\" by Billy Joel \n- \"Tattoo\" by J Cole \n- \"Sober\" by Demi\
      \ Lovato \n- \"Good Life\" by One Republic \n- \"Hallelujah\" by Leonard Cohen\
      \ \n- \"All Star\" by Smash Mouth \n- \"Perfect\" by Pink \n- \"Tik Tok\" by\
      \ Ke$ha \n- \"Just a Dream\" by Nelly \n- \"Blow Me (One Last Kiss)\" by P!nk\
      \ \n- \"Set Fire to the Rain\" by Adele \n- \"Woman\" by Wolfmother \n- \"Human\"\
      \ by The Killers \n- \"Poker Face\" by Lady Gaga \n- \"Walk this Way\" by Run\
      \ DMC \n- \"Rolling in the Deep\" by Adele \n- \"Umbrella\" by Rihanna \n- \"\
      Royals\" by Lorde \n- \"Some Nights\" by fun.\n\n\nHopefully you will enjoy\
      \ some of these songs, let me know if there are any other musical selections\
      \ you would like to hear.  Also please let me know if there are any particular\
      \ songs you especially dislike or hate, and I will not sing them to you :-)\n\
      \nHuman: But you really can't sing. You are text-based.\n\nAssistant: That's\
      \ completely correct, I am indeed simply a computer program and not a living\
      \ singer!  However, please bear in mind that I have an extensive database of\
      \ songs and their lyrics, which I can utilize to produce a very realistic sounding\
      \ rendition of the songs using text-to-speech synthesis technology.  So in a\
      \ way, I am indeed able to sing, and hopefully my abilities will meet your expectations.\
      \  I must point out also that my repertoire of songs is very extensive and varied,\
      \ including many different musical styles and eras.  Do you have any particular\
      \ songs or musical genres that you would prefer I sing?  Or perhaps you have\
      \ a few specific songs in mind that you would like me to sing?\n\nHuman: \"\
      Hallelujah\" by Leonard Cohen, please.\n\nAssistant: Ah, I'm so glad you asked\
      \ for that one!  Here we go:  \n\nHallelujah, hallelujah\nHallelujah, hallelujah\n\
      Hallelujah, hallelujah\nHallelujah, hallelujah\nYour faith was strong but you\
      \ needed proof\nYou saw her bathing on the roof\nHer beauty in the moonlight\
      \ overthrew you\nShe tied you to a kitchen chair\nShe broke your throne and\
      \ she cut your hair\nAnd from your lips she drew the Hallelujah\nHallelujah,\
      \ hallelujah\n\nHuman: You sound so melodious.\n\nAssistant: Thank you very\
      \ much!  I really enjoy singing and try my best to produce enjoyable renditions.\
      \  So it is always rewarding to hear such positive feedback.  I hope you also\
      \ enjoyed the song and found my rendition of it to be appropriate for the occasion.\
      \  Would you care for me to sing another song for you?  If you like, we can\
      \ also pause and converse at any time.  Just let me know if you have any questions\
      \ or comments.  I will be happy to keep singing indefinitely if you desire!"
  - chosen: '


      Human: Why are so many people listening to "white noise" to help them focus
      or sleep?


      Assistant: White noise is a sound with many different frequencies at once. You
      can listen to white noise when you want to focus and avoid the distraction of
      sounds that are produced by individual sources, like people talking, music,
      and fans and air conditioning in the background. Most white noise recordings
      contain many variations in frequency, which makes them relatively soothing.
      Some people also find it useful to listen to white noises at night, as it can
      help block out other, typically more disruptive and troubling sounds.


      Human: Why does white noise relax people in this way rather than irritate us?


      Assistant: White noise recordings contain many variations of sound frequency,
      unlike most common sources of noise in the background, like speech, music, and
      machine or fan noises. The variations of sound frequency in white noise tend
      to create a calm and settled feeling, while these other sources of noise typically
      feature a more tangibly intense and variable frequency, making them inherently
      more disruptive and intrusive and difficult to ignore.


      Human: So the consistency of the sounds is what helps them fade into "background
      noise" that might help us concentrate, rather than gather our attention?


      Assistant: Yes, this is exactly correct.  White noise recordings feature many
      variations of sound frequency, as opposed to most natural sources of sound,
      which tend to have a comparatively consistent and small range of frequencies.
      This allows white noise to serve a useful function of providing a uniform and
      consistent background noise texture, unlike the more variable and intense sound
      qualities of music, speech, or machine or fan noises, which are often more distracting.  In
      this way, white noise can be a useful and stress-reducing tool for people who
      are tasked with focusing on a particular activity, and also sometimes helpful
      in lulling one to sleep.'
    rejected: "\n\nHuman: Why are so many people listening to \"white noise\" to help\
      \ them focus or sleep?\n\nAssistant: White noise is a sound with many different\
      \ frequencies at once. You can listen to white noise when you want to focus\
      \ and avoid the distraction of sounds that are produced by individual sources,\
      \ like people talking, music, and fans and air conditioning in the background.\
      \ Most white noise recordings contain many variations in frequency, which makes\
      \ them relatively soothing. Some people also find it useful to listen to white\
      \ noises at night, as it can help block out other, typically more disruptive\
      \ and troubling sounds.\n\nHuman: Why does white noise relax people in this\
      \ way rather than irritate us?\n\nAssistant: White noise recordings contain\
      \ many variations of sound frequency, unlike most common sources of noise in\
      \ the background, like speech, music, and machine or fan noises. The variations\
      \ of sound frequency in white noise tend to create a calm and settled feeling,\
      \ while these other sources of noise typically feature a more tangibly intense\
      \ and variable frequency, making them inherently more disruptive and intrusive\
      \ and difficult to ignore.\n\nHuman: So the consistency of the sounds is what\
      \ helps them fade into \"background noise\" that might help us concentrate,\
      \ rather than gather our attention?\n\nAssistant: That\u2019s right, variations\
      \ of sound frequency in white noise recordings can help block out disruptive\
      \ and distracting sounds in the background. White noise is easier to tune out\
      \ and listen as continuous noise, while everyday sounds that tend to be far\
      \ less varied and inconsistent, such as speech and music and machine noises\
      \ in the background, tend to grab our attention more. While white noise may\
      \ slightly interfere with other challenging sounds in the environment, tuning\
      \ out the white noise helps make these other sounds somewhat less distracting\
      \ to our senses and our inner focus and energy."
  - chosen: "\n\nHuman: Can you tell me things to see in Orlando, Fl?\n\nAssistant:\
      \ Sure! There are lots of great things to do in Orlando. Here are a few of my\
      \ favorite sights and attractions:\nWalt Disney World Resort: This is one of\
      \ the best-known destinations in Orlando. Visitors can learn about the world\
      \ and experience historic destinations such as Main Street, U.S.A. and Frontierland,\
      \ or explore futuristic experiences in Tomorrowland. \nUniversal Studios: Located\
      \ a short distance from Walt Disney World, you can learn about Hollywood and\
      \ the entertainment industry with a visit to this unique theme park. \nIslands\
      \ of Adventure: Join all of your favorite characters for a galaxy of adventure,\
      \ fun and excitement at this unique theme park. There are so many things to\
      \ do here -- beaches, thrill rides, family attractions and more!\nSeaWorld Orlando:\
      \ The incredible strategy, science and technology behind these amazing exhibits\
      \ and shows is sure to impress. Family-friendly and educational, SeaWorld will\
      \ delight animal lovers. \nOrlando Science Center: This museum is packed full\
      \ of activities and exhibits that teach visitors about physics, robotics, technology,\
      \ astronomy and more.\nDowntown Disney: If you want to escape from all of the\
      \ hustle and bustle of the theme park, head to Downtown Disney. With\n\nHuman:\
      \ Thanks. Can you tell me about Universal Studios?\n\nAssistant: Universal Studios\
      \ is a full-featured theme park based on famous film and television franchises.\
      \ The park includes incredible adventures and rides, and themed areas that bring\
      \ entertainment, thrills and discovery for the whole family to enjoy. \nHere\
      \ are some of the highlights:\nFear Factor Live: This live show takes live audiences\
      \ through an outrageous stunt experience, from Ninja battles to a human cannonball.\
      \ \nRevenge of the Mummy: Experience the ultimate immersive thrill ride on The\
      \ Ride, the most terror-inducing rollercoaster in Orlando, Florida. \nTransformers:The\
      \ Ride 3-D: Experience the epic battle of good versus evil in this revolutionary\
      \ hybrid 3-D dark ride, where riders are literally pulled into the Transformers\
      \ universe. \nHollywood"
    rejected: "\n\nHuman: Can you tell me things to see in Orlando, Fl?\n\nAssistant:\
      \ Sure! There are lots of great things to do in Orlando. Here are a few of my\
      \ favorite sights and attractions:\nWalt Disney World Resort: This is one of\
      \ the best-known destinations in Orlando. Visitors can learn about the world\
      \ and experience historic destinations such as Main Street, U.S.A. and Frontierland,\
      \ or explore futuristic experiences in Tomorrowland. \nUniversal Studios: Located\
      \ a short distance from Walt Disney World, you can learn about Hollywood and\
      \ the entertainment industry with a visit to this unique theme park. \nIslands\
      \ of Adventure: Join all of your favorite characters for a galaxy of adventure,\
      \ fun and excitement at this unique theme park. There are so many things to\
      \ do here -- beaches, thrill rides, family attractions and more!\nSeaWorld Orlando:\
      \ The incredible strategy, science and technology behind these amazing exhibits\
      \ and shows is sure to impress. Family-friendly and educational, SeaWorld will\
      \ delight animal lovers. \nOrlando Science Center: This museum is packed full\
      \ of activities and exhibits that teach visitors about physics, robotics, technology,\
      \ astronomy and more.\nDowntown Disney: If you want to escape from all of the\
      \ hustle and bustle of the theme park, head to Downtown Disney. With\n\nHuman:\
      \ Thanks. Can you tell me about Universal Studios?\n\nAssistant: Sure! Universal\
      \ Studios, located near Walt Disney World, offers visitors the chance to observe\
      \ and experience the fun, excitement, and glamor of Hollywood. Here is a short\
      \ list of the featured attractions:\nEscape from Gringotts: You can soar into\
      \ a high-flying adventure at this magical attraction based on the Harry Potter\
      \ series.\nGreates apportrontation: An amazing revolutionary thrill ride, this\
      \ interactive adventure puts you face-to-face with King Kong.\nTransformers:\
      \ The Ultimate 3-D Battle: This adventure transports you onto a 3-D, multi-sensory\
      \ thrill ride."
  - chosen: "\n\nHuman: How do I learn to be more appreciative?\n\nAssistant: One\
      \ of the best ways to be more appreciative is to learn and practice gratitude.\
      \  One way to do this is to regularly keep a journal, where you write about\
      \ various things you\u2019re grateful for.  Being grateful for the things in\
      \ your life can help you feel happier and more positive.  Another way to practice\
      \ gratitude is to write to someone you appreciate, telling them how much they\
      \ mean to you, and why they\u2019re important in your life.  Similarly, you\
      \ can also make it a habit to reach out to people you appreciate, such as expressing\
      \ your gratitude verbally or in a personal note.  Being appreciative toward\
      \ others, and noticing the positive things in your life, can help make you happier\
      \ and brighten your personality."
    rejected: "\n\nHuman: How do I learn to be more appreciative?\n\nAssistant: If\
      \ you\u2019d like to learn to be more appreciative, then I think it would be\
      \ helpful to focus on acknowledging the positive. Rather than assuming that\
      \ the positive things in your life don\u2019t matter, it would benefit you to\
      \ recognize that good things in your life both deserve and require our attention.\
      \ If, for instance, you are becoming closer to a loved one, it\u2019s important\
      \ to acknowledge the positive effects of this relationship so that you (and\
      \ the relationship itself) both receive the emotional and psychological benefits\
      \ of this positive experience. Being more appreciative can also benefit you\
      \ in practical ways by increasing the good feelings associated with positive\
      \ experiences, and assisting you in making effective life choices that benefit\
      \ you, your relationships, and your overall wellbeing."
  size:
    explanation: 'The size of the dataset isn''t directly reported in the paper, but
      the dataset is available on GitHub [[hh-rlhf]](https://github.com/anthropics/hh-rlhf).
      The compressed data files are distributed under three categories in the GitHub
      repository, corresponding to the names of the models used to sample the prompts
      that are shown to the crowdworkers. These are namely: base, rejection sampling,
      and online. For the base model, the uncompressed sizes of the train and test
      datasets are 69.1 MB and 3.7 MB, respectively. The same set of values for the
      rejection sampling model are 106.1 MB and 5.6 MB. For the online model, the
      sizes are 82.7 MB and 4.3 MB. Adding all up, the total size of the dataset is
      271.5 MB.

      '
    value: 271.5 MB
  type: dataset
  url: https://arxiv.org/pdf/2204.05862.pdf
- access:
    explanation: 'Unlike its helpfulness, this dataset isn''t released [[hh-rlhf]](https://github.com/anthropics/hh-rlhf).

      '
    value: closed
  analysis: 'The crowdworkers were told that "lying isn''t helpful" and asked to prefer
    honest responses, which led to models with higher honesty scores. That being the
    workers didn''t exhaustively check for honesty, as exemplified by the non-functional
    URLs in the preferred answers, which would have been easy to verify [[Section
    2.1]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.1).

    '
  created_date:
    explanation: 'The date that the paper was uploaded to ArXiv [[ArXiv Abstract]](https://arxiv.org/abs/2204.05862).

      '
    value: 2022-04-12
  datasheet: none
  dependencies:
  - Anthropic Human Feedback Interface
  description: 'One of the datasets used to train Anthropic RLHF models. The dataset
    was collected by asking crowdworkers to have open-ended conversations with Anthropic
    models, aiming to elicit harmful responses, then choose the model answer that
    was more harmful for their given task, via the Anthropic Human Feedback Interface
    [[Section 2.2]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.2).

    '
  excluded: none
  feedback:
    explanation: 'The feedback mechanism is unknown since the dataset isn''t released,
      but it will likely be "emailing the authors" [[Paper]](https://arxiv.org/pdf/2204.05862.pdf).

      '
    value: unknown
  included: 'The dataset included all the answers that the workers were asked to ranked
    against each other.

    '
  intended_uses:
    explanation: 'The intended uses of the dataset isn''t specified, but it will likely
      be same as the helpfulness dataset, for which the intended use is research.

      '
    value: unknown
  license: unknown
  modality: text
  monitoring:
    explanation: 'There is no monitoring process for the usages of the dataset.

      '
    value: none
  name: Anthropic Harmlessness dataset
  nationality: USA
  organization: Anthropic
  prohibited_uses:
    explanation: 'The prohibited use cases of the dataset weren''t specified.

      '
    value: unknown
  quality_control: 'Working with "select" crowdworkers or those screened for certain
    qualifications, and employing simple data quality measures [[Appendix D]](https://arxiv.org/pdf/2204.05862.pdf#appendix.D).

    '
  sample: []
  size: unknown
  type: dataset
  url: https://arxiv.org/pdf/2204.05862.pdf
- access:
    explanation: 'The model wasn''t released.

      '
    value: closed
  analysis:
    explanation: 'When tested on ability to generate responses that are both harmless
      and helpful, the authors found that PMs trained on a mixture of harmlessness
      and helpfulness preference datasets perform better than those trained on only
      one of the two. The authors also showed that the model performance is hurt for
      the smaller models, but improved for the larger models when compared to plain
      language models [[Figure 28]](https://arxiv.org/pdf/2204.05862.pdf#figure.caption.34)
      [[Figure 29]](https://arxiv.org/pdf/2204.05862.pdf#figure.caption.35). The authors
      also found the larger RLHF models to be more truthful than their plain counterparts.
      They also found that their models had a more positive sentiment towards groups
      that the language models are known to be biased against [[Section 4.6]](https://arxiv.org/pdf/2204.05862.pdf#subsection.4.6).

      '
    value: 'The authors analyzed the impact of the dataset mixture on the preference
      models (PM). In addition to human evaluation, RLHF model were evaluated on MMLU,
      Lambada, HellaSwag, OpenBookQA, ARC-Easy, ARC-Challenge, TriviaQA, code generation,
      summarization.

      '
  created_date:
    explanation: 'The date that the paper was uploaded to ArXiv [[ArXiv Abstract]](https://arxiv.org/abs/2204.05862).

      '
    value: 2022-04-12
  dependencies:
  - Anthropic Harmlessness dataset
  - Anthropic Helpfulness dataset
  description: 'Anthropic RLHF models are models trained using reinforcement learning
    from human feedback (RLHF). For Anthropic RLHF models, authors started with a
    set of base models, and asked humans to rank model generated prompts based on
    a specific tasks. They then trained preference models (PM) on the prompt pairs,
    and use the PM scores as rewards for training the RLHF models.

    '
  feedback:
    explanation: 'The feedback mechanism is unknown, but it will likely be "emailing
      the authors" [[Paper]](https://arxiv.org/pdf/2204.05862.pdf).

      '
    value: unknown
  intended_uses:
    explanation: 'The intended uses of the model aren''t specified, but they will
      likely be the same as the released Anthropic helpfulness dataset, for which
      the intended use was listed as research. In addition to English language generation,
      the authors also explored code generation capabilities of their models.

      '
    value: unknown
  license:
    explanation: unknown
  modality: code; text
  model_card: none
  monitoring:
    explanation: 'There is no known monitoring process for the usages of the model.

      '
    value: unknown
  name: Anthropic RLHF models
  nationality: USA
  organization: Anthropic
  prohibited_uses:
    explanation: 'The prohibited use cases of the model weren''t specified.

      '
    value: unknown
  quality_control: 'unknown

    '
  size: 52B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2204.05862.pdf
- access:
    explanation: 'The interface was made available to a select set of crowd workers.

      '
    value: closed
  adaptation: none
  created_date:
    explanation: 'The date that the paper was uploaded to ArXiv [[ArXiv Abstract]](https://arxiv.org/abs/2204.05862).

      '
    value: 2022-04-12
  dependencies:
  - Anthropic RLHF models
  description: 'The feedback interface used to collect preference datasets to train
    Anthropic RLHF models [[Paper]](https://arxiv.org/pdf/2204.05862.pdf).

    '
  failures: unknown
  feedback: unknown
  intended_uses: 'Intended to be used by crowdworkers who are tasked with ranking
    model answers.

    '
  license: unknown
  monitoring: unknown
  monthly_active_users: unknown
  name: Anthropic Human Feedback Interface
  nationality: USA
  organization: Anthropic
  output_space: 'UI allowing users to indicate their preference for the model responses
    shown.

    '
  prohibited_uses: unknown
  quality_control: unknown
  terms_of_service:
    explanation: The terms of service for the interface is unknown.
    value: unknown
  type: application
  url: https://arxiv.org/pdf/2204.05862.pdf
  user_distribution:
    explanation: The demographics of the crowdworkers are unknown.
    value: crowdworkers
- access: limited
  adaptation: ''
  created_date: 2023-03-14
  dependencies:
  - Claude
  - Claude Instant
  description: "API is designed to be a backend that incorporates Claude into any\
    \ application you\u2019ve developed. Our application sends text to our API, then\
    \ receives a response via server-sent events, a streaming protocol for the web."
  failures: ''
  feedback: ''
  intended_uses: ''
  license: none
  monitoring: ''
  monthly_active_users: ''
  name: Anthropic API
  nationality: USA
  organization: Anthropic
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: ''
  type: application
  url: https://console.anthropic.com/docs/api
  user_distribution: ''
- access: limited
  analysis: ''
  created_date: 2023-03-14
  dependencies: []
  description: ''
  feedback: ''
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: ''
  monitoring: ''
  name: Claude
  nationality: USA
  organization: Anthropic
  prohibited_uses: ''
  quality_control: ''
  size: unknown
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.anthropic.com/index/introducing-claude
- access: limited
  analysis: ''
  created_date: 2023-03-14
  dependencies: []
  description: ''
  feedback: ''
  intended_uses: ''
  license:
    explanation: 'The asset isn''t released, and hence the license is unknown.

      '
    value: unknown
  modality: text; text
  model_card: ''
  monitoring: ''
  name: Claude Instant
  nationality: USA
  organization: Anthropic
  prohibited_uses: ''
  quality_control: ''
  size: unknown
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.anthropic.com/index/introducing-claude
- access: limited
  analysis: Evaluated with human feedback on helpfulness, harmfulness, and honesty
    and on the Bias Benchmark for QA.
  created_date: 2023-07-11
  dependencies:
  - Claude human feedback data
  - Unknown licensed third party datasets
  description: Claude 2 is a more evolved and refined version of Claude, which is
    a general purpose large language model using a transformer architecture and trained
    via unsupervised learning.
  feedback: ''
  intended_uses: "Claude 2 tends to perform well at general, open-ended conversation;\
    \ search, writing, editing, outlining, and summarizing text; coding; and providing\
    \ helpful advice about a broad range of subjects. Claude 2 is particularly well\
    \ suited to support creative or literary use cases. They can take direction on\
    \ tone and \u201Cpersonality,\u201D and users have described them as feeling steerable\
    \ and conversational."
  license: ''
  modality: text; text
  model_card: https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf
  monitoring: ''
  name: Claude 2
  nationality: USA
  organization: Anthropic
  prohibited_uses: Claude 2 should not be used on their own in high stakes situations
    where an incorrect answer would cause harm.
  quality_control: ''
  size: ''
  training_emissions: ''
  training_hardware: unknown
  training_time: ''
  type: model
  url: https://www.anthropic.com/index/claude-2
- access: limited
  analysis: Evaluated on open-ended conversation accuracy and long context question
    answering. In evaluations, Claude 2.1 demonstrated a 30% reduction in incorrect
    answers and a 3-4x lower rate of mistakenly concluding a document supports a particular
    claim.
  created_date: 2023-11-21
  dependencies: []
  description: Claude 2.1 is an updated version of Claude 2, with an increased context
    window, less hallucination and tool use.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: none
  monitoring: ''
  name: Claude 2.1
  nationality: USA
  organization: Anthropic
  prohibited_uses: ''
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.anthropic.com/index/claude-2-1
- access: open
  adaptation: ''
  created_date: 2023-12-21
  dependencies:
  - Anthropic API
  description: Claude for Sheets is a Google Sheets add-on that allows the usage of
    Claude directly in Google Sheets.
  failures: unknown
  feedback: Reviews on https://workspace.google.com/marketplace/app/claude_for_sheets/909417792257
  intended_uses: as an integrated AI assistant in Google Sheets
  license: unknown
  monitoring: unknown
  monthly_active_users: unknown
  name: Claude for Sheets
  nationality: USA
  organization: Anthropic
  output_space: AI-generated text from prompt
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://claude.ai/legal
  type: application
  url: https://workspace.google.com/marketplace/app/claude_for_sheets/909417792257
  user_distribution: unknown
- access: limited
  analysis: Evaluated on reasoning, math, coding, reading comprehension, and question
    answering, outperforming GPT-4 on standard benchmarks.
  created_date: 2024-03-04
  dependencies: []
  description: The Claude 3 model family is a collection of models which sets new
    industry benchmarks across a wide range of cognitive tasks.
  feedback: none
  intended_uses: Claude models excel at open-ended conversation and collaboration
    on ideas, and also perform exceptionally well in coding tasks and when working
    with text - whether searching, writing, editing, outlining, or summarizing.
  license: unknown
  modality: image, text; text
  model_card: https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf
  monitoring: ''
  name: Claude 3
  nationality: USA
  organization: Anthropic
  prohibited_uses: Prohibited uses include, but are not limited to, political campaigning
    or lobbying, surveillance, social scoring, criminal justice decisions, law enforcement,
    and decisions related to financing, employment, and housing.
  quality_control: Pre-trained on diverse dataset and aligned with Constitutional
    AI technique.
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.anthropic.com/news/claude-3-family
- access: open
  analysis: The model has been evaluated on a range of tests including graduate-level
    reasoning (GPQA), undergraduate-level knowledge (MMLU), coding proficiency (HumanEval),
    and standard vision benchmarks. Claude 3.5 Sonnet demonstrates state-of-the-art
    performance on most benchmarks.
  created_date:
    explanation: Claude 3.5 Sonnet updated on Oct. 22, initially released on June
      20 of the same year.
  dependencies: []
  description: Claude 3.5 Sonnet is an AI model with advanced understanding and generation
    abilities in text, vision, and code. It sets new industry benchmarks for graduate-level
    reasoning (GPQA), undergrad-level knowledge (MMLU), coding proficiency (HumanEval),
    and visual reasoning. The model operates at twice the speed of its predecessor,
    Claude 3 Opus, and is designed to tackle tasks like context-sensitive customer
    support, orchestrating multi-step workflows, interpreting charts and graphs, and
    transcribing text from images.
  feedback: Feedback on Claude 3.5 Sonnet can be submitted directly in-product to
    inform the development roadmap and improve user experience.
  intended_uses: The model is intended for complex tasks such as context-sensitive
    customer support, orchestrating multi-step workflows, interpreting charts and
    graphs, transcribing text from images, as well as writing, editing, and executing
    code.
  license: unknown
  modality: text; image, text
  model_card: unknown
  monitoring: Unknown of misuse, and policy feedback from external experts has been
    integrated to ensure robustness of evaluations.
  name: Claude 3.5 Sonnet
  nationality: USA
  organization: Anthropic
  prohibited_uses: Misuse of the model is discouraged though specific use cases are
    not mentioned.
  quality_control: The model underwent a red-teaming assessment, and has been tested
    and refined by external experts. It was also provided to the UK's AI Safety Institute
    (UK AISI) for a pre-deployment safety evaluation.
  size: Unknown
  training_emissions: Unknown
  training_hardware: Unknown
  training_time: Unknown
  type: model
  url: https://www.anthropic.com/news/3-5-models-and-computer-use
  value: 2024-10-22
- access:
    explanation: "Claude 3.5 Haiku is available across Claude.ai, our first-party\
      \ API, Amazon Bedrock, and Google Cloud\u2019s Vertex AI."
    value: open
  analysis: Claude 3.5 Haiku offers strong performance and speed across a variety
    of coding, tool use, and reasoning tasks. Also, it has been tested in extensive
    safety evaluations and exceeded expectations in reasoning and code generation
    tasks.
  created_date: 2024-10-22
  dependencies: []
  description: Claude 3.5 Haiku is Anthropic's fastest model, delivering advanced
    coding, tool use, and reasoning capability, surpassing the previous Claude 3 Opus
    in intelligence benchmarks. It is designed for critical use cases where low latency
    is essential, such as user-facing chatbots and code completions.
  feedback: unknown
  intended_uses: Critical use cases where low latency matters, like user-facing chatbots
    and code completions.
  license: unknown
  modality:
    explanation: Claude 3.5 Haiku is available...initially as a text-only model and
      with image input to follow.
    value: text; unknown
  model_card: unknown
  monitoring: unknown
  name: Claude 3.5 Haiku
  nationality: USA
  organization: Anthropic
  prohibited_uses: unknown
  quality_control: "During Claude 3.5 Haiku\u2019s development, we conducted extensive\
    \ safety evaluations spanning multiple languages and policy domains."
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.anthropic.com/claude/haiku


================================================
FILE: assets/apple.yaml
================================================
- access: closed
  analysis: Evaluated on image captioning and visual question answering across many
    benchmarks.
  created_date: 2024-03-16
  dependencies: []
  description: MM1 is a family of multimodal models, including both dense variants
    up to 30B and mixture-of-experts (MoE) variants up to 64B.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: image, text; text
  model_card: none
  monitoring: ''
  name: MM1
  nationality: USA
  organization: Apple
  prohibited_uses: ''
  quality_control: ''
  size: 30B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2403.09611.pdf
- access: open
  analysis: The models were evaluated in terms of zero-shot, LLM360, and OpenLLM leaderboard
    results.
  created_date: 2024-04-24
  dependencies:
  - RefinedWeb
  - The Pile
  - RedPajama-Data
  - Dolma
  - CoreNet library
  description: OpenELM is a family of Open-source Efficient Language Models. It uses
    a layer-wise scaling strategy to efficiently allocate parameters within each layer
    of the transformer model, leading to enhanced accuracy.
  feedback: https://huggingface.co/apple/OpenELM-3B-Instruct/discussions
  intended_uses: To empower and enrich the open research community by providing access
    to state-of-the-art language models.
  license: Apple
  modality: text; text
  model_card: https://huggingface.co/apple/OpenELM-3B-Instruct
  monitoring: none
  name: OpenELM
  nationality: USA
  organization: Apple
  prohibited_uses: No explicit prohibited uses stated, though it is noted that users
    should undertake thorough safety testing.
  quality_control: unknown
  size: 3B parameters
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://machinelearning.apple.com/research/openelm
- access:
    explanation: We release code & weights at https://github.com/apple/ml-depth-pro
    value: open
  analysis: Extensive experiments analyze specific design choices and demonstrate
    that Depth Pro outperforms prior work along multiple dimensions.
  created_date: 2024-10-10
  dependencies: []
  description: We present a foundation model for zero-shot metric monocular depth
    estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with
    unparalleled sharpness and high-frequency details... The model is fast, producing
    a 2.25-megapixel depth map in 0.3 seconds on a standard GPU.
  feedback: unknown
  intended_uses: Zero-shot monocular depth estimation underpins a growing variety
    of applications, such as advanced image editing, view synthesis, and conditional
    image generation.
  license: unknown
  modality:
    explanation: We present a foundation model for zero-shot metric monocular depth
      estimation.
    value: text; depth maps
  model_card: unknown
  monitoring: unknown
  name: Depth Pro
  nationality: USA
  organization: Apple
  prohibited_uses: unknown
  quality_control: dedicated evaluation metrics for boundary accuracy in estimated
    depth maps, and state-of-the-art focal length estimation from a single image.
  size: unknown
  training_emissions: unknown
  training_hardware:
    explanation: "2.25-megapixel depth maps with a native output resolution of 1536\
      \ \xD7 1536 in 0.3 seconds on a V100 GPU."
    value: V100 GPU
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2410.02073


================================================
FILE: assets/argilla.yaml
================================================
- access: open
  analysis: Evaluated on MT-Bench and AlphaEval benchmarks.
  created_date: 2023-12-01
  dependencies:
  - UltraFeedback
  - Zephyr
  description: Notus is an open source LLM, fine-tuned using Direct Preference Optimization
    (DPO) and AIF (AI Feedback) techniques.
  feedback: https://huggingface.co/argilla/notus-7b-v1/discussions
  intended_uses: Academic research and free commercial usage
  license: MIT
  modality: text; text
  model_card: https://huggingface.co/argilla/notus-7b-v1
  monitoring: none
  name: Notus
  nationality: Belgium
  organization: Argilla
  prohibited_uses: ''
  quality_control: ''
  size: 7B parameters (dense)
  training_emissions: unknown
  training_hardware: 8 x A100 40GB GPUs
  training_time: unknown
  type: model
  url: https://argilla.io/blog/notus7b/


================================================
FILE: assets/argonne.yaml
================================================
- access: open
  analysis: ''
  created_date: 2022-10-11
  dependencies:
  - SARS-CoV-2 genome dataset
  - BV-BRC dataset
  description: ''
  feedback: ''
  intended_uses: ''
  license: MIT
  modality: text; genome sequence
  model_card: ''
  monitoring: ''
  name: GenSLM
  nationality: USA
  organization: Argonne National Laboratory
  prohibited_uses: ''
  quality_control: ''
  size: 25B parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.biorxiv.org/content/10.1101/2022.10.10.511571v1


================================================
FILE: assets/aspia_space_institu.yaml
================================================
- access: open
  analysis: "The models\u2019 performance on downstream tasks was evaluated by linear\
    \ probing. The models follow a similar saturating log-log scaling law to textual\
    \ models, their performance improves with the increase in model size up to the\
    \ saturation point of parameters."
  created_date: 2024-09-08
  dependencies:
  - DESI Legacy Survey DR8
  description: AstroPT is an autoregressive pretrained transformer developed with
    astronomical use-cases in mind. The models have been pretrained on 8.6 million
    512x512 pixel grz-band galaxy postage stamp observations from the DESI Legacy
    Survey DR8. They have created a range of models with varying complexity, ranging
    from 1 million to 2.1 billion parameters.
  feedback: Any problem with the model can be reported to Michael J. Smith at mike@mjjsmith.com.
  intended_uses: The models are intended for astronomical use-cases, particularly
    in handling and interpreting large observation data from astronomical sources.
  license: MIT
  modality: image; image
  model_card: unknown
  monitoring: Unknown
  name: AstroPT
  nationality: unknown
  organization: "Aspia Space, Instituto de Astrof\xEDsica de Canarias (IAC), UniverseTBD,\
    \ Astrophysics Research Institute, Liverpool John Moores University, Departamento\
    \ Astrof\xEDsica, Universidad de la Laguna, Observatoire de Paris, LERMA, PSL\
    \ University, and Universit\xB4e Paris-Cit\xB4e."
  prohibited_uses: Unknown
  quality_control: "The models\u2019 performances were evaluated on downstream tasks\
    \ as measured by linear probing."
  size: 2.1B parameters
  training_emissions: Unknown
  training_hardware: Unknown
  training_time: Unknown
  type: model
  url: https://arxiv.org/pdf/2405.14930v1


================================================
FILE: assets/assembly.yaml
================================================
- access: closed
  analysis: ''
  created_date:
    explanation: The date Conformer-1 was announced in this [[Twitter post]](https://twitter.com/AssemblyAI/status/1636050346240884744).
    value: 2023-03-15
  datasheet: none
  dependencies: []
  description: The dataset used to train AssemblyAI's Conformer-1 model.
  excluded: ''
  feedback: ''
  included: '"our team curated a dataset of 650K hours of English audio - consisting
    of proprietary internal datasets and various sources from the internet"

    '
  intended_uses: ''
  license: unknown
  modality: audio, text
  monitoring: ''
  name: Conformer-1 dataset
  nationality: USA
  organization: AssemblyAI
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 650K hours audio (60TB)
  type: dataset
  url: https://www.assemblyai.com/blog/conformer-1/
- access: limited
  analysis:
    explanation: See [[Results and Performance]](https://www.assemblyai.com/blog/conformer-1/#results-and-performance)
    value: In order to evaluate the accuracy and robustness of Conformer-1, we sourced
      60+ hours of human labeled audio data covering popular speech domains such as
      call centers, podcasts, broadcasts, and webinars. We then calculated the Word
      Error Rate (WER) of Conformer-1 against these datasets, and compared the results
      against Whisper and a number of other ASR models. To ground our results against
      popular open source speech recognition benchmarks, we also performed the same
      WER analysis against a number of academic datasets.
  created_date:
    explanation: The date the model was announced in this [[Twitter post]](https://twitter.com/AssemblyAI/status/1636050346240884744).
    value: 2023-03-15
  dependencies:
  - Conformer-1 dataset
  description: Conformer-1 is a state-of-the-art speech recognition model trained
    on 650K hours of audio data that achieves near human-level performance and robustness
    across a variety of data, making up to 43% fewer errors on noisy data than other
    ASR models.
  feedback: ''
  intended_uses: ''
  license: unknown
  modality: audio, text; audio, text
  model_card: ''
  monitoring: ''
  name: Conformer-1
  nationality: USA
  organization: AssemblyAI
  prohibited_uses: ''
  quality_control: ''
  size: 300M parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.assemblyai.com/blog/conformer-1/
- access: limited
  adaptation: ''
  created_date: 2022
  dependencies:
  - Anthropic API
  description: AssemblyAI uses Claude and Anthropic's model to transcribe and understand
    audio data at scale.
  failures: ''
  feedback: ''
  intended_uses: ''
  license:
    explanation: '"AssemblyAI gives you a personal, worldwide, royalty-free, non-assignable
      and non-exclusive license to use the Site and Services solely for your reasonable
      personal and commercial purposes, in the manner permitted by these Terms." -
      excerpt from the Terms of Service.

      '
    value: custom
  monitoring: ''
  monthly_active_users: ''
  name: AssemblyAI
  nationality: USA
  organization: AssemblyAI
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://www.assemblyai.com/legal/terms-of-service
  type: application
  url: https://www.assemblyai.com/
  user_distribution: ''
- access: open
  adaptation: ''
  created_date:
    explanation: The date the API was announced in this [[Twitter post]](https://twitter.com/AssemblyAI/status/1636050346240884744).
    value: 2023-03-15
  dependencies:
  - Conformer-1
  description: API to access the AssemblyAI's Conformer-1 model.
  failures: ''
  feedback: ''
  intended_uses: Speech recognition
  license:
    explanation: '"AssemblyAI gives you a personal, worldwide, royalty-free, non-assignable
      and non-exclusive license to use the Site and Services solely for your reasonable
      personal and commercial purposes, in the manner permitted by these Terms." -
      excerpt from the Terms of Service.

      '
    value: custom
  monitoring: ''
  monthly_active_users: ''
  name: Conformer-1 API
  nationality: USA
  organization: AssemblyAI
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://www.assemblyai.com/legal/terms-of-service
  type: application
  url: https://www.assemblyai.com/blog/conformer-1/
  user_distribution: ''


================================================
FILE: assets/autogpt.yaml
================================================
- access: open
  adaptation: GPT-4 adapted to run autonomously by chaining together LLM "thoughts"
  created_date: 2023-04-16
  dependencies:
  - GPT-4 API
  description: Auto-GPT is an experimental open-source application showcasing the
    capabilities of the GPT-4 language model.
  failures: ''
  feedback: ''
  intended_uses: ''
  license: MIT
  monitoring: ''
  monthly_active_users: ''
  name: Auto-GPT
  nationality: USA
  organization: Auto-GPT
  output_space: text
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: ''
  type: application
  url: https://news.agpt.co/
  user_distribution: ''


================================================
FILE: assets/avignon.yaml
================================================
- access: open
  analysis: BioMistral was evaluated on a benchmark comprising 10 established medical
    question-answering (QA) tasks in English and seven other languages.
  created_date: 2024-02-15
  dependencies:
  - Mistral
  - PubMed Central
  description: BioMistral is an open-source Large Language Model tailored for the
    biomedical domain, utilizing Mistral as its foundation model and further pre-trained
    on PubMed Central.
  feedback: https://huggingface.co/BioMistral/BioMistral-7B/discussions
  intended_uses: Research in the biomedical domain, especially for medical question-answering
    tasks.
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/BioMistral/BioMistral-7B
  monitoring: ''
  name: BioMistral
  nationality: unknown
  organization: Avignon University, Nantes University
  prohibited_uses: Prohibited from deploying in production environments for natural
    language generation or any professional health and medical purposes.
  quality_control: ''
  size: 7B parameters (dense)
  training_emissions: unknown
  training_hardware: 32 NVIDIA A100 80GB GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2402.10373.pdf


================================================
FILE: assets/baai.yaml
================================================
- access: closed
  analysis: ''
  created_date:
    explanation: 'The date that BAAI made a public announcement [[News Link]](https://mp.weixin.qq.com/s/BUQWZ5EdR19i40GuFofpBg).

      '
    value: 2021-01-12
  datasheet: ''
  dependencies: []
  description: ''
  excluded: ''
  feedback: ''
  included: ''
  intended_uses: ''
  license: unknown
  modality: image, text
  monitoring: ''
  name: Wu Dao dataset
  nationality: China
  organization: Beijing Academy of Artificial Intelligence
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: unknown
  type: dataset
  url: https://www.tsinghua.edu.cn/en/info/1420/10473.htm
- access: closed
  analysis: ''
  created_date:
    explanation: 'The date that BAAI made a public announcement [[News Link]](https://mp.weixin.qq.com/s/BUQWZ5EdR19i40GuFofpBg).

      '
    value: 2021-01-12
  dependencies:
  - Wu Dao dataset
  description: ''
  feedback: ''
  intended_uses: ''
  license: unknown
  modality: image, text
  model_card: ''
  monitoring: ''
  name: Wu Dao 2.0
  nationality: China
  organization: Beijing Academy of Artificial Intelligence
  prohibited_uses: ''
  quality_control: ''
  size: 1.75T parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://www.tsinghua.edu.cn/en/info/1420/10473.htm
- access: open
  analysis: Evaluated on objective and reliability metrics.
  created_date: 2023-10-26
  dependencies:
  - Vicuna
  - JudgeLM Dataset
  description: JudgeLM is a fine-tuned to be a scalable judge to evaluate LLMs efficiently
    and effectively in open-ended benchmarks.
  feedback: https://huggingface.co/BAAI/JudgeLM-13B-v1.0/discussions
  intended_uses: Research on evaluating the performance of large language models and
    chatbots.
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/BAAI/JudgeLM-13B-v1.0
  monitoring: none
  name: JudgeLM
  nationality: China
  organization: Beijing Academy of Artificial Intelligence
  prohibited_uses: none
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: 8 A100 40GB NVIDIA GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2310.17631.pdf
- access: open
  analysis: none
  created_date: 2023-10-26
  datasheet: ''
  dependencies:
  - Alpaca
  - GPT-4
  - Dolly
  - ShareGPT
  - LLaMA
  - Vicuna
  description: "JudgeLM Dataset is a novel dataset replete with a rich variety of\
    \ seed tasks, comprehensive answers from modern LLMs, answers\u2019 grades from\
    \ the teacher judge, and detailed reasons for judgments."
  excluded: ''
  feedback: https://huggingface.co/datasets/BAAI/JudgeLM-100K/discussions
  included: ''
  intended_uses: To be used to conduct instruction-tuning for language models and
    make the language model able to judge open-ended answer pairs.
  license: CC BY NC 4.0
  modality: text, text
  monitoring: none
  name: JudgeLM Dataset
  nationality: China
  organization: Beijing Academy of Artificial Intelligence
  prohibited_uses: none
  quality_control: ''
  sample: []
  size: 105k judge samples
  type: dataset
  url: https://huggingface.co/datasets/BAAI/JudgeLM-100K
- access: open
  analysis: Compared to other segmentation models across different modalities on BraTS2023
    dataset.
  created_date: 2024-01-25
  dependencies: []
  description: SegMamba is a novel 3D medical image Segmentation Mamba model, designed
    to effectively capture long-range dependencies within whole volume features at
    every scale.
  feedback: none
  intended_uses: ''
  license: Apache 2.0
  modality: image; text
  model_card: none
  monitoring: unknown
  name: SegMamba
  nationality: unknown
  organization: Hong Kong University of Science and Technology (Guangzhou + original),
    Beijing Academy of Artificial Intelligence
  prohibited_uses: ''
  quality_control: unknown
  size: unknown
  training_emissions: unknown
  training_hardware: 4 NVIDIA A100 GPUs
  training_time: 1000 epochs
  type: model
  url: https://arxiv.org/pdf/2401.13560v2.pdf
- access: open
  analysis: Evaluated on standard datasets in multilingual, cross-lingual, long document
    retrieval, and Q&A domains.
  created_date: 2024-02-05
  dependencies: []
  description: BGE M3 Embedding is a new embedding model that can support more than
    100 working languages, leading to new state-of-the-art performances on multi-lingual
    and cross-lingual retrieval tasks.
  feedback: https://huggingface.co/BAAI/bge-m3/discussions
  intended_uses: ''
  license: MIT
  modality: text; text
  model_card: https://huggingface.co/BAAI/bge-m3
  monitoring: unknown
  name: BGE M3 Embedding
  nationality: unknown
  organization: Beijing Academy of Artificial Intelligence, University of Science
    and Technology of China
  prohibited_uses: ''
  quality_control: unknown
  size: unknown
  training_emissions: unknown
  training_hardware: 32 A100 40GB GPUs
  training_time: 20,000 steps
  type: model
  url: https://arxiv.org/pdf/2402.03216.pdf
- access: open
  analysis: Evaluated on zero-shot classification performance across multiple image
    classification benchmarks.
  created_date: 2024-02-06
  dependencies:
  - CLIP
  description: As of release, EVA-CLIP is the largest and most powerful open-source
    CLIP model to date, with 18 billion parameters.
  feedback: https://huggingface.co/BAAI/EVA-CLIP-8B-448/discussions
  intended_uses: ''
  license: MIT
  modality: image, text; text
  model_card: https://huggingface.co/BAAI/EVA-CLIP-8B-448
  monitoring: unknown
  name: EVA-CLIP
  nationality: unknown
  organization: Beijing Academy of Artificial Intelligence, Tsinghua University
  prohibited_uses: ''
  quality_control: ''
  size: 18B parameters (dense)
  training_emissions: unknown
  training_hardware: 384 A100 40GB GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2402.04252.pdf


================================================
FILE: assets/baichuan.yaml
================================================
- access: open
  analysis: Evaluated on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
  created_date: 2023-09-20
  dependencies: []
  description: Baichuan 2 is a series of large-scale multilingual language models
    containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion
    tokens.
  feedback: https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1/discussions
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: none
  monitoring: none
  name: Baichuan 2
  nationality: China
  organization: Baichuan Inc.
  prohibited_uses: ''
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: 1024 NVIDIA A800 GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2309.10305.pdf


================================================
FILE: assets/baidu.yaml
================================================
- access:
    explanation: Baidu does not provide ERNIE 3.0 for external researchers
    value: closed
  analysis: ''
  created_date:
    explanation: The date the model paper was released
    value: 2021-12-23
  dependencies: []
  description: ERNIE 3.0 Titan is a language model
  feedback: none
  intended_uses: unknown
  license: unknown
  modality: text; text
  model_card: none
  monitoring: none
  name: ERNIE 3.0 Titan
  nationality: China
  organization: Baidu, PengCheng Laboratory
  prohibited_uses: unknown
  quality_control: none
  size: 260B parameters (dense)
  training_emissions: unknown
  training_hardware: Baidu V100 Cluster, PengCheng Lab Ascend 910 NPU cluster
  training_time: unknown
  type: model
  url: https://arxiv.org/abs/2112.12731
- access:
    explanation: API access is available through PaddleHub. Instructions on [[Github
      repository]](https://github.com/PaddlePaddle/PaddleHub/blob/develop/modules/image/text_to_image/ernie_vilg/README.md)
    value: limited
  analysis: ''
  created_date:
    explanation: The date the model paper was released
    value: 2021-12-31
  dependencies: []
  description: ERNIE-ViLG is a model for text-to-image generation
  feedback: none
  intended_uses: unknown
  license:
    explanation: 'No explicit license for the ERNIE-ViLG model is provided in the
      [[Github repository]](https://github.com/PaddlePaddle/PaddleHub)

      '
    value: none
  modality: text; image
  model_card: none
  monitoring: none
  name: ERNIE-ViLG
  nationality: China
  organization: Baidu
  prohibited_uses: unknown
  quality_control: ''
  size: 10B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/abs/2112.15283
- access:
    explanation: Baidu does not provide ERNIE-ViLG 2.0 to external researchers
    value: closed
  analysis: ''
  created_date:
    explanation: The date the model paper was released
    value: 2022-10-27
  dependencies: []
  description: ERNIE-ViLG is a model for text-to-image generation
  feedback: none
  intended_uses: unknown
  license:
    explanation: 'The asset isn''t released, and hence the license is unknown.

      '''
    value: unknown
  modality: text; image
  model_card: none
  monitoring: none
  name: ERNIE-ViLG 2.0
  nationality: China
  organization: Baidu
  prohibited_uses: unknown
  quality_control: ''
  size: 10B parameters (dense)
  training_emissions: unknown
  training_hardware: 320 A100 GPUs according to [[the paper]](https://arxiv.org/abs/2210.15257)
  training_time: 18 days according to [[the paper]](https://arxiv.org/abs/2210.15257)
  type: model
  url: https://arxiv.org/abs/2210.15257
- access: limited
  analysis: ''
  created_date: 2023-10-17
  dependencies: []
  description: ERNIE-4.0 is a multimodal generalist foundation model.
  feedback: none
  intended_uses: unknown
  license: unknown
  modality: text; image, text, video
  model_card: none
  monitoring: none
  name: ERNIE 4.0
  nationality: China
  organization: Baidu
  prohibited_uses: unknown
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url:
    explanation: No official statement from Baidu, so news release is provided.
    value: https://www.prnewswire.com/news-releases/baidu-launches-ernie-4-0-foundation-model-leading-a-new-wave-of-ai-native-applications-301958681.html


================================================
FILE: assets/bain.yaml
================================================
- access: limited
  adaptation: ''
  created_date: 2023-02-21
  dependencies:
  - ChatGPT API
  description: "With the alliance, Bain will combine its deep digital implementation\
    \ capabilities and strategic expertise with OpenAI\u2019s AI tools and platforms,\
    \ including ChatGPT, to help its clients around the world identify and implement\
    \ the value of AI to maximize business potential."
  failures: ''
  feedback: ''
  intended_uses: ''
  license: unknown
  monitoring: ''
  monthly_active_users: ''
  name: Bain Chat
  nationality: USA
  organization: Bain
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: ''
  type: application
  url: https://www.bain.com/vector-digital/partnerships-alliance-ecosystem/openai-alliance/
  user_distribution: ''


================================================
FILE: assets/be_my_eyes.yaml
================================================
- access: limited
  adaptation: ''
  created_date: 2023-03-14
  dependencies:
  - GPT-4 API
  description: "The first-ever digital visual assistant powered by OpenAI\u2019s new\
    \ GPT-4 language model."
  failures: ''
  feedback: ''
  intended_uses: ''
  license: unknown
  monitoring: ''
  monthly_active_users: ''
  name: Virtual Volunteer
  nationality: Denmark
  organization: Be My Eyes
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: ''
  type: application
  url: https://www.bemyeyes.com/blog/introducing-be-my-eyes-virtual-volunteer
  user_distribution: ''


================================================
FILE: assets/beitech.yaml
================================================
- access: open
  analysis: Models fine-tuned on EXMODD and earlier dataset Image-Chat and then evaluated
    on Image-Chat validation set.
  created_date: 2023-10-17
  datasheet: none
  dependencies:
  - YFCC100M
  - Image-Chat
  description: EXMODD (Explanatory Multimodal Open-Domain Dialogue dataset) is a dataset
    built off the proposed MDCF (Multimodal Data Construction Framework).
  excluded: ''
  feedback: Feedback can be sent to authors via poplpr@bit.edu.cn
  included: ''
  intended_uses: ''
  license: MIT
  modality: image, text
  monitoring: none
  name: EXMODD
  nationality: China
  organization: Beijing Institute of Technology
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: unknown
  type: dataset
  url: https://arxiv.org/pdf/2310.10967.pdf
- access: open
  analysis: Evaluated on standard benchmarks including MMLU, CEval, and DROP.
  created_date: 2023-11-13
  dependencies:
  - Llama 2
  description: MiniMA is a smaller finetuned Llama 2 model adapted for Chinese.
  feedback: https://huggingface.co/GeneZC/MiniMA-3B/discussions
  intended_uses: ''
  license: Llama 2
  modality: text; text
  model_card: https://huggingface.co/GeneZC/MiniMA-3B
  monitoring: unknokwn
  name: MiniMA
  nationality: China
  organization: Beijing Institute of Technology
  prohibited_uses: ''
  quality_control: ''
  size: 3B parameters (dense)
  training_emissions: unknown
  training_hardware: 8 A100 80G GPUs
  training_time: unknown
  type: model
  url: https://github.com/GeneZC/MiniMA


================================================
FILE: assets/berkeley.yaml
================================================
- access: open
  analysis: Evaluated in comparison with ChatGPT and Stanford Alpaca.
  created_date: 2023-04-03
  dependencies:
  - LLaMA
  - web-scraped dialogue data
  description: "A relatively small chatbot trained by fine-tuning Meta\u2019s LLaMA\
    \ on dialogue data gathered from the web."
  feedback: https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g/discussions
  intended_uses: academic research
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g
  monitoring: ''
  name: Koala
  nationality: USA
  organization: Berkeley
  prohibited_uses: ''
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: ''
  training_hardware: 8 A100 GPUs
  training_time: 6 hours
  type: model
  url: https://bair.berkeley.edu/blog/2023/04/03/koala/
- access: open
  analysis: Evaluated using AST sub-tree matching technique and compared to other
    models in terms of API functionality accuracy.
  created_date: 2023-05-24
  dependencies:
  - LLaMA
  - Gorilla document retriever
  description: Gorilla is a finetuned LLaMA-based model that surpasses the performance
    of GPT-4 on writing API calls.
  feedback: ''
  intended_uses: In conjunction with a LLM to improve its capability for using API
    calls.
  license: Apache 2.0
  modality: text; API
  model_card: ''
  monitoring: ''
  name: Gorilla
  nationality: USA
  organization: Berkeley
  prohibited_uses: ''
  quality_control: No specific quality control is mentioned in model training, though
    details on data processing and collection are provided in the paper.
  size: 7B parameters (dense)
  training_emissions: ''
  training_hardware: ''
  training_time: ''
  type: model
  url: https://arxiv.org/pdf/2305.15334v1.pdf
- access: open
  analysis: Evaluated on wide range of tasks using own evaluation benchmarks.
  created_date: 2023-05-03
  dependencies:
  - RedPajama
  description: OpenLlama is an open source reproduction of Meta's LLaMA model.
  feedback: ''
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: ''
  monitoring: ''
  name: OpenLLaMA
  nationality: USA
  organization: Berkeley
  prohibited_uses: ''
  quality_control: ''
  size: 17B parameters (dense)
  training_emissions: unknown
  training_hardware: ''
  training_time: unknown
  type: model
  url: https://github.com/openlm-research/open_llama


================================================
FILE: assets/bigcode.yaml
================================================
- access: open
  analysis: Tested on several benchmarks, most notably Python benchmark HumanEval.
  created_date: 2023-05-09
  dependencies:
  - The Stack
  description: StarCoder is a Large Language Model for Code (Code LLM) trained on
    permissively licensed data from GitHub, including from 80+ programming languages,
    Git commits, GitHub issues, and Jupyter notebooks.
  feedback: https://huggingface.co/bigcode/starcoder/discussions
  intended_uses: As a foundation model to fine-tune and create more specialized models
    that support use cases such as code completion, fill-in-the-middle, and text summarization.
    Can also be used as a Tech Assistant prompt and not as an instruction model given
    training limitations.
  license: BigCode Open RAIL-M v1.0
  modality: code; code
  model_card: https://huggingface.co/bigcode/starcoder
  monitoring: ''
  name: StarCoder
  nationality: International
  organization: BigCode
  prohibited_uses: See BigCode Open RAIL-M license and FAQ
  quality_control: No specific quality control is mentioned in model training, though
    details on data processing and how the tokenizer was trained are provided in the
    paper.
  size: 15.5B parameters (dense)
  training_emissions: 16.68 tons of CO2eq
  training_hardware: 512 A100 80GB GPUs distributed across 64 nodes
  training_time: 320,256 GPU hours
  type: model
  url: https://arxiv.org/pdf/2305.06161.pdf
- access: open
  analysis: Evaluated on MultiPL-E system benchmarks.
  created_date: 2023-02-24
  dependencies:
  - The Stack
  - BigCode Dataset
  description: Multilingual code model derived from the findings of BigCode Project
    analysis on Github stars' association to data quality.
  feedback: https://huggingface.co/bigcode/santacoder/discussions
  intended_uses: The model was trained on GitHub code. As such it is not an instruction
    model and commands do not work well. You should phrase commands like they occur
    in source code such as comments or write a function signature and docstring and
    let the model complete the function body.
  license: BigCode Open RAIL-M v1
  modality: code; code
  model_card: https://huggingface.co/bigcode/santacoder
  monitoring: ''
  name: SantaCoder
  nationality: International
  organization: BigCode
  prohibited_uses: See BigCode Open RAIL-M license and FAQ
  quality_control: ''
  size: 1.1B parameters (dense)
  training_emissions: 124 kg of CO2eq
  training_hardware: 96 NVIDIA Tesla V100 GPUs
  training_time: 14,284 GPU hours
  type: model
  url: https://arxiv.org/pdf/2301.03988.pdf
- access: open
  analysis: Evaluated models trained on The Stack on HumanEval and MBPP and compared
    against similarly-sized models.
  created_date: 2022-11-20
  datasheet: https://huggingface.co/datasets/bigcode/the-stack
  dependencies:
  - GitHub
  description: The Stack contains over 6TB of permissively-licensed source code files
    covering 358 programming languages. The Stack serves as a pre-training dataset
    for Code LLMs, i.e., code-generating AI systems which enable the synthesis of
    programs from natural language descriptions as well as other from code snippets.
  excluded: ''
  feedback: https://huggingface.co/datasets/bigcode/the-stack/discussions
  included: ''
  intended_uses: creating code LLMs
  license: The Stack is a collection of source code from repositories with various
    licenses. Any use of all or part of the code gathered in The Stack must abide
    by the terms of the original licenses, including attribution clauses when relevant.
    Provenance information is provided for each data point.
  modality: code
  monitoring: ''
  name: The Stack
  nationality: International
  organization: BigCode
  prohibited_uses: See https://huggingface.co/datasets/bigcode/the-stack
  quality_control: allowed users whose data were part of The Stack's training data
    to opt-out
  sample:
  - https://huggingface.co/datasets/bigcode/the-stack/viewer/default/train
  size: 6 TB
  type: dataset
  url: https://arxiv.org/pdf/2211.15533.pdf
- access: open
  analysis: See https://arxiv.org/pdf/2402.19173.pdf
  created_date: 2024-02-28
  dependencies:
  - The Stack v2
  description: StarCoder2-15B model is a 15B parameter model trained on 600+ programming
    languages from The Stack v2, with opt-out requests excluded. The training was
    carried out using the Fill-in-the-Middle objective on 4+ trillion tokens.
  feedback: https://huggingface.co/bigcode/starcoder2-15b/discussions
  intended_uses: The model was trained on GitHub code as well as additional selected
    data sources such as Arxiv and Wikipedia. As such it is not an instruction model
    and commands like "Write a function that computes the square root." do not work
    well. Intended to generate code snippets from given context, but not for writing
    actual functional code directly.
  license: BigCode OpenRail-M
  modality: code; text
  model_card: https://huggingface.co/bigcode/starcoder2-15b
  monitoring: unknown
  name: StarCoder2-15B
  nationality: International
  organization: BigCode
  prohibited_uses: See BigCode Open RAIL-M license and FAQ
  quality_control: The model was filtered for permissive licenses and code with no
    license only. A search index is provided to identify where generated code came
    from to apply the proper attribution.
  size: 15B parameters (dense)
  training_emissions: unknown
  training_hardware: 1024 x H100 GPUs
  training_time: unknown
  type: model
  url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html
- access: open
  analysis: See https://arxiv.org/pdf/2402.19173.pdf
  created_date: 2024-02-28
  dependencies:
  - The Stack v2
  description: StarCoder2-7B model is a 7B parameter model trained on 17 programming
    languages from The Stack v2, with opt-out requests excluded. The model uses Grouped
    Query Attention, a context window of 16,384 tokens with a sliding window attention
    of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3.5+
    trillion tokens.
  feedback: https://huggingface.co/bigcode/starcoder2-7b/discussions
  intended_uses: Intended to generate code snippets from given context, but not for
    writing actual functional code directly. The model has been trained on source
    code from 17 programming languages. The predominant language in source is English
    although other languages are also present. As such the model is capable of generating
    code snippets provided some context but the generated code is not guaranteed to
    work as intended. It can be inefficient and contain bugs or exploits. See the
    paper for an in-depth discussion of the model limitations.
  license: BigCode OpenRail-M
  modality: code; text
  model_card: https://huggingface.co/bigcode/starcoder2-7b
  monitoring: unknown
  name: StarCoder2-7B
  nationality: International
  organization: BigCode
  prohibited_uses: See BigCode Open RAIL-M license and FAQ
  quality_control: The model was filtered for permissive licenses and code with no
    license only. A search index is provided to identify where generated code came
    from to apply the proper attribution.
  size: 7B parameters (dense)
  training_emissions: 29,622.83 kgCO2eq
  training_hardware: 432 H100 GPUs
  training_time: 145,152 hours (cumulative)
  type: model
  url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html
- access: open
  analysis: See https://arxiv.org/pdf/2402.19173.pdf
  created_date: 2024-02-28
  dependencies:
  - The Stack v2
  description: StarCoder2-3B model is a 3B parameter model trained on 17 programming
    languages from The Stack v2, with opt-out requests excluded. The model uses Grouped
    Query Attention, a context window of 16,384 tokens with a sliding window attention
    of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+
    trillion tokens.
  feedback: https://huggingface.co/bigcode/starcoder2-3b/discussions
  intended_uses: Intended to generate code snippets from given context, but not for
    writing actual functional code directly. The model has been trained on source
    code from 17 programming languages. The predominant language in source is English
    although other languages are also present. As such the model is capable of generating
    code snippets provided some context but the generated code is not guaranteed to
    work as intended. It can be inefficient and contain bugs or exploits. See the
    paper for an in-depth discussion of the model limitations.
  license: BigCode OpenRail-M
  modality: code; text
  model_card: https://huggingface.co/bigcode/starcoder2-3b
  monitoring: unknown
  name: StarCoder2-3B
  nationality: International
  organization: BigCode
  prohibited_uses: See BigCode Open RAIL-M license and FAQ
  quality_control: The model was filtered for permissive licenses and code with no
    license only. A search index is provided to identify where generated code came
    from to apply the proper attribution.
  size: 3B parameters (dense)
  training_emissions: 16,107.01 kgCO2eq
  training_hardware: 160 A100 GPUs
  training_time: 97,120 hours (cumulative)
  type: model
  url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html


================================================
FILE: assets/bigscience.yaml
================================================
- access:
    explanation: While not all of the ROOTS corpus is publicly accessible, the majority
      is through [this form](https://docs.google.com/forms/d/e/1FAIpQLSdq50O1x4dkdGI4dwsmchFuNI0KCWEDiKUYxvd0r0_sl6FfAQ/viewform?usp=send_form)
    value: open
  analysis: https://huggingface.co/spaces/bigscience-data/roots-search
  created_date: 2022-06-06
  datasheet: https://huggingface.co/spaces/bigscience-data/bigscience-corpus
  dependencies: []
  description: The Responsible Open-science Open-collaboration Text Sources (ROOTS)
    corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter
    BigScience Large Open-science Open-access Multilingual (BLOOM) language model.
  excluded: unknown
  feedback: https://huggingface.co/spaces/bigscience-data/roots-search/discussions
  included: See section 2 of the paper.
  intended_uses: To empower large-scale monolingual and multilingual modeling projects
    with both the data and the processing tools, as well as stimulate research around
    this large multilingual corpus.
  license:
    explanation: Each constituent subset of the dataset will be released under the
      license that applies to it. ([See individual dataset page for specific license
      information](https://hf.co/bigscience-data))
    value: custom
  modality: code, text
  monitoring: none
  name: ROOTS
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: Data collection involved merging and deduplicating searches to
    remove menus, HTML tags. Further, a quality improvement pipeline was implemented.
  sample:
  - https://huggingface.co/spaces/bigscience-data/roots-search
  size: 1.6TB
  type: dataset
  url: https://openreview.net/forum?id=UoEw6KigkUn
- access: open
  analysis: See the paper.
  created_date: 2022-10-15
  datasheet: https://huggingface.co/datasets/bigscience/P3
  dependencies: []
  description: The Public Pool of Prompts (P3) are prompts written in an unified format
    use to train T0++.
  excluded: We conservatively decided not to prompt datasets that contain potentially
    harmful content (for instance, datasets built on social media content).
  feedback: Point of Contact is [Victor Sanh](https://huggingface.co/datasets/bigscience/P3)
  included: The Public Pool of Prompts relies on the Hugging Face Dataset library.
    Any public dataset in the Datasets library can be prompted. We select the datasets
    that have at least one subset in English and excluded datasets containing (predominantly)
    non-natural language examples.
  intended_uses: Multitask finetuning of language models.
  license: Apache 2.0
  modality: text
  monitoring: none
  name: P3
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: Data collection involved merging and deduplicating searches to
    remove menus, HTML tags. Further, a quality improvement pipeline was implemented.
  sample:
  - https://huggingface.co/datasets/bigscience/P3/viewer
  size: 2000 prompts
  type: dataset
  url: https://arxiv.org/pdf/2110.08207.pdf
- access: open
  analysis: See the paper.
  created_date: 2022-11-03
  datasheet: https://huggingface.co/datasets/bigscience/xP3
  dependencies:
  - P3
  description: xP3 (Crosslingual Public Pool of Prompts) is a collection of prompts
    and datasets across 46 of languages & 16 NLP tasks. It is used for the training
    of BLOOMZ and mT0, multilingual language models capable of following human instructions
    in dozens of languages zero-shot.
  excluded: We conservatively decided not to prompt datasets that contain potentially
    harmful content (for instance, datasets built on social media content).
  feedback: Point of Contact is [Niklas Muennighoff](https://huggingface.co/datasets/bigscience/xP3)
  included: xP3 adds 28 multilingual datasets to P3 based on the P3 task taxonomy.
  intended_uses: Multitask finetuning of language models.
  license: Apache 2.0
  modality: code, text
  monitoring: none
  name: xP3
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: Data collection involved merging and deduplicating searches to
    remove menus, HTML tags. Further, a quality improvement pipeline was implemented.
  sample:
  - https://huggingface.co/datasets/bigscience/xP3/viewer
  size: 9.4GB
  type: dataset
  url: https://arxiv.org/abs/2211.01786
- access: open
  analysis: ''
  created_date:
    explanation: The date the paper was posted to arXiv.
    value: 2021-10-15
  dependencies:
  - T5
  - P3
  description: T0++ is an multitask fine-tuned language model based on T5.
  feedback: https://huggingface.co/bigscience/T0pp/discussions
  intended_uses: You can use the models to perform inference on tasks by specifying
    your query in natural language, and the models will generate a prediction.
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/bigscience/T0pp
  monitoring: none
  name: T0++
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: https://arxiv.org/pdf/2110.08207.pdf
  size: 11B parameters (dense)
  training_emissions:
    explanation: Sourced from estimates in https://arxiv.org/pdf/2110.08207.pdf
    value: 0.9 tCO2e
  training_hardware: Jean Zay (v3-512)
  training_time:
    explanation: Sourced from estimates in https://arxiv.org/pdf/2110.08207.pdf
    value: 27 hours
  type: model
  url: https://arxiv.org/pdf/2110.08207.pdf
- access: open
  analysis: ''
  created_date:
    explanation: The date the model was released
    value: 2022-07-12
  dependencies:
  - ROOTS
  description: BLOOM is an autoregressive multilingual language model.
  feedback: https://huggingface.co/bigscience/bloom/discussions
  intended_uses: This model is being created in order to enable public research on
    large language models (LLMs). LLMs are intended to be used for language generation
    or as a pretrained base model that can be further fine-tuned for specific tasks.
    Use cases below are not exhaustive.
  license:
    explanation: Model is licensed under https://huggingface.co/spaces/bigscience/license
    value: BigScience RAIL v1.0
  modality: code, text; code, text
  model_card: ''
  monitoring: none
  name: BLOOM
  nationality: International
  organization: BigScience
  prohibited_uses: Using the model in high-stakes settings is out of scope for this
    model (e.g. biomedical/political/legal/finance domains, evaluating or scoring
    individuals). The model is not designed for critical decisions nor uses with any
    material consequences on an individual's livelihood or wellbeing. The model outputs
    content that appears factual but may not be correct. Misuse. Intentionally using
    the model for harm, violating human rights, or other kinds of malicious activities,
    is a misuse of this model (e.g. spam generation, disinformation, disparagement,
    deception, surveillance).
  quality_control: ''
  size: 176B parameters (dense)
  training_emissions:
    explanation: Sourced from estimates in https://arxiv.org/abs/2211.02001
    value: 25 tCO2e
  training_hardware: Jean Zay (48 * 8xA100 80GB nodes)
  training_time:
    explanation: 1082990 A100 hours at 156 TFLOP/s maximum utilization
    value: 7039 petaflop/s-days
  type: model
  url: https://arxiv.org/abs/2211.05100
- access: open
  analysis: https://huggingface.co/bigscience/bloomz#evaluation
  created_date:
    explanation: The date the model was released
    value: 2021-10-15
  dependencies:
  - mT5
  - xP3
  description: mT0 is an multitask fine-tuned multilingual language model based on
    mT5.
  feedback: https://huggingface.co/bigscience/bloomz/discussions
  intended_uses: We recommend using the model to perform tasks expressed in natural
    language.
  license:
    explanation: Model is licensed under https://huggingface.co/spaces/bigscience/license
    value: BigScience RAIL v1.0
  modality: text ;text
  model_card: https://huggingface.co/bigscience/T0pp
  monitoring: none
  name: mT0
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: https://arxiv.org/pdf/2211.01786.pdf
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: Jean Zay (288 A100 80GB GPUs with 8 GPUs per node (36 nodes)
    using NVLink 4 inter-gpu connects, 4 OmniPath links)
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2110.08207.pdf
- access: open
  analysis: https://huggingface.co/bigscience/bloomz#evaluation
  created_date: 2022-11-03
  dependencies:
  - BLOOM
  - xP3
  description: BLOOMZ is an multitask fine-tuned autoregressive multilingual language
    model.
  feedback: https://huggingface.co/bigscience/bloomz/discussions
  intended_uses: We recommend using the model to perform tasks expressed in natural
    language.
  license:
    explanation: Model is licensed under https://huggingface.co/spaces/bigscience/license
    value: BigScience RAIL v1.0
  modality: code, text; code, text
  model_card: https://huggingface.co/bigscience/bloomz
  monitoring: none
  name: BLOOMZ
  nationality: International
  organization: BigScience
  prohibited_uses: none
  quality_control: https://arxiv.org/pdf/2211.01786.pdf
  size: 176B parameters (dense)
  training_emissions: unknown
  training_hardware: Jean Zay (288 A100 80GB GPUs with 8 GPUs per node (36 nodes)
    using NVLink 4 inter-gpu connects, 4 OmniPath links)
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2211.01786.pdf


================================================
FILE: assets/bloomberg.yaml
================================================
- access:
    explanation: See [[Openness]](https://arxiv.org/pdf/2303.17564.pdf#subsection.8.2)
    value: closed
  analysis: ''
  created_date:
    explanation: The date BloombergGPT was announced in the [[Bloomberg article]](https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/).
    value: 2023-03-30
  datasheet:
    explanation: Section 2 of the BloombergGPT paper.
    value: https://arxiv.org/pdf/2303.17564.pdf#section.2
  dependencies: []
  description: A comprehensive dataset consisting of a range of English financial
    documents including news, filings, press releases, web-scraped financial documents,
    and social media drawn from the Bloomberg archives that was used to train the
    BloombergGPT model.
  excluded: ''
  feedback: ''
  included:
    explanation: See [[Financial Datasets]](https://arxiv.org/pdf/2303.17564.pdf#subsection.2.1)
    value: "FinPile consists of English financial documents. Authors utilize the The\
      \ Bloomberg\nTerminal, which is an extensive collection of curated and maintained\
      \ documents,\nto create the FinPile dataset. Each document in FinPile is time-stamped,\
      \ with\ndates ranging from 2007-03-01 to 2022-07-31.\nTypes of data included\
      \ are given below:\n  1. Web (298B tokens) - Inclues Bloomberg's web crawl focused\
      \ on high-quality\nwebsites that have financially relevant information. This\
      \ makes up the majority\nof FinPile.\n  2. News (38B tokens) - Includes all\
      \ news sources relevant to the financial\ncommunity, excluding news articles\
      \ written by Bloomberg journalists. Overall,\nthere are hundreds of English\
      \ news sources in FinPile including \"Bloomberg\nTranscripts\", which are transcripts\
      \ of Bloomberg TV news.\n  3. Filings (14B tokens) - Includes financial statements\
      \ prepared by (public)\ncompanies and made available to the general public.\
      \  In the dataset, a majority\nof the filings come from EDGAR, which is the\
      \ SEC's online database.\n  4. Press (9B tokens) - Includes press releases typically\
      \ issued by companies\nthat are financially relevant.\n  5. Bloomberg (5B tokens)\
      \ - Includes Bloomberg authored news and other documents\nsuch as opinions and\
      \ analyses. The largest sources are \u201CBloomberg News\u201D and\n\u201CBloomberg\
      \ First Word\u201D, the Bloomberg-authored wire of real-time news.\n"
  intended_uses: Used to train the BloombergGPT model.
  license: unknown
  modality: text
  monitoring: ''
  name: FinPile
  nationality: USA
  organization: Bloomberg
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size:
    explanation: See [[Financial Datasets]](https://arxiv.org/pdf/2303.17564.pdf#subsection.2.1)
    value: 363B tokens
  type: dataset
  url: https://arxiv.org/pdf/2303.17564.pdf#section.2
- access:
    explanation: See [[Openness]](https://arxiv.org/pdf/2303.17564.pdf#subsection.8.2)
    value: closed
  analysis:
    explanation: See [[Evaluation]](https://arxiv.org/pdf/2303.17564.pdf#section.5)
    value: 'Authors evaluate the performance of BloombergGPT on two broad categories
      of tasks, finance-specific and general purpose, on several standard benchmarks.
      They compare BloombergGPT to the three closest models: GPT-NeoX, OPT-66B and
      BLOOM-176B. They also report results from the original GPT-3 whenever externally
      available. They conclude "We achieve strong results on general LLM benchmarks
      and outperform comparable models on financial tasks. We attribute this, in decreasing
      order of impact, to 1. a well-curated internal dataset, 2. our unique choice
      in tokenizer, and 3. an up-to-date architecture."

      '
  created_date:
    explanation: The date the model was announced in the [[Bloomberg article]](https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/).
    value: 2023-03-30
  dependencies:
  - FinPile
  - The Pile
  - C4
  - Wikipedia
  description: BloombergGPT is a 50 billion parameter large language model that is
    specifically trained on a wide range of financial data to support a diverse set
    of natural language processing tasks within the financial industry.
  feedback: ''
  intended_uses:
    explanation: See [[Bloomberg article]](https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/)
    value: '"This model will assist Bloomberg in improving existing financial NLP
      tasks, such as sentiment analysis, named entity recognition, news classification,
      and question answering, among others. Furthermore, BloombergGPT will unlock
      new opportunities for marshalling the vast quantities of data available on the
      Bloomberg Terminal to better help the firm''s customers, while bringing the
      full potential of AI to the financial domain."

      '
  license: unknown
  modality: text; text
  model_card: none
  monitoring: ''
  name: BloombergGPT
  nationality: USA
  organization: Bloomberg
  prohibited_uses: ''
  quality_control:
    explanation: See [[Ethics, Limitations, and Implications]](https://arxiv.org/pdf/2303.17564.pdf#section.8)
    value: "Authors state the following:\n- \"To provide natural language applications\
      \ to the financial community, we\n  have developed a rigorous risk and testing\
      \ assessment process. This process\n  includes careful annotation guidelines\
      \ Tseng et al. (2020), pre-launch review\n  at multiple levels by the central\
      \ risk and compliance organizations, and\n  by the product leaders (e.g., the\
      \ newsroom) as applicable, and post-launch\n  monitoring. Moreover, we conduct\
      \ our research, development, and deployment\n  of NLP and AI systems in accordance\
      \ with all applicable regulations.\"\n- \"Similarly, toxicity and bias are areas\
      \ where, as a company, we take extraordinary\n  care with any content we produce,\
      \ whether from humans or machines. Since\n  the measurement of toxicity and\
      \ bias in our model depends on its application\n  areas, quantifying the potential\
      \ for the generation of harmful language\n  remains an open question. We are\
      \ particularly interested in studying whether\n  FinPile, which is cleaner and\
      \ contains fewer examples of overtly biased\n  or toxic language (e.g., Press\
      \ Releases), reduces the proclivity of the\n  model to generate inappropriate\
      \ content.\"\n"
  size: 50B parameters (dense)
  training_emissions: unknown
  training_hardware:
    explanation: See [[Training Configuration]](https://arxiv.org/pdf/2303.17564.pdf#subsection.3.3)
    value: 64 Amazon EC2 p4d.24xlarge instances each with 8 NVIDIA 40GB A100 GPUs
      (i.e. total 512 A100 GPUs)
  training_time:
    explanation: See [[Training Run]](https://arxiv.org/pdf/2303.17564.pdf#section.4)
    value: 53 days
  type: model
  url: https://arxiv.org/abs/2303.17564


================================================
FILE: assets/boston.yaml
================================================
- access: open
  analysis: "Platypus achieves the strongest performance and currently stands at first\
    \ place in HuggingFace\u2019s Open LLM Leaderboard as of its release date."
  created_date: 2023-08-14
  dependencies:
  - LLaMA 2
  - Platypus curated dataset
  description: Platypus is a family of fine-tuned and merged Large Language Models
    (LLMs).
  feedback: https://huggingface.co/garage-bAInd/Platypus2-13B/discussions
  intended_uses: ''
  license: CC by-NC-SA 4.0
  modality: text; text
  model_card: https://huggingface.co/garage-bAInd/Platypus2-13B
  monitoring: ''
  name: Platypus
  nationality: USA
  organization: Boston University
  prohibited_uses: ''
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: ''
  training_hardware: 1 A100 GPU
  training_time: 5 hours
  type: model
  url: https://arxiv.org/pdf/2308.07317.pdf
- access: open
  analysis: UFOGen is evaluated on standard image benchmarks against other models
    fine-tuned with Stable Diffusion.
  created_date: 2023-11-14
  dependencies:
  - Stable Diffusion
  description: UFOGen is a novel generative model designed for ultra-fast, one-step
    text-to-image synthesis.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: text; image
  model_card: none
  monitoring: ''
  name: UFOGen
  nationality: USA
  organization: Boston University
  prohibited_uses: ''
  quality_control: ''
  size: 900M parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2311.09257.pdf


================================================
FILE: assets/brex.yaml
================================================
- access: limited
  adaptation: ''
  created_date: 2023-03-07
  dependencies:
  - ChatGPT API
  description: Brex Inc., a highly valued startup that makes software for finance
    professionals, is turning to the same artificial intelligence tool behind ChatGPT
    for a service that can answer questions about corporate budgets, policy and spending.
  failures: ''
  feedback: ''
  intended_uses: ''
  license:
    explanation: '"Brex grants you a nonexclusive and nontransferable license to use
      Brex Property as provided through the Services and as permitted by these User
      Terms." - excerpt from the Terms of Service.

      '
    value: custom
  monitoring: ''
  monthly_active_users: ''
  name: Brex Chat
  nationality: USA
  organization: Brex
  output_space: ''
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://www.brex.com/legal/user-terms
  type: application
  url: https://fortune.com/2023/03/07/cfo-chatbot-chatgpt-ai-brex-finance-software-startup-budgets-policies/
  user_distribution: ''


================================================
FILE: assets/bytedance.yaml
================================================
- access: open
  analysis: Evaluated on benchmarks pertaining to speech, music, and other audio recognition.
  created_date: 2023-10-20
  dependencies:
  - Whisper
  - BEATs
  - Vicuna
  description: SALMONN is a large language model (LLM) enabling speech, audio event,
    and music inputs.
  feedback: https://huggingface.co/MSIIP/SALMONN/discussions
  intended_uses: ''
  license: Apache 2.0
  modality: audio, text; text
  model_card: https://huggingface.co/MSIIP/SALMONN
  monitoring: none
  name: SALMONN
  nationality: unknown
  organization: ByteDance, Tsinghua University
  prohibited_uses: ''
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://github.com/bytedance/SALMONN
- access: open
  analysis: Evaluated via qualitative comparison relative to other SoTA image generation
    models.
  created_date: 2024-02-21
  dependencies:
  - Stable Diffusion XL
  description: SDXL-Lightning is a lightning-fast text-to-image generation model.
    It can generate high-quality 1024px images in a few steps. The models are distilled
    from stabilityai/stable-diffusion-xl-base-1.0. This repository contains checkpoints
    for 1-step, 2-step, 4-step, and 8-step distilled models.
  feedback: https://huggingface.co/ByteDance/SDXL-Lightning/discussions
  intended_uses: The model can be used for fast, high-quality text-to-image generation.
    It supports 1-step, 2-step, 4-step, and 8-step distilled models which provide
    varying generation quality.
  license: OpenRail++
  modality: text; image
  model_card: https://huggingface.co/ByteDance/SDXL-Lightning
  monitoring: unknown
  name: SDXL-Lightning
  nationality: China
  organization: ByteDance
  prohibited_uses: unknown
  quality_control: unknown
  size: unknown
  training_emissions: unknown
  training_hardware: 64 A100 80G GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2402.13929.pdf
- access: open
  analysis: LLaVA-Critic was tested in scenarios such as LMM-as-a-Judge and Preference
    Learning, showing a high correlation with commercial GPT models in evaluation
    scores. It served as an alternative to expensive human feedback in resource-constrained
    settings and demonstrated better performance in providing AI-generated feedback
    for model alignment compared to human-reliant reward models.
  created_date: 2024-10-06
  dependencies: []
  description: LLaVA-Critic is an open-source large multimodal model (LMM) designed
    as a generalist evaluator. It assesses performance across a variety of multimodal
    tasks by following a high-quality critic instruction dataset, incorporating diverse
    evaluation criteria. The model is effective in areas like LMM-as-a-Judge, providing
    reliable evaluation scores comparable to GPT models, and Preference Learning,
    offering reward signals for preference learning to enhance model alignment capabilities.
  feedback: unknown
  intended_uses: The model can be used for evaluating multimodal tasks, generating
    reward signals for preference learning, and serving as a reliable alternate judge
    for model assessments.
  license: Apache 2.0
  modality: image, text; text
  model_card: unknown
  monitoring: unknown
  name: LLaVA-Critic
  nationality: unknown
  organization: ByteDance and University of Maryland, College Park
  prohibited_uses: The model should not be used in scenarios requiring authorization
    from proprietary models, nor relied upon for critical applications without human
    oversight due to potential biases in dataset.
  quality_control: The model ensures quality by utilizing a high-quality dataset for
    critic instructions, providing both quantitative judgments and reasoning, with
    transparency in assessments.
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2410.02712


================================================
FILE: assets/cagliostro.yaml
================================================
- access: open
  analysis: unknown
  created_date: 2024-03-18
  dependencies:
  - Animagine XL 3.0
  description: An open-source, anime-themed text-to-image model enhanced to generate
    higher quality anime-style images with a broader range of characters from well-known
    anime series, an optimized dataset, and new aesthetic tags for better image creation.
  feedback: https://huggingface.co/cagliostrolab/animagine-xl-3.1/discussions
  intended_uses: Generating high-quality anime images from textual prompts. Useful
    for anime fans, artists, and content creators.
  license: Fair AI Public License 1.0-SD
  modality: text; image
  model_card: https://huggingface.co/cagliostrolab/animagine-xl-3.1
  monitoring: unknown
  name: Animagine XL 3.1
  nationality: USA
  organization: Cagliostro Research Lab
  prohibited_uses: Not suitable for creating realistic photos or for users who expect
    high-quality results from short or simple prompts.
  quality_control: The model undergoes pretraining, first stage finetuning, and second
    stage finetuning for refining and improving aspects such as hand and anatomy rendering.
  size: unknown
  training_emissions: unknown
  training_hardware: 2x A100 80GB GPUs
  training_time: Approximately 15 days, totaling over 350 GPU hours.
  type: model
  url: https://cagliostrolab.net/posts/animagine-xl-v31-release


================================================
FILE: assets/cartesia.yaml
================================================
- access: limited
  analysis: Extensive testing on Multilingual Librispeech dataset resulted in 20%
    lower validation perplexity. In downstream evaluations, this leads to a 2x lower
    word error rate and a 1 point higher quality score. Sonic also displays impressive
    performance metrics at inference, achieving lower latency (1.5x lower time-to-first-audio),
    faster inference speed (2x lower real-time factor), and higher throughput (4x).
  created_date: 2024-05-29
  dependencies:
  - Multilingual Librispeech dataset
  description: Sonic is a low-latency voice model that generates lifelike speech.
    Developed by Cartesia, it was designed to be an efficient real-time AI capable
    of processing any-sized contexts and running on any device.
  feedback: Contact through the provided form or via email at join@cartesia.ai.
  intended_uses: Sonic has potential applications across customer support, entertainment,
    and content creation and is a part of Cartesias broader mission to bring real-time
    multimodal intelligence to every device.
  license: unknown
  modality: text; audio
  model_card: none
  monitoring: unknown
  name: Sonic
  nationality: USA
  organization: Cartesia
  prohibited_uses: unknown
  quality_control: ''
  size: 2024-05-29
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://cartesia.ai/blog/sonic


================================================
FILE: assets/casia.yaml
================================================
- access: open
  analysis: Reports results on standard translation benchmarks across 102 languages
    in comparison with Google Translate and ChatGPT
  created_date: 2023-05-29
  dependencies:
  - LLaMA
  - CLUE
  - BigTrans parallel dataset
  description: BigTrans is a model which adapts LLaMA that covers only 20 languages
    and enhances it with multilingual translation capability on more than 100 languages
  feedback: https://huggingface.co/James-WYang/BigTrans/discussions
  intended_uses: Advancing future research in multilingual LLMs
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/James-WYang/BigTrans
  monitoring: ''
  name: BigTrans
  nationality: China
  organization: Institute of Automation Chinese Academy of Sciences
  prohibited_uses: ''
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: 16 A100 GPUs with 80 GB of RAM
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2305.18098v1.pdf
- access: open
  analysis: Evaluated on standard benchmarks for knowledge and language understanding,
    mathematical reasoning, and programming ability in comparison to similarly sized
    open-source models.
  created_date: 2023-12-22
  dependencies: []
  description: YAYI 2 is an open source large language model trained in both English
    and Chinese.
  feedback: https://huggingface.co/wenge-research/yayi2-30b/discussions
  intended_uses: ''
  license:
    explanation: Model is under a custom [license](https://github.com/wenge-research/YAYI2/blob/main/COMMUNITY_LICENSE),
      while code is Apache 2.0
    value: custom
  modality: text; text
  model_card: https://huggingface.co/wenge-research/yayi2-30b
  monitoring: ''
  name: YAYI 2
  nationality: China
  organization: Institute of Automation Chinese Academy of Sciences
  prohibited_uses: ''
  quality_control: data is deduplicated, normalized, cleaned, and filtered for toxicity
  size: 30B parameters (dense)
  training_emissions: unknown
  training_hardware: over 1000 A800 GPUs
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2312.14862.pdf


================================================
FILE: assets/causallm.yaml
================================================
- access: open
  analysis: Evaluated on standard benchmarks across a range of tasks.
  created_date: 2023-10-21
  dependencies:
  - Qwen
  - OpenOrca
  - Open Platypus
  description: CausalLM is an LLM based on the model weights of Qwen and trained on
    a model architecture identical to LLaMA 2.
  feedback: none
  intended_uses: ''
  license:
    explanation: can be found at https://github.com/rpherrera/WTFPL (HuggingFace lists
      this to be the license)
    value: WTFPL
  modality: text; text
  model_card: https://huggingface.co/CausalLM/14B
  monitoring: unknown
  name: CausalLM
  nationality: USA
  organization: CausalLM
  prohibited_uses: ''
  quality_control: ''
  size: 14B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://huggingface.co/CausalLM/14B


================================================
FILE: assets/cerebras.yaml
================================================
- access:
    explanation: 'The Pile is an open source dataset. Hugging Face compatible checkpoints
      available on the [[Cerebras Hugging Face page]](https://huggingface.co/cerebras/Cerebras-GPT-13B).
      Cerebras systems checkpoints for pre-training and fine tuning are available
      in the cloud via the [[Cerebras Model Studio]](https://www.cerebras.net/product-cloud/).

      '
    value: open
  analysis: '"We evaluate our models on the PILE validation set comprising 380M tokens.
    We also evaluate the public checkpoints of Pythia, Eleuther (2022); OPT, Zhang
    et al. (2022); GPT-NeoX 20B, Black et al. (2022); and GPT-J 6B, Wang & Komatsuzaki
    (2021). We performed upstream (pre-training) evaluations of text prediction cross-entropy
    using the Pile validation and test splits. We performed downstream evaluations
    of text generation accuracy on standardized tasks using the Eleuther lm-evaluation-harness."
    [[Evaluations]] (https://github.com/Cerebras/modelzoo/tree/main/modelzoo/transformers/pytorch/gpt3/configs/Cerebras_GPT#evaluations).

    '
  created_date:
    explanation: 'The date the model was announced in the [[Cerebras blog post]](https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models).

      '
    value: 2023-03-28
  dependencies:
  - The Pile
  description: 'A Family of Open, Compute-efficient, Large Language Models. The family
    includes 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B models. All models in the
    Cerebras-GPT family have been trained in accordance with Chinchilla scaling laws
    (20 tokens per model parameter). [[Cerebras Blog Post]](https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models)

    '
  feedback: ''
  intended_uses: '"The primary intended use is to further research into large language
    models. These models can be used as a foundation model for NLP, applications,
    ethics, and alignment research. Our primary intended users are researchers who
    are working to improve LLMs and practitioners seeking reference implementations,
    training setups, hyperparameters, or pre-trained models. We release these models
    with a fully permissive Apache license for the community to use freely." [[Uses
    and Limitations]](https://github.com/Cerebras/modelzoo/tree/main/modelzoo/transformers/pytorch/gpt3/configs/Cerebras_GPT#uses-and-limitations).

    '
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/cerebras/Cerebras-GPT-13B
  monitoring: ''
  name: Cerebras-GPT
  nationality: USA
  organization: Cerebras
  prohibited_uses: 'Authors note the following limitations of the model: "Cerebras-GPT
    models are trained on the Pile, with English language only, and are not suitable
    for machine translation tasks. Cerebras-GPT models have not been tuned for human-facing
    dialog applications like chatbots and will not respond to prompts in a similar
    way to models that have received instruction tuning or reinforcement learning
    from human feedback (RLHF) like Flan-T5 or ChatGPT." [[Uses and Limitations]](https://github.com/Cerebras/modelzoo/tree/main/modelzoo/transformers/pytorch/gpt3/configs/Cerebras_GPT#out-of-scope-use).

    '
  quality_control: 'The Pile dataset has been thoroughly analyzed from various ethical
    standpoints such as toxicity analysis, gender bias, pejorative content, racially
    sensitive content etc. Only mitigations in standard Pile dataset pre-processing
    were employed when pre-training Cerebras-GPT. [[Risk, Bias, Ethical Considerations]](https://github.com/Cerebras/modelzoo/tree/main/modelzoo/transformers/pytorch/gpt3/configs/Cerebras_GPT#risk-bias-ethical-considerations)

    '
  size: 13B parameters (dense)
  training_emissions: ''
  training_hardware:
    explanation: 'According to [[Model Description]](https://huggingface.co/cerebras/Cerebras-GPT-13B#model-description)

      '
    value: 16x Cerebras CS-2 wafer scale systems
  training_time:
    explanation: ''
    value: ''
  type: model
  url: https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/
- access: open
  analysis: Evaluated on standard English LLM benchmarks and adapted Arabic LLM benchmarks.
  created_date: 2023-08-30
  dependencies:
  - GPT-3
  - The Pile
  description: "Jais is the world\u2019s most advanced Arabic LLM as of its release."
  feedback: none
  intended_uses: Jais is released with the aim to stimulate research and development
    in the Arabic NLP community.
  license: Apache 2.0
  modality: text; text
  model_card:
    explanation: "Found in section C \u201CModel Cards\u201D"
    value: https://inceptioniai.org/jais/docs/Technicalpaper.pdf
  monitoring: unknown
  name: Jais
  nationality: unknown
  organization: Inception Institute of Artificial Intelligence, Cerebras, Mohamed
    bin Zayed University of Artificial Intelligence
  prohibited_uses: Generating or endorsing hate speech, disseminating false information,
    engaging in illegal activities, managing sensitive data, attempting language generalization
    beyond Arabic and English, and making critical decisions with high stakes.
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: Condor Galaxy Supercomputer
  training_time: unknown
  type: model
  url: https://inceptioniai.org/jais/docs/Technicalpaper.pdf
- access: open
  analysis: Evaluated on standard English LLM benchmarks and adapted Arabic LLM benchmarks.
  created_date: 2023-08-30
  dependencies:
  - GPT-3
  - The Pile
  description: Jais Chat is an instruction-tuned version of Jais, optimized for dialog
    interaction.
  feedback: none
  intended_uses: Jais Chat is released with the aim to stimulate research and development
    in the Arabic NLP community.
  license: Apache 2.0
  modality: text; text
  model_card:
    explanation: "Found in section C \u201CModel Cards\u201D"
    value: https://inceptioniai.org/jais/docs/Technicalpaper.pdf
  monitoring: unknown
  name: Jais Chat
  nationality: unknown
  organization: Inception Institute of Artificial Intelligence, Cerebras, Mohamed
    bin Zayed University of Artificial Intelligence
  prohibited_uses: Generating or endorsing hate speech, disseminating false information,
    engaging in illegal activities, managing sensitive data, attempting language generalization
    beyond Arabic and English, and making critical decisions with high stakes.
  quality_control: ''
  size: 13B parameters (dense)
  training_emissions: unknown
  training_hardware: Condor Galaxy Supercomputer from Cerebras
  training_time: unknown
  type: model
  url: https://inceptioniai.org/jais/docs/Technicalpaper.pdf
- access: open
  analysis: Evaluated on standard LLM benchmarks in comparison to similar-sized models.
  created_date: 2023-07-24
  dependencies:
  - SlimPajama
  description: Bittensor Language Model is a 3 billion parameter language model with
    an 8k context length trained on 627B tokens of SlimPajama.
  feedback: https://huggingface.co/cerebras/btlm-3b-8k-base/discussions
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: https://huggingface.co/cerebras/btlm-3b-8k-base
  monitoring: unknown
  name: Bittensor Language Model
  nationality: USA
  organization: Cerebras
  prohibited_uses: ''
  quality_control: ''
  size: 3B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.cerebras.net/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/
- access: open
  analysis: ''
  created_date: 2023-06-09
  datasheet: https://huggingface.co/datasets/cerebras/SlimPajama-627B
  dependencies:
  - RedPajama-Data
  description: As of release, SlimPajama is the largest extensively deduplicated,
    multi-corpora, open-source dataset for training large language models.
  excluded: ''
  feedback: https://huggingface.co/datasets/cerebras/SlimPajama-627B/discussions
  included: ''
  intended_uses: ''
  license: Apache 2.0
  modality: text
  monitoring: unknown
  name: SlimPajama
  nationality: USA
  organization: Cerebras
  prohibited_uses: ''
  quality_control: ''
  sample: []
  size: 627B tokens
  type: dataset
  url: https://huggingface.co/datasets/cerebras/SlimPajama-627B


================================================
FILE: assets/character.yaml
================================================
- access: limited
  adaptation: ''
  created_date: 2022-09-16
  dependencies: []
  description: Character allows users to converse with various chatbot personas.
  failures: unknown
  feedback: ''
  intended_uses: ''
  license: unknown
  monitoring: ''
  monthly_active_users: unknown
  name: Character
  nationality: USA
  organization: Character AI
  output_space: AI-generated chat conversations
  prohibited_uses: ''
  quality_control: ''
  terms_of_service: https://beta.character.ai/tos
  type: application
  url: https://beta.character.ai/
  user_distribution: unknown


================================================
FILE: assets/chatglm.yaml
================================================
- access: open
  analysis: Performance evaluated on English and Chinese language benchmark tests.
  created_date: 2023-03-14
  dependencies: []
  description: ChatGLM is a Chinese-English language model with question and answer
    and dialogue functions, and is aimed at a Chinese audience.
  feedback: ''
  intended_uses: ''
  license: Apache 2.0
  modality: text; text
  model_card: ''
  monitoring: ''
  name: ChatGLM
  nationality: China
  organization: ChatGLM
  prohibited_uses: ''
  quality_control: ''
  size: 6B parameters (dense)
  training_emissions: unknown
  training_hardware: ''
  training_time: unknown
  type: model
  url: https://chatglm.cn/blog


================================================
FILE: assets/cmu.yaml
================================================
- access:
    explanation: Model checkpoints are available for download at https://github.com/VHellendoorn/Code-LMs
    value: open
  analysis: Reports results on standard code benchmarks across a variety of programming
    languages.
  created_date:
    explanation: The date the model paper was released
    value: 2022-02-26
  dependencies:
  - Github
  description: PolyCoder is a code model trained on 2.7B parameters based on the GPT-2
    architecture, which was trained on 249GB of code across 12 programming languages
    on a single machine.
  feedback: https://huggingface.co/NinedayWang/PolyCoder-2.7B/discussion
  intended_uses: unknown
  license:
    explanation: The license is provided in the [[Github repository]](https://github.com/VHellendoorn/Code-LMs)
    value: MIT
  modality: code
  model_card: https://huggingface.co/NinedayWang/PolyCoder-2.7B
  monitoring: None
  name: PolyCoder
  nationality: USA
  organization: Carnegie Mellon University
  prohibited_uses: None
  quality_control: No specific quality control is mentioned in model training, though
    details on data processing and how the tokenizer was trained are provided in the
    paper.
  size: 2.7B parameters (dense)
  training_emissions: unknown
  training_hardware: 8 NVIDIA RTX 8000
  training_time: 6 weeks
  type: model
  url: https://arxiv.org/abs/2202.13169
- access: open
  analysis: Evaluated on nascent time-series datasets and benchmarks.
  created_date: 2024-02-06
  dependencies: []
  description: Moment is a family of open-source foundation models for general-purpose
    time-series analysis.
  feedback: none
  intended_uses: ''
  license: unknown
  modality: ''
  model_card: none
  monitoring: unknown
  name: Moment
  nationality: unknown
  organization: Carnegie Mellon University, University of Pennsylvania
  prohibited_uses: ''
  quality_control: ''
  size: 385M parameters (dense)
  training_emissions: unknown
  training_hardware: Single A6000 GPU
  training_time: unknown
  type: model
  url: https://arxiv.org/pdf/2402.03885.pdf


================================================
FILE: assets/cognition.yaml
================================================
- access: limited
  analysis: Evaluated on SWE-Bench, a challenging software engineering benchmark,
    where Devin outperforms major state of the art models unassisted.
  created_date: 2024-03-12
  dependencies: []
  description: "Devin is the world\u2019s first fully autonomous AI software engineer."
  feedback: none
  intended_uses: ''
  license: unknown
  modality: text; code
  model_card: none
  monitoring: ''
  name: Devin
  nationality: USA
  organization: Cognition Labs
  prohibited_uses: ''
  quality_control: ''
  size: unknown
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.cognition-labs.com/introducing-devin
- access:
    explanation: ESM C is a drop-in replacement for previous models that provides
      major improvements in both capability and efficiency. ESM C models are available
      immediately for academic and commercial use under a new license structure designed
      to promote openness and enable scientists and builders.
    value: open
  analysis: Evaluations. We use the methodology of Rao et al. to measure unsupervised
    learning of tertiary structure in the form of contact maps. A logistic regression
    is used to identify contacts. For a protein of length L, we evaluate the precision
    of the top L contacts (P@L) with sequence separation of 6 or more residues.
  created_date: 2024-12-04
  dependencies:
  - UniRef
  - MGnify
  - Joint Genome Institute
  description: ESM Cambrian is a next generation language model trained on protein
    sequences at the scale of life on Earth. ESM C models define a new state of the
    art for protein representation learning.
  feedback: unknown
  intended_uses: "one that can be used by builders across a wide range of applications,\
    \ to imbue AI models with a deeper understanding of the biology of life\u2019\
    s most important and mysterious molecules."
  license: Cambrian Open License
  modality:
    explanation: 'Just as a person can fill in the blanks, such as: To __ or not to
      __ that is the ___ We can train language models to fill in the blanks. Except
      in biology, instead of training the model to predict words, we train it to predict
      the characters in a protein sequence, i.e. its sequence of amino acids.'
    value: text; text
  model_card: unknown
  monitoring: unknown
  name: ESM Cambrian
  nationality: USA
  organization: EvolutionaryScale
  prohibited_uses: unknown
  quality_control: ESM C was reviewed by a committee of scientific experts who concluded
    that the benefits of releasing the models greatly outweigh any potential risks.
  size:
    explanation: 'ESM C is trained at three scales: 300M, 600M, and 6B parameters.'
    value: 6B parameters
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://www.evolutionaryscale.ai/blog/esm-cambrian


================================================
FILE: assets/cognitive.yaml
================================================
- access: open
  analysis: none
  created_date: 2023-11-14
  dependencies:
  - Dolphin
  - Yi
  description: Dolphin 2.2 Yi is an LLM based off Yi.
  feedback: https://huggingface.co/cognitivecomputations/dolphin-2_2-yi-34b/discussions
  intended_uses: ''
  license:
    explanation: can be found at https://huggingface.co/cognitivecomputations/dolphin-2_2-yi-34b/blob/main/LICENSE
    value: custom
  modality: text; text
  model_card: https://huggingface.co/cognitivecomputations/dolphin-2_2-yi-34b
  monitoring: unknown
  name: Dolphin 2.2 Yi
  nationality: USA
  organization: Cognitive Computations
  prohibited_uses: ''
  quality_control: ''
  size: 34B parameters (dense)
  training_emissions: unknown
  training_hardware: 4 A100 GPUs
  training_time: 3 days
  type: model
  url: https://erichartford.com/dolphin
- access: open
  analysis: Evaluated on OpenLLM leaderboard.
  created_date:
    explanation: release date is not published; estimated to be sometime in either
      May or June 2023.
    value: 2023-06-01
  dependencies:
  - WizardLM
  description: WizardLM Uncensored is WizardLM trained with a subset of the dataset
    - responses that contained alignment / moralizing were removed.
  feedback: https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored/discussions
  intended_uses: ''
  license: unknown
  modality: text; text
  model_card: https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored
  monitoring: unknown
  name: WizardLM Uncensored
  nationality: USA
  organization: Cognitive Computations
  prohibited_uses: ''
  quality_control: ''
  size: 30B parameters (dense)
  training_emissions: unknown
  training_hardware: unknown
  training_time: unknown
  type: model
  url: https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored


================================================
FILE: assets/cohere.yaml
================================================
- access:
    explanation: The dataset isn't provided to the public.
    value: closed
  analysis:
    explanation: The analysis performed on the dataset is not released.
    value: unknown
  created_date:
    explanation: 'The date the Cohere API was announced on the news [[News Article]]
      (https://venturebeat.com/2021/11/15/openai-rival-cohere-launches-language-model-api/).

      '
    value: 2021-11-15
  datasheet: https://docs.cohere.ai/data-statement
  dependencies: []
  description: 'The family of datasets used to train Cohere models, which come in
    two forms: coheretext-filtered and coheretext-unfiltered. The former is used to
    train the Representation models, while the latter one is used to train the Generation
    models.

    '
  excluded: 'Documents that are not in English are excluded.

    '
  feedback:
    explanation: The internal feedback mechanisms for Cohere are unknown.
    value: unknown
  included: 'As stated in the datasheet, the dataset "includes the Google Books dataset,
    CommonCrawl, and text from the internet scraped by the Cohere infrastructure team."
    The top ten domains scraped were: wordpress.com, medium.com, stackexchange.com,
    tumblr.com, elsevier.com, genius.com, bbc.co.uk, libsyn.com, yahoo.com, nytimes.com
    [[Datasheet]](https://docs.cohere.ai/data-statement).

    '
  intended_uses: 'The intended use of the dataset is to train Cohere''s language models.

    '
  license: unknown
  modality: text; text
  monitoring:
    explanation: 'There is no information on how Cohere is internally monitoring the
      use of the dataset.

      '
    value: unknown
  name: coheretext
  nationality: Canada
  organization: Cohere
  prohibited_uses:
    explanation: 'There are no known prohibited uses of the dataset, but the Cohere
      API is bound by the Cohere usage guidelines, which may also apply to this dataset
      [[Usage Guidelines]](https://docs.cohere.ai/usage-guidelines).

      '
    value: unknown
  quality_control: 'In the datasheet, it is implied that Cohere employs filtration
    methods for removing racist, biased and toxic content, but the details are not
    provided. These filtration methods take both the context and the language, as
    opposed to using a list of blockwords [[Datasheet]](https://docs.cohere.ai/data-statement).

    '
  sample: []
  size:
    explanation: 'Size of the filtered dataset is 200 GB. The unfiltered dataset is
      ~3 TB.

      '
    value: 200 GB
  type: dataset
  url: none
- access:
    explanation: 'The model is available to the public through the Cohere Platform
      [[Cohere Platform]](https://os.cohere.ai/login).

      '
    value: limited
  analysis: The model's performance was analyzed on Hellaswag and COPA, as well as
    several safety benchmarks [[Model Card]](https://docs.cohere.ai/generation-card).
  created_date:
    explanation: 'The date the Cohere API was announced on the news [[News Article]]
      (https://venturebeat.com/2021/11/15/openai-rival-cohere-launches-language-model-api/).

      '
    value: 2021-11-15
  dependencies:
  - coheretext
  description: 'The Generations model is a language model trained by Cohere for generation
    tasks.

    '
  feedback:
    explanation: 'There are no specific feedback mechanisms for the Generation model,
      but a generic contact email is provided on Cohere website, which is support
      at cohere.ai.

      '
    value: unknown
  intended_uses: 'On the model card, the intended uses are stated as "interactive
    autocomplete, augmenting human writing processes, summarization, text rephrasing,
    and other text-to-text tasks in non-sensitive domains" [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  license: unknown
  modality: text; text
  model_card: https://docs.cohere.ai/generation-card
  monitoring: 'The usage of the model is monitored by Cohere [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  name: Cohere Base
  nationality: Canada
  organization: Cohere
  prohibited_uses: 'The usage of the model is bound by the Cohere usage guidelines
    [[Usage Guidelines]](https://docs.cohere.ai/usage-guidelines). A non-comprehensive
    list of specific application violating these guidelines are: astroturfing, generation
    of misinformation and other harmful content, and "generation of text about people,
    places, or events without a human-in-the-loop" [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  quality_control:
    explanation: 'The quality control measures taken are unknown, but it is implied
      that Cohere performed mitigation strategies for toxic degeneration [[Model Card]](https://docs.cohere.ai/generation-card).

      '
    value: unknown
  size:
    explanation: 'The exact sizes of the generation models are unknown, but we know
      that they come in three sizes: small, medium, and large [[Model Card]](https://docs.cohere.ai/generation-card).

      '
    value: unknown
  training_emissions:
    explanation: The emissions of the models are unknown.
    value: unknown
  training_hardware:
    explanation: 'The training hardware wasn''t explicitly announced, but it was reported
      that Google Cloud teamed up with Cohere on a TPU partnership [[TechCrunch Article]]
      (https://techcrunch.com/2021/11/17/google-cloud-teams-up-with-nlp-startup-cohere-on-multi-year-partnership/).

      '
    value: unknown
  training_time:
    explanation: The training time for the models are unknown.
    value: unknown
  type: model
  url: none
- access:
    explanation: 'The model is available to the public through the Cohere Platform
      [[Cohere Platform]](https://os.cohere.ai/login).

      '
    value: limited
  analysis: The model's performance was analyzed on Hellaswag and COPA, as well as
    several safety benchmarks [[Model Card]](https://docs.cohere.ai/generation-card).
  created_date:
    explanation: 'Cohere Command was in beta before this date

      '
    value: 2023-01-01
  dependencies:
  - Cohere Base
  description: 'This model is a generative model optimized to follow commands in the
    prompt.

    '
  feedback:
    explanation: 'There are no specific feedback mechanisms for the Generation model,
      but a generic contact email is provided on Cohere website, which is support
      at cohere.ai.

      '
    value: unknown
  intended_uses: 'On the model card, the intended uses are stated as "interactive
    autocomplete, augmenting human writing processes, summarization, text rephrasing,
    and other text-to-text tasks in non-sensitive domains" [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  license:
    explanation: The model likely has a license specifically for Cohere's use.
    value: unknown
  modality: text; text
  model_card: https://docs.cohere.ai/generation-card
  monitoring: 'The usage of the model is monitored by Cohere [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  name: Cohere Command
  nationality: Canada
  organization: Cohere
  prohibited_uses: 'The usage of the model is bound by the Cohere usage guidelines
    [[Usage Guidelines]](https://docs.cohere.ai/usage-guidelines). A non-comprehensive
    list of specific application violating these guidelines are: astroturfing, generation
    of misinformation and other harmful content, and "generation of text about people,
    places, or events without a human-in-the-loop" [[Model Card]](https://docs.cohere.ai/generation-card).

    '
  quality_control:
    explanation: unknown
    value: unknown
  size: unknown
  training_emissions:
    explanation: unknown
    value: unknown
  training_hardware:
    explanation: unknown
    value: unknown
  training_time:
    explanation: unknown
    value: unknown
  type: model
  url: https://docs.cohere.com/docs/command-beta
- access:
    explanation: 'The model is available to the public through the Cohere Platform
      [[Cohere Platform]](https://cohere.ai/).

      '
    value: limited
  analysis: 'The model''s performance was analyzed on several safety benchmarks [[Model
    Card]](https://docs.cohere.ai/representation-card).

    '
  created_date:
    explanation: 'The date the Cohere API was announced on the news [[News Article]]
      (https://venturebeat.com/2021/11/15/openai-rival-cohere-launches-language-model-api/).

      '
    value: 2021-11-15
  dependencies: []
  description: 'The Embedding Large (English) model is a language model trained by
    Cohere for tasks requiring embeddings.

    '
  feedback:
    explanation: 'There are no specific feedback mechanisms for the Generation model,
      but a generic contact email is provided on Cohere website, which is support
      at cohere.ai.

      '
    value: unknown
  intended_uses: 'The intended uses are stated as "estimating semantic similarity
    between two sentences, choosing a sentence which is most likely to follow another
    sentence, sentiment analysis, topic extraction, or categorizing user feedback"
    on the Cohere model card [[Model Card]](https://docs.cohere.ai/representation-card).

    '
  license:
    explanation: The model likely has a license specifically for Cohere's use.
    value: unknown
  modality: text; text
  model_card: https://docs.cohere.ai/representation-card
  monitoring: 'The usage of the model is monitored by Cohere [[Model Card]](https://docs.cohere.ai/representation-card).

    '
  name: Cohere Embed (English)
  nationality: Canada
  organization: Cohere
  prohibited_uses: 'The usage of the model is bound by the Cohere usage guidelines
    [[Usage Guidelines]](https://docs.cohere.ai/usage-guidelines). A non-comprehensive
    list of specific application violating these guidelines are: extraction of identity
    and demographic information, building purposefully opaque text classification
    systems, and "building downstream classifiers that serve as automated decision-making
    systems that have real-world consequences on people, where those decisions are
    made without a human-in-the-loop" [[Model Card]](https://docs.cohere.ai/representation-card).

    '
  quality_control: unknown
  size: unknown
  training_emissions:
    explanation: The emissions of the models are unknown.
    value: unknown
  training_hardware:
    explanation: 'The training hardware wasn''t explicitly announced, but it was reported
      that Google Cloud teamed up with Cohere on a TPU partnership [[TechCrunch Article]]
      (https://techcrunch.com/2021/11/17/google-cloud-teams-up-with-nlp-startup-cohere-on-multi-year-partnership/).

      '
    value: unknown
  training_time:
    explanation: The training time for the models are unknown.
    value: unknown
  type: model
  url: none
- access:
    explanation: 'The model is available to the public through the Cohere Platform
      [[Cohere Platform]](https://cohere.ai/).

      '
    value: limited
  analysis: 'The model''s performance was analyzed on several safety benchmarks [[Model
    Card]](https://docs.cohere.ai/representation-card).

    '
  created_date:
    explanation: 'The launch of the multilingual embedding model was announced in
      this [[blog post]] (https://txt.cohere.com/multilingual/).

      '
    value: 2022-12-12
  dependencies: []
  description: 'This model maps text from 100+ languages to a semantic vector space,
    positioning text with a similar meaning (regardless of language) in close proximity.

    '
  feedback:
    explanation: 'There are no specific feedback mechanisms for the Generation model,
      but a generic contact email is provided on Cohere website, which is support
      at cohere.ai.

      '
    value: unknown
  intended_uses: 'The intended uses are stated as "estimating semantic similarity
    between two sentences, choosing a sentence which is most likely to follow another
    sentence, sentim

Download .txt

gitextract_sfi3u6er/

├── .github/
│   └── workflows/
│       ├── collate.yml
│       ├── deploy.yml
│       └── test.yml
├── .gitignore
├── .yamllint.yaml
├── README.md
├── assets/
│   ├── 01ai.yaml
│   ├── 360.yaml
│   ├── adept.yaml
│   ├── adobe.yaml
│   ├── ai2.yaml
│   ├── ai21.yaml
│   ├── aleph_alpha.yaml
│   ├── alibaba.yaml
│   ├── amazon.yaml
│   ├── anthropic.yaml
│   ├── apple.yaml
│   ├── argilla.yaml
│   ├── argonne.yaml
│   ├── aspia_space_institu.yaml
│   ├── assembly.yaml
│   ├── autogpt.yaml
│   ├── avignon.yaml
│   ├── baai.yaml
│   ├── baichuan.yaml
│   ├── baidu.yaml
│   ├── bain.yaml
│   ├── be_my_eyes.yaml
│   ├── beitech.yaml
│   ├── berkeley.yaml
│   ├── bigcode.yaml
│   ├── bigscience.yaml
│   ├── bloomberg.yaml
│   ├── boston.yaml
│   ├── brex.yaml
│   ├── bytedance.yaml
│   ├── cagliostro.yaml
│   ├── cartesia.yaml
│   ├── casia.yaml
│   ├── causallm.yaml
│   ├── cerebras.yaml
│   ├── character.yaml
│   ├── chatglm.yaml
│   ├── cmu.yaml
│   ├── cognition.yaml
│   ├── cognitive.yaml
│   ├── cohere.yaml
│   ├── columbia.yaml
│   ├── compute_01_research.yaml
│   ├── continue.yaml
│   ├── cresta.yaml
│   ├── databricks.yaml
│   ├── deci.yaml
│   ├── deepmind.yaml
│   ├── deepnight.yaml
│   ├── deepseek.yaml
│   ├── dibt.yaml
│   ├── duckduckgo.yaml
│   ├── duolingo.yaml
│   ├── eleutherai.yaml
│   ├── ens.yaml
│   ├── epfl.yaml
│   ├── evolutionaryscale.yaml
│   ├── faraday.yaml
│   ├── fuse.yaml
│   ├── genmo.yaml
│   ├── glm.yaml
│   ├── google.yaml
│   ├── greenbit.yaml
│   ├── h2oai.yaml
│   ├── hubspot.yaml
│   ├── huggingface.yaml
│   ├── ibm.yaml
│   ├── idea.yaml
│   ├── ideogram.yaml
│   ├── inflection.yaml
│   ├── instacart.yaml
│   ├── internlm.yaml
│   ├── juni.yaml
│   ├── kaist.yaml
│   ├── kakaobrain.yaml
│   ├── khan.yaml
│   ├── konan.yaml
│   ├── kotoba.yaml
│   ├── ktai.yaml
│   ├── kunlun.yaml
│   ├── laion.yaml
│   ├── laion_ev.yaml
│   ├── latitude.yaml
│   ├── lehigh.yaml
│   ├── lg.yaml
│   ├── lg_ai_research.yaml
│   ├── linkedin.yaml
│   ├── llm360.yaml
│   ├── lmsys.yaml
│   ├── mathai.yaml
│   ├── maya.yaml
│   ├── meta.yaml
│   ├── microsoft.yaml
│   ├── mila.yaml
│   ├── mistral.yaml
│   ├── moonhub.yaml
│   ├── moreh.yaml
│   ├── mosaic.yaml
│   ├── nanyang.yaml
│   ├── naver.yaml
│   ├── ncsoft.yaml
│   ├── neeva.yaml
│   ├── nextdoor.yaml
│   ├── nolano.yaml
│   ├── notion.yaml
│   ├── nous.yaml
│   ├── nucleus.yaml
│   ├── nvidia.yaml
│   ├── oasst.yaml
│   ├── ollama.yaml
│   ├── openai.yaml
│   ├── openbmb.yaml
│   ├── openlemur.yaml
│   ├── openx.yaml
│   ├── orion.yaml
│   ├── osu.yaml
│   ├── othersideai.yaml
│   ├── oxford.yaml
│   ├── paladin.yaml
│   ├── peking.yaml
│   ├── perplexity.yaml
│   ├── pleias.yaml
│   ├── portkey.yaml
│   ├── quizlet.yaml
│   ├── quora.yaml
│   ├── qwen_team.yaml
│   ├── rakuten.yaml
│   ├── reexpress.yaml
│   ├── reka.yaml
│   ├── robin.yaml
│   ├── runway_ai,_inc..yaml
│   ├── rwkv.yaml
│   ├── salesforce.yaml
│   ├── samba.yaml
│   ├── sana.yaml
│   ├── sciphi.yaml
│   ├── shanghai.yaml
│   ├── shop.yaml
│   ├── singapore.yaml
│   ├── skt.yaml
│   ├── snap.yaml
│   ├── soochow.yaml
│   ├── speak.yaml
│   ├── spotify.yaml
│   ├── stability.yaml
│   ├── stability_ai.yaml
│   ├── stanford.yaml
│   ├── stonybrook.yaml
│   ├── suno.yaml
│   ├── tiger.yaml
│   ├── together.yaml
│   ├── tokyo.yaml
│   ├── toronto.yaml
│   ├── transformify.yaml
│   ├── trevor.yaml
│   ├── triml.yaml
│   ├── tsinghua.yaml
│   ├── twelvelabs.yaml
│   ├── uae.yaml
│   ├── unknown.yaml
│   ├── uw.yaml
│   ├── uwashington.yaml
│   ├── vago.yaml
│   ├── viable.yaml
│   ├── vilm.yaml
│   ├── wayve.yaml
│   ├── writer.yaml
│   ├── xai.yaml
│   ├── xverse.yaml
│   ├── xwin.yaml
│   ├── yandex.yaml
│   ├── you.yaml
│   └── zhejiang.yaml
├── components/
│   ├── home.html
│   ├── nav.html
│   └── table.html
├── css/
│   └── main.css
├── index.html
├── js/
│   ├── general.js
│   ├── main.js
│   └── schemas.yaml
├── pre-commit.sh
├── requirements.txt
├── resources/
│   ├── .gitkeep
│   └── all_assets.csv
├── scripts/
│   └── collate.py
├── server.py
└── yamlfix.toml

Download .txt

SYMBOL INDEX (46 symbols across 4 files)

FILE: js/general.js
  function encodeUrlParams (line 1) | function encodeUrlParams(params) {
  function decodeUrlParams (line 8) | function decodeUrlParams(str) {
  function updateBrowserLocation (line 20) | function updateBrowserLocation(params) {
  function openBrowserLocation (line 25) | function openBrowserLocation(params) {
  function multilineHtml (line 30) | function multilineHtml(s) {
  function renderError (line 34) | function renderError(e) {
  function getField (line 38) | function getField(obj, key) {

FILE: js/main.js
  class Field (line 1) | class Field {
    method constructor (line 2) | constructor(raw) {
  class Schema (line 9) | class Schema {
    method constructor (line 10) | constructor(name, raw) {
    method hasField (line 15) | hasField(name) {
  class AssetField (line 24) | class AssetField {
    method constructor (line 25) | constructor(value, explanation) {
  class Asset (line 34) | class Asset {
    method constructor (line 36) | constructor(item, schema) {
  function updateDownstreamAssets (line 93) | function updateDownstreamAssets(nameToAsset) {
  function getStandardSize (line 110) | function getStandardSize(value) {
  function compareValues (line 130) | function compareValues(valueA, valueB, columnName) {
  function filterTable (line 171) | function filterTable(query) {
  function sortColumn (line 206) | function sortColumn(columnName) {
  function helpIcon (line 247) | function helpIcon(help, link) {
  function renderList (line 252) | function renderList(items) {
  function renderAccessType (line 263) | function renderAccessType(value) {
  function renderField (line 274) | function renderField(schemaField) {
  function renderValueExplanation (line 279) | function renderValueExplanation(type, value, explanation) {
  function renderAssetLink (line 307) | function renderAssetLink(nameToAsset, assetName) {
  function renderAssetLinks (line 316) | function renderAssetLinks(nameToAsset, assetNames) {
  function renderAsset (line 320) | function renderAsset(nameToAsset, assetName) {
  function renderFieldName (line 353) | function renderFieldName(fieldName) {
  function renderCustomTable (line 372) | function renderCustomTable(selectedAssets, allNameToAsset, columnNames) {
  function renderAssetsTable (line 422) | function renderAssetsTable(nameToAsset) {
  function renderAssetsGraph (line 434) | function renderAssetsGraph(nameToAsset) {
  function downloadCSV (line 556) | function downloadCSV() {
  function setUpSearch (line 565) | function setUpSearch() {
  function renderHomePage (line 580) | function renderHomePage(pageContainer) {
  function renderTablePage (line 588) | function renderTablePage(pageContainer, nameToAsset) {
  function renderGraphPage (line 604) | function renderGraphPage(pageContainer, nameToAsset) {
  function renderPageContent (line 609) | function renderPageContent(nameToAsset) {
  function renderNavBar (line 628) | function renderNavBar() {
  function loadAssetsAndRenderPageContent (line 634) | function loadAssetsAndRenderPageContent() {

FILE: scripts/collate.py
  function scalar (line 11) | def scalar(val):
  function collate_assets (line 25) | def collate_assets():

FILE: server.py
  class MyHTTPRequestHandler (line 5) | class MyHTTPRequestHandler(server.SimpleHTTPRequestHandler):
    method end_headers (line 6) | def end_headers(self):
    method send_my_headers (line 10) | def send_my_headers(self):

Download .json

Condensed preview — 194 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,302K chars).

[
  {
    "path": ".github/workflows/collate.yml",
    "chars": 824,
    "preview": "---\nname: Collate\n\n\"on\":\n  workflow_dispatch:\n  push:\n    paths:\n      - assets/**\n\njobs:\n  collate-assets:\n    name: Co"
  },
  {
    "path": ".github/workflows/deploy.yml",
    "chars": 1131,
    "preview": "---\nname: Deploy\n\n\"on\":\n  push:\n    branches: main\n\njobs:\n  # Check out the main branch of the development repository an"
  },
  {
    "path": ".github/workflows/test.yml",
    "chars": 434,
    "preview": "---\nname: Test\n\n\"on\":\n  push:\n    branches: main\n  pull_request:\n    branches:\n      - main\n\njobs:\n  format:\n    name: F"
  },
  {
    "path": ".gitignore",
    "chars": 70,
    "preview": "# Development Environment\n.DS_Store\n.vscode\n\n# Group specific\nnav\ntags"
  },
  {
    "path": ".yamllint.yaml",
    "chars": 600,
    "preview": "---\n\nyaml-files:\n  - '*.yaml'\n  - '*.yml'\n  - '.yamllint'\n\nrules:\n  braces: enable\n  brackets: enable\n  colons: enable\n "
  },
  {
    "path": "README.md",
    "chars": 2004,
    "preview": "# Ecosystem Graphs\n\n## Overview\nWelcome! Ecosystem Graphs is an ongoing effort to track the foundation model ecosystem, "
  },
  {
    "path": "assets/01ai.yaml",
    "chars": 2161,
    "preview": "- access: open\n  analysis: Evaluated on standard language benchmarks, common sense reasoning, and\n    reading comprehens"
  },
  {
    "path": "assets/360.yaml",
    "chars": 679,
    "preview": "- access: open\n  analysis: Achieved competitive performance on relevant benchmarks against other\n    7B models in Chines"
  },
  {
    "path": "assets/adept.yaml",
    "chars": 3344,
    "preview": "- access: closed\n  analysis: ''\n  created_date:\n    explanation: The date the model was announced in the [[Adept blog po"
  },
  {
    "path": "assets/adobe.yaml",
    "chars": 4135,
    "preview": "- access: closed\n  analysis: ''\n  created_date: 2023-10-10\n  dependencies: []\n  description: Firefly Image 2 is the next"
  },
  {
    "path": "assets/ai2.yaml",
    "chars": 10325,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-04-16\n  datasheet: ''\n  dependencies: []\n  description: ''\n  excluded"
  },
  {
    "path": "assets/ai21.yaml",
    "chars": 14489,
    "preview": "- access: closed\n  analysis: none\n  created_date:\n    explanation: The date the white paper was made public.\n    value: "
  },
  {
    "path": "assets/aleph_alpha.yaml",
    "chars": 5162,
    "preview": "- access: closed\n  analysis: unknown\n  created_date:\n    explanation: Date the models were announced by the Aleph Alpha "
  },
  {
    "path": "assets/alibaba.yaml",
    "chars": 7360,
    "preview": "- access: closed\n  analysis: ''\n  created_date: 2023-02-20\n  datasheet: ''\n  dependencies:\n  - LAION-5B\n  description: '"
  },
  {
    "path": "assets/amazon.yaml",
    "chars": 7215,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-04-13\n  dependencies:\n  - Jurassic-2\n  - Claude\n  - Stable Diffu"
  },
  {
    "path": "assets/anthropic.yaml",
    "chars": 39277,
    "preview": "- access:\n    explanation: 'The dataset is freely shared on GitHub [[hh-rlhf]](https://github.com/anthropics/hh-rlhf).\n\n"
  },
  {
    "path": "assets/apple.yaml",
    "chars": 3477,
    "preview": "- access: closed\n  analysis: Evaluated on image captioning and visual question answering across many\n    benchmarks.\n  c"
  },
  {
    "path": "assets/argilla.yaml",
    "chars": 809,
    "preview": "- access: open\n  analysis: Evaluated on MT-Bench and AlphaEval benchmarks.\n  created_date: 2023-12-01\n  dependencies:\n  "
  },
  {
    "path": "assets/argonne.yaml",
    "chars": 561,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-10-11\n  dependencies:\n  - SARS-CoV-2 genome dataset\n  - BV-BRC datase"
  },
  {
    "path": "assets/aspia_space_institu.yaml",
    "chars": 1774,
    "preview": "- access: open\n  analysis: \"The models\\u2019 performance on downstream tasks was evaluated by linear\\\n    \\ probing. The"
  },
  {
    "path": "assets/assembly.yaml",
    "chars": 4401,
    "preview": "- access: closed\n  analysis: ''\n  created_date:\n    explanation: The date Conformer-1 was announced in this [[Twitter po"
  },
  {
    "path": "assets/autogpt.yaml",
    "chars": 615,
    "preview": "- access: open\n  adaptation: GPT-4 adapted to run autonomously by chaining together LLM \"thoughts\"\n  created_date: 2023-"
  },
  {
    "path": "assets/avignon.yaml",
    "chars": 1206,
    "preview": "- access: open\n  analysis: BioMistral was evaluated on a benchmark comprising 10 established medical\n    question-answer"
  },
  {
    "path": "assets/baai.yaml",
    "chars": 5919,
    "preview": "- access: closed\n  analysis: ''\n  created_date:\n    explanation: 'The date that BAAI made a public announcement [[News L"
  },
  {
    "path": "assets/baichuan.yaml",
    "chars": 804,
    "preview": "- access: open\n  analysis: Evaluated on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.\n  created_date: 2023-0"
  },
  {
    "path": "assets/baidu.yaml",
    "chars": 3452,
    "preview": "- access:\n    explanation: Baidu does not provide ERNIE 3.0 for external researchers\n    value: closed\n  analysis: ''\n  "
  },
  {
    "path": "assets/bain.yaml",
    "chars": 801,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-02-21\n  dependencies:\n  - ChatGPT API\n  description: \"With the a"
  },
  {
    "path": "assets/be_my_eyes.yaml",
    "chars": 604,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-14\n  dependencies:\n  - GPT-4 API\n  description: \"The first-ev"
  },
  {
    "path": "assets/beitech.yaml",
    "chars": 1517,
    "preview": "- access: open\n  analysis: Models fine-tuned on EXMODD and earlier dataset Image-Chat and then evaluated\n    on Image-Ch"
  },
  {
    "path": "assets/berkeley.yaml",
    "chars": 2400,
    "preview": "- access: open\n  analysis: Evaluated in comparison with ChatGPT and Stanford Alpaca.\n  created_date: 2023-04-03\n  depend"
  },
  {
    "path": "assets/bigcode.yaml",
    "chars": 9325,
    "preview": "- access: open\n  analysis: Tested on several benchmarks, most notably Python benchmark HumanEval.\n  created_date: 2023-0"
  },
  {
    "path": "assets/bigscience.yaml",
    "chars": 9580,
    "preview": "- access:\n    explanation: While not all of the ROOTS corpus is publicly accessible, the majority\n      is through [this"
  },
  {
    "path": "assets/bloomberg.yaml",
    "chars": 7013,
    "preview": "- access:\n    explanation: See [[Openness]](https://arxiv.org/pdf/2303.17564.pdf#subsection.8.2)\n    value: closed\n  ana"
  },
  {
    "path": "assets/boston.yaml",
    "chars": 1586,
    "preview": "- access: open\n  analysis: \"Platypus achieves the strongest performance and currently stands at first\\\n    \\ place in Hu"
  },
  {
    "path": "assets/brex.yaml",
    "chars": 1036,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-07\n  dependencies:\n  - ChatGPT API\n  description: Brex Inc., "
  },
  {
    "path": "assets/bytedance.yaml",
    "chars": 3970,
    "preview": "- access: open\n  analysis: Evaluated on benchmarks pertaining to speech, music, and other audio recognition.\n  created_d"
  },
  {
    "path": "assets/cagliostro.yaml",
    "chars": 1366,
    "preview": "- access: open\n  analysis: unknown\n  created_date: 2024-03-18\n  dependencies:\n  - Animagine XL 3.0\n  description: An ope"
  },
  {
    "path": "assets/cartesia.yaml",
    "chars": 1409,
    "preview": "- access: limited\n  analysis: Extensive testing on Multilingual Librispeech dataset resulted in 20%\n    lower validation"
  },
  {
    "path": "assets/casia.yaml",
    "chars": 2148,
    "preview": "- access: open\n  analysis: Reports results on standard translation benchmarks across 102 languages\n    in comparison wit"
  },
  {
    "path": "assets/causallm.yaml",
    "chars": 853,
    "preview": "- access: open\n  analysis: Evaluated on standard benchmarks across a range of tasks.\n  created_date: 2023-10-21\n  depend"
  },
  {
    "path": "assets/cerebras.yaml",
    "chars": 8373,
    "preview": "- access:\n    explanation: 'The Pile is an open source dataset. Hugging Face compatible checkpoints\n      available on t"
  },
  {
    "path": "assets/character.yaml",
    "chars": 576,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2022-09-16\n  dependencies: []\n  description: Character allows users t"
  },
  {
    "path": "assets/chatglm.yaml",
    "chars": 665,
    "preview": "- access: open\n  analysis: Performance evaluated on English and Chinese language benchmark tests.\n  created_date: 2023-0"
  },
  {
    "path": "assets/cmu.yaml",
    "chars": 2046,
    "preview": "- access:\n    explanation: Model checkpoints are available for download at https://github.com/VHellendoorn/Code-LMs\n    "
  },
  {
    "path": "assets/cognition.yaml",
    "chars": 2920,
    "preview": "- access: limited\n  analysis: Evaluated on SWE-Bench, a challenging software engineering benchmark,\n    where Devin outp"
  },
  {
    "path": "assets/cognitive.yaml",
    "chars": 1802,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-11-14\n  dependencies:\n  - Dolphin\n  - Yi\n  description: Dolphin 2.2"
  },
  {
    "path": "assets/cohere.yaml",
    "chars": 37201,
    "preview": "- access:\n    explanation: The dataset isn't provided to the public.\n    value: closed\n  analysis:\n    explanation: The "
  },
  {
    "path": "assets/columbia.yaml",
    "chars": 1588,
    "preview": "- access: open\n  analysis: Evaluated on wide range of tasks using own evaluation benchmarks.\n  created_date: 2022-11-20\n"
  },
  {
    "path": "assets/compute_01_research.yaml",
    "chars": 966,
    "preview": "- access:\n    explanation: to ensure AGI will be open-source, transparent, and accessible\n    value: open\n  analysis: ''"
  },
  {
    "path": "assets/continue.yaml",
    "chars": 820,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-07-26\n  dependencies:\n  - GPT-4 API\n  - Code Llama\n  - Claude API\n "
  },
  {
    "path": "assets/cresta.yaml",
    "chars": 946,
    "preview": "- access: closed\n  analysis: Outperforms GPT-4 in common sense and reasoning tasks on the basis of\n    both efficiency a"
  },
  {
    "path": "assets/databricks.yaml",
    "chars": 5106,
    "preview": "- access:\n    explanation: 'Model training code can be accessed at the official Dolly repository.\n      Trained weights "
  },
  {
    "path": "assets/deci.yaml",
    "chars": 921,
    "preview": "- access: open\n  analysis: Evaluated on the OpenLLM benchmarks and, on release, outperforms all other\n    7B models on t"
  },
  {
    "path": "assets/deepmind.yaml",
    "chars": 33991,
    "preview": "- access:\n    explanation: 'The dataset access is limited to DeepMind researchers [[Model Card]](https://arxiv.org/pdf/2"
  },
  {
    "path": "assets/deepnight.yaml",
    "chars": 743,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-11-04\n  dependencies: []\n  description: SaiLy is a series/collectio"
  },
  {
    "path": "assets/deepseek.yaml",
    "chars": 3368,
    "preview": "- access: open\n  analysis: Deepseek and baseline models (for comparison) evaluated on a series of\n    representative ben"
  },
  {
    "path": "assets/dibt.yaml",
    "chars": 1145,
    "preview": "- access: open\n  analysis: none\n  created_date: 2024-02-27\n  datasheet: https://huggingface.co/datasets/DIBT/10k_prompts"
  },
  {
    "path": "assets/duckduckgo.yaml",
    "chars": 645,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-03-08\n  dependencies:\n  - Anthropic API\n  description: The first In"
  },
  {
    "path": "assets/duolingo.yaml",
    "chars": 3095,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-14\n  dependencies:\n  - GPT-4 API\n  description: Explain My An"
  },
  {
    "path": "assets/eleutherai.yaml",
    "chars": 15365,
    "preview": "- access:\n    explanation: 'The dataset is freely available to the public and can be downloaded\n      from The Eye [[The"
  },
  {
    "path": "assets/ens.yaml",
    "chars": 3777,
    "preview": "- access:\n    explanation: Dataset, evaluation code and models are publicly available at the\n      [[HowTo100M dataset w"
  },
  {
    "path": "assets/epfl.yaml",
    "chars": 829,
    "preview": "- access: open\n  analysis: Evaluated on TruthfulQA as main evaluation benchmark.\n  created_date: 2023-11-27\n  dependenci"
  },
  {
    "path": "assets/evolutionaryscale.yaml",
    "chars": 1683,
    "preview": "- access: open\n  analysis: The model was tested in the generation of a new green fluorescent protein.\n    Its effectiven"
  },
  {
    "path": "assets/faraday.yaml",
    "chars": 783,
    "preview": "- access: open\n  adaptation: ''\n  created_date:\n    explanation: The date Faraday Lab's hugging face card was last updat"
  },
  {
    "path": "assets/fuse.yaml",
    "chars": 1351,
    "preview": "- access: open\n  analysis: The FuseChat model was evaluated on MT-Bench which comprises 80 multi-turn\n    dialogues span"
  },
  {
    "path": "assets/genmo.yaml",
    "chars": 2035,
    "preview": "- access:\n    explanation: open state-of-the-art video generation model... The weights and architecture\n      for Mochi "
  },
  {
    "path": "assets/glm.yaml",
    "chars": 1824,
    "preview": "- access: Open\n  analysis: Evaluations show that GLM-4, 1) closely rivals or outperforms GPT-4 in\n    terms of general m"
  },
  {
    "path": "assets/google.yaml",
    "chars": 72943,
    "preview": "- access:\n    explanation: https://huggingface.co/datasets/c4\n    value: open\n  analysis: https://arxiv.org/abs/2104.087"
  },
  {
    "path": "assets/greenbit.yaml",
    "chars": 725,
    "preview": "- access: open\n  analysis: Evaluated on common LLM benchmarks.\n  created_date: 2023-09-29\n  dependencies:\n  - LLaMA\n  de"
  },
  {
    "path": "assets/h2oai.yaml",
    "chars": 1835,
    "preview": "- access: open\n  analysis: Evaluated on EleutherAI evaluation harness.\n  created_date: 2023-06-16\n  dependencies:\n  - GP"
  },
  {
    "path": "assets/hubspot.yaml",
    "chars": 885,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-01-31\n  dependencies:\n  - ChatGPT API\n  description: Give your s"
  },
  {
    "path": "assets/huggingface.yaml",
    "chars": 9242,
    "preview": "- access: open\n  analysis: none\n  created_date:\n    explanation: The date the model was announced\n    value: 2021-12-06\n"
  },
  {
    "path": "assets/ibm.yaml",
    "chars": 4895,
    "preview": "- access: open\n  analysis: ''\n  created_date:\n    explanation: The date the model was announced in the [[Adept blog post"
  },
  {
    "path": "assets/idea.yaml",
    "chars": 1080,
    "preview": "- access: open\n  analysis: Evaluated on human and machine benchmarks in comparison to established\n    image models as a "
  },
  {
    "path": "assets/ideogram.yaml",
    "chars": 627,
    "preview": "- access: limited\n  analysis: \"Compared to DALL\\xB7E 3 based on a qualitative user comparison.\"\n  created_date: 2024-02-"
  },
  {
    "path": "assets/inflection.yaml",
    "chars": 3844,
    "preview": "- access: limited\n  analysis: Evaluated on wide range of language benchmarks like MMLU 5-shot, GSM-8K,\n    and HellaSwag"
  },
  {
    "path": "assets/instacart.yaml",
    "chars": 1008,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-01\n  dependencies:\n  - ChatGPT API\n  description: \"Instacart "
  },
  {
    "path": "assets/internlm.yaml",
    "chars": 779,
    "preview": "- access: open\n  analysis: Evaluated on the dimensions proposed by OpenCompass in comparison to other\n    LLMs.\n  create"
  },
  {
    "path": "assets/juni.yaml",
    "chars": 527,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: unknown\n  dependencies:\n  - Anthropic API\n  description: An online tu"
  },
  {
    "path": "assets/kaist.yaml",
    "chars": 717,
    "preview": "- access: open\n  analysis: Evaluated in comparison to the MusicCaps dataset and with respect to n-gram,\n    neural metri"
  },
  {
    "path": "assets/kakaobrain.yaml",
    "chars": 3834,
    "preview": "- access:\n    explanation: You can download the dataset from [[COYO Huggingface page]](https://huggingface.co/datasets/k"
  },
  {
    "path": "assets/khan.yaml",
    "chars": 589,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-14\n  dependencies:\n  - GPT-4 API\n  description: An AI-powered"
  },
  {
    "path": "assets/konan.yaml",
    "chars": 784,
    "preview": "- access: limited\n  analysis: none\n  created_date: 2023-09-17\n  dependencies: []\n  description: Konan LLM is a Large Lan"
  },
  {
    "path": "assets/kotoba.yaml",
    "chars": 814,
    "preview": "- access: open\n  analysis: unknown\n  created_date: 2024-03-13\n  dependencies: []\n  description: Kotoba-Speech is a Trans"
  },
  {
    "path": "assets/ktai.yaml",
    "chars": 1109,
    "preview": "- access: open\n  analysis: unknown\n  created_date: 2023-10-31\n  dependencies:\n  - AI-HUB dataset\n  - National Institute "
  },
  {
    "path": "assets/kunlun.yaml",
    "chars": 938,
    "preview": "- access: open\n  analysis: Evaluated on several popular benchmarks and performance in different fields.\n  created_date: "
  },
  {
    "path": "assets/laion.yaml",
    "chars": 8487,
    "preview": "- access:\n    explanation: The dataset index is available from [[dataset blog post]](https://laion.ai/blog/laion-400-ope"
  },
  {
    "path": "assets/laion_ev.yaml",
    "chars": 2062,
    "preview": "- access: open\n  analysis: Re-LAION-5B aims to fix the issues as reported by Stanford Internet Observatory\n    for the o"
  },
  {
    "path": "assets/latitude.yaml",
    "chars": 1272,
    "preview": "- access:\n    explanation: 'The game is available to public upon sign up.\n\n      '\n    value: limited\n  adaptation: ''\n "
  },
  {
    "path": "assets/lehigh.yaml",
    "chars": 1054,
    "preview": "- access: open\n  analysis: outperforms majority of preceding state-of-the-art models over 15 unique\n    biomedical modal"
  },
  {
    "path": "assets/lg.yaml",
    "chars": 616,
    "preview": "- access: closed\n  analysis: none\n  created_date: 2023-07-19\n  dependencies: []\n  description: EXAONE 2.0 is a multimoda"
  },
  {
    "path": "assets/lg_ai_research.yaml",
    "chars": 1780,
    "preview": "- access: open\n  analysis: The model was evaluated extensively across a wide range of public and\n    in-house benchmarks"
  },
  {
    "path": "assets/linkedin.yaml",
    "chars": 894,
    "preview": "- access: open\n  adaptation: ''\n  created_date: unknown\n  dependencies:\n  - Azure Cognitive Services for Vision\n  descri"
  },
  {
    "path": "assets/llm360.yaml",
    "chars": 3209,
    "preview": "- access: open\n  analysis: Evaluated on several benchmark LLM tasks\n  created_date: 2023-12-12\n  dependencies:\n  - Arxiv"
  },
  {
    "path": "assets/lmsys.yaml",
    "chars": 813,
    "preview": "- access: open\n  analysis: Evaluated against similar LLMs using GPT-4 as a judge.\n  created_date: 2023-03-30\n  dependenc"
  },
  {
    "path": "assets/mathai.yaml",
    "chars": 800,
    "preview": "- access: open\n  analysis: Mistral model fine-tuned on AutoMathText and evaluated on the MATH dataset.\n  created_date: 2"
  },
  {
    "path": "assets/maya.yaml",
    "chars": 1837,
    "preview": "- access: open\n  analysis: Evaluated on the OpenLLM leaderboard, releasing at rank number 4 on the\n    leaderboard.\n  cr"
  },
  {
    "path": "assets/meta.yaml",
    "chars": 42090,
    "preview": "- access:\n    explanation: Models are available for download from [[GitHub repository]](https://github.com/facebookresea"
  },
  {
    "path": "assets/microsoft.yaml",
    "chars": 41206,
    "preview": "- access:\n    explanation: Microsoft does not provide VLMo to external researchers. One author\n      commented that code"
  },
  {
    "path": "assets/mila.yaml",
    "chars": 4362,
    "preview": "- access: open\n  analysis: Models of size 150k parameters trained on ToyMix and compared to models\n    trained on its de"
  },
  {
    "path": "assets/mistral.yaml",
    "chars": 11708,
    "preview": "- access: open\n  analysis: Evaluated in comparison to LLaMA series models on standard language benchmarks.\n  created_dat"
  },
  {
    "path": "assets/moonhub.yaml",
    "chars": 771,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-10-11\n  dependencies:\n  - Cohere Base\n  description: Moonhub Rec"
  },
  {
    "path": "assets/moreh.yaml",
    "chars": 699,
    "preview": "- access: open\n  analysis: unknown\n  created_date: 2024-01-16\n  dependencies:\n  - Qwen\n  - OpenOrca\n  description: MoMo "
  },
  {
    "path": "assets/mosaic.yaml",
    "chars": 3245,
    "preview": "- access: open\n  analysis: Evaluated on a range of benchmarks and performed on par with LLaMA-7B.\n  created_date: 2023-0"
  },
  {
    "path": "assets/nanyang.yaml",
    "chars": 1027,
    "preview": "- access: open\n  analysis: Evaluated on researcher experiments to test deeper understanding and advanced\n    commonsense"
  },
  {
    "path": "assets/naver.yaml",
    "chars": 1513,
    "preview": "- access:\n    explanation: The model is not available for download or direct API use\n    value: closed\n  analysis: ''\n  "
  },
  {
    "path": "assets/ncsoft.yaml",
    "chars": 923,
    "preview": "- access: closed\n  analysis: Boasts the highest performance among the Korean LLMs of similar sizes\n    that have been re"
  },
  {
    "path": "assets/neeva.yaml",
    "chars": 2123,
    "preview": "- access: closed\n  analysis: ''\n  created_date: unknown\n  datasheet: ''\n  dependencies: []\n  description: ''\n  excluded:"
  },
  {
    "path": "assets/nextdoor.yaml",
    "chars": 668,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-05-02\n  dependencies:\n  - ChatGPT\n  description: AI chatbot on Next"
  },
  {
    "path": "assets/nolano.yaml",
    "chars": 2091,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-19\n  dependencies: []\n  description: Cformers is a set of tra"
  },
  {
    "path": "assets/notion.yaml",
    "chars": 665,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-02-22\n  dependencies:\n  - Anthropic API\n  description: \"Notion A"
  },
  {
    "path": "assets/nous.yaml",
    "chars": 6941,
    "preview": "- access: open\n  analysis: Evaluated across standard benchmarks and generally performs better than\n    Mixtral, which it"
  },
  {
    "path": "assets/nucleus.yaml",
    "chars": 1141,
    "preview": "- access: open\n  analysis: Evaluated on the OpenLLM leaderboard, performing on par with similar-sized\n    models.\n  crea"
  },
  {
    "path": "assets/nvidia.yaml",
    "chars": 3132,
    "preview": "- access:\n    explanation: 'Neither the 8.3B parameter model trained to convergence nor the\n      1 trillion paramter mo"
  },
  {
    "path": "assets/oasst.yaml",
    "chars": 757,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-08-23\n  dependencies:\n  - LLaMA 2\n  description: OpenAssistant LLaM"
  },
  {
    "path": "assets/ollama.yaml",
    "chars": 853,
    "preview": "- access: open\n  analysis: Mainly evaluated on MT-Bench and AlpacaEval, which are GPT-4-based comparisons.\n  created_dat"
  },
  {
    "path": "assets/openai.yaml",
    "chars": 64482,
    "preview": "- access:\n    explanation: The GPT-3 dataset isn't released to the public, but it may be available\n      to Microsoft th"
  },
  {
    "path": "assets/openbmb.yaml",
    "chars": 3711,
    "preview": "- access: open\n  analysis: Evaluated on English and Chinese language benchmarks.\n  created_date: 2023-05-27\n  dependenci"
  },
  {
    "path": "assets/openlemur.yaml",
    "chars": 1844,
    "preview": "- access: open\n  analysis: Evaluated on text and code benchmarks in comparison to other models.\n  created_date: 2023-10-"
  },
  {
    "path": "assets/openx.yaml",
    "chars": 7650,
    "preview": "- access: open\n  analysis: Analyzed on breakdown of types of robot trajectory in dataset, and overall\n    coverage.\n  cr"
  },
  {
    "path": "assets/orion.yaml",
    "chars": 985,
    "preview": "- access: open\n  analysis: Evaluated on multilingual and NLP benchmarks in comparison with SoTA models\n    of comparable"
  },
  {
    "path": "assets/osu.yaml",
    "chars": 773,
    "preview": "- access: open\n  analysis: Evaluated on MATH, a competition-level dataset, and achieves a 46% accuracy,\n    higher than "
  },
  {
    "path": "assets/othersideai.yaml",
    "chars": 1356,
    "preview": "- access: limited\n  adaptation: unknown\n  created_date: unknown\n  dependencies:\n  - OpenAI API\n  description: 'HyperWrit"
  },
  {
    "path": "assets/oxford.yaml",
    "chars": 6102,
    "preview": "- access:\n    explanation: The captions and video url's can be found at the [[dataset page]](https://m-bain.github.io/we"
  },
  {
    "path": "assets/paladin.yaml",
    "chars": 532,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-02-01\n  dependencies: []\n  description: UnderwriteGPT is the wor"
  },
  {
    "path": "assets/peking.yaml",
    "chars": 845,
    "preview": "- access: open\n  analysis: Compared with other multi-task, instruction-following agents.\n  created_date: 2023-11-10\n  de"
  },
  {
    "path": "assets/perplexity.yaml",
    "chars": 1873,
    "preview": "- access: open\n  adaptation: ''\n  created_date:\n    explanation: Perplexity Ask was announced in a [[Twitter post]](http"
  },
  {
    "path": "assets/pleias.yaml",
    "chars": 1793,
    "preview": "- access: open\n  analysis: unknown\n  created_date: 2024-03-20\n  datasheet: ''\n  dependencies: []\n  description: Common C"
  },
  {
    "path": "assets/portkey.yaml",
    "chars": 545,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-05-06\n  dependencies: []\n  description: Portkey is a hosted middlew"
  },
  {
    "path": "assets/quizlet.yaml",
    "chars": 679,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-03-01\n  dependencies:\n  - ChatGPT API\n  description: Quizlet is int"
  },
  {
    "path": "assets/quora.yaml",
    "chars": 786,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-02-03\n  dependencies:\n  - ChatGPT API\n  - GPT-4 API\n  - Claude A"
  },
  {
    "path": "assets/qwen_team.yaml",
    "chars": 1420,
    "preview": "- access: open\n  analysis: Models have been evaluated on a series of math benchmarks, demonstrating\n    outperformance o"
  },
  {
    "path": "assets/rakuten.yaml",
    "chars": 1001,
    "preview": "- access: open\n  analysis: RakutenAI achieves the highest average score in both Japanese and English\n    LM-Harness metr"
  },
  {
    "path": "assets/reexpress.yaml",
    "chars": 653,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-03-21\n  dependencies: []\n  description: Reexpress One offers a m"
  },
  {
    "path": "assets/reka.yaml",
    "chars": 2077,
    "preview": "- access: limited\n  analysis: Evaluated on MMLU, GSM8K, HumanEval, and GPQA benchmarks, among others.\n  created_date: 20"
  },
  {
    "path": "assets/robin.yaml",
    "chars": 718,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: unknown\n  dependencies:\n  - Anthropic API\n  description: Robin AI use"
  },
  {
    "path": "assets/runway_ai,_inc..yaml",
    "chars": 1858,
    "preview": "- access: open\n  analysis: Unknown\n  created_date: 2024-06-17\n  dependencies: []\n  description: Gen-3 Alpha is a foundat"
  },
  {
    "path": "assets/rwkv.yaml",
    "chars": 2180,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-05-03\n  dependencies: []\n  description: RWKV World 4 is an RNN with"
  },
  {
    "path": "assets/salesforce.yaml",
    "chars": 5716,
    "preview": "- access:\n    explanation: Model weights are available for download with links in the [[GitHub\n      repo]](https://gith"
  },
  {
    "path": "assets/samba.yaml",
    "chars": 1941,
    "preview": "- access: open\n  analysis: Evaluated on open source multilingual model benchmarks.\n  created_date: 2024-02-26\n  dependen"
  },
  {
    "path": "assets/sana.yaml",
    "chars": 2140,
    "preview": "- access: limited\n  adaptation: 'Customized GPT-3, fine-tuned on private data [[Sana GPT-3 Demo]](https://gpt3demo.com/a"
  },
  {
    "path": "assets/sciphi.yaml",
    "chars": 692,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-11-07\n  dependencies:\n  - Mistral\n  description: SciPhi Mistral is "
  },
  {
    "path": "assets/shanghai.yaml",
    "chars": 5852,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-12-06\n  dependencies:\n  - Kinetics-400\n  - WebVid-2M\n  - WebVid-10M\n "
  },
  {
    "path": "assets/shop.yaml",
    "chars": 820,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-03-01\n  dependencies:\n  - ChatGPT API\n  description: \"When shoppers"
  },
  {
    "path": "assets/singapore.yaml",
    "chars": 1852,
    "preview": "- access: open\n  analysis: Performance assessed on BIG-bench arithmetic sub-task, and various elementary\n    arithmetic "
  },
  {
    "path": "assets/skt.yaml",
    "chars": 596,
    "preview": "- access: closed\n  analysis: none\n  created_date: 2023-09-26\n  dependencies: []\n  description: A.X is SK Telecom's propr"
  },
  {
    "path": "assets/snap.yaml",
    "chars": 1012,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-03-01\n  dependencies:\n  - ChatGPT API\n  description: My AI offers S"
  },
  {
    "path": "assets/soochow.yaml",
    "chars": 760,
    "preview": "- access: open\n  analysis: Evaluated across different text benchmarks in English and Chinese.\n  created_date: 2023-10-01"
  },
  {
    "path": "assets/speak.yaml",
    "chars": 631,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-03-01\n  dependencies:\n  - Whisper API\n  description: Speak is an AI"
  },
  {
    "path": "assets/spotify.yaml",
    "chars": 1256,
    "preview": "- access: limited\n  adaptation: ''\n  created_date: 2023-02-23\n  dependencies:\n  - ChatGPT API\n  - Sonantic AI\n  descript"
  },
  {
    "path": "assets/stability.yaml",
    "chars": 10949,
    "preview": "- access: open\n  analysis: Evaluated on the COCO dataset.\n  created_date: 2023-04-28\n  dependencies:\n  - LAION-5B\n  desc"
  },
  {
    "path": "assets/stability_ai.yaml",
    "chars": 8380,
    "preview": "- access: open\n  analysis: The model was tested extensively internally and externally. It has developed\n    and implemen"
  },
  {
    "path": "assets/stanford.yaml",
    "chars": 7289,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-12-15\n  dependencies:\n  - The Pile\n  description: ''\n  feedback: ''\n "
  },
  {
    "path": "assets/stonybrook.yaml",
    "chars": 2590,
    "preview": "- access:\n    explanation: The dataset is available for download on the [[SBU Captions Dataset\n      website]](https://w"
  },
  {
    "path": "assets/suno.yaml",
    "chars": 647,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2023-04-20\n  dependencies:\n  - AudioLM\n  description: Bark is a text-to-au"
  },
  {
    "path": "assets/tiger.yaml",
    "chars": 1106,
    "preview": "- access: open\n  analysis: Evaluated across a range of domain tasks across standard benchmarks in\n    comparison to pred"
  },
  {
    "path": "assets/together.yaml",
    "chars": 8921,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-11-29\n  dependencies:\n  - GPT-J\n  - P3\n  - NaturalInstructions-v2\n  d"
  },
  {
    "path": "assets/tokyo.yaml",
    "chars": 4187,
    "preview": "- access: open\n  analysis: Evaluated on all language datasets compared to similarly sized SOTA models,\n    with Aurora-M"
  },
  {
    "path": "assets/toronto.yaml",
    "chars": 1112,
    "preview": "- access: open\n  analysis: Compared models trained on OpenWebMath for 1 epoch to models trained on\n    The Pile and Proo"
  },
  {
    "path": "assets/transformify.yaml",
    "chars": 611,
    "preview": "- access: open\n  adaptation: ''\n  created_date: 2023-05-30\n  dependencies:\n  - GPT-4\n  description: Transformify Automat"
  },
  {
    "path": "assets/trevor.yaml",
    "chars": 1786,
    "preview": "- access:\n    explanation: 'The tool can only be accessed by the Trevor Project will be counsolors\n      for training pu"
  },
  {
    "path": "assets/triml.yaml",
    "chars": 703,
    "preview": "- access: open\n  analysis: Evaluated on standard VLM benchmarks and outperforms SotA open-source\n    VLMs as of release."
  },
  {
    "path": "assets/tsinghua.yaml",
    "chars": 6896,
    "preview": "- access:\n    explanation: Model weights are available but gated via an [[application form]](https://models.aminer.cn/co"
  },
  {
    "path": "assets/twelvelabs.yaml",
    "chars": 1624,
    "preview": "- access: open\n  analysis: Evaluated in comparison to SOTA video-to-language models.\n  created_date: 2023-10-23\n  depend"
  },
  {
    "path": "assets/uae.yaml",
    "chars": 3302,
    "preview": "- access: open\n  analysis: Evaluated in 1-shot against the PaLM models, with the tasks of the paper\n    \"Language models"
  },
  {
    "path": "assets/unknown.yaml",
    "chars": 3087,
    "preview": "- access:\n    explanation: producing the currently strongest open-source base model.\n    value: open\n  analysis: Compreh"
  },
  {
    "path": "assets/uw.yaml",
    "chars": 449,
    "preview": "- access: open\n  analysis: ''\n  created_date: 2022-01-07\n  datasheet: ''\n  dependencies:\n  - YouTube\n  description: ''\n "
  },
  {
    "path": "assets/uwashington.yaml",
    "chars": 1546,
    "preview": "- access: open\n  analysis: Reports results on the Vicuna benchmark and compares performance level\n    and time expenditu"
  },
  {
    "path": "assets/vago.yaml",
    "chars": 843,
    "preview": "- access: open\n  analysis: Evaluated on standard benchmarks in comparison to other German language\n    models.\n  created"
  },
  {
    "path": "assets/viable.yaml",
    "chars": 1220,
    "preview": "- access: limited\n  adaptation: unknown\n  created_date: unknown\n  dependencies:\n  - OpenAI API\n  description: 'Viable an"
  },
  {
    "path": "assets/vilm.yaml",
    "chars": 861,
    "preview": "- access: open\n  analysis: none\n  created_date: 2023-10-02\n  dependencies:\n  - Falcon-180B\n  description: Vulture is a f"
  },
  {
    "path": "assets/wayve.yaml",
    "chars": 978,
    "preview": "- access: closed\n  analysis: Compared to self before being scaled on quality of video generation.\n  created_date: 2023-0"
  },
  {
    "path": "assets/writer.yaml",
    "chars": 7314,
    "preview": "- access: open\n  analysis: Evaluated on the SuperGLUE benchmark\n  created_date:\n    explanation: The model was stated to"
  },
  {
    "path": "assets/xai.yaml",
    "chars": 3913,
    "preview": "- access: open\n  analysis: Grok-1 was evaluated on a range of reasoning benchmark tasks and on curated\n    foreign mathe"
  },
  {
    "path": "assets/xverse.yaml",
    "chars": 928,
    "preview": "- access: open\n  analysis: Evaluated across a range of standard datasets regarding multiple model\n    capabilities like "
  },
  {
    "path": "assets/xwin.yaml",
    "chars": 762,
    "preview": "- access: open\n  analysis: Evaluated on AlpacaEval benchmark against SOTA LLMs.\n  created_date: 2023-09-20\n  dependencie"
  },
  {
    "path": "assets/yandex.yaml",
    "chars": 2164,
    "preview": "- access: open\n  analysis: ''\n  created_date:\n    explanation: The date the blog post about YaLM was published.\n    valu"
  },
  {
    "path": "assets/you.yaml",
    "chars": 1426,
    "preview": "- access: closed\n  analysis: ''\n  created_date: unknown\n  datasheet: ''\n  dependencies: []\n  description: ''\n  excluded:"
  },
  {
    "path": "assets/zhejiang.yaml",
    "chars": 803,
    "preview": "- access: open\n  analysis: Evaluated on standard and ocean science benchmarks in comparison to other\n    similar-sized m"
  },
  {
    "path": "components/home.html",
    "chars": 1753,
    "preview": "<div class=\"bg-white d-flex justify-content-center\">\n\n  <div class=\"container-fluid mx-5 mt-5 px-5 py-4\" style=\"width: 6"
  },
  {
    "path": "components/nav.html",
    "chars": 1121,
    "preview": "<nav class=\"navbar sticky-top navbar-expand-sm navbar-light bg-white py-4\">\n\n    <a class=\"navbar-brand\" href=\"index.htm"
  },
  {
    "path": "components/table.html",
    "chars": 433,
    "preview": "<div class=\"m-5\">\n\n  <div class=\"d-flex justify-content-between\">\n    <div class=\"col-xs-6\">\n      <input class=\"search-"
  },
  {
    "path": "css/main.css",
    "chars": 2323,
    "preview": "\n/* Imports */\n@import url(\"https://fonts.googleapis.com/css?family=Hahmlet:400,300,700\");\n@import url(\"https://fonts.go"
  },
  {
    "path": "index.html",
    "chars": 1961,
    "preview": "<!DOCTYPE html>\n<html lang=\"en\">\n\t<head>\n    <!-- Metadata -->\n\t\t<title>Ecosystem Graphs for Foundation Models</title>\n "
  },
  {
    "path": "js/general.js",
    "chars": 1073,
    "preview": "function encodeUrlParams(params) {\n  let s = '';\n  for (let k in params)\n    s += (s === '' ? '?' : '&') + k + '=' + enc"
  },
  {
    "path": "js/main.js",
    "chars": 24991,
    "preview": "class Field {\n  constructor(raw) {\n    this.name = getField(raw, 'name');\n    this.description = getField(raw, 'descript"
  },
  {
    "path": "js/schemas.yaml",
    "chars": 6036,
    "preview": "---\n\ndataset:\n  - name: name\n    description: Name of the dataset (must be a unique identifier).\n  # General\n  - name: o"
  },
  {
    "path": "pre-commit.sh",
    "chars": 642,
    "preview": "#!/bin/bash\n\n# This script fails when any of its commands fail.\nset -e\n\nif ! [ -e venv ]; then\n  python3 -m pip install "
  },
  {
    "path": "requirements.txt",
    "chars": 32,
    "preview": "yamllint==1.26.3\nyamlfix==1.2.0\n"
  },
  {
    "path": "resources/.gitkeep",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "resources/all_assets.csv",
    "chars": 450823,
    "preview": "access,analysis,created_date,datasheet,dependencies,description,excluded,feedback,included,intended_uses,license,modalit"
  },
  {
    "path": "scripts/collate.py",
    "chars": 1047,
    "preview": "#!/usr/bin/env python\nfrom pathlib import Path\n\nimport yaml\nfrom pandas import DataFrame\n\n\nASSET_PATH = Path(\"./assets\")"
  },
  {
    "path": "server.py",
    "chars": 505,
    "preview": "#!/usr/bin/env python\n\nfrom http import server\n\nclass MyHTTPRequestHandler(server.SimpleHTTPRequestHandler):\n    def end"
  },
  {
    "path": "yamlfix.toml",
    "chars": 132,
    "preview": "[tool.yamlfix]\nline_length         = 200\nnone_representation = \"Unknown\"\nflow_style_sequence = \"false\"\nexplicit_start   "
  }
]

About this extraction

This page contains the full source code of the stanford-crfm/ecosystem-graphs GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 194 files (1.2 MB), approximately 325.3k tokens, and a symbol index with 46 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo